llvm/docs/LangRef.rst

   1 ==============================
   2 LLVM Language Reference Manual
   3 ==============================
   4
   5 .. contents::
   6    :local:
   7    :depth: 4
   8
   9 Abstract
  10 ========
  11
  12 This document is a reference manual for the LLVM assembly language. LLVM
  13 is a Static Single Assignment (SSA) based representation that provides
  14 type safety, low-level operations, flexibility, and the capability of
  15 representing 'all' high-level languages cleanly. It is the common code
  16 representation used throughout all phases of the LLVM compilation
  17 strategy.
  18
  19 Introduction
  20 ============
  21
  22 The LLVM code representation is designed to be used in three different
  23 forms: as an in-memory compiler IR, as an on-disk bitcode representation
  24 (suitable for fast loading by a Just-In-Time compiler), and as a human
  25 readable assembly language representation. This allows LLVM to provide a
  26 powerful intermediate representation for efficient compiler
  27 transformations and analysis, while providing a natural means to debug
  28 and visualize the transformations. The three different forms of LLVM are
  29 all equivalent. This document describes the human readable
  30 representation and notation.
  31
  32 The LLVM representation aims to be light-weight and low-level while
  33 being expressive, typed, and extensible at the same time. It aims to be
  34 a "universal IR" of sorts, by being at a low enough level that
  35 high-level ideas may be cleanly mapped to it (similar to how
  36 microprocessors are "universal IR's", allowing many source languages to
  37 be mapped to them). By providing type information, LLVM can be used as
  38 the target of optimizations: for example, through pointer analysis, it
  39 can be proven that a C automatic variable is never accessed outside of
  40 the current function, allowing it to be promoted to a simple SSA value
  41 instead of a memory location.
  42
  43 .. _wellformed:
  44
  45 Well-Formedness
  46 ---------------
  47
  48 It is important to note that this document describes 'well formed' LLVM
  49 assembly language. There is a difference between what the parser accepts
  50 and what is considered 'well formed'. For example, the following
  51 instruction is syntactically okay, but not well formed:
  52
  53 .. code-block:: llvm
  54
  55     %x = add i32 1, %x
  56
  57 because the definition of ``%x`` does not dominate all of its uses. The
  58 LLVM infrastructure provides a verification pass that may be used to
  59 verify that an LLVM module is well formed. This pass is automatically
  60 run by the parser after parsing input assembly and by the optimizer
  61 before it outputs bitcode. The violations pointed out by the verifier
  62 pass indicate bugs in transformation passes or input to the parser.
  63
  64 .. _identifiers:
  65
  66 Identifiers
  67 ===========
  68
  69 LLVM identifiers come in two basic types: global and local. Global
  70 identifiers (functions, global variables) begin with the ``'@'``
  71 character. Local identifiers (register names, types) begin with the
  72 ``'%'`` character. Additionally, there are three different formats for
  73 identifiers, for different purposes:
  74
  75 #. Named values are represented as a string of characters with their
  76    prefix. For example, ``%foo``, ``@DivisionByZero``,
  77    ``%a.really.long.identifier``. The actual regular expression used is
  78    '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other
  79    characters in their names can be surrounded with quotes. Special
  80    characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII
  81    code for the character in hexadecimal. In this way, any character can
  82    be used in a name value, even quotes themselves. The ``"\01"`` prefix
  83    can be used on global values to suppress mangling.
  84 #. Unnamed values are represented as an unsigned numeric value with
  85    their prefix. For example, ``%12``, ``@2``, ``%44``.
  86 #. Constants, which are described in the section Constants_ below.
  87
  88 LLVM requires that values start with a prefix for two reasons: Compilers
  89 don't need to worry about name clashes with reserved words, and the set
  90 of reserved words may be expanded in the future without penalty.
  91 Additionally, unnamed identifiers allow a compiler to quickly come up
  92 with a temporary variable without having to avoid symbol table
  93 conflicts.
  94
  95 Reserved words in LLVM are very similar to reserved words in other
  96 languages. There are keywords for different opcodes ('``add``',
  97 '``bitcast``', '``ret``', etc...), for primitive type names ('``void``',
  98 '``i32``', etc...), and others. These reserved words cannot conflict
  99 with variable names, because none of them start with a prefix character
 100 (``'%'`` or ``'@'``).
 101
 102 Here is an example of LLVM code to multiply the integer variable
 103 '``%X``' by 8:
 104
 105 The easy way:
 106
 107 .. code-block:: llvm
 108
 109     %result = mul i32 %X, 8
 110
 111 After strength reduction:
 112
 113 .. code-block:: llvm
 114
 115     %result = shl i32 %X, 3
 116
 117 And the hard way:
 118
 119 .. code-block:: llvm
 120
 121     %0 = add i32 %X, %X           ; yields i32:%0
 122     %1 = add i32 %0, %0           ; yields i32:%1
 123     %result = add i32 %1, %1
 124
 125 This last way of multiplying ``%X`` by 8 illustrates several important
 126 lexical features of LLVM:
 127
 128 #. Comments are delimited with a '``;``' and go until the end of line.
 129 #. Unnamed temporaries are created when the result of a computation is
 130    not assigned to a named value.
 131 #. Unnamed temporaries are numbered sequentially (using a per-function
 132    incrementing counter, starting with 0). Note that basic blocks and unnamed
 133    function parameters are included in this numbering. For example, if the
 134    entry basic block is not given a label name and all function parameters are
 135    named, then it will get number 0.
 136
 137 It also shows a convention that we follow in this document. When
 138 demonstrating instructions, we will follow an instruction with a comment
 139 that defines the type and name of value produced.
 140
 141 High Level Structure
 142 ====================
 143
 144 Module Structure
 145 ----------------
 146
 147 LLVM programs are composed of ``Module``'s, each of which is a
 148 translation unit of the input programs. Each module consists of
 149 functions, global variables, and symbol table entries. Modules may be
 150 combined together with the LLVM linker, which merges function (and
 151 global variable) definitions, resolves forward declarations, and merges
 152 symbol table entries. Here is an example of the "hello world" module:
 153
 154 .. code-block:: llvm
 155
 156     ; Declare the string constant as a global constant.
 157     @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00"
 158
 159     ; External declaration of the puts function
 160     declare i32 @puts(i8* nocapture) nounwind
 161
 162     ; Definition of main function
 163     define i32 @main() {   ; i32()*
 164       ; Convert [13 x i8]* to i8*...
 165       %cast210 = getelementptr [13 x i8], [13 x i8]* @.str, i64 0, i64 0
 166
 167       ; Call puts function to write out the string to stdout.
 168       call i32 @puts(i8* %cast210)
 169       ret i32 0
 170     }
 171
 172     ; Named metadata
 173     !0 = !{i32 42, null, !"string"}
 174     !foo = !{!0}
 175
 176 This example is made up of a :ref:`global variable <globalvars>` named
 177 "``.str``", an external declaration of the "``puts``" function, a
 178 :ref:`function definition <functionstructure>` for "``main``" and
 179 :ref:`named metadata <namedmetadatastructure>` "``foo``".
 180
 181 In general, a module is made up of a list of global values (where both
 182 functions and global variables are global values). Global values are
 183 represented by a pointer to a memory location (in this case, a pointer
 184 to an array of char, and a pointer to a function), and have one of the
 185 following :ref:`linkage types <linkage>`.
 186
 187 .. _linkage:
 188
 189 Linkage Types
 190 -------------
 191
 192 All Global Variables and Functions have one of the following types of
 193 linkage:
 194
 195 ``private``
 196     Global values with "``private``" linkage are only directly
 197     accessible by objects in the current module. In particular, linking
 198     code into a module with a private global value may cause the
 199     private to be renamed as necessary to avoid collisions. Because the
 200     symbol is private to the module, all references can be updated. This
 201     doesn't show up in any symbol table in the object file.
 202 ``internal``
 203     Similar to private, but the value shows as a local symbol
 204     (``STB_LOCAL`` in the case of ELF) in the object file. This
 205     corresponds to the notion of the '``static``' keyword in C.
 206 ``available_externally``
 207     Globals with "``available_externally``" linkage are never emitted into
 208     the object file corresponding to the LLVM module. From the linker's
 209     perspective, an ``available_externally`` global is equivalent to
 210     an external declaration. They exist to allow inlining and other
 211     optimizations to take place given knowledge of the definition of the
 212     global, which is known to be somewhere outside the module. Globals
 213     with ``available_externally`` linkage are allowed to be discarded at
 214     will, and allow inlining and other optimizations. This linkage type is
 215     only allowed on definitions, not declarations.
 216 ``linkonce``
 217     Globals with "``linkonce``" linkage are merged with other globals of
 218     the same name when linkage occurs. This can be used to implement
 219     some forms of inline functions, templates, or other code which must
 220     be generated in each translation unit that uses it, but where the
 221     body may be overridden with a more definitive definition later.
 222     Unreferenced ``linkonce`` globals are allowed to be discarded. Note
 223     that ``linkonce`` linkage does not actually allow the optimizer to
 224     inline the body of this function into callers because it doesn't
 225     know if this definition of the function is the definitive definition
 226     within the program or whether it will be overridden by a stronger
 227     definition. To enable inlining and other optimizations, use
 228     "``linkonce_odr``" linkage.
 229 ``weak``
 230     "``weak``" linkage has the same merging semantics as ``linkonce``
 231     linkage, except that unreferenced globals with ``weak`` linkage may
 232     not be discarded. This is used for globals that are declared "weak"
 233     in C source code.
 234 ``common``
 235     "``common``" linkage is most similar to "``weak``" linkage, but they
 236     are used for tentative definitions in C, such as "``int X;``" at
 237     global scope. Symbols with "``common``" linkage are merged in the
 238     same way as ``weak symbols``, and they may not be deleted if
 239     unreferenced. ``common`` symbols may not have an explicit section,
 240     must have a zero initializer, and may not be marked
 241     ':ref:`constant <globalvars>`'. Functions and aliases may not have
 242     common linkage.
 243
 244 .. _linkage_appending:
 245
 246 ``appending``
 247     "``appending``" linkage may only be applied to global variables of
 248     pointer to array type. When two global variables with appending
 249     linkage are linked together, the two global arrays are appended
 250     together. This is the LLVM, typesafe, equivalent of having the
 251     system linker append together "sections" with identical names when
 252     .o files are linked.
 253
 254     Unfortunately this doesn't correspond to any feature in .o files, so it
 255     can only be used for variables like ``llvm.global_ctors`` which llvm
 256     interprets specially.
 257
 258 ``extern_weak``
 259     The semantics of this linkage follow the ELF object file model: the
 260     symbol is weak until linked, if not linked, the symbol becomes null
 261     instead of being an undefined reference.
 262 ``linkonce_odr``, ``weak_odr``
 263     Some languages allow differing globals to be merged, such as two
 264     functions with different semantics. Other languages, such as
 265     ``C++``, ensure that only equivalent globals are ever merged (the
 266     "one definition rule" --- "ODR"). Such languages can use the
 267     ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the
 268     global will only be merged with equivalent globals. These linkage
 269     types are otherwise the same as their non-``odr`` versions.
 270 ``external``
 271     If none of the above identifiers are used, the global is externally
 272     visible, meaning that it participates in linkage and can be used to
 273     resolve external symbol references.
 274
 275 It is illegal for a global variable or function *declaration* to have any
 276 linkage type other than ``external`` or ``extern_weak``.
 277
 278 .. _callingconv:
 279
 280 Calling Conventions
 281 -------------------
 282
 283 LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and
 284 :ref:`invokes <i_invoke>` can all have an optional calling convention
 285 specified for the call. The calling convention of any pair of dynamic
 286 caller/callee must match, or the behavior of the program is undefined.
 287 The following calling conventions are supported by LLVM, and more may be
 288 added in the future:
 289
 290 "``ccc``" - The C calling convention
 291     This calling convention (the default if no other calling convention
 292     is specified) matches the target C calling conventions. This calling
 293     convention supports varargs function calls and tolerates some
 294     mismatch in the declared prototype and implemented declaration of
 295     the function (as does normal C).
 296 "``fastcc``" - The fast calling convention
 297     This calling convention attempts to make calls as fast as possible
 298     (e.g. by passing things in registers). This calling convention
 299     allows the target to use whatever tricks it wants to produce fast
 300     code for the target, without having to conform to an externally
 301     specified ABI (Application Binary Interface). `Tail calls can only
 302     be optimized when this, the tailcc, the GHC or the HiPE convention is
 303     used. <CodeGenerator.html#tail-call-optimization>`_ This calling
 304     convention does not support varargs and requires the prototype of all
 305     callees to exactly match the prototype of the function definition.
 306 "``coldcc``" - The cold calling convention
 307     This calling convention attempts to make code in the caller as
 308     efficient as possible under the assumption that the call is not
 309     commonly executed. As such, these calls often preserve all registers
 310     so that the call does not break any live ranges in the caller side.
 311     This calling convention does not support varargs and requires the
 312     prototype of all callees to exactly match the prototype of the
 313     function definition. Furthermore the inliner doesn't consider such function
 314     calls for inlining.
 315 "``cc 10``" - GHC convention
 316     This calling convention has been implemented specifically for use by
 317     the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_.
 318     It passes everything in registers, going to extremes to achieve this
 319     by disabling callee save registers. This calling convention should
 320     not be used lightly but only for specific situations such as an
 321     alternative to the *register pinning* performance technique often
 322     used when implementing functional programming languages. At the
 323     moment only X86 supports this convention and it has the following
 324     limitations:
 325
 326     -  On *X86-32* only supports up to 4 bit type parameters. No
 327        floating-point types are supported.
 328     -  On *X86-64* only supports up to 10 bit type parameters and 6
 329        floating-point parameters.
 330
 331     This calling convention supports `tail call
 332     optimization <CodeGenerator.html#tail-call-optimization>`_ but requires
 333     both the caller and callee are using it.
 334 "``cc 11``" - The HiPE calling convention
 335     This calling convention has been implemented specifically for use by
 336     the `High-Performance Erlang
 337     (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the*
 338     native code compiler of the `Ericsson's Open Source Erlang/OTP
 339     system <http://www.erlang.org/download.shtml>`_. It uses more
 340     registers for argument passing than the ordinary C calling
 341     convention and defines no callee-saved registers. The calling
 342     convention properly supports `tail call
 343     optimization <CodeGenerator.html#tail-call-optimization>`_ but requires
 344     that both the caller and the callee use it. It uses a *register pinning*
 345     mechanism, similar to GHC's convention, for keeping frequently
 346     accessed runtime components pinned to specific hardware registers.
 347     At the moment only X86 supports this convention (both 32 and 64
 348     bit).
 349 "``webkit_jscc``" - WebKit's JavaScript calling convention
 350     This calling convention has been implemented for `WebKit FTL JIT
 351     <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the
 352     stack right to left (as cdecl does), and returns a value in the
 353     platform's customary return register.
 354 "``anyregcc``" - Dynamic calling convention for code patching
 355     This is a special convention that supports patching an arbitrary code
 356     sequence in place of a call site. This convention forces the call
 357     arguments into registers but allows them to be dynamically
 358     allocated. This can currently only be used with calls to
 359     llvm.experimental.patchpoint because only this intrinsic records
 360     the location of its arguments in a side table. See :doc:`StackMaps`.
 361 "``preserve_mostcc``" - The `PreserveMost` calling convention
 362     This calling convention attempts to make the code in the caller as
 363     unintrusive as possible. This convention behaves identically to the `C`
 364     calling convention on how arguments and return values are passed, but it
 365     uses a different set of caller/callee-saved registers. This alleviates the
 366     burden of saving and recovering a large register set before and after the
 367     call in the caller. If the arguments are passed in callee-saved registers,
 368     then they will be preserved by the callee across the call. This doesn't
 369     apply for values returned in callee-saved registers.
 370
 371     - On X86-64 the callee preserves all general purpose registers, except for
 372       R11. R11 can be used as a scratch register. Floating-point registers
 373       (XMMs/YMMs) are not preserved and need to be saved by the caller.
 374
 375     The idea behind this convention is to support calls to runtime functions
 376     that have a hot path and a cold path. The hot path is usually a small piece
 377     of code that doesn't use many registers. The cold path might need to call out to
 378     another function and therefore only needs to preserve the caller-saved
 379     registers, which haven't already been saved by the caller. The
 380     `PreserveMost` calling convention is very similar to the `cold` calling
 381     convention in terms of caller/callee-saved registers, but they are used for
 382     different types of function calls. `coldcc` is for function calls that are
 383     rarely executed, whereas `preserve_mostcc` function calls are intended to be
 384     on the hot path and definitely executed a lot. Furthermore `preserve_mostcc`
 385     doesn't prevent the inliner from inlining the function call.
 386
 387     This calling convention will be used by a future version of the ObjectiveC
 388     runtime and should therefore still be considered experimental at this time.
 389     Although this convention was created to optimize certain runtime calls to
 390     the ObjectiveC runtime, it is not limited to this runtime and might be used
 391     by other runtimes in the future too. The current implementation only
 392     supports X86-64, but the intention is to support more architectures in the
 393     future.
 394 "``preserve_allcc``" - The `PreserveAll` calling convention
 395     This calling convention attempts to make the code in the caller even less
 396     intrusive than the `PreserveMost` calling convention. This calling
 397     convention also behaves identical to the `C` calling convention on how
 398     arguments and return values are passed, but it uses a different set of
 399     caller/callee-saved registers. This removes the burden of saving and
 400     recovering a large register set before and after the call in the caller. If
 401     the arguments are passed in callee-saved registers, then they will be
 402     preserved by the callee across the call. This doesn't apply for values
 403     returned in callee-saved registers.
 404
 405     - On X86-64 the callee preserves all general purpose registers, except for
 406       R11. R11 can be used as a scratch register. Furthermore it also preserves
 407       all floating-point registers (XMMs/YMMs).
 408
 409     The idea behind this convention is to support calls to runtime functions
 410     that don't need to call out to any other functions.
 411
 412     This calling convention, like the `PreserveMost` calling convention, will be
 413     used by a future version of the ObjectiveC runtime and should be considered
 414     experimental at this time.
 415 "``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
 416     Clang generates an access function to access C++-style TLS. The access
 417     function generally has an entry block, an exit block and an initialization
 418     block that is run at the first time. The entry and exit blocks can access
 419     a few TLS IR variables, each access will be lowered to a platform-specific
 420     sequence.
 421
 422     This calling convention aims to minimize overhead in the caller by
 423     preserving as many registers as possible (all the registers that are
 424     preserved on the fast path, composed of the entry and exit blocks).
 425
 426     This calling convention behaves identical to the `C` calling convention on
 427     how arguments and return values are passed, but it uses a different set of
 428     caller/callee-saved registers.
 429
 430     Given that each platform has its own lowering sequence, hence its own set
 431     of preserved registers, we can't use the existing `PreserveMost`.
 432
 433     - On X86-64 the callee preserves all general purpose registers, except for
 434       RDI and RAX.
 435 "``tailcc``" - Tail callable calling convention
 436     This calling convention ensures that calls in tail position will always be
 437     tail call optimized. This calling convention is equivalent to fastcc,
 438     except for an additional guarantee that tail calls will be produced
 439     whenever possible. `Tail calls can only be optimized when this, the fastcc,
 440     the GHC or the HiPE convention is used. <CodeGenerator.html#tail-call-optimization>`_
 441     This calling convention does not support varargs and requires the prototype of
 442     all callees to exactly match the prototype of the function definition.
 443 "``swiftcc``" - This calling convention is used for Swift language.
 444     - On X86-64 RCX and R8 are available for additional integer returns, and
 445       XMM2 and XMM3 are available for additional FP/vector returns.
 446     - On iOS platforms, we use AAPCS-VFP calling convention.
 447 "``swifttailcc``"
 448     This calling convention is like ``swiftcc`` in most respects, but also the
 449     callee pops the argument area of the stack so that mandatory tail calls are
 450     possible as in ``tailcc``.
 451 "``cfguard_checkcc``" - Windows Control Flow Guard (Check mechanism)
 452     This calling convention is used for the Control Flow Guard check function,
 453     calls to which can be inserted before indirect calls to check that the call
 454     target is a valid function address. The check function has no return value,
 455     but it will trigger an OS-level error if the address is not a valid target.
 456     The set of registers preserved by the check function, and the register
 457     containing the target address are architecture-specific.
 458
 459     - On X86 the target address is passed in ECX.
 460     - On ARM the target address is passed in R0.
 461     - On AArch64 the target address is passed in X15.
 462 "``cc <n>``" - Numbered convention
 463     Any calling convention may be specified by number, allowing
 464     target-specific calling conventions to be used. Target specific
 465     calling conventions start at 64.
 466
 467 More calling conventions can be added/defined on an as-needed basis, to
 468 support Pascal conventions or any other well-known target-independent
 469 convention.
 470
 471 .. _visibilitystyles:
 472
 473 Visibility Styles
 474 -----------------
 475
 476 All Global Variables and Functions have one of the following visibility
 477 styles:
 478
 479 "``default``" - Default style
 480     On targets that use the ELF object file format, default visibility
 481     means that the declaration is visible to other modules and, in
 482     shared libraries, means that the declared entity may be overridden.
 483     On Darwin, default visibility means that the declaration is visible
 484     to other modules. Default visibility corresponds to "external
 485     linkage" in the language.
 486 "``hidden``" - Hidden style
 487     Two declarations of an object with hidden visibility refer to the
 488     same object if they are in the same shared object. Usually, hidden
 489     visibility indicates that the symbol will not be placed into the
 490     dynamic symbol table, so no other module (executable or shared
 491     library) can reference it directly.
 492 "``protected``" - Protected style
 493     On ELF, protected visibility indicates that the symbol will be
 494     placed in the dynamic symbol table, but that references within the
 495     defining module will bind to the local symbol. That is, the symbol
 496     cannot be overridden by another module.
 497
 498 A symbol with ``internal`` or ``private`` linkage must have ``default``
 499 visibility.
 500
 501 .. _dllstorageclass:
 502
 503 DLL Storage Classes
 504 -------------------
 505
 506 All Global Variables, Functions and Aliases can have one of the following
 507 DLL storage class:
 508
 509 ``dllimport``
 510     "``dllimport``" causes the compiler to reference a function or variable via
 511     a global pointer to a pointer that is set up by the DLL exporting the
 512     symbol. On Microsoft Windows targets, the pointer name is formed by
 513     combining ``__imp_`` and the function or variable name.
 514 ``dllexport``
 515     "``dllexport``" causes the compiler to provide a global pointer to a pointer
 516     in a DLL, so that it can be referenced with the ``dllimport`` attribute. On
 517     Microsoft Windows targets, the pointer name is formed by combining
 518     ``__imp_`` and the function or variable name. Since this storage class
 519     exists for defining a dll interface, the compiler, assembler and linker know
 520     it is externally referenced and must refrain from deleting the symbol.
 521
 522 .. _tls_model:
 523
 524 Thread Local Storage Models
 525 ---------------------------
 526
 527 A variable may be defined as ``thread_local``, which means that it will
 528 not be shared by threads (each thread will have a separated copy of the
 529 variable). Not all targets support thread-local variables. Optionally, a
 530 TLS model may be specified:
 531
 532 ``localdynamic``
 533     For variables that are only used within the current shared library.
 534 ``initialexec``
 535     For variables in modules that will not be loaded dynamically.
 536 ``localexec``
 537     For variables defined in the executable and only used within it.
 538
 539 If no explicit model is given, the "general dynamic" model is used.
 540
 541 The models correspond to the ELF TLS models; see `ELF Handling For
 542 Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
 543 more information on under which circumstances the different models may
 544 be used. The target may choose a different TLS model if the specified
 545 model is not supported, or if a better choice of model can be made.
 546
 547 A model can also be specified in an alias, but then it only governs how
 548 the alias is accessed. It will not have any effect in the aliasee.
 549
 550 For platforms without linker support of ELF TLS model, the -femulated-tls
 551 flag can be used to generate GCC compatible emulated TLS code.
 552
 553 .. _runtime_preemption_model:
 554
 555 Runtime Preemption Specifiers
 556 -----------------------------
 557
 558 Global variables, functions and aliases may have an optional runtime preemption
 559 specifier. If a preemption specifier isn't given explicitly, then a
 560 symbol is assumed to be ``dso_preemptable``.
 561
 562 ``dso_preemptable``
 563     Indicates that the function or variable may be replaced by a symbol from
 564     outside the linkage unit at runtime.
 565
 566 ``dso_local``
 567     The compiler may assume that a function or variable marked as ``dso_local``
 568     will resolve to a symbol within the same linkage unit. Direct access will
 569     be generated even if the definition is not within this compilation unit.
 570
 571 .. _namedtypes:
 572
 573 Structure Types
 574 ---------------
 575
 576 LLVM IR allows you to specify both "identified" and "literal" :ref:`structure
 577 types <t_struct>`. Literal types are uniqued structurally, but identified types
 578 are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used
 579 to forward declare a type that is not yet available.
 580
 581 An example of an identified structure specification is:
 582
 583 .. code-block:: llvm
 584
 585     %mytype = type { %mytype*, i32 }
 586
 587 Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only
 588 literal types are uniqued in recent versions of LLVM.
 589
 590 .. _nointptrtype:
 591
 592 Non-Integral Pointer Type
 593 -------------------------
 594
 595 Note: non-integral pointer types are a work in progress, and they should be
 596 considered experimental at this time.
 597
 598 LLVM IR optionally allows the frontend to denote pointers in certain address
 599 spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`.
 600 Non-integral pointer types represent pointers that have an *unspecified* bitwise
 601 representation; that is, the integral representation may be target dependent or
 602 unstable (not backed by a fixed integer).
 603
 604 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
 605 integral (i.e. normal) pointers in that they convert integers to and from
 606 corresponding pointer types, but there are additional implications to be
 607 aware of.  Because the bit-representation of a non-integral pointer may
 608 not be stable, two identical casts of the same operand may or may not
 609 return the same value.  Said differently, the conversion to or from the
 610 non-integral type depends on environmental state in an implementation
 611 defined manner.
 612
 613 If the frontend wishes to observe a *particular* value following a cast, the
 614 generated IR must fence with the underlying environment in an implementation
 615 defined manner. (In practice, this tends to require ``noinline`` routines for
 616 such operations.)
 617
 618 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
 619 non-integral types are analogous to ones on integral types with one
 620 key exception: the optimizer may not, in general, insert new dynamic
 621 occurrences of such casts.  If a new cast is inserted, the optimizer would
 622 need to either ensure that a) all possible values are valid, or b)
 623 appropriate fencing is inserted.  Since the appropriate fencing is
 624 implementation defined, the optimizer can't do the latter.  The former is
 625 challenging as many commonly expected properties, such as
 626 ``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
 627
 628 .. _globalvars:
 629
 630 Global Variables
 631 ----------------
 632
 633 Global variables define regions of memory allocated at compilation time
 634 instead of run-time.
 635
 636 Global variable definitions must be initialized.
 637
 638 Global variables in other translation units can also be declared, in which
 639 case they don't have an initializer.
 640
 641 Global variables can optionally specify a :ref:`linkage type <linkage>`.
 642
 643 Either global variable definitions or declarations may have an explicit section
 644 to be placed in and may have an optional explicit alignment specified. If there
 645 is a mismatch between the explicit or inferred section information for the
 646 variable declaration and its definition the resulting behavior is undefined.
 647
 648 A variable may be defined as a global ``constant``, which indicates that
 649 the contents of the variable will **never** be modified (enabling better
 650 optimization, allowing the global data to be placed in the read-only
 651 section of an executable, etc). Note that variables that need runtime
 652 initialization cannot be marked ``constant`` as there is a store to the
 653 variable.
 654
 655 LLVM explicitly allows *declarations* of global variables to be marked
 656 constant, even if the final definition of the global is not. This
 657 capability can be used to enable slightly better optimization of the
 658 program, but requires the language definition to guarantee that
 659 optimizations based on the 'constantness' are valid for the translation
 660 units that do not include the definition.
 661
 662 As SSA values, global variables define pointer values that are in scope
 663 (i.e. they dominate) all basic blocks in the program. Global variables
 664 always define a pointer to their "content" type because they describe a
 665 region of memory, and all memory objects in LLVM are accessed through
 666 pointers.
 667
 668 Global variables can be marked with ``unnamed_addr`` which indicates
 669 that the address is not significant, only the content. Constants marked
 670 like this can be merged with other constants if they have the same
 671 initializer. Note that a constant with significant address *can* be
 672 merged with a ``unnamed_addr`` constant, the result being a constant
 673 whose address is significant.
 674
 675 If the ``local_unnamed_addr`` attribute is given, the address is known to
 676 not be significant within the module.
 677
 678 A global variable may be declared to reside in a target-specific
 679 numbered address space. For targets that support them, address spaces
 680 may affect how optimizations are performed and/or what target
 681 instructions are used to access the variable. The default address space
 682 is zero. The address space qualifier must precede any other attributes.
 683
 684 LLVM allows an explicit section to be specified for globals. If the
 685 target supports it, it will emit globals to the section specified.
 686 Additionally, the global can placed in a comdat if the target has the necessary
 687 support.
 688
 689 External declarations may have an explicit section specified. Section
 690 information is retained in LLVM IR for targets that make use of this
 691 information. Attaching section information to an external declaration is an
 692 assertion that its definition is located in the specified section. If the
 693 definition is located in a different section, the behavior is undefined.
 694
 695 By default, global initializers are optimized by assuming that global
 696 variables defined within the module are not modified from their
 697 initial values before the start of the global initializer. This is
 698 true even for variables potentially accessible from outside the
 699 module, including those with external linkage or appearing in
 700 ``@llvm.used`` or dllexported variables. This assumption may be suppressed
 701 by marking the variable with ``externally_initialized``.
 702
 703 An explicit alignment may be specified for a global, which must be a
 704 power of 2. If not present, or if the alignment is set to zero, the
 705 alignment of the global is set by the target to whatever it feels
 706 convenient. If an explicit alignment is specified, the global is forced
 707 to have exactly that alignment. Targets and optimizers are not allowed
 708 to over-align the global if the global has an assigned section. In this
 709 case, the extra alignment could be observable: for example, code could
 710 assume that the globals are densely packed in their section and try to
 711 iterate over them as an array, alignment padding would break this
 712 iteration. The maximum alignment is ``1 << 32``.
 713
 714 For global variables declarations, as well as definitions that may be
 715 replaced at link time (``linkonce``, ``weak``, ``extern_weak`` and ``common``
 716 linkage types), LLVM makes no assumptions about the allocation size of the
 717 variables, except that they may not overlap. The alignment of a global variable
 718 declaration or replaceable definition must not be greater than the alignment of
 719 the definition it resolves to.
 720
 721 Globals can also have a :ref:`DLL storage class <dllstorageclass>`,
 722 an optional :ref:`runtime preemption specifier <runtime_preemption_model>`,
 723 an optional :ref:`global attributes <glattrs>` and
 724 an optional list of attached :ref:`metadata <metadata>`.
 725
 726 Variables and aliases can have a
 727 :ref:`Thread Local Storage Model <tls_model>`.
 728
 729 :ref:`Scalable vectors <t_vector>` cannot be global variables or members of
 730 arrays because their size is unknown at compile time. They are allowed in
 731 structs to facilitate intrinsics returning multiple values. Structs containing
 732 scalable vectors cannot be used in loads, stores, allocas, or GEPs.
 733
 734 Syntax::
 735
 736       @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility]
 737                          [DLLStorageClass] [ThreadLocal]
 738                          [(unnamed_addr|local_unnamed_addr)] [AddrSpace]
 739                          [ExternallyInitialized]
 740                          <global | constant> <Type> [<InitializerConstant>]
 741                          [, section "name"] [, partition "name"]
 742                          [, comdat [($name)]] [, align <Alignment>]
 743                          (, !name !N)*
 744
 745 For example, the following defines a global in a numbered address space
 746 with an initializer, section, and alignment:
 747
 748 .. code-block:: llvm
 749
 750     @G = addrspace(5) constant float 1.0, section "foo", align 4
 751
 752 The following example just declares a global variable
 753
 754 .. code-block:: llvm
 755
 756    @G = external global i32
 757
 758 The following example defines a thread-local global with the
 759 ``initialexec`` TLS model:
 760
 761 .. code-block:: llvm
 762
 763     @G = thread_local(initialexec) global i32 0, align 4
 764
 765 .. _functionstructure:
 766
 767 Functions
 768 ---------
 769
 770 LLVM function definitions consist of the "``define``" keyword, an
 771 optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption
 772 specifier <runtime_preemption_model>`,  an optional :ref:`visibility
 773 style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`,
 774 an optional :ref:`calling convention <callingconv>`,
 775 an optional ``unnamed_addr`` attribute, a return type, an optional
 776 :ref:`parameter attribute <paramattrs>` for the return type, a function
 777 name, a (possibly empty) argument list (each with optional :ref:`parameter
 778 attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
 779 an optional address space, an optional section, an optional alignment,
 780 an optional :ref:`comdat <langref_comdats>`,
 781 an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`,
 782 an optional :ref:`prologue <prologuedata>`,
 783 an optional :ref:`personality <personalityfn>`,
 784 an optional list of attached :ref:`metadata <metadata>`,
 785 an opening curly brace, a list of basic blocks, and a closing curly brace.
 786
 787 LLVM function declarations consist of the "``declare``" keyword, an
 788 optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style
 789 <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an
 790 optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr``
 791 or ``local_unnamed_addr`` attribute, an optional address space, a return type,
 792 an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly
 793 empty list of arguments, an optional alignment, an optional :ref:`garbage
 794 collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional
 795 :ref:`prologue <prologuedata>`.
 796
 797 A function definition contains a list of basic blocks, forming the CFG (Control
 798 Flow Graph) for the function. Each basic block may optionally start with a label
 799 (giving the basic block a symbol table entry), contains a list of instructions,
 800 and ends with a :ref:`terminator <terminators>` instruction (such as a branch or
 801 function return). If an explicit label name is not provided, a block is assigned
 802 an implicit numbered label, using the next value from the same counter as used
 803 for unnamed temporaries (:ref:`see above<identifiers>`). For example, if a
 804 function entry block does not have an explicit label, it will be assigned label
 805 "%0", then the first unnamed temporary in that block will be "%1", etc. If a
 806 numeric label is explicitly specified, it must match the numeric label that
 807 would be used implicitly.
 808
 809 The first basic block in a function is special in two ways: it is
 810 immediately executed on entrance to the function, and it is not allowed
 811 to have predecessor basic blocks (i.e. there can not be any branches to
 812 the entry block of a function). Because the block can have no
 813 predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
 814
 815 LLVM allows an explicit section to be specified for functions. If the
 816 target supports it, it will emit functions to the section specified.
 817 Additionally, the function can be placed in a COMDAT.
 818
 819 An explicit alignment may be specified for a function. If not present,
 820 or if the alignment is set to zero, the alignment of the function is set
 821 by the target to whatever it feels convenient. If an explicit alignment
 822 is specified, the function is forced to have at least that much
 823 alignment. All alignments must be a power of 2.
 824
 825 If the ``unnamed_addr`` attribute is given, the address is known to not
 826 be significant and two identical functions can be merged.
 827
 828 If the ``local_unnamed_addr`` attribute is given, the address is known to
 829 not be significant within the module.
 830
 831 If an explicit address space is not given, it will default to the program
 832 address space from the :ref:`datalayout string<langref_datalayout>`.
 833
 834 Syntax::
 835
 836     define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass]
 837            [cconv] [ret attrs]
 838            <ResultType> @<FunctionName> ([argument list])
 839            [(unnamed_addr|local_unnamed_addr)] [AddrSpace] [fn Attrs]
 840            [section "name"] [partition "name"] [comdat [($name)]] [align N]
 841            [gc] [prefix Constant] [prologue Constant] [personality Constant]
 842            (!name !N)* { ... }
 843
 844 The argument list is a comma separated sequence of arguments where each
 845 argument is of the following form:
 846
 847 Syntax::
 848
 849    <type> [parameter Attrs] [name]
 850
 851
 852 .. _langref_aliases:
 853
 854 Aliases
 855 -------
 856
 857 Aliases, unlike function or variables, don't create any new data. They
 858 are just a new symbol and metadata for an existing position.
 859
 860 Aliases have a name and an aliasee that is either a global value or a
 861 constant expression.
 862
 863 Aliases may have an optional :ref:`linkage type <linkage>`, an optional
 864 :ref:`runtime preemption specifier <runtime_preemption_model>`, an optional
 865 :ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
 866 <dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
 867
 868 Syntax::
 869
 870     @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee>
 871               [, partition "name"]
 872
 873 The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
 874 ``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers
 875 might not correctly handle dropping a weak symbol that is aliased.
 876
 877 Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as
 878 the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
 879 to the same content.
 880
 881 If the ``local_unnamed_addr`` attribute is given, the address is known to
 882 not be significant within the module.
 883
 884 Since aliases are only a second name, some restrictions apply, of which
 885 some can only be checked when producing an object file:
 886
 887 * The expression defining the aliasee must be computable at assembly
 888   time. Since it is just a name, no relocations can be used.
 889
 890 * No alias in the expression can be weak as the possibility of the
 891   intermediate alias being overridden cannot be represented in an
 892   object file.
 893
 894 * No global value in the expression can be a declaration, since that
 895   would require a relocation, which is not possible.
 896
 897 .. _langref_ifunc:
 898
 899 IFuncs
 900 -------
 901
 902 IFuncs, like as aliases, don't create any new data or func. They are just a new
 903 symbol that dynamic linker resolves at runtime by calling a resolver function.
 904
 905 IFuncs have a name and a resolver that is a function called by dynamic linker
 906 that returns address of another function associated with the name.
 907
 908 IFunc may have an optional :ref:`linkage type <linkage>` and an optional
 909 :ref:`visibility style <visibility>`.
 910
 911 Syntax::
 912
 913     @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver>
 914               [, partition "name"]
 915
 916
 917 .. _langref_comdats:
 918
 919 Comdats
 920 -------
 921
 922 Comdat IR provides access to object file COMDAT/section group functionality
 923 which represents interrelated sections.
 924
 925 Comdats have a name which represents the COMDAT key and a selection kind to
 926 provide input on how the linker deduplicates comdats with the same key in two
 927 different object files. A comdat must be included or omitted as a unit.
 928 Discarding the whole comdat is allowed but discarding a subset is not.
 929
 930 A global object may be a member of at most one comdat. Aliases are placed in the
 931 same COMDAT that their aliasee computes to, if any.
 932
 933 Syntax::
 934
 935     $<Name> = comdat SelectionKind
 936
 937 For selection kinds other than ``nodeduplicate``, only one of the duplicate
 938 comdats may be retained by the linker and the members of the remaining comdats
 939 must be discarded. The following selection kinds are supported:
 940
 941 ``any``
 942     The linker may choose any COMDAT key, the choice is arbitrary.
 943 ``exactmatch``
 944     The linker may choose any COMDAT key but the sections must contain the
 945     same data.
 946 ``largest``
 947     The linker will choose the section containing the largest COMDAT key.
 948 ``nodeduplicate``
 949     No deduplication is performed.
 950 ``samesize``
 951     The linker may choose any COMDAT key but the sections must contain the
 952     same amount of data.
 953
 954 - XCOFF and Mach-O don't support COMDATs.
 955 - COFF supports all selection kinds. Non-``nodeduplicate`` selection kinds need
 956   a non-local linkage COMDAT symbol.
 957 - ELF supports ``any`` and ``nodeduplicate``.
 958 - WebAssembly only supports ``any``.
 959
 960 Here is an example of a COFF COMDAT where a function will only be selected if
 961 the COMDAT key's section is the largest:
 962
 963 .. code-block:: text
 964
 965    $foo = comdat largest
 966    @foo = global i32 2, comdat($foo)
 967
 968    define void @bar() comdat($foo) {
 969      ret void
 970    }
 971
 972 In a COFF object file, this will create a COMDAT section with selection kind
 973 ``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
 974 and another COMDAT section with selection kind
 975 ``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
 976 section and contains the contents of the ``@bar`` symbol.
 977
 978 As a syntactic sugar the ``$name`` can be omitted if the name is the same as
 979 the global name:
 980
 981 .. code-block:: llvm
 982
 983   $foo = comdat any
 984   @foo = global i32 2, comdat
 985   @bar = global i32 3, comdat($foo)
 986
 987 There are some restrictions on the properties of the global object.
 988 It, or an alias to it, must have the same name as the COMDAT group when
 989 targeting COFF.
 990 The contents and size of this object may be used during link-time to determine
 991 which COMDAT groups get selected depending on the selection kind.
 992 Because the name of the object must match the name of the COMDAT group, the
 993 linkage of the global object must not be local; local symbols can get renamed
 994 if a collision occurs in the symbol table.
 995
 996 The combined use of COMDATS and section attributes may yield surprising results.
 997 For example:
 998
 999 .. code-block:: llvm
1000
1001    $foo = comdat any
1002    $bar = comdat any
1003    @g1 = global i32 42, section "sec", comdat($foo)
1004    @g2 = global i32 42, section "sec", comdat($bar)
1005
1006 From the object file perspective, this requires the creation of two sections
1007 with the same name. This is necessary because both globals belong to different
1008 COMDAT groups and COMDATs, at the object file level, are represented by
1009 sections.
1010
1011 Note that certain IR constructs like global variables and functions may
1012 create COMDATs in the object file in addition to any which are specified using
1013 COMDAT IR. This arises when the code generator is configured to emit globals
1014 in individual sections (e.g. when `-data-sections` or `-function-sections`
1015 is supplied to `llc`).
1016
1017 .. _namedmetadatastructure:
1018
1019 Named Metadata
1020 --------------
1021
1022 Named metadata is a collection of metadata. :ref:`Metadata
1023 nodes <metadata>` (but not metadata strings) are the only valid
1024 operands for a named metadata.
1025
1026 #. Named metadata are represented as a string of characters with the
1027    metadata prefix. The rules for metadata names are the same as for
1028    identifiers, but quoted names are not allowed. ``"\xx"`` type escapes
1029    are still valid, which allows any character to be part of a name.
1030
1031 Syntax::
1032
1033     ; Some unnamed metadata nodes, which are referenced by the named metadata.
1034     !0 = !{!"zero"}
1035     !1 = !{!"one"}
1036     !2 = !{!"two"}
1037     ; A named metadata.
1038     !name = !{!0, !1, !2}
1039
1040 .. _paramattrs:
1041
1042 Parameter Attributes
1043 --------------------
1044
1045 The return type and each parameter of a function type may have a set of
1046 *parameter attributes* associated with them. Parameter attributes are
1047 used to communicate additional information about the result or
1048 parameters of a function. Parameter attributes are considered to be part
1049 of the function, not of the function type, so functions with different
1050 parameter attributes can have the same function type.
1051
1052 Parameter attributes are simple keywords that follow the type specified.
1053 If multiple parameter attributes are needed, they are space separated.
1054 For example:
1055
1056 .. code-block:: llvm
1057
1058     declare i32 @printf(i8* noalias nocapture, ...)
1059     declare i32 @atoi(i8 zeroext)
1060     declare signext i8 @returns_signed_char()
1061
1062 Note that any attributes for the function result (``nounwind``,
1063 ``readonly``) come immediately after the argument list.
1064
1065 Currently, only the following parameter attributes are defined:
1066
1067 ``zeroext``
1068     This indicates to the code generator that the parameter or return
1069     value should be zero-extended to the extent required by the target's
1070     ABI by the caller (for a parameter) or the callee (for a return value).
1071 ``signext``
1072     This indicates to the code generator that the parameter or return
1073     value should be sign-extended to the extent required by the target's
1074     ABI (which is usually 32-bits) by the caller (for a parameter) or
1075     the callee (for a return value).
1076 ``inreg``
1077     This indicates that this parameter or return value should be treated
1078     in a special target-dependent fashion while emitting code for
1079     a function call or return (usually, by putting it in a register as
1080     opposed to memory, though some targets use it to distinguish between
1081     two different kinds of registers). Use of this attribute is
1082     target-specific.
1083 ``byval(<ty>)``
1084     This indicates that the pointer parameter should really be passed by
1085     value to the function. The attribute implies that a hidden copy of
1086     the pointee is made between the caller and the callee, so the callee
1087     is unable to modify the value in the caller. This attribute is only
1088     valid on LLVM pointer arguments. It is generally used to pass
1089     structs and arrays by value, but is also valid on pointers to
1090     scalars. The copy is considered to belong to the caller not the
1091     callee (for example, ``readonly`` functions should not write to
1092     ``byval`` parameters). This is not a valid attribute for return
1093     values.
1094
1095     The byval type argument indicates the in-memory value type, and
1096     must be the same as the pointee type of the argument.
1097
1098     The byval attribute also supports specifying an alignment with the
1099     align attribute. It indicates the alignment of the stack slot to
1100     form and the known alignment of the pointer specified to the call
1101     site. If the alignment is not specified, then the code generator
1102     makes a target-specific assumption.
1103
1104 .. _attr_byref:
1105
1106 ``byref(<ty>)``
1107
1108     The ``byref`` argument attribute allows specifying the pointee
1109     memory type of an argument. This is similar to ``byval``, but does
1110     not imply a copy is made anywhere, or that the argument is passed
1111     on the stack. This implies the pointer is dereferenceable up to
1112     the storage size of the type.
1113
1114     It is not generally permissible to introduce a write to an
1115     ``byref`` pointer. The pointer may have any address space and may
1116     be read only.
1117
1118     This is not a valid attribute for return values.
1119
1120     The alignment for an ``byref`` parameter can be explicitly
1121     specified by combining it with the ``align`` attribute, similar to
1122     ``byval``. If the alignment is not specified, then the code generator
1123     makes a target-specific assumption.
1124
1125     This is intended for representing ABI constraints, and is not
1126     intended to be inferred for optimization use.
1127
1128 .. _attr_preallocated:
1129
1130 ``preallocated(<ty>)``
1131     This indicates that the pointer parameter should really be passed by
1132     value to the function, and that the pointer parameter's pointee has
1133     already been initialized before the call instruction. This attribute
1134     is only valid on LLVM pointer arguments. The argument must be the value
1135     returned by the appropriate
1136     :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` on non
1137     ``musttail`` calls, or the corresponding caller parameter in ``musttail``
1138     calls, although it is ignored during codegen.
1139
1140     A non ``musttail`` function call with a ``preallocated`` attribute in
1141     any parameter must have a ``"preallocated"`` operand bundle. A ``musttail``
1142     function call cannot have a ``"preallocated"`` operand bundle.
1143
1144     The preallocated attribute requires a type argument, which must be
1145     the same as the pointee type of the argument.
1146
1147     The preallocated attribute also supports specifying an alignment with the
1148     align attribute. It indicates the alignment of the stack slot to
1149     form and the known alignment of the pointer specified to the call
1150     site. If the alignment is not specified, then the code generator
1151     makes a target-specific assumption.
1152
1153 .. _attr_inalloca:
1154
1155 ``inalloca(<ty>)``
1156
1157     The ``inalloca`` argument attribute allows the caller to take the
1158     address of outgoing stack arguments. An ``inalloca`` argument must
1159     be a pointer to stack memory produced by an ``alloca`` instruction.
1160     The alloca, or argument allocation, must also be tagged with the
1161     inalloca keyword. Only the last argument may have the ``inalloca``
1162     attribute, and that argument is guaranteed to be passed in memory.
1163
1164     An argument allocation may be used by a call at most once because
1165     the call may deallocate it. The ``inalloca`` attribute cannot be
1166     used in conjunction with other attributes that affect argument
1167     storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The
1168     ``inalloca`` attribute also disables LLVM's implicit lowering of
1169     large aggregate return values, which means that frontend authors
1170     must lower them with ``sret`` pointers.
1171
1172     When the call site is reached, the argument allocation must have
1173     been the most recent stack allocation that is still live, or the
1174     behavior is undefined. It is possible to allocate additional stack
1175     space after an argument allocation and before its call site, but it
1176     must be cleared off with :ref:`llvm.stackrestore
1177     <int_stackrestore>`.
1178
1179     The inalloca attribute requires a type argument, which must be the
1180     same as the pointee type of the argument.
1181
1182     See :doc:`InAlloca` for more information on how to use this
1183     attribute.
1184
1185 ``sret(<ty>)``
1186     This indicates that the pointer parameter specifies the address of a
1187     structure that is the return value of the function in the source
1188     program. This pointer must be guaranteed by the caller to be valid:
1189     loads and stores to the structure may be assumed by the callee not
1190     to trap and to be properly aligned. This is not a valid attribute
1191     for return values.
1192
1193     The sret type argument specifies the in memory type, which must be
1194     the same as the pointee type of the argument.
1195
1196 .. _attr_elementtype:
1197
1198 ``elementtype(<ty>)``
1199
1200     The ``elementtype`` argument attribute can be used to specify a pointer
1201     element type in a way that is compatible with `opaque pointers
1202     <OpaquePointers.html>`__.
1203
1204     The ``elementtype`` attribute by itself does not carry any specific
1205     semantics. However, certain intrinsics may require this attribute to be
1206     present and assign it particular semantics. This will be documented on
1207     individual intrinsics.
1208
1209     The attribute may only be applied to pointer typed arguments of intrinsic
1210     calls. It cannot be applied to non-intrinsic calls, and cannot be applied
1211     to parameters on function declarations. For non-opaque pointers, the type
1212     passed to ``elementtype`` must match the pointer element type.
1213
1214 .. _attr_align:
1215
1216 ``align <n>`` or ``align(<n>)``
1217     This indicates that the pointer value or vector of pointers has the
1218     specified alignment. If applied to a vector of pointers, *all* pointers
1219     (elements) have the specified alignment. If the pointer value does not have
1220     the specified alignment, :ref:`poison value <poisonvalues>` is returned or
1221     passed instead.  The ``align`` attribute should be combined with the
1222     ``noundef`` attribute to ensure a pointer is aligned, or otherwise the
1223     behavior is undefined. Note that ``align 1`` has no effect on non-byval,
1224     non-preallocated arguments.
1225
1226     Note that this attribute has additional semantics when combined with the
1227     ``byval`` or ``preallocated`` attribute, which are documented there.
1228
1229 .. _noalias:
1230
1231 ``noalias``
1232     This indicates that memory locations accessed via pointer values
1233     :ref:`based <pointeraliasing>` on the argument or return value are not also
1234     accessed, during the execution of the function, via pointer values not
1235     *based* on the argument or return value. This guarantee only holds for
1236     memory locations that are *modified*, by any means, during the execution of
1237     the function. The attribute on a return value also has additional semantics
1238     described below. The caller shares the responsibility with the callee for
1239     ensuring that these requirements are met.  For further details, please see
1240     the discussion of the NoAlias response in :ref:`alias analysis <Must, May,
1241     or No>`.
1242
1243     Note that this definition of ``noalias`` is intentionally similar
1244     to the definition of ``restrict`` in C99 for function arguments.
1245
1246     For function return values, C99's ``restrict`` is not meaningful,
1247     while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias``
1248     attribute on return values are stronger than the semantics of the attribute
1249     when used on function arguments. On function return values, the ``noalias``
1250     attribute indicates that the function acts like a system memory allocation
1251     function, returning a pointer to allocated storage disjoint from the
1252     storage for any other object accessible to the caller.
1253
1254 .. _nocapture:
1255
1256 ``nocapture``
1257     This indicates that the callee does not :ref:`capture <pointercapture>` the
1258     pointer. This is not a valid attribute for return values.
1259     This attribute applies only to the particular copy of the pointer passed in
1260     this argument. A caller could pass two copies of the same pointer with one
1261     being annotated nocapture and the other not, and the callee could validly
1262     capture through the non annotated parameter.
1263
1264 .. code-block:: llvm
1265
1266     define void @f(i8* nocapture %a, i8* %b) {
1267       ; (capture %b)
1268     }
1269
1270     call void @f(i8* @glb, i8* @glb) ; well-defined
1271
1272 ``nofree``
1273     This indicates that callee does not free the pointer argument. This is not
1274     a valid attribute for return values.
1275
1276 .. _nest:
1277
1278 ``nest``
1279     This indicates that the pointer parameter can be excised using the
1280     :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid
1281     attribute for return values and can only be applied to one parameter.
1282
1283 ``returned``
1284     This indicates that the function always returns the argument as its return
1285     value. This is a hint to the optimizer and code generator used when
1286     generating the caller, allowing value propagation, tail call optimization,
1287     and omission of register saves and restores in some cases; it is not
1288     checked or enforced when generating the callee. The parameter and the
1289     function return type must be valid operands for the
1290     :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for
1291     return values and can only be applied to one parameter.
1292
1293 ``nonnull``
1294     This indicates that the parameter or return pointer is not null. This
1295     attribute may only be applied to pointer typed parameters. This is not
1296     checked or enforced by LLVM; if the parameter or return pointer is null,
1297     :ref:`poison value <poisonvalues>` is returned or passed instead.
1298     The ``nonnull`` attribute should be combined with the ``noundef`` attribute
1299     to ensure a pointer is not null or otherwise the behavior is undefined.
1300
1301 ``dereferenceable(<n>)``
1302     This indicates that the parameter or return pointer is dereferenceable. This
1303     attribute may only be applied to pointer typed parameters. A pointer that
1304     is dereferenceable can be loaded from speculatively without a risk of
1305     trapping. The number of bytes known to be dereferenceable must be provided
1306     in parentheses. It is legal for the number of bytes to be less than the
1307     size of the pointee type. The ``nonnull`` attribute does not imply
1308     dereferenceability (consider a pointer to one element past the end of an
1309     array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in
1310     ``addrspace(0)`` (which is the default address space), except if the
1311     ``null_pointer_is_valid`` function attribute is present.
1312     ``n`` should be a positive number. The pointer should be well defined,
1313     otherwise it is undefined behavior. This means ``dereferenceable(<n>)``
1314     implies ``noundef``.
1315
1316 ``dereferenceable_or_null(<n>)``
1317     This indicates that the parameter or return value isn't both
1318     non-null and non-dereferenceable (up to ``<n>`` bytes) at the same
1319     time. All non-null pointers tagged with
1320     ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``.
1321     For address space 0 ``dereferenceable_or_null(<n>)`` implies that
1322     a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``,
1323     and in other address spaces ``dereferenceable_or_null(<n>)``
1324     implies that a pointer is at least one of ``dereferenceable(<n>)``
1325     or ``null`` (i.e. it may be both ``null`` and
1326     ``dereferenceable(<n>)``). This attribute may only be applied to
1327     pointer typed parameters.
1328
1329 ``swiftself``
1330     This indicates that the parameter is the self/context parameter. This is not
1331     a valid attribute for return values and can only be applied to one
1332     parameter.
1333
1334 ``swiftasync``
1335     This indicates that the parameter is the asynchronous context parameter and
1336     triggers the creation of a target-specific extended frame record to store
1337     this pointer. This is not a valid attribute for return values and can only
1338     be applied to one parameter.
1339
1340 ``swifterror``
1341     This attribute is motivated to model and optimize Swift error handling. It
1342     can be applied to a parameter with pointer to pointer type or a
1343     pointer-sized alloca. At the call site, the actual argument that corresponds
1344     to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or
1345     the ``swifterror`` parameter of the caller. A ``swifterror`` value (either
1346     the parameter or the alloca) can only be loaded and stored from, or used as
1347     a ``swifterror`` argument. This is not a valid attribute for return values
1348     and can only be applied to one parameter.
1349
1350     These constraints allow the calling convention to optimize access to
1351     ``swifterror`` variables by associating them with a specific register at
1352     call boundaries rather than placing them in memory. Since this does change
1353     the calling convention, a function which uses the ``swifterror`` attribute
1354     on a parameter is not ABI-compatible with one which does not.
1355
1356     These constraints also allow LLVM to assume that a ``swifterror`` argument
1357     does not alias any other memory visible within a function and that a
1358     ``swifterror`` alloca passed as an argument does not escape.
1359
1360 ``immarg``
1361     This indicates the parameter is required to be an immediate
1362     value. This must be a trivial immediate integer or floating-point
1363     constant. Undef or constant expressions are not valid. This is
1364     only valid on intrinsic declarations and cannot be applied to a
1365     call site or arbitrary function.
1366
1367 ``noundef``
1368     This attribute applies to parameters and return values. If the value
1369     representation contains any undefined or poison bits, the behavior is
1370     undefined. Note that this does not refer to padding introduced by the
1371     type's storage representation.
1372
1373 ``alignstack(<n>)``
1374     This indicates the alignment that should be considered by the backend when
1375     assigning this parameter to a stack slot during calling convention
1376     lowering. The enforcement of the specified alignment is target-dependent,
1377     as target-specific calling convention rules may override this value. This
1378     attribute serves the purpose of carrying language specific alignment
1379     information that is not mapped to base types in the backend (for example,
1380     over-alignment specification through language attributes).
1381
1382 .. _gc:
1383
1384 Garbage Collector Strategy Names
1385 --------------------------------
1386
1387 Each function may specify a garbage collector strategy name, which is simply a
1388 string:
1389
1390 .. code-block:: llvm
1391
1392     define void @f() gc "name" { ... }
1393
1394 The supported values of *name* includes those :ref:`built in to LLVM
1395 <builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC
1396 strategy will cause the compiler to alter its output in order to support the
1397 named garbage collection algorithm. Note that LLVM itself does not contain a
1398 garbage collector, this functionality is restricted to generating machine code
1399 which can interoperate with a collector provided externally.
1400
1401 .. _prefixdata:
1402
1403 Prefix Data
1404 -----------
1405
1406 Prefix data is data associated with a function which the code
1407 generator will emit immediately before the function's entrypoint.
1408 The purpose of this feature is to allow frontends to associate
1409 language-specific runtime metadata with specific functions and make it
1410 available through the function pointer while still allowing the
1411 function pointer to be called.
1412
1413 To access the data for a given function, a program may bitcast the
1414 function pointer to a pointer to the constant's type and dereference
1415 index -1. This implies that the IR symbol points just past the end of
1416 the prefix data. For instance, take the example of a function annotated
1417 with a single ``i32``,
1418
1419 .. code-block:: llvm
1420
1421     define void @f() prefix i32 123 { ... }
1422
1423 The prefix data can be referenced as,
1424
1425 .. code-block:: llvm
1426
1427     %0 = bitcast void* () @f to i32*
1428     %a = getelementptr inbounds i32, i32* %0, i32 -1
1429     %b = load i32, i32* %a
1430
1431 Prefix data is laid out as if it were an initializer for a global variable
1432 of the prefix data's type. The function will be placed such that the
1433 beginning of the prefix data is aligned. This means that if the size
1434 of the prefix data is not a multiple of the alignment size, the
1435 function's entrypoint will not be aligned. If alignment of the
1436 function's entrypoint is desired, padding must be added to the prefix
1437 data.
1438
1439 A function may have prefix data but no body. This has similar semantics
1440 to the ``available_externally`` linkage in that the data may be used by the
1441 optimizers but will not be emitted in the object file.
1442
1443 .. _prologuedata:
1444
1445 Prologue Data
1446 -------------
1447
1448 The ``prologue`` attribute allows arbitrary code (encoded as bytes) to
1449 be inserted prior to the function body. This can be used for enabling
1450 function hot-patching and instrumentation.
1451
1452 To maintain the semantics of ordinary function calls, the prologue data must
1453 have a particular format. Specifically, it must begin with a sequence of
1454 bytes which decode to a sequence of machine instructions, valid for the
1455 module's target, which transfer control to the point immediately succeeding
1456 the prologue data, without performing any other visible action. This allows
1457 the inliner and other passes to reason about the semantics of the function
1458 definition without needing to reason about the prologue data. Obviously this
1459 makes the format of the prologue data highly target dependent.
1460
1461 A trivial example of valid prologue data for the x86 architecture is ``i8 144``,
1462 which encodes the ``nop`` instruction:
1463
1464 .. code-block:: text
1465
1466     define void @f() prologue i8 144 { ... }
1467
1468 Generally prologue data can be formed by encoding a relative branch instruction
1469 which skips the metadata, as in this example of valid prologue data for the
1470 x86_64 architecture, where the first two bytes encode ``jmp .+10``:
1471
1472 .. code-block:: text
1473
1474     %0 = type <{ i8, i8, i8* }>
1475
1476     define void @f() prologue %0 <{ i8 235, i8 8, i8* @md}> { ... }
1477
1478 A function may have prologue data but no body. This has similar semantics
1479 to the ``available_externally`` linkage in that the data may be used by the
1480 optimizers but will not be emitted in the object file.
1481
1482 .. _personalityfn:
1483
1484 Personality Function
1485 --------------------
1486
1487 The ``personality`` attribute permits functions to specify what function
1488 to use for exception handling.
1489
1490 .. _attrgrp:
1491
1492 Attribute Groups
1493 ----------------
1494
1495 Attribute groups are groups of attributes that are referenced by objects within
1496 the IR. They are important for keeping ``.ll`` files readable, because a lot of
1497 functions will use the same set of attributes. In the degenerative case of a
1498 ``.ll`` file that corresponds to a single ``.c`` file, the single attribute
1499 group will capture the important command line flags used to build that file.
1500
1501 An attribute group is a module-level object. To use an attribute group, an
1502 object references the attribute group's ID (e.g. ``#37``). An object may refer
1503 to more than one attribute group. In that situation, the attributes from the
1504 different groups are merged.
1505
1506 Here is an example of attribute groups for a function that should always be
1507 inlined, has a stack alignment of 4, and which shouldn't use SSE instructions:
1508
1509 .. code-block:: llvm
1510
1511    ; Target-independent attributes:
1512    attributes #0 = { alwaysinline alignstack=4 }
1513
1514    ; Target-dependent attributes:
1515    attributes #1 = { "no-sse" }
1516
1517    ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
1518    define void @f() #0 #1 { ... }
1519
1520 .. _fnattrs:
1521
1522 Function Attributes
1523 -------------------
1524
1525 Function attributes are set to communicate additional information about
1526 a function. Function attributes are considered to be part of the
1527 function, not of the function type, so functions with different function
1528 attributes can have the same function type.
1529
1530 Function attributes are simple keywords that follow the type specified.
1531 If multiple attributes are needed, they are space separated. For
1532 example:
1533
1534 .. code-block:: llvm
1535
1536     define void @f() noinline { ... }
1537     define void @f() alwaysinline { ... }
1538     define void @f() alwaysinline optsize { ... }
1539     define void @f() optsize { ... }
1540
1541 ``alignstack(<n>)``
1542     This attribute indicates that, when emitting the prologue and
1543     epilogue, the backend should forcibly align the stack pointer.
1544     Specify the desired alignment, which must be a power of two, in
1545     parentheses.
1546 ``allocsize(<EltSizeParam>[, <NumEltsParam>])``
1547     This attribute indicates that the annotated function will always return at
1548     least a given number of bytes (or null). Its arguments are zero-indexed
1549     parameter numbers; if one argument is provided, then it's assumed that at
1550     least ``CallSite.Args[EltSizeParam]`` bytes will be available at the
1551     returned pointer. If two are provided, then it's assumed that
1552     ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are
1553     available. The referenced parameters must be integer types. No assumptions
1554     are made about the contents of the returned block of memory.
1555 ``alwaysinline``
1556     This attribute indicates that the inliner should attempt to inline
1557     this function into callers whenever possible, ignoring any active
1558     inlining size threshold for this caller.
1559 ``builtin``
1560     This indicates that the callee function at a call site should be
1561     recognized as a built-in function, even though the function's declaration
1562     uses the ``nobuiltin`` attribute. This is only valid at call sites for
1563     direct calls to functions that are declared with the ``nobuiltin``
1564     attribute.
1565 ``cold``
1566     This attribute indicates that this function is rarely called. When
1567     computing edge weights, basic blocks post-dominated by a cold
1568     function call are also considered to be cold; and, thus, given low
1569     weight.
1570 ``convergent``
1571     In some parallel execution models, there exist operations that cannot be
1572     made control-dependent on any additional values.  We call such operations
1573     ``convergent``, and mark them with this attribute.
1574
1575     The ``convergent`` attribute may appear on functions or call/invoke
1576     instructions.  When it appears on a function, it indicates that calls to
1577     this function should not be made control-dependent on additional values.
1578     For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so
1579     calls to this intrinsic cannot be made control-dependent on additional
1580     values.
1581
1582     When it appears on a call/invoke, the ``convergent`` attribute indicates
1583     that we should treat the call as though we're calling a convergent
1584     function.  This is particularly useful on indirect calls; without this we
1585     may treat such calls as though the target is non-convergent.
1586
1587     The optimizer may remove the ``convergent`` attribute on functions when it
1588     can prove that the function does not execute any convergent operations.
1589     Similarly, the optimizer may remove ``convergent`` on calls/invokes when it
1590     can prove that the call/invoke cannot call a convergent function.
1591 ``disable_sanitizer_instrumentation``
1592     When instrumenting code with sanitizers, it can be important to skip certain
1593     functions to ensure no instrumentation is applied to them.
1594
1595     This attribute is not always similar to absent ``sanitize_<name>``
1596     attributes: depending on the specific sanitizer, code can be inserted into
1597     functions regardless of the ``sanitize_<name>`` attribute to prevent false
1598     positive reports.
1599
1600     ``disable_sanitizer_instrumentation`` disables all kinds of instrumentation,
1601     taking precedence over the ``sanitize_<name>`` attributes and other compiler
1602     flags.
1603 ``"dontcall-error"``
1604     This attribute denotes that an error diagnostic should be emitted when a
1605     call of a function with this attribute is not eliminated via optimization.
1606     Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1607     such callees to attach information about where in the source language such a
1608     call came from. A string value can be provided as a note.
1609 ``"dontcall-warn"``
1610     This attribute denotes that a warning diagnostic should be emitted when a
1611     call of a function with this attribute is not eliminated via optimization.
1612     Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1613     such callees to attach information about where in the source language such a
1614     call came from. A string value can be provided as a note.
1615 ``"frame-pointer"``
1616     This attribute tells the code generator whether the function
1617     should keep the frame pointer. The code generator may emit the frame pointer
1618     even if this attribute says the frame pointer can be eliminated.
1619     The allowed string values are:
1620
1621      * ``"none"`` (default) - the frame pointer can be eliminated.
1622      * ``"non-leaf"`` - the frame pointer should be kept if the function calls
1623        other functions.
1624      * ``"all"`` - the frame pointer should be kept.
1625 ``hot``
1626     This attribute indicates that this function is a hot spot of the program
1627     execution. The function will be optimized more aggressively and will be
1628     placed into special subsection of the text section to improving locality.
1629
1630     When profile feedback is enabled, this attribute has the precedence over
1631     the profile information. By marking a function ``hot``, users can work
1632     around the cases where the training input does not have good coverage
1633     on all the hot functions.
1634 ``inaccessiblememonly``
1635     This attribute indicates that the function may only access memory that
1636     is not accessible by the module being compiled. This is a weaker form
1637     of ``readnone``. If the function reads or writes other memory, the
1638     behavior is undefined.
1639 ``inaccessiblemem_or_argmemonly``
1640     This attribute indicates that the function may only access memory that is
1641     either not accessible by the module being compiled, or is pointed to
1642     by its pointer arguments. This is a weaker form of  ``argmemonly``. If the
1643     function reads or writes other memory, the behavior is undefined.
1644 ``inlinehint``
1645     This attribute indicates that the source code contained a hint that
1646     inlining this function is desirable (such as the "inline" keyword in
1647     C/C++). It is just a hint; it imposes no requirements on the
1648     inliner.
1649 ``jumptable``
1650     This attribute indicates that the function should be added to a
1651     jump-instruction table at code-generation time, and that all address-taken
1652     references to this function should be replaced with a reference to the
1653     appropriate jump-instruction-table function pointer. Note that this creates
1654     a new pointer for the original function, which means that code that depends
1655     on function-pointer identity can break. So, any function annotated with
1656     ``jumptable`` must also be ``unnamed_addr``.
1657 ``minsize``
1658     This attribute suggests that optimization passes and code generator
1659     passes make choices that keep the code size of this function as small
1660     as possible and perform optimizations that may sacrifice runtime
1661     performance in order to minimize the size of the generated code.
1662 ``naked``
1663     This attribute disables prologue / epilogue emission for the
1664     function. This can have very system-specific consequences.
1665 ``"no-inline-line-tables"``
1666     When this attribute is set to true, the inliner discards source locations
1667     when inlining code and instead uses the source location of the call site.
1668     Breakpoints set on code that was inlined into the current function will
1669     not fire during the execution of the inlined call sites. If the debugger
1670     stops inside an inlined call site, it will appear to be stopped at the
1671     outermost inlined call site.
1672 ``no-jump-tables``
1673     When this attribute is set to true, the jump tables and lookup tables that
1674     can be generated from a switch case lowering are disabled.
1675 ``nobuiltin``
1676     This indicates that the callee function at a call site is not recognized as
1677     a built-in function. LLVM will retain the original call and not replace it
1678     with equivalent code based on the semantics of the built-in function, unless
1679     the call site uses the ``builtin`` attribute. This is valid at call sites
1680     and on function declarations and definitions.
1681 ``noduplicate``
1682     This attribute indicates that calls to the function cannot be
1683     duplicated. A call to a ``noduplicate`` function may be moved
1684     within its parent function, but may not be duplicated within
1685     its parent function.
1686
1687     A function containing a ``noduplicate`` call may still
1688     be an inlining candidate, provided that the call is not
1689     duplicated by inlining. That implies that the function has
1690     internal linkage and only has one call site, so the original
1691     call is dead after inlining.
1692 ``nofree``
1693     This function attribute indicates that the function does not, directly or
1694     transitively, call a memory-deallocation function (``free``, for example)
1695     on a memory allocation which existed before the call.
1696
1697     As a result, uncaptured pointers that are known to be dereferenceable
1698     prior to a call to a function with the ``nofree`` attribute are still
1699     known to be dereferenceable after the call. The capturing condition is
1700     necessary in environments where the function might communicate the
1701     pointer to another thread which then deallocates the memory.  Alternatively,
1702     ``nosync`` would ensure such communication cannot happen and even captured
1703     pointers cannot be freed by the function.
1704
1705     A ``nofree`` function is explicitly allowed to free memory which it
1706     allocated or (if not ``nosync``) arrange for another thread to free
1707     memory on it's behalf.  As a result, perhaps surprisingly, a ``nofree``
1708     function can return a pointer to a previously deallocated memory object.
1709 ``noimplicitfloat``
1710     Disallows implicit floating-point code. This inhibits optimizations that
1711     use floating-point code and floating-point/SIMD/vector registers for
1712     operations that are not nominally floating-point. LLVM instructions that
1713     perform floating-point operations or require access to floating-point
1714     registers may still cause floating-point code to be generated.
1715 ``noinline``
1716     This attribute indicates that the inliner should never inline this
1717     function in any situation. This attribute may not be used together
1718     with the ``alwaysinline`` attribute.
1719 ``nomerge``
1720     This attribute indicates that calls to this function should never be merged
1721     during optimization. For example, it will prevent tail merging otherwise
1722     identical code sequences that raise an exception or terminate the program.
1723     Tail merging normally reduces the precision of source location information,
1724     making stack traces less useful for debugging. This attribute gives the
1725     user control over the tradeoff between code size and debug information
1726     precision.
1727 ``nonlazybind``
1728     This attribute suppresses lazy symbol binding for the function. This
1729     may make calls to the function faster, at the cost of extra program
1730     startup time if the function is not called during program startup.
1731 ``noprofile``
1732     This function attribute prevents instrumentation based profiling, used for
1733     coverage or profile based optimization, from being added to a function,
1734     even when inlined.
1735 ``noredzone``
1736     This attribute indicates that the code generator should not use a
1737     red zone, even if the target-specific ABI normally permits it.
1738 ``indirect-tls-seg-refs``
1739     This attribute indicates that the code generator should not use
1740     direct TLS access through segment registers, even if the
1741     target-specific ABI normally permits it.
1742 ``noreturn``
1743     This function attribute indicates that the function never returns
1744     normally, hence through a return instruction. This produces undefined
1745     behavior at runtime if the function ever does dynamically return. Annotated
1746     functions may still raise an exception, i.a., ``nounwind`` is not implied.
1747 ``norecurse``
1748     This function attribute indicates that the function does not call itself
1749     either directly or indirectly down any possible call path. This produces
1750     undefined behavior at runtime if the function ever does recurse.
1751 ``willreturn``
1752     This function attribute indicates that a call of this function will
1753     either exhibit undefined behavior or comes back and continues execution
1754     at a point in the existing call stack that includes the current invocation.
1755     Annotated functions may still raise an exception, i.a., ``nounwind`` is not implied.
1756     If an invocation of an annotated function does not return control back
1757     to a point in the call stack, the behavior is undefined.
1758 ``nosync``
1759     This function attribute indicates that the function does not communicate
1760     (synchronize) with another thread through memory or other well-defined means.
1761     Synchronization is considered possible in the presence of `atomic` accesses
1762     that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses,
1763     as well as `convergent` function calls. Note that through `convergent` function calls
1764     non-memory communication, e.g., cross-lane operations, are possible and are also
1765     considered synchronization. However `convergent` does not contradict `nosync`.
1766     If an annotated function does ever synchronize with another thread,
1767     the behavior is undefined.
1768 ``nounwind``
1769     This function attribute indicates that the function never raises an
1770     exception. If the function does raise an exception, its runtime
1771     behavior is undefined. However, functions marked nounwind may still
1772     trap or generate asynchronous exceptions. Exception handling schemes
1773     that are recognized by LLVM to handle asynchronous exceptions, such
1774     as SEH, will still provide their implementation defined semantics.
1775 ``nosanitize_coverage``
1776     This attribute indicates that SanitizerCoverage instrumentation is disabled
1777     for this function.
1778 ``null_pointer_is_valid``
1779    If ``null_pointer_is_valid`` is set, then the ``null`` address
1780    in address-space 0 is considered to be a valid address for memory loads and
1781    stores. Any analysis or optimization should not treat dereferencing a
1782    pointer to ``null`` as undefined behavior in this function.
1783    Note: Comparing address of a global variable to ``null`` may still
1784    evaluate to false because of a limitation in querying this attribute inside
1785    constant expressions.
1786 ``optforfuzzing``
1787     This attribute indicates that this function should be optimized
1788     for maximum fuzzing signal.
1789 ``optnone``
1790     This function attribute indicates that most optimization passes will skip
1791     this function, with the exception of interprocedural optimization passes.
1792     Code generation defaults to the "fast" instruction selector.
1793     This attribute cannot be used together with the ``alwaysinline``
1794     attribute; this attribute is also incompatible
1795     with the ``minsize`` attribute and the ``optsize`` attribute.
1796
1797     This attribute requires the ``noinline`` attribute to be specified on
1798     the function as well, so the function is never inlined into any caller.
1799     Only functions with the ``alwaysinline`` attribute are valid
1800     candidates for inlining into the body of this function.
1801 ``optsize``
1802     This attribute suggests that optimization passes and code generator
1803     passes make choices that keep the code size of this function low,
1804     and otherwise do optimizations specifically to reduce code size as
1805     long as they do not significantly impact runtime performance.
1806 ``"patchable-function"``
1807     This attribute tells the code generator that the code
1808     generated for this function needs to follow certain conventions that
1809     make it possible for a runtime function to patch over it later.
1810     The exact effect of this attribute depends on its string value,
1811     for which there currently is one legal possibility:
1812
1813      * ``"prologue-short-redirect"`` - This style of patchable
1814        function is intended to support patching a function prologue to
1815        redirect control away from the function in a thread safe
1816        manner.  It guarantees that the first instruction of the
1817        function will be large enough to accommodate a short jump
1818        instruction, and will be sufficiently aligned to allow being
1819        fully changed via an atomic compare-and-swap instruction.
1820        While the first requirement can be satisfied by inserting large
1821        enough NOP, LLVM can and will try to re-purpose an existing
1822        instruction (i.e. one that would have to be emitted anyway) as
1823        the patchable instruction larger than a short jump.
1824
1825        ``"prologue-short-redirect"`` is currently only supported on
1826        x86-64.
1827
1828     This attribute by itself does not imply restrictions on
1829     inter-procedural optimizations.  All of the semantic effects the
1830     patching may have to be separately conveyed via the linkage type.
1831 ``"probe-stack"``
1832     This attribute indicates that the function will trigger a guard region
1833     in the end of the stack. It ensures that accesses to the stack must be
1834     no further apart than the size of the guard region to a previous
1835     access of the stack. It takes one required string value, the name of
1836     the stack probing function that will be called.
1837
1838     If a function that has a ``"probe-stack"`` attribute is inlined into
1839     a function with another ``"probe-stack"`` attribute, the resulting
1840     function has the ``"probe-stack"`` attribute of the caller. If a
1841     function that has a ``"probe-stack"`` attribute is inlined into a
1842     function that has no ``"probe-stack"`` attribute at all, the resulting
1843     function has the ``"probe-stack"`` attribute of the callee.
1844 ``readnone``
1845     On a function, this attribute indicates that the function computes its
1846     result (or decides to unwind an exception) based strictly on its arguments,
1847     without dereferencing any pointer arguments or otherwise accessing
1848     any mutable state (e.g. memory, control registers, etc) visible to
1849     caller functions. It does not write through any pointer arguments
1850     (including ``byval`` arguments) and never changes any state visible
1851     to callers. This means while it cannot unwind exceptions by calling
1852     the ``C++`` exception throwing methods (since they write to memory), there may
1853     be non-``C++`` mechanisms that throw exceptions without writing to LLVM
1854     visible memory.
1855
1856     On an argument, this attribute indicates that the function does not
1857     dereference that pointer argument, even though it may read or write the
1858     memory that the pointer points to if accessed through other pointers.
1859
1860     If a readnone function reads or writes memory visible to the program, or
1861     has other side-effects, the behavior is undefined. If a function reads from
1862     or writes to a readnone pointer argument, the behavior is undefined.
1863 ``readonly``
1864     On a function, this attribute indicates that the function does not write
1865     through any pointer arguments (including ``byval`` arguments) or otherwise
1866     modify any state (e.g. memory, control registers, etc) visible to
1867     caller functions. It may dereference pointer arguments and read
1868     state that may be set in the caller. A readonly function always
1869     returns the same value (or unwinds an exception identically) when
1870     called with the same set of arguments and global state.  This means while it
1871     cannot unwind exceptions by calling the ``C++`` exception throwing methods
1872     (since they write to memory), there may be non-``C++`` mechanisms that throw
1873     exceptions without writing to LLVM visible memory.
1874
1875     On an argument, this attribute indicates that the function does not write
1876     through this pointer argument, even though it may write to the memory that
1877     the pointer points to.
1878
1879     If a readonly function writes memory visible to the program, or
1880     has other side-effects, the behavior is undefined. If a function writes to
1881     a readonly pointer argument, the behavior is undefined.
1882 ``"stack-probe-size"``
1883     This attribute controls the behavior of stack probes: either
1884     the ``"probe-stack"`` attribute, or ABI-required stack probes, if any.
1885     It defines the size of the guard region. It ensures that if the function
1886     may use more stack space than the size of the guard region, stack probing
1887     sequence will be emitted. It takes one required integer value, which
1888     is 4096 by default.
1889
1890     If a function that has a ``"stack-probe-size"`` attribute is inlined into
1891     a function with another ``"stack-probe-size"`` attribute, the resulting
1892     function has the ``"stack-probe-size"`` attribute that has the lower
1893     numeric value. If a function that has a ``"stack-probe-size"`` attribute is
1894     inlined into a function that has no ``"stack-probe-size"`` attribute
1895     at all, the resulting function has the ``"stack-probe-size"`` attribute
1896     of the callee.
1897 ``"no-stack-arg-probe"``
1898     This attribute disables ABI-required stack probes, if any.
1899 ``writeonly``
1900     On a function, this attribute indicates that the function may write to but
1901     does not read from memory.
1902
1903     On an argument, this attribute indicates that the function may write to but
1904     does not read through this pointer argument (even though it may read from
1905     the memory that the pointer points to).
1906
1907     If a writeonly function reads memory visible to the program, or
1908     has other side-effects, the behavior is undefined. If a function reads
1909     from a writeonly pointer argument, the behavior is undefined.
1910 ``argmemonly``
1911     This attribute indicates that the only memory accesses inside function are
1912     loads and stores from objects pointed to by its pointer-typed arguments,
1913     with arbitrary offsets. Or in other words, all memory operations in the
1914     function can refer to memory only using pointers based on its function
1915     arguments.
1916
1917     Note that ``argmemonly`` can be used together with ``readonly`` attribute
1918     in order to specify that function reads only from its arguments.
1919
1920     If an argmemonly function reads or writes memory other than the pointer
1921     arguments, or has other side-effects, the behavior is undefined.
1922 ``returns_twice``
1923     This attribute indicates that this function can return twice. The C
1924     ``setjmp`` is an example of such a function. The compiler disables
1925     some optimizations (like tail calls) in the caller of these
1926     functions.
1927 ``safestack``
1928     This attribute indicates that
1929     `SafeStack <https://clang.llvm.org/docs/SafeStack.html>`_
1930     protection is enabled for this function.
1931
1932     If a function that has a ``safestack`` attribute is inlined into a
1933     function that doesn't have a ``safestack`` attribute or which has an
1934     ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting
1935     function will have a ``safestack`` attribute.
1936 ``sanitize_address``
1937     This attribute indicates that AddressSanitizer checks
1938     (dynamic address safety analysis) are enabled for this function.
1939 ``sanitize_memory``
1940     This attribute indicates that MemorySanitizer checks (dynamic detection
1941     of accesses to uninitialized memory) are enabled for this function.
1942 ``sanitize_thread``
1943     This attribute indicates that ThreadSanitizer checks
1944     (dynamic thread safety analysis) are enabled for this function.
1945 ``sanitize_hwaddress``
1946     This attribute indicates that HWAddressSanitizer checks
1947     (dynamic address safety analysis based on tagged pointers) are enabled for
1948     this function.
1949 ``sanitize_memtag``
1950     This attribute indicates that MemTagSanitizer checks
1951     (dynamic address safety analysis based on Armv8 MTE) are enabled for
1952     this function.
1953 ``speculative_load_hardening``
1954     This attribute indicates that
1955     `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_
1956     should be enabled for the function body.
1957
1958     Speculative Load Hardening is a best-effort mitigation against
1959     information leak attacks that make use of control flow
1960     miss-speculation - specifically miss-speculation of whether a branch
1961     is taken or not. Typically vulnerabilities enabling such attacks are
1962     classified as "Spectre variant #1". Notably, this does not attempt to
1963     mitigate against miss-speculation of branch target, classified as
1964     "Spectre variant #2" vulnerabilities.
1965
1966     When inlining, the attribute is sticky. Inlining a function that carries
1967     this attribute will cause the caller to gain the attribute. This is intended
1968     to provide a maximally conservative model where the code in a function
1969     annotated with this attribute will always (even after inlining) end up
1970     hardened.
1971 ``speculatable``
1972     This function attribute indicates that the function does not have any
1973     effects besides calculating its result and does not have undefined behavior.
1974     Note that ``speculatable`` is not enough to conclude that along any
1975     particular execution path the number of calls to this function will not be
1976     externally observable. This attribute is only valid on functions
1977     and declarations, not on individual call sites. If a function is
1978     incorrectly marked as speculatable and really does exhibit
1979     undefined behavior, the undefined behavior may be observed even
1980     if the call site is dead code.
1981
1982 ``ssp``
1983     This attribute indicates that the function should emit a stack
1984     smashing protector. It is in the form of a "canary" --- a random value
1985     placed on the stack before the local variables that's checked upon
1986     return from the function to see if it has been overwritten. A
1987     heuristic is used to determine if a function needs stack protectors
1988     or not. The heuristic used will enable protectors for functions with:
1989
1990     - Character arrays larger than ``ssp-buffer-size`` (default 8).
1991     - Aggregates containing character arrays larger than ``ssp-buffer-size``.
1992     - Calls to alloca() with variable sizes or constant sizes greater than
1993       ``ssp-buffer-size``.
1994
1995     Variables that are identified as requiring a protector will be arranged
1996     on the stack such that they are adjacent to the stack protector guard.
1997
1998     If a function with an ``ssp`` attribute is inlined into a calling function,
1999     the attribute is not carried over to the calling function.
2000
2001 ``sspstrong``
2002     This attribute indicates that the function should emit a stack smashing
2003     protector. This attribute causes a strong heuristic to be used when
2004     determining if a function needs stack protectors. The strong heuristic
2005     will enable protectors for functions with:
2006
2007     - Arrays of any size and type
2008     - Aggregates containing an array of any size and type.
2009     - Calls to alloca().
2010     - Local variables that have had their address taken.
2011
2012     Variables that are identified as requiring a protector will be arranged
2013     on the stack such that they are adjacent to the stack protector guard.
2014     The specific layout rules are:
2015
2016     #. Large arrays and structures containing large arrays
2017        (``>= ssp-buffer-size``) are closest to the stack protector.
2018     #. Small arrays and structures containing small arrays
2019        (``< ssp-buffer-size``) are 2nd closest to the protector.
2020     #. Variables that have had their address taken are 3rd closest to the
2021        protector.
2022
2023     This overrides the ``ssp`` function attribute.
2024
2025     If a function with an ``sspstrong`` attribute is inlined into a calling
2026     function which has an ``ssp`` attribute, the calling function's attribute
2027     will be upgraded to ``sspstrong``.
2028
2029 ``sspreq``
2030     This attribute indicates that the function should *always* emit a stack
2031     smashing protector. This overrides the ``ssp`` and ``sspstrong`` function
2032     attributes.
2033
2034     Variables that are identified as requiring a protector will be arranged
2035     on the stack such that they are adjacent to the stack protector guard.
2036     The specific layout rules are:
2037
2038     #. Large arrays and structures containing large arrays
2039        (``>= ssp-buffer-size``) are closest to the stack protector.
2040     #. Small arrays and structures containing small arrays
2041        (``< ssp-buffer-size``) are 2nd closest to the protector.
2042     #. Variables that have had their address taken are 3rd closest to the
2043        protector.
2044
2045     If a function with an ``sspreq`` attribute is inlined into a calling
2046     function which has an ``ssp`` or ``sspstrong`` attribute, the calling
2047     function's attribute will be upgraded to ``sspreq``.
2048
2049 ``strictfp``
2050     This attribute indicates that the function was called from a scope that
2051     requires strict floating-point semantics.  LLVM will not attempt any
2052     optimizations that require assumptions about the floating-point rounding
2053     mode or that might alter the state of floating-point status flags that
2054     might otherwise be set or cleared by calling this function. LLVM will
2055     not introduce any new floating-point instructions that may trap.
2056
2057 ``"denormal-fp-math"``
2058     This indicates the denormal (subnormal) handling that may be
2059     assumed for the default floating-point environment. This is a
2060     comma separated pair. The elements may be one of ``"ieee"``,
2061     ``"preserve-sign"``, or ``"positive-zero"``. The first entry
2062     indicates the flushing mode for the result of floating point
2063     operations. The second indicates the handling of denormal inputs
2064     to floating point instructions. For compatibility with older
2065     bitcode, if the second value is omitted, both input and output
2066     modes will assume the same mode.
2067
2068     If this is attribute is not specified, the default is
2069     ``"ieee,ieee"``.
2070
2071     If the output mode is ``"preserve-sign"``, or ``"positive-zero"``,
2072     denormal outputs may be flushed to zero by standard floating-point
2073     operations. It is not mandated that flushing to zero occurs, but if
2074     a denormal output is flushed to zero, it must respect the sign
2075     mode. Not all targets support all modes. While this indicates the
2076     expected floating point mode the function will be executed with,
2077     this does not make any attempt to ensure the mode is
2078     consistent. User or platform code is expected to set the floating
2079     point mode appropriately before function entry.
2080
2081    If the input mode is ``"preserve-sign"``, or ``"positive-zero"``, a
2082    floating-point operation must treat any input denormal value as
2083    zero. In some situations, if an instruction does not respect this
2084    mode, the input may need to be converted to 0 as if by
2085    ``@llvm.canonicalize`` during lowering for correctness.
2086
2087 ``"denormal-fp-math-f32"``
2088     Same as ``"denormal-fp-math"``, but only controls the behavior of
2089     the 32-bit float type (or vectors of 32-bit floats). If both are
2090     are present, this overrides ``"denormal-fp-math"``. Not all targets
2091     support separately setting the denormal mode per type, and no
2092     attempt is made to diagnose unsupported uses. Currently this
2093     attribute is respected by the AMDGPU and NVPTX backends.
2094
2095 ``"thunk"``
2096     This attribute indicates that the function will delegate to some other
2097     function with a tail call. The prototype of a thunk should not be used for
2098     optimization purposes. The caller is expected to cast the thunk prototype to
2099     match the thunk target prototype.
2100 ``uwtable``
2101     This attribute indicates that the ABI being targeted requires that
2102     an unwind table entry be produced for this function even if we can
2103     show that no exceptions passes by it. This is normally the case for
2104     the ELF x86-64 abi, but it can be disabled for some compilation
2105     units.
2106 ``nocf_check``
2107     This attribute indicates that no control-flow check will be performed on
2108     the attributed entity. It disables -fcf-protection=<> for a specific
2109     entity to fine grain the HW control flow protection mechanism. The flag
2110     is target independent and currently appertains to a function or function
2111     pointer.
2112 ``shadowcallstack``
2113     This attribute indicates that the ShadowCallStack checks are enabled for
2114     the function. The instrumentation checks that the return address for the
2115     function has not changed between the function prolog and epilog. It is
2116     currently x86_64-specific.
2117 ``mustprogress``
2118     This attribute indicates that the function is required to return, unwind,
2119     or interact with the environment in an observable way e.g. via a volatile
2120     memory access, I/O, or other synchronization.  The ``mustprogress``
2121     attribute is intended to model the requirements of the first section of
2122     [intro.progress] of the C++ Standard. As a consequence, a loop in a
2123     function with the `mustprogress` attribute can be assumed to terminate if
2124     it does not interact with the environment in an observable way, and
2125     terminating loops without side-effects can be removed. If a `mustprogress`
2126     function does not satisfy this contract, the behavior is undefined.  This
2127     attribute does not apply transitively to callees, but does apply to call
2128     sites within the function. Note that `willreturn` implies `mustprogress`.
2129 ``"warn-stack-size"="<threshold>"``
2130     This attribute sets a threshold to emit diagnostics once the frame size is
2131     known should the frame size exceed the specified value.  It takes one
2132     required integer value, which should be a non-negative integer, and less
2133     than `UINT_MAX`.  It's unspecified which threshold will be used when
2134     duplicate definitions are linked together with differing values.
2135 ``vscale_range(<min>[, <max>])``
2136     This attribute indicates the minimum and maximum vscale value for the given
2137     function. The min must be greater than 0. A maximum value of 0 means
2138     unbounded. If the optional max value is omitted then max is set to the
2139     value of min. If the attribute is not present, no assumptions are made
2140     about the range of vscale.
2141
2142 Call Site Attributes
2143 ----------------------
2144
2145 In addition to function attributes the following call site only
2146 attributes are supported:
2147
2148 ``vector-function-abi-variant``
2149     This attribute can be attached to a :ref:`call <i_call>` to list
2150     the vector functions associated to the function. Notice that the
2151     attribute cannot be attached to a :ref:`invoke <i_invoke>` or a
2152     :ref:`callbr <i_callbr>` instruction. The attribute consists of a
2153     comma separated list of mangled names. The order of the list does
2154     not imply preference (it is logically a set). The compiler is free
2155     to pick any listed vector function of its choosing.
2156
2157     The syntax for the mangled names is as follows:::
2158
2159         _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)]
2160
2161     When present, the attribute informs the compiler that the function
2162     ``<scalar_name>`` has a corresponding vector variant that can be
2163     used to perform the concurrent invocation of ``<scalar_name>`` on
2164     vectors. The shape of the vector function is described by the
2165     tokens between the prefix ``_ZGV`` and the ``<scalar_name>``
2166     token. The standard name of the vector function is
2167     ``_ZGV<isa><mask><vlen><parameters>_<scalar_name>``. When present,
2168     the optional token ``(<vector_redirection>)`` informs the compiler
2169     that a custom name is provided in addition to the standard one
2170     (custom names can be provided for example via the use of ``declare
2171     variant`` in OpenMP 5.0). The declaration of the variant must be
2172     present in the IR Module. The signature of the vector variant is
2173     determined by the rules of the Vector Function ABI (VFABI)
2174     specifications of the target. For Arm and X86, the VFABI can be
2175     found at https://github.com/ARM-software/abi-aa and
2176     https://software.intel.com/content/www/us/en/develop/download/vector-simd-function-abi.html,
2177     respectively.
2178
2179     For X86 and Arm targets, the values of the tokens in the standard
2180     name are those that are defined in the VFABI. LLVM has an internal
2181     ``<isa>`` token that can be used to create scalar-to-vector
2182     mappings for functions that are not directly associated to any of
2183     the target ISAs (for example, some of the mappings stored in the
2184     TargetLibraryInfo). Valid values for the ``<isa>`` token are:::
2185
2186         <isa>:= b | c | d | e  -> X86 SSE, AVX, AVX2, AVX512
2187               | n | s          -> Armv8 Advanced SIMD, SVE
2188               | __LLVM__       -> Internal LLVM Vector ISA
2189
2190     For all targets currently supported (x86, Arm and Internal LLVM),
2191     the remaining tokens can have the following values:::
2192
2193         <mask>:= M | N         -> mask | no mask
2194
2195         <vlen>:= number        -> number of lanes
2196                | x             -> VLA (Vector Length Agnostic)
2197
2198         <parameters>:= v              -> vector
2199                      | l | l <number> -> linear
2200                      | R | R <number> -> linear with ref modifier
2201                      | L | L <number> -> linear with val modifier
2202                      | U | U <number> -> linear with uval modifier
2203                      | ls <pos>       -> runtime linear
2204                      | Rs <pos>       -> runtime linear with ref modifier
2205                      | Ls <pos>       -> runtime linear with val modifier
2206                      | Us <pos>       -> runtime linear with uval modifier
2207                      | u              -> uniform
2208
2209         <scalar_name>:= name of the scalar function
2210
2211         <vector_redirection>:= optional, custom name of the vector function
2212
2213 ``preallocated(<ty>)``
2214     This attribute is required on calls to ``llvm.call.preallocated.arg``
2215     and cannot be used on any other call. See
2216     :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` for more
2217     details.
2218
2219 .. _glattrs:
2220
2221 Global Attributes
2222 -----------------
2223
2224 Attributes may be set to communicate additional information about a global variable.
2225 Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable
2226 are grouped into a single :ref:`attribute group <attrgrp>`.
2227
2228 .. _opbundles:
2229
2230 Operand Bundles
2231 ---------------
2232
2233 Operand bundles are tagged sets of SSA values that can be associated
2234 with certain LLVM instructions (currently only ``call`` s and
2235 ``invoke`` s).  In a way they are like metadata, but dropping them is
2236 incorrect and will change program semantics.
2237
2238 Syntax::
2239
2240     operand bundle set ::= '[' operand bundle (, operand bundle )* ']'
2241     operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
2242     bundle operand ::= SSA value
2243     tag ::= string constant
2244
2245 Operand bundles are **not** part of a function's signature, and a
2246 given function may be called from multiple places with different kinds
2247 of operand bundles.  This reflects the fact that the operand bundles
2248 are conceptually a part of the ``call`` (or ``invoke``), not the
2249 callee being dispatched to.
2250
2251 Operand bundles are a generic mechanism intended to support
2252 runtime-introspection-like functionality for managed languages.  While
2253 the exact semantics of an operand bundle depend on the bundle tag,
2254 there are certain limitations to how much the presence of an operand
2255 bundle can influence the semantics of a program.  These restrictions
2256 are described as the semantics of an "unknown" operand bundle.  As
2257 long as the behavior of an operand bundle is describable within these
2258 restrictions, LLVM does not need to have special knowledge of the
2259 operand bundle to not miscompile programs containing it.
2260
2261 - The bundle operands for an unknown operand bundle escape in unknown
2262   ways before control is transferred to the callee or invokee.
2263 - Calls and invokes with operand bundles have unknown read / write
2264   effect on the heap on entry and exit (even if the call target is
2265   ``readnone`` or ``readonly``), unless they're overridden with
2266   callsite specific attributes.
2267 - An operand bundle at a call site cannot change the implementation
2268   of the called function.  Inter-procedural optimizations work as
2269   usual as long as they take into account the first two properties.
2270
2271 More specific types of operand bundles are described below.
2272
2273 .. _deopt_opbundles:
2274
2275 Deoptimization Operand Bundles
2276 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2277
2278 Deoptimization operand bundles are characterized by the ``"deopt"``
2279 operand bundle tag.  These operand bundles represent an alternate
2280 "safe" continuation for the call site they're attached to, and can be
2281 used by a suitable runtime to deoptimize the compiled frame at the
2282 specified call site.  There can be at most one ``"deopt"`` operand
2283 bundle attached to a call site.  Exact details of deoptimization is
2284 out of scope for the language reference, but it usually involves
2285 rewriting a compiled frame into a set of interpreted frames.
2286
2287 From the compiler's perspective, deoptimization operand bundles make
2288 the call sites they're attached to at least ``readonly``.  They read
2289 through all of their pointer typed operands (even if they're not
2290 otherwise escaped) and the entire visible heap.  Deoptimization
2291 operand bundles do not capture their operands except during
2292 deoptimization, in which case control will not be returned to the
2293 compiled frame.
2294
2295 The inliner knows how to inline through calls that have deoptimization
2296 operand bundles.  Just like inlining through a normal call site
2297 involves composing the normal and exceptional continuations, inlining
2298 through a call site with a deoptimization operand bundle needs to
2299 appropriately compose the "safe" deoptimization continuation.  The
2300 inliner does this by prepending the parent's deoptimization
2301 continuation to every deoptimization continuation in the inlined body.
2302 E.g. inlining ``@f`` into ``@g`` in the following example
2303
2304 .. code-block:: llvm
2305
2306     define void @f() {
2307       call void @x()  ;; no deopt state
2308       call void @y() [ "deopt"(i32 10) ]
2309       call void @y() [ "deopt"(i32 10), "unknown"(i8* null) ]
2310       ret void
2311     }
2312
2313     define void @g() {
2314       call void @f() [ "deopt"(i32 20) ]
2315       ret void
2316     }
2317
2318 will result in
2319
2320 .. code-block:: llvm
2321
2322     define void @g() {
2323       call void @x()  ;; still no deopt state
2324       call void @y() [ "deopt"(i32 20, i32 10) ]
2325       call void @y() [ "deopt"(i32 20, i32 10), "unknown"(i8* null) ]
2326       ret void
2327     }
2328
2329 It is the frontend's responsibility to structure or encode the
2330 deoptimization state in a way that syntactically prepending the
2331 caller's deoptimization state to the callee's deoptimization state is
2332 semantically equivalent to composing the caller's deoptimization
2333 continuation after the callee's deoptimization continuation.
2334
2335 .. _ob_funclet:
2336
2337 Funclet Operand Bundles
2338 ^^^^^^^^^^^^^^^^^^^^^^^
2339
2340 Funclet operand bundles are characterized by the ``"funclet"``
2341 operand bundle tag.  These operand bundles indicate that a call site
2342 is within a particular funclet.  There can be at most one
2343 ``"funclet"`` operand bundle attached to a call site and it must have
2344 exactly one bundle operand.
2345
2346 If any funclet EH pads have been "entered" but not "exited" (per the
2347 `description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_),
2348 it is undefined behavior to execute a ``call`` or ``invoke`` which:
2349
2350 * does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind
2351   intrinsic, or
2352 * has a ``"funclet"`` bundle whose operand is not the most-recently-entered
2353   not-yet-exited funclet EH pad.
2354
2355 Similarly, if no funclet EH pads have been entered-but-not-yet-exited,
2356 executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior.
2357
2358 GC Transition Operand Bundles
2359 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2360
2361 GC transition operand bundles are characterized by the
2362 ``"gc-transition"`` operand bundle tag. These operand bundles mark a
2363 call as a transition between a function with one GC strategy to a
2364 function with a different GC strategy. If coordinating the transition
2365 between GC strategies requires additional code generation at the call
2366 site, these bundles may contain any values that are needed by the
2367 generated code.  For more details, see :ref:`GC Transitions
2368 <gc_transition_args>`.
2369
2370 The bundle contain an arbitrary list of Values which need to be passed
2371 to GC transition code. They will be lowered and passed as operands to
2372 the appropriate GC_TRANSITION nodes in the selection DAG. It is assumed
2373 that these arguments must be available before and after (but not
2374 necessarily during) the execution of the callee.
2375
2376 .. _assume_opbundles:
2377
2378 Assume Operand Bundles
2379 ^^^^^^^^^^^^^^^^^^^^^^
2380
2381 Operand bundles on an :ref:`llvm.assume <int_assume>` allows representing
2382 assumptions that a :ref:`parameter attribute <paramattrs>` or a
2383 :ref:`function attribute <fnattrs>` holds for a certain value at a certain
2384 location. Operand bundles enable assumptions that are either hard or impossible
2385 to represent as a boolean argument of an :ref:`llvm.assume <int_assume>`.
2386
2387 An assume operand bundle has the form:
2388
2389 ::
2390
2391       "<tag>"([ <holds for value> [, <attribute argument>] ])
2392
2393 * The tag of the operand bundle is usually the name of attribute that can be
2394   assumed to hold. It can also be `ignore`, this tag doesn't contain any
2395   information and should be ignored.
2396 * The first argument if present is the value for which the attribute hold.
2397 * The second argument if present is an argument of the attribute.
2398
2399 If there are no arguments the attribute is a property of the call location.
2400
2401 If the represented attribute expects a constant argument, the argument provided
2402 to the operand bundle should be a constant as well.
2403
2404 For example:
2405
2406 .. code-block:: llvm
2407
2408       call void @llvm.assume(i1 true) ["align"(i32* %val, i32 8)]
2409
2410 allows the optimizer to assume that at location of call to
2411 :ref:`llvm.assume <int_assume>` ``%val`` has an alignment of at least 8.
2412
2413 .. code-block:: llvm
2414
2415       call void @llvm.assume(i1 %cond) ["cold"(), "nonnull"(i64* %val)]
2416
2417 allows the optimizer to assume that the :ref:`llvm.assume <int_assume>`
2418 call location is cold and that ``%val`` may not be null.
2419
2420 Just like for the argument of :ref:`llvm.assume <int_assume>`, if any of the
2421 provided guarantees are violated at runtime the behavior is undefined.
2422
2423 Even if the assumed property can be encoded as a boolean value, like
2424 ``nonnull``, using operand bundles to express the property can still have
2425 benefits:
2426
2427 * Attributes that can be expressed via operand bundles are directly the
2428   property that the optimizer uses and cares about. Encoding attributes as
2429   operand bundles removes the need for an instruction sequence that represents
2430   the property (e.g., `icmp ne i32* %p, null` for `nonnull`) and for the
2431   optimizer to deduce the property from that instruction sequence.
2432 * Expressing the property using operand bundles makes it easy to identify the
2433   use of the value as a use in an :ref:`llvm.assume <int_assume>`. This then
2434   simplifies and improves heuristics, e.g., for use "use-sensitive"
2435   optimizations.
2436
2437 .. _ob_preallocated:
2438
2439 Preallocated Operand Bundles
2440 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2441
2442 Preallocated operand bundles are characterized by the ``"preallocated"``
2443 operand bundle tag.  These operand bundles allow separation of the allocation
2444 of the call argument memory from the call site.  This is necessary to pass
2445 non-trivially copyable objects by value in a way that is compatible with MSVC
2446 on some targets.  There can be at most one ``"preallocated"`` operand bundle
2447 attached to a call site and it must have exactly one bundle operand, which is
2448 a token generated by ``@llvm.call.preallocated.setup``.  A call with this
2449 operand bundle should not adjust the stack before entering the function, as
2450 that will have been done by one of the ``@llvm.call.preallocated.*`` intrinsics.
2451
2452 .. code-block:: llvm
2453
2454       %foo = type { i64, i32 }
2455
2456       ...
2457
2458       %t = call token @llvm.call.preallocated.setup(i32 1)
2459       %a = call i8* @llvm.call.preallocated.arg(token %t, i32 0) preallocated(%foo)
2460       %b = bitcast i8* %a to %foo*
2461       ; initialize %b
2462       call void @bar(i32 42, %foo* preallocated(%foo) %b) ["preallocated"(token %t)]
2463
2464 .. _ob_gc_live:
2465
2466 GC Live Operand Bundles
2467 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2468
2469 A "gc-live" operand bundle is only valid on a :ref:`gc.statepoint <gc_statepoint>`
2470 intrinsic. The operand bundle must contain every pointer to a garbage collected
2471 object which potentially needs to be updated by the garbage collector.
2472
2473 When lowered, any relocated value will be recorded in the corresponding
2474 :ref:`stackmap entry <statepoint-stackmap-format>`.  See the intrinsic description
2475 for further details.
2476
2477 ObjC ARC Attached Call Operand Bundles
2478 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2479
2480 A ``"clang.arc.attachedcall"`` operand bundle on a call indicates the call is
2481 implicitly followed by a marker instruction and a call to an ObjC runtime
2482 function that uses the result of the call. The operand bundle takes either the
2483 pointer to the runtime function (``@objc_retainAutoreleasedReturnValue`` or
2484 ``@objc_unsafeClaimAutoreleasedReturnValue``) or no arguments. If the bundle
2485 doesn't take any arguments, only the marker instruction has to be emitted after
2486 the call; the runtime function calls don't have to be emitted since they already
2487 have been emitted. The return value of a call with this bundle is used by a call
2488 to ``@llvm.objc.clang.arc.noop.use`` unless the called function's return type is
2489 void, in which case the operand bundle is ignored.
2490
2491 .. code-block:: llvm
2492
2493    ; The marker instruction and a runtime function call are inserted after the call
2494    ; to @foo.
2495    call i8* @foo() [ "clang.arc.attachedcall"(i8* (i8*)* @objc_retainAutoreleasedReturnValue) ]
2496    call i8* @foo() [ "clang.arc.attachedcall"(i8* (i8*)* @objc_unsafeClaimAutoreleasedReturnValue) ]
2497
2498    ; Only the marker instruction is inserted after the call to @foo.
2499    call i8* @foo() [ "clang.arc.attachedcall"() ]
2500
2501 The operand bundle is needed to ensure the call is immediately followed by the
2502 marker instruction or the ObjC runtime call in the final output.
2503
2504 .. _moduleasm:
2505
2506 Module-Level Inline Assembly
2507 ----------------------------
2508
2509 Modules may contain "module-level inline asm" blocks, which corresponds
2510 to the GCC "file scope inline asm" blocks. These blocks are internally
2511 concatenated by LLVM and treated as a single unit, but may be separated
2512 in the ``.ll`` file if desired. The syntax is very simple:
2513
2514 .. code-block:: llvm
2515
2516     module asm "inline asm code goes here"
2517     module asm "more can go here"
2518
2519 The strings can contain any character by escaping non-printable
2520 characters. The escape sequence used is simply "\\xx" where "xx" is the
2521 two digit hex code for the number.
2522
2523 Note that the assembly string *must* be parseable by LLVM's integrated assembler
2524 (unless it is disabled), even when emitting a ``.s`` file.
2525
2526 .. _langref_datalayout:
2527
2528 Data Layout
2529 -----------
2530
2531 A module may specify a target specific data layout string that specifies
2532 how data is to be laid out in memory. The syntax for the data layout is
2533 simply:
2534
2535 .. code-block:: llvm
2536
2537     target datalayout = "layout specification"
2538
2539 The *layout specification* consists of a list of specifications
2540 separated by the minus sign character ('-'). Each specification starts
2541 with a letter and may include other information after the letter to
2542 define some aspect of the data layout. The specifications accepted are
2543 as follows:
2544
2545 ``E``
2546     Specifies that the target lays out data in big-endian form. That is,
2547     the bits with the most significance have the lowest address
2548     location.
2549 ``e``
2550     Specifies that the target lays out data in little-endian form. That
2551     is, the bits with the least significance have the lowest address
2552     location.
2553 ``S<size>``
2554     Specifies the natural alignment of the stack in bits. Alignment
2555     promotion of stack variables is limited to the natural stack
2556     alignment to avoid dynamic stack realignment. The stack alignment
2557     must be a multiple of 8-bits. If omitted, the natural stack
2558     alignment defaults to "unspecified", which does not prevent any
2559     alignment promotions.
2560 ``P<address space>``
2561     Specifies the address space that corresponds to program memory.
2562     Harvard architectures can use this to specify what space LLVM
2563     should place things such as functions into. If omitted, the
2564     program memory space defaults to the default address space of 0,
2565     which corresponds to a Von Neumann architecture that has code
2566     and data in the same space.
2567 ``G<address space>``
2568     Specifies the address space to be used by default when creating global
2569     variables. If omitted, the globals address space defaults to the default
2570     address space 0.
2571     Note: variable declarations without an address space are always created in
2572     address space 0, this property only affects the default value to be used
2573     when creating globals without additional contextual information (e.g. in
2574     LLVM passes).
2575 ``A<address space>``
2576     Specifies the address space of objects created by '``alloca``'.
2577     Defaults to the default address space of 0.
2578 ``p[n]:<size>:<abi>[:<pref>][:<idx>]``
2579     This specifies the *size* of a pointer and its ``<abi>`` and
2580     ``<pref>``\erred alignments for address space ``n``. ``<pref>`` is optional
2581     and defaults to ``<abi>``. The fourth parameter ``<idx>`` is the size of the
2582     index that used for address calculation. If not
2583     specified, the default index size is equal to the pointer size. All sizes
2584     are in bits. The address space, ``n``, is optional, and if not specified,
2585     denotes the default address space 0. The value of ``n`` must be
2586     in the range [1,2^23).
2587 ``i<size>:<abi>[:<pref>]``
2588     This specifies the alignment for an integer type of a given bit
2589     ``<size>``. The value of ``<size>`` must be in the range [1,2^23).
2590     ``<pref>`` is optional and defaults to ``<abi>``.
2591 ``v<size>:<abi>[:<pref>]``
2592     This specifies the alignment for a vector type of a given bit
2593     ``<size>``. The value of ``<size>`` must be in the range [1,2^23).
2594     ``<pref>`` is optional and defaults to ``<abi>``.
2595 ``f<size>:<abi>[:<pref>]``
2596     This specifies the alignment for a floating-point type of a given bit
2597     ``<size>``. Only values of ``<size>`` that are supported by the target
2598     will work. 32 (float) and 64 (double) are supported on all targets; 80
2599     or 128 (different flavors of long double) are also supported on some
2600     targets. The value of ``<size>`` must be in the range [1,2^23).
2601     ``<pref>`` is optional and defaults to ``<abi>``.
2602 ``a:<abi>[:<pref>]``
2603     This specifies the alignment for an object of aggregate type.
2604     ``<pref>`` is optional and defaults to ``<abi>``.
2605 ``F<type><abi>``
2606     This specifies the alignment for function pointers.
2607     The options for ``<type>`` are:
2608
2609     * ``i``: The alignment of function pointers is independent of the alignment
2610       of functions, and is a multiple of ``<abi>``.
2611     * ``n``: The alignment of function pointers is a multiple of the explicit
2612       alignment specified on the function, and is a multiple of ``<abi>``.
2613 ``m:<mangling>``
2614     If present, specifies that llvm names are mangled in the output. Symbols
2615     prefixed with the mangling escape character ``\01`` are passed through
2616     directly to the assembler without the escape character. The mangling style
2617     options are
2618
2619     * ``e``: ELF mangling: Private symbols get a ``.L`` prefix.
2620     * ``l``: GOFF mangling: Private symbols get a ``@`` prefix.
2621     * ``m``: Mips mangling: Private symbols get a ``$`` prefix.
2622     * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other
2623       symbols get a ``_`` prefix.
2624     * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix.
2625       Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``,
2626       ``__fastcall``, and ``__vectorcall`` have custom mangling that appends
2627       ``@N`` where N is the number of bytes used to pass parameters. C++ symbols
2628       starting with ``?`` are not mangled in any way.
2629     * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C
2630       symbols do not receive a ``_`` prefix.
2631     * ``a``: XCOFF mangling: Private symbols get a ``L..`` prefix.
2632 ``n<size1>:<size2>:<size3>...``
2633     This specifies a set of native integer widths for the target CPU in
2634     bits. For example, it might contain ``n32`` for 32-bit PowerPC,
2635     ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
2636     this set are considered to support most general arithmetic operations
2637     efficiently.
2638 ``ni:<address space0>:<address space1>:<address space2>...``
2639     This specifies pointer types with the specified address spaces
2640     as :ref:`Non-Integral Pointer Type <nointptrtype>` s.  The ``0``
2641     address space cannot be specified as non-integral.
2642
2643 On every specification that takes a ``<abi>:<pref>``, specifying the
2644 ``<pref>`` alignment is optional. If omitted, the preceding ``:``
2645 should be omitted too and ``<pref>`` will be equal to ``<abi>``.
2646
2647 When constructing the data layout for a given target, LLVM starts with a
2648 default set of specifications which are then (possibly) overridden by
2649 the specifications in the ``datalayout`` keyword. The default
2650 specifications are given in this list:
2651
2652 -  ``e`` - little endian
2653 -  ``p:64:64:64`` - 64-bit pointers with 64-bit alignment.
2654 -  ``p[n]:64:64:64`` - Other address spaces are assumed to be the
2655    same as the default address space.
2656 -  ``S0`` - natural stack alignment is unspecified
2657 -  ``i1:8:8`` - i1 is 8-bit (byte) aligned
2658 -  ``i8:8:8`` - i8 is 8-bit (byte) aligned
2659 -  ``i16:16:16`` - i16 is 16-bit aligned
2660 -  ``i32:32:32`` - i32 is 32-bit aligned
2661 -  ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
2662    alignment of 64-bits
2663 -  ``f16:16:16`` - half is 16-bit aligned
2664 -  ``f32:32:32`` - float is 32-bit aligned
2665 -  ``f64:64:64`` - double is 64-bit aligned
2666 -  ``f128:128:128`` - quad is 128-bit aligned
2667 -  ``v64:64:64`` - 64-bit vector is 64-bit aligned
2668 -  ``v128:128:128`` - 128-bit vector is 128-bit aligned
2669 -  ``a:0:64`` - aggregates are 64-bit aligned
2670
2671 When LLVM is determining the alignment for a given type, it uses the
2672 following rules:
2673
2674 #. If the type sought is an exact match for one of the specifications,
2675    that specification is used.
2676 #. If no match is found, and the type sought is an integer type, then
2677    the smallest integer type that is larger than the bitwidth of the
2678    sought type is used. If none of the specifications are larger than
2679    the bitwidth then the largest integer type is used. For example,
2680    given the default specifications above, the i7 type will use the
2681    alignment of i8 (next largest) while both i65 and i256 will use the
2682    alignment of i64 (largest specified).
2683
2684 The function of the data layout string may not be what you expect.
2685 Notably, this is not a specification from the frontend of what alignment
2686 the code generator should use.
2687
2688 Instead, if specified, the target data layout is required to match what
2689 the ultimate *code generator* expects. This string is used by the
2690 mid-level optimizers to improve code, and this only works if it matches
2691 what the ultimate code generator uses. There is no way to generate IR
2692 that does not embed this target-specific detail into the IR. If you
2693 don't specify the string, the default specifications will be used to
2694 generate a Data Layout and the optimization phases will operate
2695 accordingly and introduce target specificity into the IR with respect to
2696 these default specifications.
2697
2698 .. _langref_triple:
2699
2700 Target Triple
2701 -------------
2702
2703 A module may specify a target triple string that describes the target
2704 host. The syntax for the target triple is simply:
2705
2706 .. code-block:: llvm
2707
2708     target triple = "x86_64-apple-macosx10.7.0"
2709
2710 The *target triple* string consists of a series of identifiers delimited
2711 by the minus sign character ('-'). The canonical forms are:
2712
2713 ::
2714
2715     ARCHITECTURE-VENDOR-OPERATING_SYSTEM
2716     ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
2717
2718 This information is passed along to the backend so that it generates
2719 code for the proper architecture. It's possible to override this on the
2720 command line with the ``-mtriple`` command line option.
2721
2722 .. _objectlifetime:
2723
2724 Object Lifetime
2725 ----------------------
2726
2727 A memory object, or simply object, is a region of a memory space that is
2728 reserved by a memory allocation such as :ref:`alloca <i_alloca>`, heap
2729 allocation calls, and global variable definitions.
2730 Once it is allocated, the bytes stored in the region can only be read or written
2731 through a pointer that is :ref:`based on <pointeraliasing>` the allocation
2732 value.
2733 If a pointer that is not based on the object tries to read or write to the
2734 object, it is undefined behavior.
2735
2736 A lifetime of a memory object is a property that decides its accessibility.
2737 Unless stated otherwise, a memory object is alive since its allocation, and
2738 dead after its deallocation.
2739 It is undefined behavior to access a memory object that isn't alive, but
2740 operations that don't dereference it such as
2741 :ref:`getelementptr <i_getelementptr>`, :ref:`ptrtoint <i_ptrtoint>` and
2742 :ref:`icmp <i_icmp>` return a valid result.
2743 This explains code motion of these instructions across operations that
2744 impact the object's lifetime.
2745 A stack object's lifetime can be explicitly specified using
2746 :ref:`llvm.lifetime.start <int_lifestart>` and
2747 :ref:`llvm.lifetime.end <int_lifeend>` intrinsic function calls.
2748
2749 .. _pointeraliasing:
2750
2751 Pointer Aliasing Rules
2752 ----------------------
2753
2754 Any memory access must be done through a pointer value associated with
2755 an address range of the memory access, otherwise the behavior is
2756 undefined. Pointer values are associated with address ranges according
2757 to the following rules:
2758
2759 -  A pointer value is associated with the addresses associated with any
2760    value it is *based* on.
2761 -  An address of a global variable is associated with the address range
2762    of the variable's storage.
2763 -  The result value of an allocation instruction is associated with the
2764    address range of the allocated storage.
2765 -  A null pointer in the default address-space is associated with no
2766    address.
2767 -  An :ref:`undef value <undefvalues>` in *any* address-space is
2768    associated with no address.
2769 -  An integer constant other than zero or a pointer value returned from
2770    a function not defined within LLVM may be associated with address
2771    ranges allocated through mechanisms other than those provided by
2772    LLVM. Such ranges shall not overlap with any ranges of addresses
2773    allocated by mechanisms provided by LLVM.
2774
2775 A pointer value is *based* on another pointer value according to the
2776 following rules:
2777
2778 -  A pointer value formed from a scalar ``getelementptr`` operation is *based* on
2779    the pointer-typed operand of the ``getelementptr``.
2780 -  The pointer in lane *l* of the result of a vector ``getelementptr`` operation
2781    is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand
2782    of the ``getelementptr``.
2783 -  The result value of a ``bitcast`` is *based* on the operand of the
2784    ``bitcast``.
2785 -  A pointer value formed by an ``inttoptr`` is *based* on all pointer
2786    values that contribute (directly or indirectly) to the computation of
2787    the pointer's value.
2788 -  The "*based* on" relationship is transitive.
2789
2790 Note that this definition of *"based"* is intentionally similar to the
2791 definition of *"based"* in C99, though it is slightly weaker.
2792
2793 LLVM IR does not associate types with memory. The result type of a
2794 ``load`` merely indicates the size and alignment of the memory from
2795 which to load, as well as the interpretation of the value. The first
2796 operand type of a ``store`` similarly only indicates the size and
2797 alignment of the store.
2798
2799 Consequently, type-based alias analysis, aka TBAA, aka
2800 ``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
2801 :ref:`Metadata <metadata>` may be used to encode additional information
2802 which specialized optimization passes may use to implement type-based
2803 alias analysis.
2804
2805 .. _pointercapture:
2806
2807 Pointer Capture
2808 ---------------
2809
2810 Given a function call and a pointer that is passed as an argument or stored in
2811 the memory before the call, a pointer is *captured* by the call if it makes a
2812 copy of any part of the pointer that outlives the call.
2813 To be precise, a pointer is captured if one or more of the following conditions
2814 hold:
2815
2816 1. The call stores any bit of the pointer carrying information into a place,
2817    and the stored bits can be read from the place by the caller after this call
2818    exits.
2819
2820 .. code-block:: llvm
2821
2822     @glb  = global i8* null
2823     @glb2 = global i8* null
2824     @glb3 = global i8* null
2825     @glbi = global i32 0
2826
2827     define i8* @f(i8* %a, i8* %b, i8* %c, i8* %d, i8* %e) {
2828       store i8* %a, i8** @glb ; %a is captured by this call
2829
2830       store i8* %b,   i8** @glb2 ; %b isn't captured because the stored value is overwritten by the store below
2831       store i8* null, i8** @glb2
2832
2833       store i8* %c,   i8** @glb3
2834       call void @g() ; If @g makes a copy of %c that outlives this call (@f), %c is captured
2835       store i8* null, i8** @glb3
2836
2837       %i = ptrtoint i8* %d to i64
2838       %j = trunc i64 %i to i32
2839       store i32 %j, i32* @glbi ; %d is captured
2840
2841       ret i8* %e ; %e is captured
2842     }
2843
2844 2. The call stores any bit of the pointer carrying information into a place,
2845    and the stored bits can be safely read from the place by another thread via
2846    synchronization.
2847
2848 .. code-block:: llvm
2849
2850     @lock = global i1 true
2851
2852     define void @f(i8* %a) {
2853       store i8* %a, i8** @glb
2854       store atomic i1 false, i1* @lock release ; %a is captured because another thread can safely read @glb
2855       store i8* null, i8** @glb
2856       ret void
2857     }
2858
2859 3. The call's behavior depends on any bit of the pointer carrying information.
2860
2861 .. code-block:: llvm
2862
2863     @glb = global i8 0
2864
2865     define void @f(i8* %a) {
2866       %c = icmp eq i8* %a, @glb
2867       br i1 %c, label %BB_EXIT, label %BB_CONTINUE ; escapes %a
2868     BB_EXIT:
2869       call void @exit()
2870       unreachable
2871     BB_CONTINUE:
2872       ret void
2873     }
2874
2875 4. The pointer is used in a volatile access as its address.
2876
2877
2878 .. _volatile:
2879
2880 Volatile Memory Accesses
2881 ------------------------
2882
2883 Certain memory accesses, such as :ref:`load <i_load>`'s,
2884 :ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be
2885 marked ``volatile``. The optimizers must not change the number of
2886 volatile operations or change their order of execution relative to other
2887 volatile operations. The optimizers *may* change the order of volatile
2888 operations relative to non-volatile operations. This is not Java's
2889 "volatile" and has no cross-thread synchronization behavior.
2890
2891 A volatile load or store may have additional target-specific semantics.
2892 Any volatile operation can have side effects, and any volatile operation
2893 can read and/or modify state which is not accessible via a regular load
2894 or store in this module. Volatile operations may use addresses which do
2895 not point to memory (like MMIO registers). This means the compiler may
2896 not use a volatile operation to prove a non-volatile access to that
2897 address has defined behavior.
2898
2899 The allowed side-effects for volatile accesses are limited.  If a
2900 non-volatile store to a given address would be legal, a volatile
2901 operation may modify the memory at that address. A volatile operation
2902 may not modify any other memory accessible by the module being compiled.
2903 A volatile operation may not call any code in the current module.
2904
2905 The compiler may assume execution will continue after a volatile operation,
2906 so operations which modify memory or may have undefined behavior can be
2907 hoisted past a volatile operation.
2908
2909 As an exception to the preceding rule, the compiler may not assume execution
2910 will continue after a volatile store operation. This restriction is necessary
2911 to support the somewhat common pattern in C of intentionally storing to an
2912 invalid pointer to crash the program. In the future, it might make sense to
2913 allow frontends to control this behavior.
2914
2915 IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy
2916 or llvm.memmove intrinsics even when those intrinsics are flagged volatile.
2917 Likewise, the backend should never split or merge target-legal volatile
2918 load/store instructions. Similarly, IR-level volatile loads and stores cannot
2919 change from integer to floating-point or vice versa.
2920
2921 .. admonition:: Rationale
2922
2923  Platforms may rely on volatile loads and stores of natively supported
2924  data width to be executed as single instruction. For example, in C
2925  this holds for an l-value of volatile primitive type with native
2926  hardware support, but not necessarily for aggregate types. The
2927  frontend upholds these expectations, which are intentionally
2928  unspecified in the IR. The rules above ensure that IR transformations
2929  do not violate the frontend's contract with the language.
2930
2931 .. _memmodel:
2932
2933 Memory Model for Concurrent Operations
2934 --------------------------------------
2935
2936 The LLVM IR does not define any way to start parallel threads of
2937 execution or to register signal handlers. Nonetheless, there are
2938 platform-specific ways to create them, and we define LLVM IR's behavior
2939 in their presence. This model is inspired by the C++0x memory model.
2940
2941 For a more informal introduction to this model, see the :doc:`Atomics`.
2942
2943 We define a *happens-before* partial order as the least partial order
2944 that
2945
2946 -  Is a superset of single-thread program order, and
2947 -  When a *synchronizes-with* ``b``, includes an edge from ``a`` to
2948    ``b``. *Synchronizes-with* pairs are introduced by platform-specific
2949    techniques, like pthread locks, thread creation, thread joining,
2950    etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering
2951    Constraints <ordering>`).
2952
2953 Note that program order does not introduce *happens-before* edges
2954 between a thread and signals executing inside that thread.
2955
2956 Every (defined) read operation (load instructions, memcpy, atomic
2957 loads/read-modify-writes, etc.) R reads a series of bytes written by
2958 (defined) write operations (store instructions, atomic
2959 stores/read-modify-writes, memcpy, etc.). For the purposes of this
2960 section, initialized globals are considered to have a write of the
2961 initializer which is atomic and happens before any other read or write
2962 of the memory in question. For each byte of a read R, R\ :sub:`byte`
2963 may see any write to the same byte, except:
2964
2965 -  If write\ :sub:`1`  happens before write\ :sub:`2`, and
2966    write\ :sub:`2` happens before R\ :sub:`byte`, then
2967    R\ :sub:`byte` does not see write\ :sub:`1`.
2968 -  If R\ :sub:`byte` happens before write\ :sub:`3`, then
2969    R\ :sub:`byte` does not see write\ :sub:`3`.
2970
2971 Given that definition, R\ :sub:`byte` is defined as follows:
2972
2973 -  If R is volatile, the result is target-dependent. (Volatile is
2974    supposed to give guarantees which can support ``sig_atomic_t`` in
2975    C/C++, and may be used for accesses to addresses that do not behave
2976    like normal memory. It does not generally provide cross-thread
2977    synchronization.)
2978 -  Otherwise, if there is no write to the same byte that happens before
2979    R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte.
2980 -  Otherwise, if R\ :sub:`byte` may see exactly one write,
2981    R\ :sub:`byte` returns the value written by that write.
2982 -  Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
2983    see are atomic, it chooses one of the values written. See the :ref:`Atomic
2984    Memory Ordering Constraints <ordering>` section for additional
2985    constraints on how the choice is made.
2986 -  Otherwise R\ :sub:`byte` returns ``undef``.
2987
2988 R returns the value composed of the series of bytes it read. This
2989 implies that some bytes within the value may be ``undef`` **without**
2990 the entire value being ``undef``. Note that this only defines the
2991 semantics of the operation; it doesn't mean that targets will emit more
2992 than one instruction to read the series of bytes.
2993
2994 Note that in cases where none of the atomic intrinsics are used, this
2995 model places only one restriction on IR transformations on top of what
2996 is required for single-threaded execution: introducing a store to a byte
2997 which might not otherwise be stored is not allowed in general.
2998 (Specifically, in the case where another thread might write to and read
2999 from an address, introducing a store can change a load that may see
3000 exactly one write into a load that may see multiple writes.)
3001
3002 .. _ordering:
3003
3004 Atomic Memory Ordering Constraints
3005 ----------------------------------
3006
3007 Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`,
3008 :ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`,
3009 :ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take
3010 ordering parameters that determine which other atomic instructions on
3011 the same address they *synchronize with*. These semantics are borrowed
3012 from Java and C++0x, but are somewhat more colloquial. If these
3013 descriptions aren't precise enough, check those specs (see spec
3014 references in the :doc:`atomics guide <Atomics>`).
3015 :ref:`fence <i_fence>` instructions treat these orderings somewhat
3016 differently since they don't take an address. See that instruction's
3017 documentation for details.
3018
3019 For a simpler introduction to the ordering constraints, see the
3020 :doc:`Atomics`.
3021
3022 ``unordered``
3023     The set of values that can be read is governed by the happens-before
3024     partial order. A value cannot be read unless some operation wrote
3025     it. This is intended to provide a guarantee strong enough to model
3026     Java's non-volatile shared variables. This ordering cannot be
3027     specified for read-modify-write operations; it is not strong enough
3028     to make them atomic in any interesting way.
3029 ``monotonic``
3030     In addition to the guarantees of ``unordered``, there is a single
3031     total order for modifications by ``monotonic`` operations on each
3032     address. All modification orders must be compatible with the
3033     happens-before order. There is no guarantee that the modification
3034     orders can be combined to a global total order for the whole program
3035     (and this often will not be possible). The read in an atomic
3036     read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
3037     :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification
3038     order immediately before the value it writes. If one atomic read
3039     happens before another atomic read of the same address, the later
3040     read must see the same value or a later value in the address's
3041     modification order. This disallows reordering of ``monotonic`` (or
3042     stronger) operations on the same address. If an address is written
3043     ``monotonic``-ally by one thread, and other threads ``monotonic``-ally
3044     read that address repeatedly, the other threads must eventually see
3045     the write. This corresponds to the C++0x/C1x
3046     ``memory_order_relaxed``.
3047 ``acquire``
3048     In addition to the guarantees of ``monotonic``, a
3049     *synchronizes-with* edge may be formed with a ``release`` operation.
3050     This is intended to model C++'s ``memory_order_acquire``.
3051 ``release``
3052     In addition to the guarantees of ``monotonic``, if this operation
3053     writes a value which is subsequently read by an ``acquire``
3054     operation, it *synchronizes-with* that operation. (This isn't a
3055     complete description; see the C++0x definition of a release
3056     sequence.) This corresponds to the C++0x/C1x
3057     ``memory_order_release``.
3058 ``acq_rel`` (acquire+release)
3059     Acts as both an ``acquire`` and ``release`` operation on its
3060     address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``.
3061 ``seq_cst`` (sequentially consistent)
3062     In addition to the guarantees of ``acq_rel`` (``acquire`` for an
3063     operation that only reads, ``release`` for an operation that only
3064     writes), there is a global total order on all
3065     sequentially-consistent operations on all addresses, which is
3066     consistent with the *happens-before* partial order and with the
3067     modification orders of all the affected addresses. Each
3068     sequentially-consistent read sees the last preceding write to the
3069     same address in this global order. This corresponds to the C++0x/C1x
3070     ``memory_order_seq_cst`` and Java volatile.
3071
3072 .. _syncscope:
3073
3074 If an atomic operation is marked ``syncscope("singlethread")``, it only
3075 *synchronizes with* and only participates in the seq\_cst total orderings of
3076 other operations running in the same thread (for example, in signal handlers).
3077
3078 If an atomic operation is marked ``syncscope("<target-scope>")``, where
3079 ``<target-scope>`` is a target specific synchronization scope, then it is target
3080 dependent if it *synchronizes with* and participates in the seq\_cst total
3081 orderings of other operations.
3082
3083 Otherwise, an atomic operation that is not marked ``syncscope("singlethread")``
3084 or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the
3085 seq\_cst total orderings of other operations that are not marked
3086 ``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.
3087
3088 .. _floatenv:
3089
3090 Floating-Point Environment
3091 --------------------------
3092
3093 The default LLVM floating-point environment assumes that floating-point
3094 instructions do not have side effects. Results assume the round-to-nearest
3095 rounding mode. No floating-point exception state is maintained in this
3096 environment. Therefore, there is no attempt to create or preserve invalid
3097 operation (SNaN) or division-by-zero exceptions.
3098
3099 The benefit of this exception-free assumption is that floating-point
3100 operations may be speculated freely without any other fast-math relaxations
3101 to the floating-point model.
3102
3103 Code that requires different behavior than this should use the
3104 :ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
3105
3106 .. _fastmath:
3107
3108 Fast-Math Flags
3109 ---------------
3110
3111 LLVM IR floating-point operations (:ref:`fneg <i_fneg>`, :ref:`fadd <i_fadd>`,
3112 :ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
3113 :ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`), :ref:`phi <i_phi>`,
3114 :ref:`select <i_select>` and :ref:`call <i_call>`
3115 may use the following flags to enable otherwise unsafe
3116 floating-point transformations.
3117
3118 ``nnan``
3119    No NaNs - Allow optimizations to assume the arguments and result are not
3120    NaN. If an argument is a nan, or the result would be a nan, it produces
3121    a :ref:`poison value <poisonvalues>` instead.
3122
3123 ``ninf``
3124    No Infs - Allow optimizations to assume the arguments and result are not
3125    +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it
3126    produces a :ref:`poison value <poisonvalues>` instead.
3127
3128 ``nsz``
3129    No Signed Zeros - Allow optimizations to treat the sign of a zero
3130    argument or result as insignificant. This does not imply that -0.0
3131    is poison and/or guaranteed to not exist in the operation.
3132
3133 ``arcp``
3134    Allow Reciprocal - Allow optimizations to use the reciprocal of an
3135    argument rather than perform division.
3136
3137 ``contract``
3138    Allow floating-point contraction (e.g. fusing a multiply followed by an
3139    addition into a fused multiply-and-add). This does not enable reassociating
3140    to form arbitrary contractions. For example, ``(a*b) + (c*d) + e`` can not
3141    be transformed into ``(a*b) + ((c*d) + e)`` to create two fma operations.
3142
3143 ``afn``
3144    Approximate functions - Allow substitution of approximate calculations for
3145    functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
3146    for places where this can apply to LLVM's intrinsic math functions.
3147
3148 ``reassoc``
3149    Allow reassociation transformations for floating-point instructions.
3150    This may dramatically change results in floating-point.
3151
3152 ``fast``
3153    This flag implies all of the others.
3154
3155 .. _uselistorder:
3156
3157 Use-list Order Directives
3158 -------------------------
3159
3160 Use-list directives encode the in-memory order of each use-list, allowing the
3161 order to be recreated. ``<order-indexes>`` is a comma-separated list of
3162 indexes that are assigned to the referenced value's uses. The referenced
3163 value's use-list is immediately sorted by these indexes.
3164
3165 Use-list directives may appear at function scope or global scope. They are not
3166 instructions, and have no effect on the semantics of the IR. When they're at
3167 function scope, they must appear after the terminator of the final basic block.
3168
3169 If basic blocks have their address taken via ``blockaddress()`` expressions,
3170 ``uselistorder_bb`` can be used to reorder their use-lists from outside their
3171 function's scope.
3172
3173 :Syntax:
3174
3175 ::
3176
3177     uselistorder <ty> <value>, { <order-indexes> }
3178     uselistorder_bb @function, %block { <order-indexes> }
3179
3180 :Examples:
3181
3182 ::
3183
3184     define void @foo(i32 %arg1, i32 %arg2) {
3185     entry:
3186       ; ... instructions ...
3187     bb:
3188       ; ... instructions ...
3189
3190       ; At function scope.
3191       uselistorder i32 %arg1, { 1, 0, 2 }
3192       uselistorder label %bb, { 1, 0 }
3193     }
3194
3195     ; At global scope.
3196     uselistorder i32* @global, { 1, 2, 0 }
3197     uselistorder i32 7, { 1, 0 }
3198     uselistorder i32 (i32) @bar, { 1, 0 }
3199     uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 }
3200
3201 .. _source_filename:
3202
3203 Source Filename
3204 ---------------
3205
3206 The *source filename* string is set to the original module identifier,
3207 which will be the name of the compiled source file when compiling from
3208 source through the clang front end, for example. It is then preserved through
3209 the IR and bitcode.
3210
3211 This is currently necessary to generate a consistent unique global
3212 identifier for local functions used in profile data, which prepends the
3213 source file name to the local function name.
3214
3215 The syntax for the source file name is simply:
3216
3217 .. code-block:: text
3218
3219     source_filename = "/path/to/source.c"
3220
3221 .. _typesystem:
3222
3223 Type System
3224 ===========
3225
3226 The LLVM type system is one of the most important features of the
3227 intermediate representation. Being typed enables a number of
3228 optimizations to be performed on the intermediate representation
3229 directly, without having to do extra analyses on the side before the
3230 transformation. A strong type system makes it easier to read the
3231 generated code and enables novel analyses and transformations that are
3232 not feasible to perform on normal three address code representations.
3233
3234 .. _t_void:
3235
3236 Void Type
3237 ---------
3238
3239 :Overview:
3240
3241
3242 The void type does not represent any value and has no size.
3243
3244 :Syntax:
3245
3246
3247 ::
3248
3249       void
3250
3251
3252 .. _t_function:
3253
3254 Function Type
3255 -------------
3256
3257 :Overview:
3258
3259
3260 The function type can be thought of as a function signature. It consists of a
3261 return type and a list of formal parameter types. The return type of a function
3262 type is a void type or first class type --- except for :ref:`label <t_label>`
3263 and :ref:`metadata <t_metadata>` types.
3264
3265 :Syntax:
3266
3267 ::
3268
3269       <returntype> (<parameter list>)
3270
3271 ...where '``<parameter list>``' is a comma-separated list of type
3272 specifiers. Optionally, the parameter list may include a type ``...``, which
3273 indicates that the function takes a variable number of arguments. Variable
3274 argument functions can access their arguments with the :ref:`variable argument
3275 handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type
3276 except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`.
3277
3278 :Examples:
3279
3280 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3281 | ``i32 (i32)``                   | function taking an ``i32``, returning an ``i32``                                                                                                                    |
3282 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3283 | ``float (i16, i32 *) *``        | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``.                                    |
3284 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3285 | ``i32 (i8*, ...)``              | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. |
3286 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3287 | ``{i32, i32} (i32)``            | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values                                                                 |
3288 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3289
3290 .. _t_firstclass:
3291
3292 First Class Types
3293 -----------------
3294
3295 The :ref:`first class <t_firstclass>` types are perhaps the most important.
3296 Values of these types are the only ones which can be produced by
3297 instructions.
3298
3299 .. _t_single_value:
3300
3301 Single Value Types
3302 ^^^^^^^^^^^^^^^^^^
3303
3304 These are the types that are valid in registers from CodeGen's perspective.
3305
3306 .. _t_integer:
3307
3308 Integer Type
3309 """"""""""""
3310
3311 :Overview:
3312
3313 The integer type is a very simple type that simply specifies an
3314 arbitrary bit width for the integer type desired. Any bit width from 1
3315 bit to 2\ :sup:`23`\ (about 8 million) can be specified.
3316
3317 :Syntax:
3318
3319 ::
3320
3321       iN
3322
3323 The number of bits the integer will occupy is specified by the ``N``
3324 value.
3325
3326 Examples:
3327 *********
3328
3329 +----------------+------------------------------------------------+
3330 | ``i1``         | a single-bit integer.                          |
3331 +----------------+------------------------------------------------+
3332 | ``i32``        | a 32-bit integer.                              |
3333 +----------------+------------------------------------------------+
3334 | ``i1942652``   | a really big integer of over 1 million bits.   |
3335 +----------------+------------------------------------------------+
3336
3337 .. _t_floating:
3338
3339 Floating-Point Types
3340 """"""""""""""""""""
3341
3342 .. list-table::
3343    :header-rows: 1
3344
3345    * - Type
3346      - Description
3347
3348    * - ``half``
3349      - 16-bit floating-point value
3350
3351    * - ``bfloat``
3352      - 16-bit "brain" floating-point value (7-bit significand).  Provides the
3353        same number of exponent bits as ``float``, so that it matches its dynamic
3354        range, but with greatly reduced precision.  Used in Intel's AVX-512 BF16
3355        extensions and Arm's ARMv8.6-A extensions, among others.
3356
3357    * - ``float``
3358      - 32-bit floating-point value
3359
3360    * - ``double``
3361      - 64-bit floating-point value
3362
3363    * - ``fp128``
3364      - 128-bit floating-point value (113-bit significand)
3365
3366    * - ``x86_fp80``
3367      -  80-bit floating-point value (X87)
3368
3369    * - ``ppc_fp128``
3370      - 128-bit floating-point value (two 64-bits)
3371
3372 The binary format of half, float, double, and fp128 correspond to the
3373 IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128
3374 respectively.
3375
3376 X86_amx Type
3377 """"""""""""
3378
3379 :Overview:
3380
3381 The x86_amx type represents a value held in an AMX tile register on an x86
3382 machine. The operations allowed on it are quite limited. Only few intrinsics
3383 are allowed: stride load and store, zero and dot product. No instruction is
3384 allowed for this type. There are no arguments, arrays, pointers, vectors
3385 or constants of this type.
3386
3387 :Syntax:
3388
3389 ::
3390
3391       x86_amx
3392
3393
3394 X86_mmx Type
3395 """"""""""""
3396
3397 :Overview:
3398
3399 The x86_mmx type represents a value held in an MMX register on an x86
3400 machine. The operations allowed on it are quite limited: parameters and
3401 return values, load and store, and bitcast. User-specified MMX
3402 instructions are represented as intrinsic or asm calls with arguments
3403 and/or results of this type. There are no arrays, vectors or constants
3404 of this type.
3405
3406 :Syntax:
3407
3408 ::
3409
3410       x86_mmx
3411
3412
3413 .. _t_pointer:
3414
3415 Pointer Type
3416 """"""""""""
3417
3418 :Overview:
3419
3420 The pointer type is used to specify memory locations. Pointers are
3421 commonly used to reference objects in memory.
3422
3423 Pointer types may have an optional address space attribute defining the
3424 numbered address space where the pointed-to object resides. The default
3425 address space is number zero. The semantics of non-zero address spaces
3426 are target-specific.
3427
3428 Note that LLVM does not permit pointers to void (``void*``) nor does it
3429 permit pointers to labels (``label*``). Use ``i8*`` instead.
3430
3431 LLVM is in the process of transitioning to
3432 `opaque pointers <OpaquePointers.html#opaque-pointers>`_.
3433 Opaque pointers do not have a pointee type. Rather, instructions
3434 interacting through pointers specify the type of the underlying memory
3435 they are interacting with. Opaque pointers are still in the process of
3436 being worked on and are not complete.
3437
3438 :Syntax:
3439
3440 ::
3441
3442       <type> *
3443       ptr
3444
3445 :Examples:
3446
3447 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3448 | ``[4 x i32]*``          | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values.                               |
3449 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3450 | ``i32 (i32*) *``        | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. |
3451 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3452 | ``i32 addrspace(5)*``   | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space 5.                            |
3453 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3454 | ``ptr``                 | An opaque pointer type to a value that resides in address space 0.                                           |
3455 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3456 | ``ptr addrspace(5)``    | An opaque pointer type to a value that resides in address space 5.                                           |
3457 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3458
3459 .. _t_vector:
3460
3461 Vector Type
3462 """""""""""
3463
3464 :Overview:
3465
3466 A vector type is a simple derived type that represents a vector of
3467 elements. Vector types are used when multiple primitive data are
3468 operated in parallel using a single instruction (SIMD). A vector type
3469 requires a size (number of elements), an underlying primitive data type,
3470 and a scalable property to represent vectors where the exact hardware
3471 vector length is unknown at compile time. Vector types are considered
3472 :ref:`first class <t_firstclass>`.
3473
3474 :Memory Layout:
3475
3476 In general vector elements are laid out in memory in the same way as
3477 :ref:`array types <t_array>`. Such an analogy works fine as long as the vector
3478 elements are byte sized. However, when the elements of the vector aren't byte
3479 sized it gets a bit more complicated. One way to describe the layout is by
3480 describing what happens when a vector such as <N x iM> is bitcasted to an
3481 integer type with N*M bits, and then following the rules for storing such an
3482 integer to memory.
3483
3484 A bitcast from a vector type to a scalar integer type will see the elements
3485 being packed together (without padding). The order in which elements are
3486 inserted in the integer depends on endianess. For little endian element zero
3487 is put in the least significant bits of the integer, and for big endian
3488 element zero is put in the most significant bits.
3489
3490 Using a vector such as ``<i4 1, i4 2, i4 3, i4 5>`` as an example, together
3491 with the analogy that we can replace a vector store by a bitcast followed by
3492 an integer store, we get this for big endian:
3493
3494 .. code-block:: llvm
3495
3496       %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
3497
3498       ; Bitcasting from a vector to an integral type can be seen as
3499       ; concatenating the values:
3500       ;   %val now has the hexadecimal value 0x1235.
3501
3502       store i16 %val, i16* %ptr
3503
3504       ; In memory the content will be (8-bit addressing):
3505       ;
3506       ;    [%ptr + 0]: 00010010  (0x12)
3507       ;    [%ptr + 1]: 00110101  (0x35)
3508
3509 The same example for little endian:
3510
3511 .. code-block:: llvm
3512
3513       %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
3514
3515       ; Bitcasting from a vector to an integral type can be seen as
3516       ; concatenating the values:
3517       ;   %val now has the hexadecimal value 0x5321.
3518
3519       store i16 %val, i16* %ptr
3520
3521       ; In memory the content will be (8-bit addressing):
3522       ;
3523       ;    [%ptr + 0]: 01010011  (0x53)
3524       ;    [%ptr + 1]: 00100001  (0x21)
3525
3526 When ``<N*M>`` isn't evenly divisible by the byte size the exact memory layout
3527 is unspecified (just like it is for an integral type of the same size). This
3528 is because different targets could put the padding at different positions when
3529 the type size is smaller than the type's store size.
3530
3531 :Syntax:
3532
3533 ::
3534
3535       < <# elements> x <elementtype> >          ; Fixed-length vector
3536       < vscale x <# elements> x <elementtype> > ; Scalable vector
3537
3538 The number of elements is a constant integer value larger than 0;
3539 elementtype may be any integer, floating-point or pointer type. Vectors
3540 of size zero are not allowed. For scalable vectors, the total number of
3541 elements is a constant multiple (called vscale) of the specified number
3542 of elements; vscale is a positive integer that is unknown at compile time
3543 and the same hardware-dependent constant for all scalable vectors at run
3544 time. The size of a specific scalable vector type is thus constant within
3545 IR, even if the exact size in bytes cannot be determined until run time.
3546
3547 :Examples:
3548
3549 +------------------------+----------------------------------------------------+
3550 | ``<4 x i32>``          | Vector of 4 32-bit integer values.                 |
3551 +------------------------+----------------------------------------------------+
3552 | ``<8 x float>``        | Vector of 8 32-bit floating-point values.          |
3553 +------------------------+----------------------------------------------------+
3554 | ``<2 x i64>``          | Vector of 2 64-bit integer values.                 |
3555 +------------------------+----------------------------------------------------+
3556 | ``<4 x i64*>``         | Vector of 4 pointers to 64-bit integer values.     |
3557 +------------------------+----------------------------------------------------+
3558 | ``<vscale x 4 x i32>`` | Vector with a multiple of 4 32-bit integer values. |
3559 +------------------------+----------------------------------------------------+
3560
3561 .. _t_label:
3562
3563 Label Type
3564 ^^^^^^^^^^
3565
3566 :Overview:
3567
3568 The label type represents code labels.
3569
3570 :Syntax:
3571
3572 ::
3573
3574       label
3575
3576 .. _t_token:
3577
3578 Token Type
3579 ^^^^^^^^^^
3580
3581 :Overview:
3582
3583 The token type is used when a value is associated with an instruction
3584 but all uses of the value must not attempt to introspect or obscure it.
3585 As such, it is not appropriate to have a :ref:`phi <i_phi>` or
3586 :ref:`select <i_select>` of type token.
3587
3588 :Syntax:
3589
3590 ::
3591
3592       token
3593
3594
3595
3596 .. _t_metadata:
3597
3598 Metadata Type
3599 ^^^^^^^^^^^^^
3600
3601 :Overview:
3602
3603 The metadata type represents embedded metadata. No derived types may be
3604 created from metadata except for :ref:`function <t_function>` arguments.
3605
3606 :Syntax:
3607
3608 ::
3609
3610       metadata
3611
3612 .. _t_aggregate:
3613
3614 Aggregate Types
3615 ^^^^^^^^^^^^^^^
3616
3617 Aggregate Types are a subset of derived types that can contain multiple
3618 member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are
3619 aggregate types. :ref:`Vectors <t_vector>` are not considered to be
3620 aggregate types.
3621
3622 .. _t_array:
3623
3624 Array Type
3625 """"""""""
3626
3627 :Overview:
3628
3629 The array type is a very simple derived type that arranges elements
3630 sequentially in memory. The array type requires a size (number of
3631 elements) and an underlying data type.
3632
3633 :Syntax:
3634
3635 ::
3636
3637       [<# elements> x <elementtype>]
3638
3639 The number of elements is a constant integer value; ``elementtype`` may
3640 be any type with a size.
3641
3642 :Examples:
3643
3644 +------------------+--------------------------------------+
3645 | ``[40 x i32]``   | Array of 40 32-bit integer values.   |
3646 +------------------+--------------------------------------+
3647 | ``[41 x i32]``   | Array of 41 32-bit integer values.   |
3648 +------------------+--------------------------------------+
3649 | ``[4 x i8]``     | Array of 4 8-bit integer values.     |
3650 +------------------+--------------------------------------+
3651
3652 Here are some examples of multidimensional arrays:
3653
3654 +-----------------------------+----------------------------------------------------------+
3655 | ``[3 x [4 x i32]]``         | 3x4 array of 32-bit integer values.                      |
3656 +-----------------------------+----------------------------------------------------------+
3657 | ``[12 x [10 x float]]``     | 12x10 array of single precision floating-point values.   |
3658 +-----------------------------+----------------------------------------------------------+
3659 | ``[2 x [3 x [4 x i16]]]``   | 2x3x4 array of 16-bit integer values.                    |
3660 +-----------------------------+----------------------------------------------------------+
3661
3662 There is no restriction on indexing beyond the end of the array implied
3663 by a static type (though there are restrictions on indexing beyond the
3664 bounds of an allocated object in some cases). This means that
3665 single-dimension 'variable sized array' addressing can be implemented in
3666 LLVM with a zero length array type. An implementation of 'pascal style
3667 arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for
3668 example.
3669
3670 .. _t_struct:
3671
3672 Structure Type
3673 """"""""""""""
3674
3675 :Overview:
3676
3677 The structure type is used to represent a collection of data members
3678 together in memory. The elements of a structure may be any type that has
3679 a size.
3680
3681 Structures in memory are accessed using '``load``' and '``store``' by
3682 getting a pointer to a field with the '``getelementptr``' instruction.
3683 Structures in registers are accessed using the '``extractvalue``' and
3684 '``insertvalue``' instructions.
3685
3686 Structures may optionally be "packed" structures, which indicate that
3687 the alignment of the struct is one byte, and that there is no padding
3688 between the elements. In non-packed structs, padding between field types
3689 is inserted as defined by the DataLayout string in the module, which is
3690 required to match what the underlying code generator expects.
3691
3692 Structures can either be "literal" or "identified". A literal structure
3693 is defined inline with other types (e.g. ``{i32, i32}*``) whereas
3694 identified types are always defined at the top level with a name.
3695 Literal types are uniqued by their contents and can never be recursive
3696 or opaque since there is no way to write one. Identified types can be
3697 recursive, can be opaqued, and are never uniqued.
3698
3699 :Syntax:
3700
3701 ::
3702
3703       %T1 = type { <type list> }     ; Identified normal struct type
3704       %T2 = type <{ <type list> }>   ; Identified packed struct type
3705
3706 :Examples:
3707
3708 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3709 | ``{ i32, i32, i32 }``        | A triple of three ``i32`` values                                                                                                                                                      |
3710 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3711 | ``{ float, i32 (i32) * }``   | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``.  |
3712 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3713 | ``<{ i8, i32 }>``            | A packed struct known to be 5 bytes in size.                                                                                                                                          |
3714 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3715
3716 .. _t_opaque:
3717
3718 Opaque Structure Types
3719 """"""""""""""""""""""
3720
3721 :Overview:
3722
3723 Opaque structure types are used to represent structure types that
3724 do not have a body specified. This corresponds (for example) to the C
3725 notion of a forward declared structure. They can be named (``%X``) or
3726 unnamed (``%52``).
3727
3728 :Syntax:
3729
3730 ::
3731
3732       %X = type opaque
3733       %52 = type opaque
3734
3735 :Examples:
3736
3737 +--------------+-------------------+
3738 | ``opaque``   | An opaque type.   |
3739 +--------------+-------------------+
3740
3741 .. _constants:
3742
3743 Constants
3744 =========
3745
3746 LLVM has several different basic types of constants. This section
3747 describes them all and their syntax.
3748
3749 Simple Constants
3750 ----------------
3751
3752 **Boolean constants**
3753     The two strings '``true``' and '``false``' are both valid constants
3754     of the ``i1`` type.
3755 **Integer constants**
3756     Standard integers (such as '4') are constants of the
3757     :ref:`integer <t_integer>` type. Negative numbers may be used with
3758     integer types.
3759 **Floating-point constants**
3760     Floating-point constants use standard decimal notation (e.g.
3761     123.421), exponential notation (e.g. 1.23421e+2), or a more precise
3762     hexadecimal notation (see below). The assembler requires the exact
3763     decimal value of a floating-point constant. For example, the
3764     assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
3765     decimal in binary. Floating-point constants must have a
3766     :ref:`floating-point <t_floating>` type.
3767 **Null pointer constants**
3768     The identifier '``null``' is recognized as a null pointer constant
3769     and must be of :ref:`pointer type <t_pointer>`.
3770 **Token constants**
3771     The identifier '``none``' is recognized as an empty token constant
3772     and must be of :ref:`token type <t_token>`.
3773
3774 The one non-intuitive notation for constants is the hexadecimal form of
3775 floating-point constants. For example, the form
3776 '``double    0x432ff973cafa8000``' is equivalent to (but harder to read
3777 than) '``double 4.5e+15``'. The only time hexadecimal floating-point
3778 constants are required (and the only time that they are generated by the
3779 disassembler) is when a floating-point constant must be emitted but it
3780 cannot be represented as a decimal floating-point number in a reasonable
3781 number of digits. For example, NaN's, infinities, and other special
3782 values are represented in their IEEE hexadecimal format so that assembly
3783 and disassembly do not cause any bits to change in the constants.
3784
3785 When using the hexadecimal form, constants of types bfloat, half, float, and
3786 double are represented using the 16-digit form shown above (which matches the
3787 IEEE754 representation for double); bfloat, half and float values must, however,
3788 be exactly representable as bfloat, IEEE 754 half, and IEEE 754 single
3789 precision respectively. Hexadecimal format is always used for long double, and
3790 there are three forms of long double. The 80-bit format used by x86 is
3791 represented as ``0xK`` followed by 20 hexadecimal digits. The 128-bit format
3792 used by PowerPC (two adjacent doubles) is represented by ``0xM`` followed by 32
3793 hexadecimal digits. The IEEE 128-bit format is represented by ``0xL`` followed
3794 by 32 hexadecimal digits. Long doubles will only work if they match the long
3795 double format on your target.  The IEEE 16-bit format (half precision) is
3796 represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit
3797 format is represented by ``0xR`` followed by 4 hexadecimal digits. All
3798 hexadecimal formats are big-endian (sign bit at the left).
3799
3800 There are no constants of type x86_mmx and x86_amx.
3801
3802 .. _complexconstants:
3803
3804 Complex Constants
3805 -----------------
3806
3807 Complex constants are a (potentially recursive) combination of simple
3808 constants and smaller complex constants.
3809
3810 **Structure constants**
3811     Structure constants are represented with notation similar to
3812     structure type definitions (a comma separated list of elements,
3813     surrounded by braces (``{}``)). For example:
3814     "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as
3815     "``@G = external global i32``". Structure constants must have
3816     :ref:`structure type <t_struct>`, and the number and types of elements
3817     must match those specified by the type.
3818 **Array constants**
3819     Array constants are represented with notation similar to array type
3820     definitions (a comma separated list of elements, surrounded by
3821     square brackets (``[]``)). For example:
3822     "``[ i32 42, i32 11, i32 74 ]``". Array constants must have
3823     :ref:`array type <t_array>`, and the number and types of elements must
3824     match those specified by the type. As a special case, character array
3825     constants may also be represented as a double-quoted string using the ``c``
3826     prefix. For example: "``c"Hello World\0A\00"``".
3827 **Vector constants**
3828     Vector constants are represented with notation similar to vector
3829     type definitions (a comma separated list of elements, surrounded by
3830     less-than/greater-than's (``<>``)). For example:
3831     "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants
3832     must have :ref:`vector type <t_vector>`, and the number and types of
3833     elements must match those specified by the type.
3834 **Zero initialization**
3835     The string '``zeroinitializer``' can be used to zero initialize a
3836     value to zero of *any* type, including scalar and
3837     :ref:`aggregate <t_aggregate>` types. This is often used to avoid
3838     having to print large zero initializers (e.g. for large arrays) and
3839     is always exactly equivalent to using explicit zero initializers.
3840 **Metadata node**
3841     A metadata node is a constant tuple without types. For example:
3842     "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values,
3843     for example: "``!{!0, i32 0, i8* @global, i64 (i64)* @function, !"str"}``".
3844     Unlike other typed constants that are meant to be interpreted as part of
3845     the instruction stream, metadata is a place to attach additional
3846     information such as debug info.
3847
3848 Global Variable and Function Addresses
3849 --------------------------------------
3850
3851 The addresses of :ref:`global variables <globalvars>` and
3852 :ref:`functions <functionstructure>` are always implicitly valid
3853 (link-time) constants. These constants are explicitly referenced when
3854 the :ref:`identifier for the global <identifiers>` is used and always have
3855 :ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM
3856 file:
3857
3858 .. code-block:: llvm
3859
3860     @X = global i32 17
3861     @Y = global i32 42
3862     @Z = global [2 x i32*] [ i32* @X, i32* @Y ]
3863
3864 .. _undefvalues:
3865
3866 Undefined Values
3867 ----------------
3868
3869 The string '``undef``' can be used anywhere a constant is expected, and
3870 indicates that the user of the value may receive an unspecified
3871 bit-pattern. Undefined values may be of any type (other than '``label``'
3872 or '``void``') and be used anywhere a constant is permitted.
3873
3874 Undefined values are useful because they indicate to the compiler that
3875 the program is well defined no matter what value is used. This gives the
3876 compiler more freedom to optimize. Here are some examples of
3877 (potentially surprising) transformations that are valid (in pseudo IR):
3878
3879 .. code-block:: llvm
3880
3881       %A = add %X, undef
3882       %B = sub %X, undef
3883       %C = xor %X, undef
3884     Safe:
3885       %A = undef
3886       %B = undef
3887       %C = undef
3888
3889 This is safe because all of the output bits are affected by the undef
3890 bits. Any output bit can have a zero or one depending on the input bits.
3891
3892 .. code-block:: llvm
3893
3894       %A = or %X, undef
3895       %B = and %X, undef
3896     Safe:
3897       %A = -1
3898       %B = 0
3899     Safe:
3900       %A = %X  ;; By choosing undef as 0
3901       %B = %X  ;; By choosing undef as -1
3902     Unsafe:
3903       %A = undef
3904       %B = undef
3905
3906 These logical operations have bits that are not always affected by the
3907 input. For example, if ``%X`` has a zero bit, then the output of the
3908 '``and``' operation will always be a zero for that bit, no matter what
3909 the corresponding bit from the '``undef``' is. As such, it is unsafe to
3910 optimize or assume that the result of the '``and``' is '``undef``'.
3911 However, it is safe to assume that all bits of the '``undef``' could be
3912 0, and optimize the '``and``' to 0. Likewise, it is safe to assume that
3913 all the bits of the '``undef``' operand to the '``or``' could be set,
3914 allowing the '``or``' to be folded to -1.
3915
3916 .. code-block:: llvm
3917
3918       %A = select undef, %X, %Y
3919       %B = select undef, 42, %Y
3920       %C = select %X, %Y, undef
3921     Safe:
3922       %A = %X     (or %Y)
3923       %B = 42     (or %Y)
3924       %C = %Y
3925     Unsafe:
3926       %A = undef
3927       %B = undef
3928       %C = undef
3929
3930 This set of examples shows that undefined '``select``' (and conditional
3931 branch) conditions can go *either way*, but they have to come from one
3932 of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were
3933 both known to have a clear low bit, then ``%A`` would have to have a
3934 cleared low bit. However, in the ``%C`` example, the optimizer is
3935 allowed to assume that the '``undef``' operand could be the same as
3936 ``%Y``, allowing the whole '``select``' to be eliminated.
3937
3938 .. code-block:: llvm
3939
3940       %A = xor undef, undef
3941
3942       %B = undef
3943       %C = xor %B, %B
3944
3945       %D = undef
3946       %E = icmp slt %D, 4
3947       %F = icmp gte %D, 4
3948
3949     Safe:
3950       %A = undef
3951       %B = undef
3952       %C = undef
3953       %D = undef
3954       %E = undef
3955       %F = undef
3956
3957 This example points out that two '``undef``' operands are not
3958 necessarily the same. This can be surprising to people (and also matches
3959 C semantics) where they assume that "``X^X``" is always zero, even if
3960 ``X`` is undefined. This isn't true for a number of reasons, but the
3961 short answer is that an '``undef``' "variable" can arbitrarily change
3962 its value over its "live range". This is true because the variable
3963 doesn't actually *have a live range*. Instead, the value is logically
3964 read from arbitrary registers that happen to be around when needed, so
3965 the value is not necessarily consistent over time. In fact, ``%A`` and
3966 ``%C`` need to have the same semantics or the core LLVM "replace all
3967 uses with" concept would not hold.
3968
3969 To ensure all uses of a given register observe the same value (even if
3970 '``undef``'), the :ref:`freeze instruction <i_freeze>` can be used.
3971
3972 .. code-block:: llvm
3973
3974       %A = sdiv undef, %X
3975       %B = sdiv %X, undef
3976     Safe:
3977       %A = 0
3978     b: unreachable
3979
3980 These examples show the crucial difference between an *undefined value*
3981 and *undefined behavior*. An undefined value (like '``undef``') is
3982 allowed to have an arbitrary bit-pattern. This means that the ``%A``
3983 operation can be constant folded to '``0``', because the '``undef``'
3984 could be zero, and zero divided by any value is zero.
3985 However, in the second example, we can make a more aggressive
3986 assumption: because the ``undef`` is allowed to be an arbitrary value,
3987 we are allowed to assume that it could be zero. Since a divide by zero
3988 has *undefined behavior*, we are allowed to assume that the operation
3989 does not execute at all. This allows us to delete the divide and all
3990 code after it. Because the undefined operation "can't happen", the
3991 optimizer can assume that it occurs in dead code.
3992
3993 .. code-block:: text
3994
3995     a:  store undef -> %X
3996     b:  store %X -> undef
3997     Safe:
3998     a: <deleted>
3999     b: unreachable
4000
4001 A store *of* an undefined value can be assumed to not have any effect;
4002 we can assume that the value is overwritten with bits that happen to
4003 match what was already there. However, a store *to* an undefined
4004 location could clobber arbitrary memory, therefore, it has undefined
4005 behavior.
4006
4007 Branching on an undefined value is undefined behavior.
4008 This explains optimizations that depend on branch conditions to construct
4009 predicates, such as Correlated Value Propagation and Global Value Numbering.
4010 In case of switch instruction, the branch condition should be frozen, otherwise
4011 it is undefined behavior.
4012
4013 .. code-block:: llvm
4014
4015     Unsafe:
4016       br undef, BB1, BB2 ; UB
4017
4018       %X = and i32 undef, 255
4019       switch %X, label %ret [ .. ] ; UB
4020
4021       store undef, i8* %ptr
4022       %X = load i8* %ptr ; %X is undef
4023       switch i8 %X, label %ret [ .. ] ; UB
4024
4025     Safe:
4026       %X = or i8 undef, 255 ; always 255
4027       switch i8 %X, label %ret [ .. ] ; Well-defined
4028
4029       %X = freeze i1 undef
4030       br %X, BB1, BB2 ; Well-defined (non-deterministic jump)
4031
4032
4033 This is also consistent with the behavior of MemorySanitizer.
4034 MemorySanitizer, detector of uses of uninitialized memory,
4035 defines a branch with condition that depends on an undef value (or
4036 certain other values, like e.g. a result of a load from heap-allocated
4037 memory that has never been stored to) to have an externally visible
4038 side effect. For this reason functions with *sanitize_memory*
4039 attribute are not allowed to produce such branches "out of thin
4040 air". More strictly, an optimization that inserts a conditional branch
4041 is only valid if in all executions where the branch condition has at
4042 least one undefined bit, the same branch condition is evaluated in the
4043 input IR as well.
4044
4045 .. _poisonvalues:
4046
4047 Poison Values
4048 -------------
4049
4050 A poison value is a result of an erroneous operation.
4051 In order to facilitate speculative execution, many instructions do not
4052 invoke immediate undefined behavior when provided with illegal operands,
4053 and return a poison value instead.
4054 The string '``poison``' can be used anywhere a constant is expected, and
4055 operations such as :ref:`add <i_add>` with the ``nsw`` flag can produce
4056 a poison value.
4057
4058 Poison value behavior is defined in terms of value *dependence*:
4059
4060 -  Values other than :ref:`phi <i_phi>` nodes, :ref:`select <i_select>`, and
4061    :ref:`freeze <i_freeze>` instructions depend on their operands.
4062 -  :ref:`Phi <i_phi>` nodes depend on the operand corresponding to
4063    their dynamic predecessor basic block.
4064 -  :ref:`Select <i_select>` instructions depend on their condition operand and
4065    their selected operand.
4066 -  Function arguments depend on the corresponding actual argument values
4067    in the dynamic callers of their functions.
4068 -  :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>`
4069    instructions that dynamically transfer control back to them.
4070 -  :ref:`Invoke <i_invoke>` instructions depend on the
4071    :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing
4072    call instructions that dynamically transfer control back to them.
4073 -  Non-volatile loads and stores depend on the most recent stores to all
4074    of the referenced memory addresses, following the order in the IR
4075    (including loads and stores implied by intrinsics such as
4076    :ref:`@llvm.memcpy <int_memcpy>`.)
4077 -  An instruction with externally visible side effects depends on the
4078    most recent preceding instruction with externally visible side
4079    effects, following the order in the IR. (This includes :ref:`volatile
4080    operations <volatile>`.)
4081 -  An instruction *control-depends* on a :ref:`terminator
4082    instruction <terminators>` if the terminator instruction has
4083    multiple successors and the instruction is always executed when
4084    control transfers to one of the successors, and may not be executed
4085    when control is transferred to another.
4086 -  Additionally, an instruction also *control-depends* on a terminator
4087    instruction if the set of instructions it otherwise depends on would
4088    be different if the terminator had transferred control to a different
4089    successor.
4090 -  Dependence is transitive.
4091 -  Vector elements may be independently poisoned. Therefore, transforms
4092    on instructions such as shufflevector must be careful to propagate
4093    poison across values or elements only as allowed by the original code.
4094
4095 An instruction that *depends* on a poison value, produces a poison value
4096 itself. A poison value may be relaxed into an
4097 :ref:`undef value <undefvalues>`, which takes an arbitrary bit-pattern.
4098 Propagation of poison can be stopped with the
4099 :ref:`freeze instruction <i_freeze>`.
4100
4101 This means that immediate undefined behavior occurs if a poison value is
4102 used as an instruction operand that has any values that trigger undefined
4103 behavior. Notably this includes (but is not limited to):
4104
4105 -  The pointer operand of a :ref:`load <i_load>`, :ref:`store <i_store>` or
4106    any other pointer dereferencing instruction (independent of address
4107    space).
4108 -  The divisor operand of a ``udiv``, ``sdiv``, ``urem`` or ``srem``
4109    instruction.
4110 -  The condition operand of a :ref:`br <i_br>` instruction.
4111 -  The callee operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4112    instruction.
4113 -  The parameter operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4114    instruction, when the function or invoking call site has a ``noundef``
4115    attribute in the corresponding position.
4116 -  The operand of a :ref:`ret <i_ret>` instruction if the function or invoking
4117    call site has a `noundef` attribute in the return value position.
4118
4119 Here are some examples:
4120
4121 .. code-block:: llvm
4122
4123     entry:
4124       %poison = sub nuw i32 0, 1           ; Results in a poison value.
4125       %poison2 = sub i32 poison, 1         ; Also results in a poison value.
4126       %still_poison = and i32 %poison, 0   ; 0, but also poison.
4127       %poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison
4128       store i32 0, i32* %poison_yet_again  ; Undefined behavior due to
4129                                            ; store to poison.
4130
4131       store i32 %poison, i32* @g           ; Poison value stored to memory.
4132       %poison3 = load i32, i32* @g         ; Poison value loaded back from memory.
4133
4134       %narrowaddr = bitcast i32* @g to i16*
4135       %wideaddr = bitcast i32* @g to i64*
4136       %poison4 = load i16, i16* %narrowaddr ; Returns a poison value.
4137       %poison5 = load i64, i64* %wideaddr   ; Returns a poison value.
4138
4139       %cmp = icmp slt i32 %poison, 0       ; Returns a poison value.
4140       br i1 %cmp, label %end, label %end   ; undefined behavior
4141
4142     end:
4143
4144 .. _welldefinedvalues:
4145
4146 Well-Defined Values
4147 -------------------
4148
4149 Given a program execution, a value is *well defined* if the value does not
4150 have an undef bit and is not poison in the execution.
4151 An aggregate value or vector is well defined if its elements are well defined.
4152 The padding of an aggregate isn't considered, since it isn't visible
4153 without storing it into memory and loading it with a different type.
4154
4155 A constant of a :ref:`single value <t_single_value>`, non-vector type is well
4156 defined if it is neither '``undef``' constant nor '``poison``' constant.
4157 The result of :ref:`freeze instruction <i_freeze>` is well defined regardless
4158 of its operand.
4159
4160 .. _blockaddress:
4161
4162 Addresses of Basic Blocks
4163 -------------------------
4164
4165 ``blockaddress(@function, %block)``
4166
4167 The '``blockaddress``' constant computes the address of the specified
4168 basic block in the specified function.
4169
4170 It always has an ``i8 addrspace(P)*`` type, where ``P`` is the address space
4171 of the function containing ``%block`` (usually ``addrspace(0)``).
4172
4173 Taking the address of the entry block is illegal.
4174
4175 This value only has defined behavior when used as an operand to the
4176 ':ref:`indirectbr <i_indirectbr>`' or ':ref:`callbr <i_callbr>`'instruction, or
4177 for comparisons against null. Pointer equality tests between labels addresses
4178 results in undefined behavior --- though, again, comparison against null is ok,
4179 and no label is equal to the null pointer. This may be passed around as an
4180 opaque pointer sized value as long as the bits are not inspected. This
4181 allows ``ptrtoint`` and arithmetic to be performed on these values so
4182 long as the original value is reconstituted before the ``indirectbr`` or
4183 ``callbr`` instruction.
4184
4185 Finally, some targets may provide defined semantics when using the value
4186 as the operand to an inline assembly, but that is target specific.
4187
4188 .. _dso_local_equivalent:
4189
4190 DSO Local Equivalent
4191 --------------------
4192
4193 ``dso_local_equivalent @func``
4194
4195 A '``dso_local_equivalent``' constant represents a function which is
4196 functionally equivalent to a given function, but is always defined in the
4197 current linkage unit. The resulting pointer has the same type as the underlying
4198 function. The resulting pointer is permitted, but not required, to be different
4199 from a pointer to the function, and it may have different values in different
4200 translation units.
4201
4202 The target function may not have ``extern_weak`` linkage.
4203
4204 ``dso_local_equivalent`` can be implemented as such:
4205
4206 - If the function has local linkage, hidden visibility, or is
4207   ``dso_local``, ``dso_local_equivalent`` can be implemented as simply a pointer
4208   to the function.
4209 - ``dso_local_equivalent`` can be implemented with a stub that tail-calls the
4210   function. Many targets support relocations that resolve at link time to either
4211   a function or a stub for it, depending on if the function is defined within the
4212   linkage unit; LLVM will use this when available. (This is commonly called a
4213   "PLT stub".) On other targets, the stub may need to be emitted explicitly.
4214
4215 This can be used wherever a ``dso_local`` instance of a function is needed without
4216 needing to explicitly make the original function ``dso_local``. An instance where
4217 this can be used is for static offset calculations between a function and some other
4218 ``dso_local`` symbol. This is especially useful for the Relative VTables C++ ABI,
4219 where dynamic relocations for function pointers in VTables can be replaced with
4220 static relocations for offsets between the VTable and virtual functions which
4221 may not be ``dso_local``.
4222
4223 This is currently only supported for ELF binary formats.
4224
4225 .. _no_cfi:
4226
4227 No CFI
4228 ------
4229
4230 ``no_cfi @func``
4231
4232 With `Control-Flow Integrity (CFI)
4233 <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, a '``no_cfi``'
4234 constant represents a function reference that does not get replaced with a
4235 reference to the CFI jump table in the ``LowerTypeTests`` pass. These constants
4236 may be useful in low-level programs, such as operating system kernels, which
4237 need to refer to the actual function body.
4238
4239 .. _constantexprs:
4240
4241 Constant Expressions
4242 --------------------
4243
4244 Constant expressions are used to allow expressions involving other
4245 constants to be used as constants. Constant expressions may be of any
4246 :ref:`first class <t_firstclass>` type and may involve any LLVM operation
4247 that does not have side effects (e.g. load and call are not supported).
4248 The following is the syntax for constant expressions:
4249
4250 ``trunc (CST to TYPE)``
4251     Perform the :ref:`trunc operation <i_trunc>` on constants.
4252 ``zext (CST to TYPE)``
4253     Perform the :ref:`zext operation <i_zext>` on constants.
4254 ``sext (CST to TYPE)``
4255     Perform the :ref:`sext operation <i_sext>` on constants.
4256 ``fptrunc (CST to TYPE)``
4257     Truncate a floating-point constant to another floating-point type.
4258     The size of CST must be larger than the size of TYPE. Both types
4259     must be floating-point.
4260 ``fpext (CST to TYPE)``
4261     Floating-point extend a constant to another type. The size of CST
4262     must be smaller or equal to the size of TYPE. Both types must be
4263     floating-point.
4264 ``fptoui (CST to TYPE)``
4265     Convert a floating-point constant to the corresponding unsigned
4266     integer constant. TYPE must be a scalar or vector integer type. CST
4267     must be of scalar or vector floating-point type. Both CST and TYPE
4268     must be scalars, or vectors of the same number of elements. If the
4269     value won't fit in the integer type, the result is a
4270     :ref:`poison value <poisonvalues>`.
4271 ``fptosi (CST to TYPE)``
4272     Convert a floating-point constant to the corresponding signed
4273     integer constant. TYPE must be a scalar or vector integer type. CST
4274     must be of scalar or vector floating-point type. Both CST and TYPE
4275     must be scalars, or vectors of the same number of elements. If the
4276     value won't fit in the integer type, the result is a
4277     :ref:`poison value <poisonvalues>`.
4278 ``uitofp (CST to TYPE)``
4279     Convert an unsigned integer constant to the corresponding
4280     floating-point constant. TYPE must be a scalar or vector floating-point
4281     type.  CST must be of scalar or vector integer type. Both CST and TYPE must
4282     be scalars, or vectors of the same number of elements.
4283 ``sitofp (CST to TYPE)``
4284     Convert a signed integer constant to the corresponding floating-point
4285     constant. TYPE must be a scalar or vector floating-point type.
4286     CST must be of scalar or vector integer type. Both CST and TYPE must
4287     be scalars, or vectors of the same number of elements.
4288 ``ptrtoint (CST to TYPE)``
4289     Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants.
4290 ``inttoptr (CST to TYPE)``
4291     Perform the :ref:`inttoptr operation <i_inttoptr>` on constants.
4292     This one is *really* dangerous!
4293 ``bitcast (CST to TYPE)``
4294     Convert a constant, CST, to another TYPE.
4295     The constraints of the operands are the same as those for the
4296     :ref:`bitcast instruction <i_bitcast>`.
4297 ``addrspacecast (CST to TYPE)``
4298     Convert a constant pointer or constant vector of pointer, CST, to another
4299     TYPE in a different address space. The constraints of the operands are the
4300     same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`.
4301 ``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)``
4302     Perform the :ref:`getelementptr operation <i_getelementptr>` on
4303     constants. As with the :ref:`getelementptr <i_getelementptr>`
4304     instruction, the index list may have one or more indexes, which are
4305     required to make sense for the type of "pointer to TY".
4306 ``select (COND, VAL1, VAL2)``
4307     Perform the :ref:`select operation <i_select>` on constants.
4308 ``icmp COND (VAL1, VAL2)``
4309     Perform the :ref:`icmp operation <i_icmp>` on constants.
4310 ``fcmp COND (VAL1, VAL2)``
4311     Perform the :ref:`fcmp operation <i_fcmp>` on constants.
4312 ``extractelement (VAL, IDX)``
4313     Perform the :ref:`extractelement operation <i_extractelement>` on
4314     constants.
4315 ``insertelement (VAL, ELT, IDX)``
4316     Perform the :ref:`insertelement operation <i_insertelement>` on
4317     constants.
4318 ``shufflevector (VEC1, VEC2, IDXMASK)``
4319     Perform the :ref:`shufflevector operation <i_shufflevector>` on
4320     constants.
4321 ``extractvalue (VAL, IDX0, IDX1, ...)``
4322     Perform the :ref:`extractvalue operation <i_extractvalue>` on
4323     constants. The index list is interpreted in a similar manner as
4324     indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At
4325     least one index value must be specified.
4326 ``insertvalue (VAL, ELT, IDX0, IDX1, ...)``
4327     Perform the :ref:`insertvalue operation <i_insertvalue>` on constants.
4328     The index list is interpreted in a similar manner as indices in a
4329     ':ref:`getelementptr <i_getelementptr>`' operation. At least one index
4330     value must be specified.
4331 ``OPCODE (LHS, RHS)``
4332     Perform the specified operation of the LHS and RHS constants. OPCODE
4333     may be any of the :ref:`binary <binaryops>` or :ref:`bitwise
4334     binary <bitwiseops>` operations. The constraints on operands are
4335     the same as those for the corresponding instruction (e.g. no bitwise
4336     operations on floating-point values are allowed).
4337
4338 Other Values
4339 ============
4340
4341 .. _inlineasmexprs:
4342
4343 Inline Assembler Expressions
4344 ----------------------------
4345
4346 LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
4347 Inline Assembly <moduleasm>`) through the use of a special value. This value
4348 represents the inline assembler as a template string (containing the
4349 instructions to emit), a list of operand constraints (stored as a string), a
4350 flag that indicates whether or not the inline asm expression has side effects,
4351 and a flag indicating whether the function containing the asm needs to align its
4352 stack conservatively.
4353
4354 The template string supports argument substitution of the operands using "``$``"
4355 followed by a number, to indicate substitution of the given register/memory
4356 location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also
4357 be used, where ``MODIFIER`` is a target-specific annotation for how to print the
4358 operand (See :ref:`inline-asm-modifiers`).
4359
4360 A literal "``$``" may be included by using "``$$``" in the template. To include
4361 other special characters into the output, the usual "``\XX``" escapes may be
4362 used, just as in other strings. Note that after template substitution, the
4363 resulting assembly string is parsed by LLVM's integrated assembler unless it is
4364 disabled -- even when emitting a ``.s`` file -- and thus must contain assembly
4365 syntax known to LLVM.
4366
4367 LLVM also supports a few more substitutions useful for writing inline assembly:
4368
4369 - ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob.
4370   This substitution is useful when declaring a local label. Many standard
4371   compiler optimizations, such as inlining, may duplicate an inline asm blob.
4372   Adding a blob-unique identifier ensures that the two labels will not conflict
4373   during assembly. This is used to implement `GCC's %= special format
4374   string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_.
4375 - ``${:comment}``: Expands to the comment character of the current target's
4376   assembly dialect. This is usually ``#``, but many targets use other strings,
4377   such as ``;``, ``//``, or ``!``.
4378 - ``${:private}``: Expands to the assembler private label prefix. Labels with
4379   this prefix will not appear in the symbol table of the assembled object.
4380   Typically the prefix is ``L``, but targets may use other strings. ``.L`` is
4381   relatively popular.
4382
4383 LLVM's support for inline asm is modeled closely on the requirements of Clang's
4384 GCC-compatible inline-asm support. Thus, the feature-set and the constraint and
4385 modifier codes listed here are similar or identical to those in GCC's inline asm
4386 support. However, to be clear, the syntax of the template and constraint strings
4387 described here is *not* the same as the syntax accepted by GCC and Clang, and,
4388 while most constraint letters are passed through as-is by Clang, some get
4389 translated to other codes when converting from the C source to the LLVM
4390 assembly.
4391
4392 An example inline assembler expression is:
4393
4394 .. code-block:: llvm
4395
4396     i32 (i32) asm "bswap $0", "=r,r"
4397
4398 Inline assembler expressions may **only** be used as the callee operand
4399 of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction.
4400 Thus, typically we have:
4401
4402 .. code-block:: llvm
4403
4404     %X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
4405
4406 Inline asms with side effects not visible in the constraint list must be
4407 marked as having side effects. This is done through the use of the
4408 '``sideeffect``' keyword, like so:
4409
4410 .. code-block:: llvm
4411
4412     call void asm sideeffect "eieio", ""()
4413
4414 In some cases inline asms will contain code that will not work unless
4415 the stack is aligned in some way, such as calls or SSE instructions on
4416 x86, yet will not contain code that does that alignment within the asm.
4417 The compiler should make conservative assumptions about what the asm
4418 might contain and should generate its usual stack alignment code in the
4419 prologue if the '``alignstack``' keyword is present:
4420
4421 .. code-block:: llvm
4422
4423     call void asm alignstack "eieio", ""()
4424
4425 Inline asms also support using non-standard assembly dialects. The
4426 assumed dialect is ATT. When the '``inteldialect``' keyword is present,
4427 the inline asm is using the Intel dialect. Currently, ATT and Intel are
4428 the only supported dialects. An example is:
4429
4430 .. code-block:: llvm
4431
4432     call void asm inteldialect "eieio", ""()
4433
4434 In the case that the inline asm might unwind the stack,
4435 the '``unwind``' keyword must be used, so that the compiler emits
4436 unwinding information:
4437
4438 .. code-block:: llvm
4439
4440     call void asm unwind "call func", ""()
4441
4442 If the inline asm unwinds the stack and isn't marked with
4443 the '``unwind``' keyword, the behavior is undefined.
4444
4445 If multiple keywords appear, the '``sideeffect``' keyword must come
4446 first, the '``alignstack``' keyword second, the '``inteldialect``' keyword
4447 third and the '``unwind``' keyword last.
4448
4449 Inline Asm Constraint String
4450 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4451
4452 The constraint list is a comma-separated string, each element containing one or
4453 more constraint codes.
4454
4455 For each element in the constraint list an appropriate register or memory
4456 operand will be chosen, and it will be made available to assembly template
4457 string expansion as ``$0`` for the first constraint in the list, ``$1`` for the
4458 second, etc.
4459
4460 There are three different types of constraints, which are distinguished by a
4461 prefix symbol in front of the constraint code: Output, Input, and Clobber. The
4462 constraints must always be given in that order: outputs first, then inputs, then
4463 clobbers. They cannot be intermingled.
4464
4465 There are also three different categories of constraint codes:
4466
4467 - Register constraint. This is either a register class, or a fixed physical
4468   register. This kind of constraint will allocate a register, and if necessary,
4469   bitcast the argument or result to the appropriate type.
4470 - Memory constraint. This kind of constraint is for use with an instruction
4471   taking a memory operand. Different constraints allow for different addressing
4472   modes used by the target.
4473 - Immediate value constraint. This kind of constraint is for an integer or other
4474   immediate value which can be rendered directly into an instruction. The
4475   various target-specific constraints allow the selection of a value in the
4476   proper range for the instruction you wish to use it with.
4477
4478 Output constraints
4479 """"""""""""""""""
4480
4481 Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This
4482 indicates that the assembly will write to this operand, and the operand will
4483 then be made available as a return value of the ``asm`` expression. Output
4484 constraints do not consume an argument from the call instruction. (Except, see
4485 below about indirect outputs).
4486
4487 Normally, it is expected that no output locations are written to by the assembly
4488 expression until *all* of the inputs have been read. As such, LLVM may assign
4489 the same register to an output and an input. If this is not safe (e.g. if the
4490 assembly contains two instructions, where the first writes to one output, and
4491 the second reads an input and writes to a second output), then the "``&``"
4492 modifier must be used (e.g. "``=&r``") to specify that the output is an
4493 "early-clobber" output. Marking an output as "early-clobber" ensures that LLVM
4494 will not use the same register for any inputs (other than an input tied to this
4495 output).
4496
4497 Input constraints
4498 """""""""""""""""
4499
4500 Input constraints do not have a prefix -- just the constraint codes. Each input
4501 constraint will consume one argument from the call instruction. It is not
4502 permitted for the asm to write to any input register or memory location (unless
4503 that input is tied to an output). Note also that multiple inputs may all be
4504 assigned to the same register, if LLVM can determine that they necessarily all
4505 contain the same value.
4506
4507 Instead of providing a Constraint Code, input constraints may also "tie"
4508 themselves to an output constraint, by providing an integer as the constraint
4509 string. Tied inputs still consume an argument from the call instruction, and
4510 take up a position in the asm template numbering as is usual -- they will simply
4511 be constrained to always use the same register as the output they've been tied
4512 to. For example, a constraint string of "``=r,0``" says to assign a register for
4513 output, and use that register as an input as well (it being the 0'th
4514 constraint).
4515
4516 It is permitted to tie an input to an "early-clobber" output. In that case, no
4517 *other* input may share the same register as the input tied to the early-clobber
4518 (even when the other input has the same value).
4519
4520 You may only tie an input to an output which has a register constraint, not a
4521 memory constraint. Only a single input may be tied to an output.
4522
4523 There is also an "interesting" feature which deserves a bit of explanation: if a
4524 register class constraint allocates a register which is too small for the value
4525 type operand provided as input, the input value will be split into multiple
4526 registers, and all of them passed to the inline asm.
4527
4528 However, this feature is often not as useful as you might think.
4529
4530 Firstly, the registers are *not* guaranteed to be consecutive. So, on those
4531 architectures that have instructions which operate on multiple consecutive
4532 instructions, this is not an appropriate way to support them. (e.g. the 32-bit
4533 SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The
4534 hardware then loads into both the named register, and the next register. This
4535 feature of inline asm would not be useful to support that.)
4536
4537 A few of the targets provide a template string modifier allowing explicit access
4538 to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and
4539 ``D``). On such an architecture, you can actually access the second allocated
4540 register (yet, still, not any subsequent ones). But, in that case, you're still
4541 probably better off simply splitting the value into two separate operands, for
4542 clarity. (e.g. see the description of the ``A`` constraint on X86, which,
4543 despite existing only for use with this feature, is not really a good idea to
4544 use)
4545
4546 Indirect inputs and outputs
4547 """""""""""""""""""""""""""
4548
4549 Indirect output or input constraints can be specified by the "``*``" modifier
4550 (which goes after the "``=``" in case of an output). This indicates that the asm
4551 will write to or read from the contents of an *address* provided as an input
4552 argument. (Note that in this way, indirect outputs act more like an *input* than
4553 an output: just like an input, they consume an argument of the call expression,
4554 rather than producing a return value. An indirect output constraint is an
4555 "output" only in that the asm is expected to write to the contents of the input
4556 memory location, instead of just read from it).
4557
4558 This is most typically used for memory constraint, e.g. "``=*m``", to pass the
4559 address of a variable as a value.
4560
4561 It is also possible to use an indirect *register* constraint, but only on output
4562 (e.g. "``=*r``"). This will cause LLVM to allocate a register for an output
4563 value normally, and then, separately emit a store to the address provided as
4564 input, after the provided inline asm. (It's not clear what value this
4565 functionality provides, compared to writing the store explicitly after the asm
4566 statement, and it can only produce worse code, since it bypasses many
4567 optimization passes. I would recommend not using it.)
4568
4569 Call arguments for indirect constraints must have pointer type and must specify
4570 the :ref:`elementtype <attr_elementtype>` attribute to indicate the pointer
4571 element type.
4572
4573 Clobber constraints
4574 """""""""""""""""""
4575
4576 A clobber constraint is indicated by a "``~``" prefix. A clobber does not
4577 consume an input operand, nor generate an output. Clobbers cannot use any of the
4578 general constraint code letters -- they may use only explicit register
4579 constraints, e.g. "``~{eax}``". The one exception is that a clobber string of
4580 "``~{memory}``" indicates that the assembly writes to arbitrary undeclared
4581 memory locations -- not only the memory pointed to by a declared indirect
4582 output.
4583
4584 Note that clobbering named registers that are also present in output
4585 constraints is not legal.
4586
4587
4588 Constraint Codes
4589 """"""""""""""""
4590 After a potential prefix comes constraint code, or codes.
4591
4592 A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character
4593 followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
4594 (e.g. "``{eax}``").
4595
4596 The one and two letter constraint codes are typically chosen to be the same as
4597 GCC's constraint codes.
4598
4599 A single constraint may include one or more than constraint code in it, leaving
4600 it up to LLVM to choose which one to use. This is included mainly for
4601 compatibility with the translation of GCC inline asm coming from clang.
4602
4603 There are two ways to specify alternatives, and either or both may be used in an
4604 inline asm constraint list:
4605
4606 1) Append the codes to each other, making a constraint code set. E.g. "``im``"
4607    or "``{eax}m``". This means "choose any of the options in the set". The
4608    choice of constraint is made independently for each constraint in the
4609    constraint list.
4610
4611 2) Use "``|``" between constraint code sets, creating alternatives. Every
4612    constraint in the constraint list must have the same number of alternative
4613    sets. With this syntax, the same alternative in *all* of the items in the
4614    constraint list will be chosen together.
4615
4616 Putting those together, you might have a two operand constraint string like
4617 ``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then
4618 operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1
4619 may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m.
4620
4621 However, the use of either of the alternatives features is *NOT* recommended, as
4622 LLVM is not able to make an intelligent choice about which one to use. (At the
4623 point it currently needs to choose, not enough information is available to do so
4624 in a smart way.) Thus, it simply tries to make a choice that's most likely to
4625 compile, not one that will be optimal performance. (e.g., given "``rm``", it'll
4626 always choose to use memory, not registers). And, if given multiple registers,
4627 or multiple register classes, it will simply choose the first one. (In fact, it
4628 doesn't currently even ensure explicitly specified physical registers are
4629 unique, so specifying multiple physical registers as alternatives, like
4630 ``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was
4631 intended.)
4632
4633 Supported Constraint Code List
4634 """"""""""""""""""""""""""""""
4635
4636 The constraint codes are, in general, expected to behave the same way they do in
4637 GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
4638 inline asm code which was supported by GCC. A mismatch in behavior between LLVM
4639 and GCC likely indicates a bug in LLVM.
4640
4641 Some constraint codes are typically supported by all targets:
4642
4643 - ``r``: A register in the target's general purpose register class.
4644 - ``m``: A memory address operand. It is target-specific what addressing modes
4645   are supported, typical examples are register, or register + register offset,
4646   or register + immediate offset (of some target-specific size).
4647 - ``i``: An integer constant (of target-specific width). Allows either a simple
4648   immediate, or a relocatable value.
4649 - ``n``: An integer constant -- *not* including relocatable values.
4650 - ``s``: An integer constant, but allowing *only* relocatable values.
4651 - ``X``: Allows an operand of any kind, no constraint whatsoever. Typically
4652   useful to pass a label for an asm branch or call.
4653
4654   .. FIXME: but that surely isn't actually okay to jump out of an asm
4655      block without telling llvm about the control transfer???)
4656
4657 - ``{register-name}``: Requires exactly the named physical register.
4658
4659 Other constraints are target-specific:
4660
4661 AArch64:
4662
4663 - ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate.
4664 - ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction,
4665   i.e. 0 to 4095 with optional shift by 12.
4666 - ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or
4667   ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12.
4668 - ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a
4669   logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register.
4670 - ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a
4671   logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register.
4672 - ``M``: An immediate integer for use with the ``MOV`` assembly alias on a
4673   32-bit register. This is a superset of ``K``: in addition to the bitmask
4674   immediate, also allows immediate integers which can be loaded with a single
4675   ``MOVZ`` or ``MOVL`` instruction.
4676 - ``N``: An immediate integer for use with the ``MOV`` assembly alias on a
4677   64-bit register. This is a superset of ``L``.
4678 - ``Q``: Memory address operand must be in a single register (no
4679   offsets). (However, LLVM currently does this for the ``m`` constraint as
4680   well.)
4681 - ``r``: A 32 or 64-bit integer register (W* or X*).
4682 - ``w``: A 32, 64, or 128-bit floating-point, SIMD or SVE vector register.
4683 - ``x``: Like w, but restricted to registers 0 to 15 inclusive.
4684 - ``y``: Like w, but restricted to SVE vector registers Z0 to Z7 inclusive.
4685 - ``Upl``: One of the low eight SVE predicate registers (P0 to P7)
4686 - ``Upa``: Any of the SVE predicate registers (P0 to P15)
4687
4688 AMDGPU:
4689
4690 - ``r``: A 32 or 64-bit integer register.
4691 - ``[0-9]v``: The 32-bit VGPR register, number 0-9.
4692 - ``[0-9]s``: The 32-bit SGPR register, number 0-9.
4693 - ``[0-9]a``: The 32-bit AGPR register, number 0-9.
4694 - ``I``: An integer inline constant in the range from -16 to 64.
4695 - ``J``: A 16-bit signed integer constant.
4696 - ``A``: An integer or a floating-point inline constant.
4697 - ``B``: A 32-bit signed integer constant.
4698 - ``C``: A 32-bit unsigned integer constant or an integer inline constant in the range from -16 to 64.
4699 - ``DA``: A 64-bit constant that can be split into two "A" constants.
4700 - ``DB``: A 64-bit constant that can be split into two "B" constants.
4701
4702 All ARM modes:
4703
4704 - ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address
4705   operand. Treated the same as operand ``m``, at the moment.
4706 - ``Te``: An even general-purpose 32-bit integer register: ``r0,r2,...,r12,r14``
4707 - ``To``: An odd general-purpose 32-bit integer register: ``r1,r3,...,r11``
4708
4709 ARM and ARM's Thumb2 mode:
4710
4711 - ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``)
4712 - ``I``: An immediate integer valid for a data-processing instruction.
4713 - ``J``: An immediate integer between -4095 and 4095.
4714 - ``K``: An immediate integer whose bitwise inverse is valid for a
4715   data-processing instruction. (Can be used with template modifier "``B``" to
4716   print the inverted value).
4717 - ``L``: An immediate integer whose negation is valid for a data-processing
4718   instruction. (Can be used with template modifier "``n``" to print the negated
4719   value).
4720 - ``M``: A power of two or an integer between 0 and 32.
4721 - ``N``: Invalid immediate constraint.
4722 - ``O``: Invalid immediate constraint.
4723 - ``r``: A general-purpose 32-bit integer register (``r0-r15``).
4724 - ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same
4725   as ``r``.
4726 - ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode,
4727   invalid.
4728 - ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4729   ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
4730 - ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4731   ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
4732 - ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4733   ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
4734
4735 ARM's Thumb1 mode:
4736
4737 - ``I``: An immediate integer between 0 and 255.
4738 - ``J``: An immediate integer between -255 and -1.
4739 - ``K``: An immediate integer between 0 and 255, with optional left-shift by
4740   some amount.
4741 - ``L``: An immediate integer between -7 and 7.
4742 - ``M``: An immediate integer which is a multiple of 4 between 0 and 1020.
4743 - ``N``: An immediate integer between 0 and 31.
4744 - ``O``: An immediate integer which is a multiple of 4 between -508 and 508.
4745 - ``r``: A low 32-bit GPR register (``r0-r7``).
4746 - ``l``: A low 32-bit GPR register (``r0-r7``).
4747 - ``h``: A high GPR register (``r0-r7``).
4748 - ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4749   ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
4750 - ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4751   ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
4752 - ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4753   ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
4754
4755
4756 Hexagon:
4757
4758 - ``o``, ``v``: A memory address operand, treated the same as constraint ``m``,
4759   at the moment.
4760 - ``r``: A 32 or 64-bit register.
4761
4762 MSP430:
4763
4764 - ``r``: An 8 or 16-bit register.
4765
4766 MIPS:
4767
4768 - ``I``: An immediate signed 16-bit integer.
4769 - ``J``: An immediate integer zero.
4770 - ``K``: An immediate unsigned 16-bit integer.
4771 - ``L``: An immediate 32-bit integer, where the lower 16 bits are 0.
4772 - ``N``: An immediate integer between -65535 and -1.
4773 - ``O``: An immediate signed 15-bit integer.
4774 - ``P``: An immediate integer between 1 and 65535.
4775 - ``m``: A memory address operand. In MIPS-SE mode, allows a base address
4776   register plus 16-bit immediate offset. In MIPS mode, just a base register.
4777 - ``R``: A memory address operand. In MIPS-SE mode, allows a base address
4778   register plus a 9-bit signed offset. In MIPS mode, the same as constraint
4779   ``m``.
4780 - ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or
4781   ``sc`` instruction on the given subtarget (details vary).
4782 - ``r``, ``d``,  ``y``: A 32 or 64-bit GPR register.
4783 - ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register
4784   (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w``
4785   argument modifier for compatibility with GCC.
4786 - ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always
4787   ``25``).
4788 - ``l``: The ``lo`` register, 32 or 64-bit.
4789 - ``x``: Invalid.
4790
4791 NVPTX:
4792
4793 - ``b``: A 1-bit integer register.
4794 - ``c`` or ``h``: A 16-bit integer register.
4795 - ``r``: A 32-bit integer register.
4796 - ``l`` or ``N``: A 64-bit integer register.
4797 - ``f``: A 32-bit float register.
4798 - ``d``: A 64-bit float register.
4799
4800
4801 PowerPC:
4802
4803 - ``I``: An immediate signed 16-bit integer.
4804 - ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits.
4805 - ``K``: An immediate unsigned 16-bit integer.
4806 - ``L``: An immediate signed 16-bit integer, shifted left 16 bits.
4807 - ``M``: An immediate integer greater than 31.
4808 - ``N``: An immediate integer that is an exact power of 2.
4809 - ``O``: The immediate integer constant 0.
4810 - ``P``: An immediate integer constant whose negation is a signed 16-bit
4811   constant.
4812 - ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently
4813   treated the same as ``m``.
4814 - ``r``: A 32 or 64-bit integer register.
4815 - ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is:
4816   ``R1-R31``).
4817 - ``f``: A 32 or 64-bit float register (``F0-F31``),
4818 - ``v``: For ``4 x f32`` or ``4 x f64`` types, a 128-bit altivec vector
4819    register (``V0-V31``).
4820
4821 - ``y``: Condition register (``CR0-CR7``).
4822 - ``wc``: An individual CR bit in a CR register.
4823 - ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX
4824   register set (overlapping both the floating-point and vector register files).
4825 - ``ws``: A 32 or 64-bit floating-point register, from the full VSX register
4826   set.
4827
4828 RISC-V:
4829
4830 - ``A``: An address operand (using a general-purpose register, without an
4831   offset).
4832 - ``I``: A 12-bit signed integer immediate operand.
4833 - ``J``: A zero integer immediate operand.
4834 - ``K``: A 5-bit unsigned integer immediate operand.
4835 - ``f``: A 32- or 64-bit floating-point register (requires F or D extension).
4836 - ``r``: A 32- or 64-bit general-purpose register (depending on the platform
4837   ``XLEN``).
4838 - ``vr``: A vector register. (requires V extension).
4839 - ``vm``: A vector register for masking operand. (requires V extension).
4840
4841 Sparc:
4842
4843 - ``I``: An immediate 13-bit signed integer.
4844 - ``r``: A 32-bit integer register.
4845 - ``f``: Any floating-point register on SparcV8, or a floating-point
4846   register in the "low" half of the registers on SparcV9.
4847 - ``e``: Any floating-point register. (Same as ``f`` on SparcV8.)
4848
4849 SystemZ:
4850
4851 - ``I``: An immediate unsigned 8-bit integer.
4852 - ``J``: An immediate unsigned 12-bit integer.
4853 - ``K``: An immediate signed 16-bit integer.
4854 - ``L``: An immediate signed 20-bit integer.
4855 - ``M``: An immediate integer 0x7fffffff.
4856 - ``Q``: A memory address operand with a base address and a 12-bit immediate
4857   unsigned displacement.
4858 - ``R``: A memory address operand with a base address, a 12-bit immediate
4859   unsigned displacement, and an index register.
4860 - ``S``: A memory address operand with a base address and a 20-bit immediate
4861   signed displacement.
4862 - ``T``: A memory address operand with a base address, a 20-bit immediate
4863   signed displacement, and an index register.
4864 - ``r`` or ``d``: A 32, 64, or 128-bit integer register.
4865 - ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an
4866   address context evaluates as zero).
4867 - ``h``: A 32-bit value in the high part of a 64bit data register
4868   (LLVM-specific)
4869 - ``f``: A 32, 64, or 128-bit floating-point register.
4870
4871 X86:
4872
4873 - ``I``: An immediate integer between 0 and 31.
4874 - ``J``: An immediate integer between 0 and 64.
4875 - ``K``: An immediate signed 8-bit integer.
4876 - ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only)
4877   0xffffffff.
4878 - ``M``: An immediate integer between 0 and 3.
4879 - ``N``: An immediate unsigned 8-bit integer.
4880 - ``O``: An immediate integer between 0 and 127.
4881 - ``e``: An immediate 32-bit signed integer.
4882 - ``Z``: An immediate 32-bit unsigned integer.
4883 - ``o``, ``v``: Treated the same as ``m``, at the moment.
4884 - ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
4885   ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d``
4886   registers, and on X86-64, it is all of the integer registers.
4887 - ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
4888   ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers.
4889 - ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register.
4890 - ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has
4891   existed since i386, and can be accessed without the REX prefix.
4892 - ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register.
4893 - ``y``: A 64-bit MMX register, if MMX is enabled.
4894 - ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector
4895   operand in a SSE register. If AVX is also enabled, can also be a 256-bit
4896   vector operand in an AVX register. If AVX-512 is also enabled, can also be a
4897   512-bit vector operand in an AVX512 register, Otherwise, an error.
4898 - ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error.
4899 - ``A``: Special case: allocates EAX first, then EDX, for a single operand (in
4900   32-bit mode, a 64-bit integer operand will get split into two registers). It
4901   is not recommended to use this constraint, as in 64-bit mode, the 64-bit
4902   operand will get allocated only to RAX -- if two 32-bit operands are needed,
4903   you're better off splitting it yourself, before passing it to the asm
4904   statement.
4905
4906 XCore:
4907
4908 - ``r``: A 32-bit integer register.
4909
4910
4911 .. _inline-asm-modifiers:
4912
4913 Asm template argument modifiers
4914 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4915
4916 In the asm template string, modifiers can be used on the operand reference, like
4917 "``${0:n}``".
4918
4919 The modifiers are, in general, expected to behave the same way they do in
4920 GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
4921 inline asm code which was supported by GCC. A mismatch in behavior between LLVM
4922 and GCC likely indicates a bug in LLVM.
4923
4924 Target-independent:
4925
4926 - ``c``: Print an immediate integer constant unadorned, without
4927   the target-specific immediate punctuation (e.g. no ``$`` prefix).
4928 - ``n``: Negate and print immediate integer constant unadorned, without the
4929   target-specific immediate punctuation (e.g. no ``$`` prefix).
4930 - ``l``: Print as an unadorned label, without the target-specific label
4931   punctuation (e.g. no ``$`` prefix).
4932
4933 AArch64:
4934
4935 - ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g.,
4936   instead of ``x30``, print ``w30``.
4937 - ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow).
4938 - ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a
4939   ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of
4940   ``v*``.
4941
4942 AMDGPU:
4943
4944 - ``r``: No effect.
4945
4946 ARM:
4947
4948 - ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a
4949   register).
4950 - ``P``: No effect.
4951 - ``q``: No effect.
4952 - ``y``: Print a VFP single-precision register as an indexed double (e.g. print
4953   as ``d4[1]`` instead of ``s9``)
4954 - ``B``: Bitwise invert and print an immediate integer constant without ``#``
4955   prefix.
4956 - ``L``: Print the low 16-bits of an immediate integer constant.
4957 - ``M``: Print as a register set suitable for ldm/stm. Also prints *all*
4958   register operands subsequent to the specified one (!), so use carefully.
4959 - ``Q``: Print the low-order register of a register-pair, or the low-order
4960   register of a two-register operand.
4961 - ``R``: Print the high-order register of a register-pair, or the high-order
4962   register of a two-register operand.
4963 - ``H``: Print the second register of a register-pair. (On a big-endian system,
4964   ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent
4965   to ``R``.)
4966
4967   .. FIXME: H doesn't currently support printing the second register
4968      of a two-register operand.
4969
4970 - ``e``: Print the low doubleword register of a NEON quad register.
4971 - ``f``: Print the high doubleword register of a NEON quad register.
4972 - ``m``: Print the base register of a memory operand without the ``[`` and ``]``
4973   adornment.
4974
4975 Hexagon:
4976
4977 - ``L``: Print the second register of a two-register operand. Requires that it
4978   has been allocated consecutively to the first.
4979
4980   .. FIXME: why is it restricted to consecutive ones? And there's
4981      nothing that ensures that happens, is there?
4982
4983 - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
4984   nothing. Used to print 'addi' vs 'add' instructions.
4985
4986 MSP430:
4987
4988 No additional modifiers.
4989
4990 MIPS:
4991
4992 - ``X``: Print an immediate integer as hexadecimal
4993 - ``x``: Print the low 16 bits of an immediate integer as hexadecimal.
4994 - ``d``: Print an immediate integer as decimal.
4995 - ``m``: Subtract one and print an immediate integer as decimal.
4996 - ``z``: Print $0 if an immediate zero, otherwise print normally.
4997 - ``L``: Print the low-order register of a two-register operand, or prints the
4998   address of the low-order word of a double-word memory operand.
4999
5000   .. FIXME: L seems to be missing memory operand support.
5001
5002 - ``M``: Print the high-order register of a two-register operand, or prints the
5003   address of the high-order word of a double-word memory operand.
5004
5005   .. FIXME: M seems to be missing memory operand support.
5006
5007 - ``D``: Print the second register of a two-register operand, or prints the
5008   second word of a double-word memory operand. (On a big-endian system, ``D`` is
5009   equivalent to ``L``, and on little-endian system, ``D`` is equivalent to
5010   ``M``.)
5011 - ``w``: No effect. Provided for compatibility with GCC which requires this
5012   modifier in order to print MSA registers (``W0-W31``) with the ``f``
5013   constraint.
5014
5015 NVPTX:
5016
5017 - ``r``: No effect.
5018
5019 PowerPC:
5020
5021 - ``L``: Print the second register of a two-register operand. Requires that it
5022   has been allocated consecutively to the first.
5023
5024   .. FIXME: why is it restricted to consecutive ones? And there's
5025      nothing that ensures that happens, is there?
5026
5027 - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
5028   nothing. Used to print 'addi' vs 'add' instructions.
5029 - ``y``: For a memory operand, prints formatter for a two-register X-form
5030   instruction. (Currently always prints ``r0,OPERAND``).
5031 - ``U``: Prints 'u' if the memory operand is an update form, and nothing
5032   otherwise. (NOTE: LLVM does not support update form, so this will currently
5033   always print nothing)
5034 - ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does
5035   not support indexed form, so this will currently always print nothing)
5036
5037 RISC-V:
5038
5039 - ``i``: Print the letter 'i' if the operand is not a register, otherwise print
5040   nothing. Used to print 'addi' vs 'add' instructions, etc.
5041 - ``z``: Print the register ``zero`` if an immediate zero, otherwise print
5042   normally.
5043
5044 Sparc:
5045
5046 - ``r``: No effect.
5047
5048 SystemZ:
5049
5050 SystemZ implements only ``n``, and does *not* support any of the other
5051 target-independent modifiers.
5052
5053 X86:
5054
5055 - ``c``: Print an unadorned integer or symbol name. (The latter is
5056   target-specific behavior for this typically target-independent modifier).
5057 - ``A``: Print a register name with a '``*``' before it.
5058 - ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory
5059   operand.
5060 - ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a
5061   memory operand.
5062 - ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory
5063   operand.
5064 - ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory
5065   operand.
5066 - ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are
5067   available, otherwise the 32-bit register name; do nothing on a memory operand.
5068 - ``n``: Negate and print an unadorned integer, or, for operands other than an
5069   immediate integer (e.g. a relocatable symbol expression), print a '-' before
5070   the operand. (The behavior for relocatable symbol expressions is a
5071   target-specific behavior for this typically target-independent modifier)
5072 - ``H``: Print a memory reference with additional offset +8.
5073 - ``P``: Print a memory reference or operand for use as the argument of a call
5074   instruction. (E.g. omit ``(rip)``, even though it's PC-relative.)
5075
5076 XCore:
5077
5078 No additional modifiers.
5079
5080
5081 Inline Asm Metadata
5082 ^^^^^^^^^^^^^^^^^^^
5083
5084 The call instructions that wrap inline asm nodes may have a
5085 "``!srcloc``" MDNode attached to it that contains a list of constant
5086 integers. If present, the code generator will use the integer as the
5087 location cookie value when report errors through the ``LLVMContext``
5088 error reporting mechanisms. This allows a front-end to correlate backend
5089 errors that occur with inline asm back to the source code that produced
5090 it. For example:
5091
5092 .. code-block:: llvm
5093
5094     call void asm sideeffect "something bad", ""(), !srcloc !42
5095     ...
5096     !42 = !{ i32 1234567 }
5097
5098 It is up to the front-end to make sense of the magic numbers it places
5099 in the IR. If the MDNode contains multiple constants, the code generator
5100 will use the one that corresponds to the line of the asm that the error
5101 occurs on.
5102
5103 .. _metadata:
5104
5105 Metadata
5106 ========
5107
5108 LLVM IR allows metadata to be attached to instructions and global objects in the
5109 program that can convey extra information about the code to the optimizers and
5110 code generator. One example application of metadata is source-level
5111 debug information. There are two metadata primitives: strings and nodes.
5112
5113 Metadata does not have a type, and is not a value. If referenced from a
5114 ``call`` instruction, it uses the ``metadata`` type.
5115
5116 All metadata are identified in syntax by an exclamation point ('``!``').
5117
5118 .. _metadata-string:
5119
5120 Metadata Nodes and Metadata Strings
5121 -----------------------------------
5122
5123 A metadata string is a string surrounded by double quotes. It can
5124 contain any character by escaping non-printable characters with
5125 "``\xx``" where "``xx``" is the two digit hex code. For example:
5126 "``!"test\00"``".
5127
5128 Metadata nodes are represented with notation similar to structure
5129 constants (a comma separated list of elements, surrounded by braces and
5130 preceded by an exclamation point). Metadata nodes can have any values as
5131 their operand. For example:
5132
5133 .. code-block:: llvm
5134
5135     !{ !"test\00", i32 10}
5136
5137 Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example:
5138
5139 .. code-block:: text
5140
5141     !0 = distinct !{!"test\00", i32 10}
5142
5143 ``distinct`` nodes are useful when nodes shouldn't be merged based on their
5144 content. They can also occur when transformations cause uniquing collisions
5145 when metadata operands change.
5146
5147 A :ref:`named metadata <namedmetadatastructure>` is a collection of
5148 metadata nodes, which can be looked up in the module symbol table. For
5149 example:
5150
5151 .. code-block:: llvm
5152
5153     !foo = !{!4, !3}
5154
5155 Metadata can be used as function arguments. Here the ``llvm.dbg.value``
5156 intrinsic is using three metadata arguments:
5157
5158 .. code-block:: llvm
5159
5160     call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26)
5161
5162 Metadata can be attached to an instruction. Here metadata ``!21`` is attached
5163 to the ``add`` instruction using the ``!dbg`` identifier:
5164
5165 .. code-block:: llvm
5166
5167     %indvar.next = add i64 %indvar, 1, !dbg !21
5168
5169 Instructions may not have multiple metadata attachments with the same
5170 identifier.
5171
5172 Metadata can also be attached to a function or a global variable. Here metadata
5173 ``!22`` is attached to the ``f1`` and ``f2`` functions, and the globals ``g1``
5174 and ``g2`` using the ``!dbg`` identifier:
5175
5176 .. code-block:: llvm
5177
5178     declare !dbg !22 void @f1()
5179     define void @f2() !dbg !22 {
5180       ret void
5181     }
5182
5183     @g1 = global i32 0, !dbg !22
5184     @g2 = external global i32, !dbg !22
5185
5186 Unlike instructions, global objects (functions and global variables) may have
5187 multiple metadata attachments with the same identifier.
5188
5189 A transformation is required to drop any metadata attachment that it does not
5190 know or know it can't preserve. Currently there is an exception for metadata
5191 attachment to globals for ``!type`` and ``!absolute_symbol`` which can't be
5192 unconditionally dropped unless the global is itself deleted.
5193
5194 Metadata attached to a module using named metadata may not be dropped, with
5195 the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``).
5196
5197 More information about specific metadata nodes recognized by the
5198 optimizers and code generator is found below.
5199
5200 .. _specialized-metadata:
5201
5202 Specialized Metadata Nodes
5203 ^^^^^^^^^^^^^^^^^^^^^^^^^^
5204
5205 Specialized metadata nodes are custom data structures in metadata (as opposed
5206 to generic tuples). Their fields are labelled, and can be specified in any
5207 order.
5208
5209 These aren't inherently debug info centric, but currently all the specialized
5210 metadata nodes are related to debug info.
5211
5212 .. _DICompileUnit:
5213
5214 DICompileUnit
5215 """""""""""""
5216
5217 ``DICompileUnit`` nodes represent a compile unit. The ``enums:``,
5218 ``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples
5219 containing the debug info to be emitted along with the compile unit, regardless
5220 of code optimizations (some nodes are only emitted if there are references to
5221 them from instructions). The ``debugInfoForProfiling:`` field is a boolean
5222 indicating whether or not line-table discriminators are updated to provide
5223 more-accurate debug info for profiling results.
5224
5225 .. code-block:: text
5226
5227     !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang",
5228                         isOptimized: true, flags: "-O2", runtimeVersion: 2,
5229                         splitDebugFilename: "abc.debug", emissionKind: FullDebug,
5230                         enums: !2, retainedTypes: !3, globals: !4, imports: !5,
5231                         macros: !6, dwoId: 0x0abcd)
5232
5233 Compile unit descriptors provide the root scope for objects declared in a
5234 specific compilation unit. File descriptors are defined using this scope.  These
5235 descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep
5236 track of global variables, type information, and imported entities (declarations
5237 and namespaces).
5238
5239 .. _DIFile:
5240
5241 DIFile
5242 """"""
5243
5244 ``DIFile`` nodes represent files. The ``filename:`` can include slashes.
5245
5246 .. code-block:: none
5247
5248     !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir",
5249                  checksumkind: CSK_MD5,
5250                  checksum: "000102030405060708090a0b0c0d0e0f")
5251
5252 Files are sometimes used in ``scope:`` fields, and are the only valid target
5253 for ``file:`` fields.
5254 Valid values for ``checksumkind:`` field are: {CSK_None, CSK_MD5, CSK_SHA1, CSK_SHA256}
5255
5256 .. _DIBasicType:
5257
5258 DIBasicType
5259 """""""""""
5260
5261 ``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and
5262 ``float``. ``tag:`` defaults to ``DW_TAG_base_type``.
5263
5264 .. code-block:: text
5265
5266     !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
5267                       encoding: DW_ATE_unsigned_char)
5268     !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
5269
5270 The ``encoding:`` describes the details of the type. Usually it's one of the
5271 following:
5272
5273 .. code-block:: text
5274
5275   DW_ATE_address       = 1
5276   DW_ATE_boolean       = 2
5277   DW_ATE_float         = 4
5278   DW_ATE_signed        = 5
5279   DW_ATE_signed_char   = 6
5280   DW_ATE_unsigned      = 7
5281   DW_ATE_unsigned_char = 8
5282
5283 .. _DISubroutineType:
5284
5285 DISubroutineType
5286 """"""""""""""""
5287
5288 ``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field
5289 refers to a tuple; the first operand is the return type, while the rest are the
5290 types of the formal arguments in order. If the first operand is ``null``, that
5291 represents a function with no return value (such as ``void foo() {}`` in C++).
5292
5293 .. code-block:: text
5294
5295     !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed)
5296     !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char)
5297     !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char)
5298
5299 .. _DIDerivedType:
5300
5301 DIDerivedType
5302 """""""""""""
5303
5304 ``DIDerivedType`` nodes represent types derived from other types, such as
5305 qualified types.
5306
5307 .. code-block:: text
5308
5309     !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
5310                       encoding: DW_ATE_unsigned_char)
5311     !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32,
5312                         align: 32)
5313
5314 The following ``tag:`` values are valid:
5315
5316 .. code-block:: text
5317
5318   DW_TAG_member             = 13
5319   DW_TAG_pointer_type       = 15
5320   DW_TAG_reference_type     = 16
5321   DW_TAG_typedef            = 22
5322   DW_TAG_inheritance        = 28
5323   DW_TAG_ptr_to_member_type = 31
5324   DW_TAG_const_type         = 38
5325   DW_TAG_friend             = 42
5326   DW_TAG_volatile_type      = 53
5327   DW_TAG_restrict_type      = 55
5328   DW_TAG_atomic_type        = 71
5329   DW_TAG_immutable_type     = 75
5330
5331 .. _DIDerivedTypeMember:
5332
5333 ``DW_TAG_member`` is used to define a member of a :ref:`composite type
5334 <DICompositeType>`. The type of the member is the ``baseType:``. The
5335 ``offset:`` is the member's bit offset.  If the composite type has an ODR
5336 ``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is
5337 uniqued based only on its ``name:`` and ``scope:``.
5338
5339 ``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:``
5340 field of :ref:`composite types <DICompositeType>` to describe parents and
5341 friends.
5342
5343 ``DW_TAG_typedef`` is used to provide a name for the ``baseType:``.
5344
5345 ``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``,
5346 ``DW_TAG_volatile_type``, ``DW_TAG_restrict_type``, ``DW_TAG_atomic_type`` and
5347 ``DW_TAG_immutable_type`` are used to qualify the ``baseType:``.
5348
5349 Note that the ``void *`` type is expressed as a type derived from NULL.
5350
5351 .. _DICompositeType:
5352
5353 DICompositeType
5354 """""""""""""""
5355
5356 ``DICompositeType`` nodes represent types composed of other types, like
5357 structures and unions. ``elements:`` points to a tuple of the composed types.
5358
5359 If the source language supports ODR, the ``identifier:`` field gives the unique
5360 identifier used for type merging between modules.  When specified,
5361 :ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member
5362 derived types <DIDerivedTypeMember>` that reference the ODR-type in their
5363 ``scope:`` change uniquing rules.
5364
5365 For a given ``identifier:``, there should only be a single composite type that
5366 does not have  ``flags: DIFlagFwdDecl`` set.  LLVM tools that link modules
5367 together will unique such definitions at parse time via the ``identifier:``
5368 field, even if the nodes are ``distinct``.
5369
5370 .. code-block:: text
5371
5372     !0 = !DIEnumerator(name: "SixKind", value: 7)
5373     !1 = !DIEnumerator(name: "SevenKind", value: 7)
5374     !2 = !DIEnumerator(name: "NegEightKind", value: -8)
5375     !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12,
5376                           line: 2, size: 32, align: 32, identifier: "_M4Enum",
5377                           elements: !{!0, !1, !2})
5378
5379 The following ``tag:`` values are valid:
5380
5381 .. code-block:: text
5382
5383   DW_TAG_array_type       = 1
5384   DW_TAG_class_type       = 2
5385   DW_TAG_enumeration_type = 4
5386   DW_TAG_structure_type   = 19
5387   DW_TAG_union_type       = 23
5388
5389 For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange
5390 descriptors <DISubrange>`, each representing the range of subscripts at that
5391 level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an
5392 array type is a native packed vector. The optional ``dataLocation`` is a
5393 DIExpression that describes how to get from an object's address to the actual
5394 raw data, if they aren't equivalent. This is only supported for array types,
5395 particularly to describe Fortran arrays, which have an array descriptor in
5396 addition to the array data. Alternatively it can also be DIVariable which
5397 has the address of the actual raw data. The Fortran language supports pointer
5398 arrays which can be attached to actual arrays, this attachment between pointer
5399 and pointee is called association.  The optional ``associated`` is a
5400 DIExpression that describes whether the pointer array is currently associated.
5401 The optional ``allocated`` is a DIExpression that describes whether the
5402 allocatable array is currently allocated.  The optional ``rank`` is a
5403 DIExpression that describes the rank (number of dimensions) of fortran assumed
5404 rank array (rank is known at runtime).
5405
5406 For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator
5407 descriptors <DIEnumerator>`, each representing the definition of an enumeration
5408 value for the set. All enumeration type descriptors are collected in the
5409 ``enums:`` field of the :ref:`compile unit <DICompileUnit>`.
5410
5411 For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and
5412 ``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types
5413 <DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or
5414 ``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with
5415 ``isDefinition: false``.
5416
5417 .. _DISubrange:
5418
5419 DISubrange
5420 """"""""""
5421
5422 ``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of
5423 :ref:`DICompositeType`.
5424
5425 - ``count: -1`` indicates an empty array.
5426 - ``count: !10`` describes the count with a :ref:`DILocalVariable`.
5427 - ``count: !12`` describes the count with a :ref:`DIGlobalVariable`.
5428
5429 .. code-block:: text
5430
5431     !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0
5432     !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1
5433     !2 = !DISubrange(count: -1) ; empty array.
5434
5435     ; Scopes used in rest of example
5436     !6 = !DIFile(filename: "vla.c", directory: "/path/to/file")
5437     !7 = distinct !DICompileUnit(language: DW_LANG_C99, file: !6)
5438     !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5)
5439
5440     ; Use of local variable as count value
5441     !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5442     !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9)
5443     !11 = !DISubrange(count: !10, lowerBound: 0)
5444
5445     ; Use of global variable as count value
5446     !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9)
5447     !13 = !DISubrange(count: !12, lowerBound: 0)
5448
5449 .. _DIEnumerator:
5450
5451 DIEnumerator
5452 """"""""""""
5453
5454 ``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type``
5455 variants of :ref:`DICompositeType`.
5456
5457 .. code-block:: text
5458
5459     !0 = !DIEnumerator(name: "SixKind", value: 7)
5460     !1 = !DIEnumerator(name: "SevenKind", value: 7)
5461     !2 = !DIEnumerator(name: "NegEightKind", value: -8)
5462
5463 DITemplateTypeParameter
5464 """""""""""""""""""""""
5465
5466 ``DITemplateTypeParameter`` nodes represent type parameters to generic source
5467 language constructs. They are used (optionally) in :ref:`DICompositeType` and
5468 :ref:`DISubprogram` ``templateParams:`` fields.
5469
5470 .. code-block:: text
5471
5472     !0 = !DITemplateTypeParameter(name: "Ty", type: !1)
5473
5474 DITemplateValueParameter
5475 """"""""""""""""""""""""
5476
5477 ``DITemplateValueParameter`` nodes represent value parameters to generic source
5478 language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``,
5479 but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or
5480 ``DW_TAG_GNU_template_param_pack``. They are used (optionally) in
5481 :ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields.
5482
5483 .. code-block:: text
5484
5485     !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7)
5486
5487 DINamespace
5488 """""""""""
5489
5490 ``DINamespace`` nodes represent namespaces in the source language.
5491
5492 .. code-block:: text
5493
5494     !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7)
5495
5496 .. _DIGlobalVariable:
5497
5498 DIGlobalVariable
5499 """"""""""""""""
5500
5501 ``DIGlobalVariable`` nodes represent global variables in the source language.
5502
5503 .. code-block:: text
5504
5505     @foo = global i32, !dbg !0
5506     !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
5507     !1 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !2,
5508                            file: !3, line: 7, type: !4, isLocal: true,
5509                            isDefinition: false, declaration: !5)
5510
5511
5512 DIGlobalVariableExpression
5513 """"""""""""""""""""""""""
5514
5515 ``DIGlobalVariableExpression`` nodes tie a :ref:`DIGlobalVariable` together
5516 with a :ref:`DIExpression`.
5517
5518 .. code-block:: text
5519
5520     @lower = global i32, !dbg !0
5521     @upper = global i32, !dbg !1
5522     !0 = !DIGlobalVariableExpression(
5523              var: !2,
5524              expr: !DIExpression(DW_OP_LLVM_fragment, 0, 32)
5525              )
5526     !1 = !DIGlobalVariableExpression(
5527              var: !2,
5528              expr: !DIExpression(DW_OP_LLVM_fragment, 32, 32)
5529              )
5530     !2 = !DIGlobalVariable(name: "split64", linkageName: "split64", scope: !3,
5531                            file: !4, line: 8, type: !5, declaration: !6)
5532
5533 All global variable expressions should be referenced by the `globals:` field of
5534 a :ref:`compile unit <DICompileUnit>`.
5535
5536 .. _DISubprogram:
5537
5538 DISubprogram
5539 """"""""""""
5540
5541 ``DISubprogram`` nodes represent functions from the source language. A distinct
5542 ``DISubprogram`` may be attached to a function definition using ``!dbg``
5543 metadata. A unique ``DISubprogram`` may be attached to a function declaration
5544 used for call site debug info. The ``retainedNodes:`` field is a list of
5545 :ref:`variables <DILocalVariable>` and :ref:`labels <DILabel>` that must be
5546 retained, even if their IR counterparts are optimized out of the IR. The
5547 ``type:`` field must point at an :ref:`DISubroutineType`.
5548
5549 .. _DISubprogramDeclaration:
5550
5551 When ``isDefinition: false``, subprograms describe a declaration in the type
5552 tree as opposed to a definition of a function.  If the scope is a composite
5553 type with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``,
5554 then the subprogram declaration is uniqued based only on its ``linkageName:``
5555 and ``scope:``.
5556
5557 .. code-block:: text
5558
5559     define void @_Z3foov() !dbg !0 {
5560       ...
5561     }
5562
5563     !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1,
5564                                 file: !2, line: 7, type: !3, isLocal: true,
5565                                 isDefinition: true, scopeLine: 8,
5566                                 containingType: !4,
5567                                 virtuality: DW_VIRTUALITY_pure_virtual,
5568                                 virtualIndex: 10, flags: DIFlagPrototyped,
5569                                 isOptimized: true, unit: !5, templateParams: !6,
5570                                 declaration: !7, retainedNodes: !8,
5571                                 thrownTypes: !9)
5572
5573 .. _DILexicalBlock:
5574
5575 DILexicalBlock
5576 """"""""""""""
5577
5578 ``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram
5579 <DISubprogram>`. The line number and column numbers are used to distinguish
5580 two lexical blocks at same depth. They are valid targets for ``scope:``
5581 fields.
5582
5583 .. code-block:: text
5584
5585     !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35)
5586
5587 Usually lexical blocks are ``distinct`` to prevent node merging based on
5588 operands.
5589
5590 .. _DILexicalBlockFile:
5591
5592 DILexicalBlockFile
5593 """"""""""""""""""
5594
5595 ``DILexicalBlockFile`` nodes are used to discriminate between sections of a
5596 :ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to
5597 indicate textual inclusion, or the ``discriminator:`` field can be used to
5598 discriminate between control flow within a single block in the source language.
5599
5600 .. code-block:: text
5601
5602     !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35)
5603     !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0)
5604     !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1)
5605
5606 .. _DILocation:
5607
5608 DILocation
5609 """"""""""
5610
5611 ``DILocation`` nodes represent source debug locations. The ``scope:`` field is
5612 mandatory, and points at an :ref:`DILexicalBlockFile`, an
5613 :ref:`DILexicalBlock`, or an :ref:`DISubprogram`.
5614
5615 .. code-block:: text
5616
5617     !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2)
5618
5619 .. _DILocalVariable:
5620
5621 DILocalVariable
5622 """""""""""""""
5623
5624 ``DILocalVariable`` nodes represent local variables in the source language. If
5625 the ``arg:`` field is set to non-zero, then this variable is a subprogram
5626 parameter, and it will be included in the ``retainedNodes:`` field of its
5627 :ref:`DISubprogram`.
5628
5629 .. code-block:: text
5630
5631     !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7,
5632                           type: !3, flags: DIFlagArtificial)
5633     !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7,
5634                           type: !3)
5635     !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3)
5636
5637 .. _DIExpression:
5638
5639 DIExpression
5640 """"""""""""
5641
5642 ``DIExpression`` nodes represent expressions that are inspired by the DWARF
5643 expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>`
5644 (such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the
5645 referenced LLVM variable relates to the source language variable. Debug
5646 intrinsics are interpreted left-to-right: start by pushing the value/address
5647 operand of the intrinsic onto a stack, then repeatedly push and evaluate
5648 opcodes from the DIExpression until the final variable description is produced.
5649
5650 The current supported opcode vocabulary is limited:
5651
5652 - ``DW_OP_deref`` dereferences the top of the expression stack.
5653 - ``DW_OP_plus`` pops the last two entries from the expression stack, adds
5654   them together and appends the result to the expression stack.
5655 - ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
5656   the last entry from the second last entry and appends the result to the
5657   expression stack.
5658 - ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
5659 - ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8``
5660   here, respectively) of the variable fragment from the working expression. Note
5661   that contrary to DW_OP_bit_piece, the offset is describing the location
5662   within the described source variable.
5663 - ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding
5664   (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the
5665   expression stack is to be converted. Maps into a ``DW_OP_convert`` operation
5666   that references a base type constructed from the supplied values.
5667 - ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be
5668   optionally applied to the pointer. The memory tag is derived from the
5669   given tag offset in an implementation-defined manner.
5670 - ``DW_OP_swap`` swaps top two stack entries.
5671 - ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
5672   of the stack is treated as an address. The second stack entry is treated as an
5673   address space identifier.
5674 - ``DW_OP_stack_value`` marks a constant value.
5675 - ``DW_OP_LLVM_entry_value, N`` may only appear in MIR and at the
5676   beginning of a ``DIExpression``. In DWARF a ``DBG_VALUE``
5677   instruction binding a ``DIExpression(DW_OP_LLVM_entry_value`` to a
5678   register is lowered to a ``DW_OP_entry_value [reg]``, pushing the
5679   value the register had upon function entry onto the stack.  The next
5680   ``(N - 1)`` operations will be part of the ``DW_OP_entry_value``
5681   block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value,
5682   1, DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an
5683   expression where the entry value of the debug value instruction's
5684   value/address operand is pushed to the stack, and is added
5685   with 123. Due to framework limitations ``N`` can currently only
5686   be 1.
5687
5688   The operation is introduced by the ``LiveDebugValues`` pass, which
5689   applies it only to function parameters that are unmodified
5690   throughout the function. Support is limited to simple register
5691   location descriptions, or as indirect locations (e.g., when a struct
5692   is passed-by-value to a callee via a pointer to a temporary copy
5693   made in the caller). The entry value op is also introduced by the
5694   ``AsmPrinter`` pass when a call site parameter value
5695   (``DW_AT_call_site_parameter_value``) is represented as entry value
5696   of the parameter.
5697 - ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one
5698   value, such as one that calculates the sum of two registers. This is always
5699   used in combination with an ordered list of values, such that
5700   ``DW_OP_LLVM_arg, N`` refers to the ``N``th element in that list. For
5701   example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus,
5702   DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to
5703   ``%reg1 - reg2``. This list of values should be provided by the containing
5704   intrinsic/instruction.
5705 - ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided
5706   signed offset of the specified register. The opcode is only generated by the
5707   ``AsmPrinter`` pass to describe call site parameter value which requires an
5708   expression over two registers.
5709 - ``DW_OP_push_object_address`` pushes the address of the object which can then
5710   serve as a descriptor in subsequent calculation. This opcode can be used to
5711   calculate bounds of fortran allocatable array which has array descriptors.
5712 - ``DW_OP_over`` duplicates the entry currently second in the stack at the top
5713   of the stack. This opcode can be used to calculate bounds of fortran assumed
5714   rank array which has rank known at run time and current dimension number is
5715   implicitly first element of the stack.
5716 - ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can
5717   be used to represent pointer variables which are optimized out but the value
5718   it points to is known. This operator is required as it is different than DWARF
5719   operator DW_OP_implicit_pointer in representation and specification (number
5720   and types of operands) and later can not be used as multiple level.
5721
5722 .. code-block:: text
5723
5724     IR for "*ptr = 4;"
5725     --------------
5726     call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !20)
5727     !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
5728                            type: !18)
5729     !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
5730     !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5731     !20 = !DIExpression(DW_OP_LLVM_implicit_pointer))
5732
5733     IR for "**ptr = 4;"
5734     --------------
5735     call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !21)
5736     !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
5737                            type: !18)
5738     !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
5739     !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
5740     !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5741     !21 = !DIExpression(DW_OP_LLVM_implicit_pointer,
5742                         DW_OP_LLVM_implicit_pointer))
5743
5744 DWARF specifies three kinds of simple location descriptions: Register, memory,
5745 and implicit location descriptions.  Note that a location description is
5746 defined over certain ranges of a program, i.e the location of a variable may
5747 change over the course of the program. Register and memory location
5748 descriptions describe the *concrete location* of a source variable (in the
5749 sense that a debugger might modify its value), whereas *implicit locations*
5750 describe merely the actual *value* of a source variable which might not exist
5751 in registers or in memory (see ``DW_OP_stack_value``).
5752
5753 A ``llvm.dbg.addr`` or ``llvm.dbg.declare`` intrinsic describes an indirect
5754 value (the address) of a source variable. The first operand of the intrinsic
5755 must be an address of some kind. A DIExpression attached to the intrinsic
5756 refines this address to produce a concrete location for the source variable.
5757
5758 A ``llvm.dbg.value`` intrinsic describes the direct value of a source variable.
5759 The first operand of the intrinsic may be a direct or indirect value. A
5760 DIExpression attached to the intrinsic refines the first operand to produce a
5761 direct value. For example, if the first operand is an indirect value, it may be
5762 necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a
5763 valid debug intrinsic.
5764
5765 .. note::
5766
5767    A DIExpression is interpreted in the same way regardless of which kind of
5768    debug intrinsic it's attached to.
5769
5770 .. code-block:: text
5771
5772     !0 = !DIExpression(DW_OP_deref)
5773     !1 = !DIExpression(DW_OP_plus_uconst, 3)
5774     !1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus)
5775     !2 = !DIExpression(DW_OP_bit_piece, 3, 7)
5776     !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7)
5777     !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef)
5778     !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value)
5779
5780 DIArgList
5781 """"""""""""
5782
5783 ``DIArgList`` nodes hold a list of constant or SSA value references. These are
5784 used in :ref:`debug intrinsics<dbg_intrinsics>` (currently only in
5785 ``llvm.dbg.value``) in combination with a ``DIExpression`` that uses the
5786 ``DW_OP_LLVM_arg`` operator. Because a DIArgList may refer to local values
5787 within a function, it must only be used as a function argument, must always be
5788 inlined, and cannot appear in named metadata.
5789
5790 .. code-block:: text
5791
5792     llvm.dbg.value(metadata !DIArgList(i32 %a, i32 %b),
5793                    metadata !16,
5794                    metadata !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus))
5795
5796 DIFlags
5797 """""""""""""""
5798
5799 These flags encode various properties of DINodes.
5800
5801 The `ExportSymbols` flag marks a class, struct or union whose members
5802 may be referenced as if they were defined in the containing class or
5803 union. This flag is used to decide whether the DW_AT_export_symbols can
5804 be used for the structure type.
5805
5806 DIObjCProperty
5807 """"""""""""""
5808
5809 ``DIObjCProperty`` nodes represent Objective-C property nodes.
5810
5811 .. code-block:: text
5812
5813     !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo",
5814                          getter: "getFoo", attributes: 7, type: !2)
5815
5816 DIImportedEntity
5817 """"""""""""""""
5818
5819 ``DIImportedEntity`` nodes represent entities (such as modules) imported into a
5820 compile unit. The ``elements`` field is a list of renamed entities (such as
5821 variables and subprograms) in the imported entity (such as module).
5822
5823 .. code-block:: text
5824
5825    !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0,
5826                           entity: !1, line: 7, elements: !3)
5827    !3 = !{!4}
5828    !4 = !DIImportedEntity(tag: DW_TAG_imported_declaration, name: "bar", scope: !0,
5829                           entity: !5, line: 7)
5830
5831 DIMacro
5832 """""""
5833
5834 ``DIMacro`` nodes represent definition or undefinition of a macro identifiers.
5835 The ``name:`` field is the macro identifier, followed by macro parameters when
5836 defining a function-like macro, and the ``value`` field is the token-string
5837 used to expand the macro identifier.
5838
5839 .. code-block:: text
5840
5841    !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)",
5842                  value: "((x) + 1)")
5843    !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo")
5844
5845 DIMacroFile
5846 """""""""""
5847
5848 ``DIMacroFile`` nodes represent inclusion of source files.
5849 The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that
5850 appear in the included source file.
5851
5852 .. code-block:: text
5853
5854    !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2,
5855                      nodes: !3)
5856
5857 .. _DILabel:
5858
5859 DILabel
5860 """""""
5861
5862 ``DILabel`` nodes represent labels within a :ref:`DISubprogram`. All fields of
5863 a ``DILabel`` are mandatory. The ``scope:`` field must be one of either a
5864 :ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`.
5865 The ``name:`` field is the label identifier. The ``file:`` field is the
5866 :ref:`DIFile` the label is present in. The ``line:`` field is the source line
5867 within the file where the label is declared.
5868
5869 .. code-block:: text
5870
5871   !2 = !DILabel(scope: !0, name: "foo", file: !1, line: 7)
5872
5873 '``tbaa``' Metadata
5874 ^^^^^^^^^^^^^^^^^^^
5875
5876 In LLVM IR, memory does not have types, so LLVM's own type system is not
5877 suitable for doing type based alias analysis (TBAA). Instead, metadata is
5878 added to the IR to describe a type system of a higher level language. This
5879 can be used to implement C/C++ strict type aliasing rules, but it can also
5880 be used to implement custom alias analysis behavior for other languages.
5881
5882 This description of LLVM's TBAA system is broken into two parts:
5883 :ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and
5884 :ref:`Representation<tbaa_node_representation>` talks about the metadata
5885 encoding of various entities.
5886
5887 It is always possible to trace any TBAA node to a "root" TBAA node (details
5888 in the :ref:`Representation<tbaa_node_representation>` section).  TBAA
5889 nodes with different roots have an unknown aliasing relationship, and LLVM
5890 conservatively infers ``MayAlias`` between them.  The rules mentioned in
5891 this section only pertain to TBAA nodes living under the same root.
5892
5893 .. _tbaa_node_semantics:
5894
5895 Semantics
5896 """""""""
5897
5898 The TBAA metadata system, referred to as "struct path TBAA" (not to be
5899 confused with ``tbaa.struct``), consists of the following high level
5900 concepts: *Type Descriptors*, further subdivided into scalar type
5901 descriptors and struct type descriptors; and *Access Tags*.
5902
5903 **Type descriptors** describe the type system of the higher level language
5904 being compiled.  **Scalar type descriptors** describe types that do not
5905 contain other types.  Each scalar type has a parent type, which must also
5906 be a scalar type or the TBAA root.  Via this parent relation, scalar types
5907 within a TBAA root form a tree.  **Struct type descriptors** denote types
5908 that contain a sequence of other type descriptors, at known offsets.  These
5909 contained type descriptors can either be struct type descriptors themselves
5910 or scalar type descriptors.
5911
5912 **Access tags** are metadata nodes attached to load and store instructions.
5913 Access tags use type descriptors to describe the *location* being accessed
5914 in terms of the type system of the higher level language.  Access tags are
5915 tuples consisting of a base type, an access type and an offset.  The base
5916 type is a scalar type descriptor or a struct type descriptor, the access
5917 type is a scalar type descriptor, and the offset is a constant integer.
5918
5919 The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two
5920 things:
5921
5922  * If ``BaseTy`` is a struct type, the tag describes a memory access (load
5923    or store) of a value of type ``AccessTy`` contained in the struct type
5924    ``BaseTy`` at offset ``Offset``.
5925
5926  * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and
5927    ``AccessTy`` must be the same; and the access tag describes a scalar
5928    access with scalar type ``AccessTy``.
5929
5930 We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)``
5931 tuples this way:
5932
5933  * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is
5934    ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as
5935    described in the TBAA metadata.  ``ImmediateParent(BaseTy, Offset)`` is
5936    undefined if ``Offset`` is non-zero.
5937
5938  * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)``
5939    is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in
5940    ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted
5941    to be relative within that inner type.
5942
5943 A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)``
5944 aliases a memory access with an access tag ``(BaseTy2, AccessTy2,
5945 Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2,
5946 Offset2)`` via the ``Parent`` relation or vice versa.
5947
5948 As a concrete example, the type descriptor graph for the following program
5949
5950 .. code-block:: c
5951
5952     struct Inner {
5953       int i;    // offset 0
5954       float f;  // offset 4
5955     };
5956
5957     struct Outer {
5958       float f;  // offset 0
5959       double d; // offset 4
5960       struct Inner inner_a;  // offset 12
5961     };
5962
5963     void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) {
5964       outer->f = 0;            // tag0: (OuterStructTy, FloatScalarTy, 0)
5965       outer->inner_a.i = 0;    // tag1: (OuterStructTy, IntScalarTy, 12)
5966       outer->inner_a.f = 0.0;  // tag2: (OuterStructTy, FloatScalarTy, 16)
5967       *f = 0.0;                // tag3: (FloatScalarTy, FloatScalarTy, 0)
5968     }
5969
5970 is (note that in C and C++, ``char`` can be used to access any arbitrary
5971 type):
5972
5973 .. code-block:: text
5974
5975     Root = "TBAA Root"
5976     CharScalarTy = ("char", Root, 0)
5977     FloatScalarTy = ("float", CharScalarTy, 0)
5978     DoubleScalarTy = ("double", CharScalarTy, 0)
5979     IntScalarTy = ("int", CharScalarTy, 0)
5980     InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)}
5981     OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4),
5982                      (InnerStructTy, 12)}
5983
5984
5985 with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy,
5986 0)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and
5987 ``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``.
5988
5989 .. _tbaa_node_representation:
5990
5991 Representation
5992 """"""""""""""
5993
5994 The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or
5995 with exactly one ``MDString`` operand.
5996
5997 Scalar type descriptors are represented as an ``MDNode`` s with two
5998 operands.  The first operand is an ``MDString`` denoting the name of the
5999 struct type.  LLVM does not assign meaning to the value of this operand, it
6000 only cares about it being an ``MDString``.  The second operand is an
6001 ``MDNode`` which points to the parent for said scalar type descriptor,
6002 which is either another scalar type descriptor or the TBAA root.  Scalar
6003 type descriptors can have an optional third argument, but that must be the
6004 constant integer zero.
6005
6006 Struct type descriptors are represented as ``MDNode`` s with an odd number
6007 of operands greater than 1.  The first operand is an ``MDString`` denoting
6008 the name of the struct type.  Like in scalar type descriptors the actual
6009 value of this name operand is irrelevant to LLVM.  After the name operand,
6010 the struct type descriptors have a sequence of alternating ``MDNode`` and
6011 ``ConstantInt`` operands.  With N starting from 1, the 2N - 1 th operand,
6012 an ``MDNode``, denotes a contained field, and the 2N th operand, a
6013 ``ConstantInt``, is the offset of the said contained field.  The offsets
6014 must be in non-decreasing order.
6015
6016 Access tags are represented as ``MDNode`` s with either 3 or 4 operands.
6017 The first operand is an ``MDNode`` pointing to the node representing the
6018 base type.  The second operand is an ``MDNode`` pointing to the node
6019 representing the access type.  The third operand is a ``ConstantInt`` that
6020 states the offset of the access.  If a fourth field is present, it must be
6021 a ``ConstantInt`` valued at 0 or 1.  If it is 1 then the access tag states
6022 that the location being accessed is "constant" (meaning
6023 ``pointsToConstantMemory`` should return true; see `other useful
6024 AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_).  The TBAA root of
6025 the access type and the base type of an access tag must be the same, and
6026 that is the TBAA root of the access tag.
6027
6028 '``tbaa.struct``' Metadata
6029 ^^^^^^^^^^^^^^^^^^^^^^^^^^
6030
6031 The :ref:`llvm.memcpy <int_memcpy>` is often used to implement
6032 aggregate assignment operations in C and similar languages, however it
6033 is defined to copy a contiguous region of memory, which is more than
6034 strictly necessary for aggregate types which contain holes due to
6035 padding. Also, it doesn't contain any TBAA information about the fields
6036 of the aggregate.
6037
6038 ``!tbaa.struct`` metadata can describe which memory subregions in a
6039 memcpy are padding and what the TBAA tags of the struct are.
6040
6041 The current metadata format is very simple. ``!tbaa.struct`` metadata
6042 nodes are a list of operands which are in conceptual groups of three.
6043 For each group of three, the first operand gives the byte offset of a
6044 field in bytes, the second gives its size in bytes, and the third gives
6045 its tbaa tag. e.g.:
6046
6047 .. code-block:: llvm
6048
6049     !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 }
6050
6051 This describes a struct with two fields. The first is at offset 0 bytes
6052 with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes
6053 and has size 4 bytes and has tbaa tag !2.
6054
6055 Note that the fields need not be contiguous. In this example, there is a
6056 4 byte gap between the two fields. This gap represents padding which
6057 does not carry useful data and need not be preserved.
6058
6059 '``noalias``' and '``alias.scope``' Metadata
6060 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6061
6062 ``noalias`` and ``alias.scope`` metadata provide the ability to specify generic
6063 noalias memory-access sets. This means that some collection of memory access
6064 instructions (loads, stores, memory-accessing calls, etc.) that carry
6065 ``noalias`` metadata can specifically be specified not to alias with some other
6066 collection of memory access instructions that carry ``alias.scope`` metadata.
6067 Each type of metadata specifies a list of scopes where each scope has an id and
6068 a domain.
6069
6070 When evaluating an aliasing query, if for some domain, the set
6071 of scopes with that domain in one instruction's ``alias.scope`` list is a
6072 subset of (or equal to) the set of scopes for that domain in another
6073 instruction's ``noalias`` list, then the two memory accesses are assumed not to
6074 alias.
6075
6076 Because scopes in one domain don't affect scopes in other domains, separate
6077 domains can be used to compose multiple independent noalias sets.  This is
6078 used for example during inlining.  As the noalias function parameters are
6079 turned into noalias scope metadata, a new domain is used every time the
6080 function is inlined.
6081
6082 The metadata identifying each domain is itself a list containing one or two
6083 entries. The first entry is the name of the domain. Note that if the name is a
6084 string then it can be combined across functions and translation units. A
6085 self-reference can be used to create globally unique domain names. A
6086 descriptive string may optionally be provided as a second list entry.
6087
6088 The metadata identifying each scope is also itself a list containing two or
6089 three entries. The first entry is the name of the scope. Note that if the name
6090 is a string then it can be combined across functions and translation units. A
6091 self-reference can be used to create globally unique scope names. A metadata
6092 reference to the scope's domain is the second entry. A descriptive string may
6093 optionally be provided as a third list entry.
6094
6095 For example,
6096
6097 .. code-block:: llvm
6098
6099     ; Two scope domains:
6100     !0 = !{!0}
6101     !1 = !{!1}
6102
6103     ; Some scopes in these domains:
6104     !2 = !{!2, !0}
6105     !3 = !{!3, !0}
6106     !4 = !{!4, !1}
6107
6108     ; Some scope lists:
6109     !5 = !{!4} ; A list containing only scope !4
6110     !6 = !{!4, !3, !2}
6111     !7 = !{!3}
6112
6113     ; These two instructions don't alias:
6114     %0 = load float, float* %c, align 4, !alias.scope !5
6115     store float %0, float* %arrayidx.i, align 4, !noalias !5
6116
6117     ; These two instructions also don't alias (for domain !1, the set of scopes
6118     ; in the !alias.scope equals that in the !noalias list):
6119     %2 = load float, float* %c, align 4, !alias.scope !5
6120     store float %2, float* %arrayidx.i2, align 4, !noalias !6
6121
6122     ; These two instructions may alias (for domain !0, the set of scopes in
6123     ; the !noalias list is not a superset of, or equal to, the scopes in the
6124     ; !alias.scope list):
6125     %2 = load float, float* %c, align 4, !alias.scope !6
6126     store float %0, float* %arrayidx.i, align 4, !noalias !7
6127
6128 '``fpmath``' Metadata
6129 ^^^^^^^^^^^^^^^^^^^^^
6130
6131 ``fpmath`` metadata may be attached to any instruction of floating-point
6132 type. It can be used to express the maximum acceptable error in the
6133 result of that instruction, in ULPs, thus potentially allowing the
6134 compiler to use a more efficient but less accurate method of computing
6135 it. ULP is defined as follows:
6136
6137     If ``x`` is a real number that lies between two finite consecutive
6138     floating-point numbers ``a`` and ``b``, without being equal to one
6139     of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the
6140     distance between the two non-equal finite floating-point numbers
6141     nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``.
6142
6143 The metadata node shall consist of a single positive float type number
6144 representing the maximum relative error, for example:
6145
6146 .. code-block:: llvm
6147
6148     !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs
6149
6150 .. _range-metadata:
6151
6152 '``range``' Metadata
6153 ^^^^^^^^^^^^^^^^^^^^
6154
6155 ``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
6156 integer types. It expresses the possible ranges the loaded value or the value
6157 returned by the called function at this call site is in. If the loaded or
6158 returned value is not in the specified range, the behavior is undefined. The
6159 ranges are represented with a flattened list of integers. The loaded value or
6160 the value returned is known to be in the union of the ranges defined by each
6161 consecutive pair. Each pair has the following properties:
6162
6163 -  The type must match the type loaded by the instruction.
6164 -  The pair ``a,b`` represents the range ``[a,b)``.
6165 -  Both ``a`` and ``b`` are constants.
6166 -  The range is allowed to wrap.
6167 -  The range should not represent the full or empty set. That is,
6168    ``a!=b``.
6169
6170 In addition, the pairs must be in signed order of the lower bound and
6171 they must be non-contiguous.
6172
6173 Examples:
6174
6175 .. code-block:: llvm
6176
6177       %a = load i8, i8* %x, align 1, !range !0 ; Can only be 0 or 1
6178       %b = load i8, i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
6179       %c = call i8 @foo(),       !range !2 ; Can only be 0, 1, 3, 4 or 5
6180       %d = invoke i8 @bar() to label %cont
6181              unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
6182     ...
6183     !0 = !{ i8 0, i8 2 }
6184     !1 = !{ i8 255, i8 2 }
6185     !2 = !{ i8 0, i8 2, i8 3, i8 6 }
6186     !3 = !{ i8 -2, i8 0, i8 3, i8 6 }
6187
6188 '``absolute_symbol``' Metadata
6189 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6190
6191 ``absolute_symbol`` metadata may be attached to a global variable
6192 declaration. It marks the declaration as a reference to an absolute symbol,
6193 which causes the backend to use absolute relocations for the symbol even
6194 in position independent code, and expresses the possible ranges that the
6195 global variable's *address* (not its value) is in, in the same format as
6196 ``range`` metadata, with the extension that the pair ``all-ones,all-ones``
6197 may be used to represent the full set.
6198
6199 Example (assuming 64-bit pointers):
6200
6201 .. code-block:: llvm
6202
6203       @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256)
6204       @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64)
6205
6206     ...
6207     !0 = !{ i64 0, i64 256 }
6208     !1 = !{ i64 -1, i64 -1 }
6209
6210 '``callees``' Metadata
6211 ^^^^^^^^^^^^^^^^^^^^^^
6212
6213 ``callees`` metadata may be attached to indirect call sites. If ``callees``
6214 metadata is attached to a call site, and any callee is not among the set of
6215 functions provided by the metadata, the behavior is undefined. The intent of
6216 this metadata is to facilitate optimizations such as indirect-call promotion.
6217 For example, in the code below, the call instruction may only target the
6218 ``add`` or ``sub`` functions:
6219
6220 .. code-block:: llvm
6221
6222     %result = call i64 %binop(i64 %x, i64 %y), !callees !0
6223
6224     ...
6225     !0 = !{i64 (i64, i64)* @add, i64 (i64, i64)* @sub}
6226
6227 '``callback``' Metadata
6228 ^^^^^^^^^^^^^^^^^^^^^^^
6229
6230 ``callback`` metadata may be attached to a function declaration, or definition.
6231 (Call sites are excluded only due to the lack of a use case.) For ease of
6232 exposition, we'll refer to the function annotated w/ metadata as a broker
6233 function. The metadata describes how the arguments of a call to the broker are
6234 in turn passed to the callback function specified by the metadata. Thus, the
6235 ``callback`` metadata provides a partial description of a call site inside the
6236 broker function with regards to the arguments of a call to the broker. The only
6237 semantic restriction on the broker function itself is that it is not allowed to
6238 inspect or modify arguments referenced in the ``callback`` metadata as
6239 pass-through to the callback function.
6240
6241 The broker is not required to actually invoke the callback function at runtime.
6242 However, the assumptions about not inspecting or modifying arguments that would
6243 be passed to the specified callback function still hold, even if the callback
6244 function is not dynamically invoked. The broker is allowed to invoke the
6245 callback function more than once per invocation of the broker. The broker is
6246 also allowed to invoke (directly or indirectly) the function passed as a
6247 callback through another use. Finally, the broker is also allowed to relay the
6248 callback callee invocation to a different thread.
6249
6250 The metadata is structured as follows: At the outer level, ``callback``
6251 metadata is a list of ``callback`` encodings. Each encoding starts with a
6252 constant ``i64`` which describes the argument position of the callback function
6253 in the call to the broker. The following elements, except the last, describe
6254 what arguments are passed to the callback function. Each element is again an
6255 ``i64`` constant identifying the argument of the broker that is passed through,
6256 or ``i64 -1`` to indicate an unknown or inspected argument. The order in which
6257 they are listed has to be the same in which they are passed to the callback
6258 callee. The last element of the encoding is a boolean which specifies how
6259 variadic arguments of the broker are handled. If it is true, all variadic
6260 arguments of the broker are passed through to the callback function *after* the
6261 arguments encoded explicitly before.
6262
6263 In the code below, the ``pthread_create`` function is marked as a broker
6264 through the ``!callback !1`` metadata. In the example, there is only one
6265 callback encoding, namely ``!2``, associated with the broker. This encoding
6266 identifies the callback function as the second argument of the broker (``i64
6267 2``) and the sole argument of the callback function as the third one of the
6268 broker function (``i64 3``).
6269
6270 .. FIXME why does the llvm-sphinx-docs builder give a highlighting
6271    error if the below is set to highlight as 'llvm', despite that we
6272    have misc.highlighting_failure set?
6273
6274 .. code-block:: text
6275
6276     declare !callback !1 dso_local i32 @pthread_create(i64*, %union.pthread_attr_t*, i8* (i8*)*, i8*)
6277
6278     ...
6279     !2 = !{i64 2, i64 3, i1 false}
6280     !1 = !{!2}
6281
6282 Another example is shown below. The callback callee is the second argument of
6283 the ``__kmpc_fork_call`` function (``i64 2``). The callee is given two unknown
6284 values (each identified by a ``i64 -1``) and afterwards all
6285 variadic arguments that are passed to the ``__kmpc_fork_call`` call (due to the
6286 final ``i1 true``).
6287
6288 .. FIXME why does the llvm-sphinx-docs builder give a highlighting
6289    error if the below is set to highlight as 'llvm', despite that we
6290    have misc.highlighting_failure set?
6291
6292 .. code-block:: text
6293
6294     declare !callback !0 dso_local void @__kmpc_fork_call(%struct.ident_t*, i32, void (i32*, i32*, ...)*, ...)
6295
6296     ...
6297     !1 = !{i64 2, i64 -1, i64 -1, i1 true}
6298     !0 = !{!1}
6299
6300
6301 '``unpredictable``' Metadata
6302 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6303
6304 ``unpredictable`` metadata may be attached to any branch or switch
6305 instruction. It can be used to express the unpredictability of control
6306 flow. Similar to the llvm.expect intrinsic, it may be used to alter
6307 optimizations related to compare and branch instructions. The metadata
6308 is treated as a boolean value; if it exists, it signals that the branch
6309 or switch that it is attached to is completely unpredictable.
6310
6311 .. _md_dereferenceable:
6312
6313 '``dereferenceable``' Metadata
6314 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6315
6316 The existence of the ``!dereferenceable`` metadata on the instruction
6317 tells the optimizer that the value loaded is known to be dereferenceable.
6318 The number of bytes known to be dereferenceable is specified by the integer
6319 value in the metadata node. This is analogous to the ''dereferenceable''
6320 attribute on parameters and return values.
6321
6322 .. _md_dereferenceable_or_null:
6323
6324 '``dereferenceable_or_null``' Metadata
6325 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6326
6327 The existence of the ``!dereferenceable_or_null`` metadata on the
6328 instruction tells the optimizer that the value loaded is known to be either
6329 dereferenceable or null.
6330 The number of bytes known to be dereferenceable is specified by the integer
6331 value in the metadata node. This is analogous to the ''dereferenceable_or_null''
6332 attribute on parameters and return values.
6333
6334 .. _llvm.loop:
6335
6336 '``llvm.loop``'
6337 ^^^^^^^^^^^^^^^
6338
6339 It is sometimes useful to attach information to loop constructs. Currently,
6340 loop metadata is implemented as metadata attached to the branch instruction
6341 in the loop latch block. The loop metadata node is a list of
6342 other metadata nodes, each representing a property of the loop. Usually,
6343 the first item of the property node is a string. For example, the
6344 ``llvm.loop.unroll.count`` suggests an unroll factor to the loop
6345 unroller:
6346
6347 .. code-block:: llvm
6348
6349       br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0
6350     ...
6351     !0 = !{!0, !1, !2}
6352     !1 = !{!"llvm.loop.unroll.enable"}
6353     !2 = !{!"llvm.loop.unroll.count", i32 4}
6354
6355 For legacy reasons, the first item of a loop metadata node must be a
6356 reference to itself. Before the advent of the 'distinct' keyword, this
6357 forced the preservation of otherwise identical metadata nodes. Since
6358 the loop-metadata node can be attached to multiple nodes, the 'distinct'
6359 keyword has become unnecessary.
6360
6361 Prior to the property nodes, one or two ``DILocation`` (debug location)
6362 nodes can be present in the list. The first, if present, identifies the
6363 source-code location where the loop begins. The second, if present,
6364 identifies the source-code location where the loop ends.
6365
6366 Loop metadata nodes cannot be used as unique identifiers. They are
6367 neither persistent for the same loop through transformations nor
6368 necessarily unique to just one loop.
6369
6370 '``llvm.loop.disable_nonforced``'
6371 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6372
6373 This metadata disables all optional loop transformations unless
6374 explicitly instructed using other transformation metadata such as
6375 ``llvm.loop.unroll.enable``. That is, no heuristic will try to determine
6376 whether a transformation is profitable. The purpose is to avoid that the
6377 loop is transformed to a different loop before an explicitly requested
6378 (forced) transformation is applied. For instance, loop fusion can make
6379 other transformations impossible. Mandatory loop canonicalizations such
6380 as loop rotation are still applied.
6381
6382 It is recommended to use this metadata in addition to any llvm.loop.*
6383 transformation directive. Also, any loop should have at most one
6384 directive applied to it (and a sequence of transformations built using
6385 followup-attributes). Otherwise, which transformation will be applied
6386 depends on implementation details such as the pass pipeline order.
6387
6388 See :ref:`transformation-metadata` for details.
6389
6390 '``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
6391 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6392
6393 Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are
6394 used to control per-loop vectorization and interleaving parameters such as
6395 vectorization width and interleave count. These metadata should be used in
6396 conjunction with ``llvm.loop`` loop identification metadata. The
6397 ``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only
6398 optimization hints and the optimizer will only interleave and vectorize loops if
6399 it believes it is safe to do so. The ``llvm.loop.parallel_accesses`` metadata
6400 which contains information about loop-carried memory dependencies can be helpful
6401 in determining the safety of these transformations.
6402
6403 '``llvm.loop.interleave.count``' Metadata
6404 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6405
6406 This metadata suggests an interleave count to the loop interleaver.
6407 The first operand is the string ``llvm.loop.interleave.count`` and the
6408 second operand is an integer specifying the interleave count. For
6409 example:
6410
6411 .. code-block:: llvm
6412
6413    !0 = !{!"llvm.loop.interleave.count", i32 4}
6414
6415 Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving
6416 multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0
6417 then the interleave count will be determined automatically.
6418
6419 '``llvm.loop.vectorize.enable``' Metadata
6420 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6421
6422 This metadata selectively enables or disables vectorization for the loop. The
6423 first operand is the string ``llvm.loop.vectorize.enable`` and the second operand
6424 is a bit. If the bit operand value is 1 vectorization is enabled. A value of
6425 0 disables vectorization:
6426
6427 .. code-block:: llvm
6428
6429    !0 = !{!"llvm.loop.vectorize.enable", i1 0}
6430    !1 = !{!"llvm.loop.vectorize.enable", i1 1}
6431
6432 '``llvm.loop.vectorize.predicate.enable``' Metadata
6433 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6434
6435 This metadata selectively enables or disables creating predicated instructions
6436 for the loop, which can enable folding of the scalar epilogue loop into the
6437 main loop. The first operand is the string
6438 ``llvm.loop.vectorize.predicate.enable`` and the second operand is a bit. If
6439 the bit operand value is 1 vectorization is enabled. A value of 0 disables
6440 vectorization:
6441
6442 .. code-block:: llvm
6443
6444    !0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0}
6445    !1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1}
6446
6447 '``llvm.loop.vectorize.scalable.enable``' Metadata
6448 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6449
6450 This metadata selectively enables or disables scalable vectorization for the
6451 loop, and only has any effect if vectorization for the loop is already enabled.
6452 The first operand is the string ``llvm.loop.vectorize.scalable.enable``
6453 and the second operand is a bit. If the bit operand value is 1 scalable
6454 vectorization is enabled, whereas a value of 0 reverts to the default fixed
6455 width vectorization:
6456
6457 .. code-block:: llvm
6458
6459    !0 = !{!"llvm.loop.vectorize.scalable.enable", i1 0}
6460    !1 = !{!"llvm.loop.vectorize.scalable.enable", i1 1}
6461
6462 '``llvm.loop.vectorize.width``' Metadata
6463 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6464
6465 This metadata sets the target width of the vectorizer. The first
6466 operand is the string ``llvm.loop.vectorize.width`` and the second
6467 operand is an integer specifying the width. For example:
6468
6469 .. code-block:: llvm
6470
6471    !0 = !{!"llvm.loop.vectorize.width", i32 4}
6472
6473 Note that setting ``llvm.loop.vectorize.width`` to 1 disables
6474 vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to
6475 0 or if the loop does not have this metadata the width will be
6476 determined automatically.
6477
6478 '``llvm.loop.vectorize.followup_vectorized``' Metadata
6479 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6480
6481 This metadata defines which loop attributes the vectorized loop will
6482 have. See :ref:`transformation-metadata` for details.
6483
6484 '``llvm.loop.vectorize.followup_epilogue``' Metadata
6485 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6486
6487 This metadata defines which loop attributes the epilogue will have. The
6488 epilogue is not vectorized and is executed when either the vectorized
6489 loop is not known to preserve semantics (because e.g., it processes two
6490 arrays that are found to alias by a runtime check) or for the last
6491 iterations that do not fill a complete set of vector lanes. See
6492 :ref:`Transformation Metadata <transformation-metadata>` for details.
6493
6494 '``llvm.loop.vectorize.followup_all``' Metadata
6495 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6496
6497 Attributes in the metadata will be added to both the vectorized and
6498 epilogue loop.
6499 See :ref:`Transformation Metadata <transformation-metadata>` for details.
6500
6501 '``llvm.loop.unroll``'
6502 ^^^^^^^^^^^^^^^^^^^^^^
6503
6504 Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling
6505 optimization hints such as the unroll factor. ``llvm.loop.unroll``
6506 metadata should be used in conjunction with ``llvm.loop`` loop
6507 identification metadata. The ``llvm.loop.unroll`` metadata are only
6508 optimization hints and the unrolling will only be performed if the
6509 optimizer believes it is safe to do so.
6510
6511 '``llvm.loop.unroll.count``' Metadata
6512 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6513
6514 This metadata suggests an unroll factor to the loop unroller. The
6515 first operand is the string ``llvm.loop.unroll.count`` and the second
6516 operand is a positive integer specifying the unroll factor. For
6517 example:
6518
6519 .. code-block:: llvm
6520
6521    !0 = !{!"llvm.loop.unroll.count", i32 4}
6522
6523 If the trip count of the loop is less than the unroll count the loop
6524 will be partially unrolled.
6525
6526 '``llvm.loop.unroll.disable``' Metadata
6527 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6528
6529 This metadata disables loop unrolling. The metadata has a single operand
6530 which is the string ``llvm.loop.unroll.disable``. For example:
6531
6532 .. code-block:: llvm
6533
6534    !0 = !{!"llvm.loop.unroll.disable"}
6535
6536 '``llvm.loop.unroll.runtime.disable``' Metadata
6537 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6538
6539 This metadata disables runtime loop unrolling. The metadata has a single
6540 operand which is the string ``llvm.loop.unroll.runtime.disable``. For example:
6541
6542 .. code-block:: llvm
6543
6544    !0 = !{!"llvm.loop.unroll.runtime.disable"}
6545
6546 '``llvm.loop.unroll.enable``' Metadata
6547 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6548
6549 This metadata suggests that the loop should be fully unrolled if the trip count
6550 is known at compile time and partially unrolled if the trip count is not known
6551 at compile time. The metadata has a single operand which is the string
6552 ``llvm.loop.unroll.enable``.  For example:
6553
6554 .. code-block:: llvm
6555
6556    !0 = !{!"llvm.loop.unroll.enable"}
6557
6558 '``llvm.loop.unroll.full``' Metadata
6559 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6560
6561 This metadata suggests that the loop should be unrolled fully. The
6562 metadata has a single operand which is the string ``llvm.loop.unroll.full``.
6563 For example:
6564
6565 .. code-block:: llvm
6566
6567    !0 = !{!"llvm.loop.unroll.full"}
6568
6569 '``llvm.loop.unroll.followup``' Metadata
6570 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6571
6572 This metadata defines which loop attributes the unrolled loop will have.
6573 See :ref:`Transformation Metadata <transformation-metadata>` for details.
6574
6575 '``llvm.loop.unroll.followup_remainder``' Metadata
6576 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6577
6578 This metadata defines which loop attributes the remainder loop after
6579 partial/runtime unrolling will have. See
6580 :ref:`Transformation Metadata <transformation-metadata>` for details.
6581
6582 '``llvm.loop.unroll_and_jam``'
6583 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6584
6585 This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata
6586 above, but affect the unroll and jam pass. In addition any loop with
6587 ``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will
6588 disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the
6589 unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam
6590 too.)
6591
6592 The metadata for unroll and jam otherwise is the same as for ``unroll``.
6593 ``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and
6594 ``llvm.loop.unroll_and_jam.count`` do the same as for unroll.
6595 ``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints
6596 and the normal safety checks will still be performed.
6597
6598 '``llvm.loop.unroll_and_jam.count``' Metadata
6599 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6600
6601 This metadata suggests an unroll and jam factor to use, similarly to
6602 ``llvm.loop.unroll.count``. The first operand is the string
6603 ``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer
6604 specifying the unroll factor. For example:
6605
6606 .. code-block:: llvm
6607
6608    !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4}
6609
6610 If the trip count of the loop is less than the unroll count the loop
6611 will be partially unroll and jammed.
6612
6613 '``llvm.loop.unroll_and_jam.disable``' Metadata
6614 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6615
6616 This metadata disables loop unroll and jamming. The metadata has a single
6617 operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example:
6618
6619 .. code-block:: llvm
6620
6621    !0 = !{!"llvm.loop.unroll_and_jam.disable"}
6622
6623 '``llvm.loop.unroll_and_jam.enable``' Metadata
6624 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6625
6626 This metadata suggests that the loop should be fully unroll and jammed if the
6627 trip count is known at compile time and partially unrolled if the trip count is
6628 not known at compile time. The metadata has a single operand which is the
6629 string ``llvm.loop.unroll_and_jam.enable``.  For example:
6630
6631 .. code-block:: llvm
6632
6633    !0 = !{!"llvm.loop.unroll_and_jam.enable"}
6634
6635 '``llvm.loop.unroll_and_jam.followup_outer``' Metadata
6636 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6637
6638 This metadata defines which loop attributes the outer unrolled loop will
6639 have. See :ref:`Transformation Metadata <transformation-metadata>` for
6640 details.
6641
6642 '``llvm.loop.unroll_and_jam.followup_inner``' Metadata
6643 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6644
6645 This metadata defines which loop attributes the inner jammed loop will
6646 have. See :ref:`Transformation Metadata <transformation-metadata>` for
6647 details.
6648
6649 '``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata
6650 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6651
6652 This metadata defines which attributes the epilogue of the outer loop
6653 will have. This loop is usually unrolled, meaning there is no such
6654 loop. This attribute will be ignored in this case. See
6655 :ref:`Transformation Metadata <transformation-metadata>` for details.
6656
6657 '``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata
6658 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6659
6660 This metadata defines which attributes the inner loop of the epilogue
6661 will have. The outer epilogue will usually be unrolled, meaning there
6662 can be multiple inner remainder loops. See
6663 :ref:`Transformation Metadata <transformation-metadata>` for details.
6664
6665 '``llvm.loop.unroll_and_jam.followup_all``' Metadata
6666 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6667
6668 Attributes specified in the metadata is added to all
6669 ``llvm.loop.unroll_and_jam.*`` loops. See
6670 :ref:`Transformation Metadata <transformation-metadata>` for details.
6671
6672 '``llvm.loop.licm_versioning.disable``' Metadata
6673 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6674
6675 This metadata indicates that the loop should not be versioned for the purpose
6676 of enabling loop-invariant code motion (LICM). The metadata has a single operand
6677 which is the string ``llvm.loop.licm_versioning.disable``. For example:
6678
6679 .. code-block:: llvm
6680
6681    !0 = !{!"llvm.loop.licm_versioning.disable"}
6682
6683 '``llvm.loop.distribute.enable``' Metadata
6684 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6685
6686 Loop distribution allows splitting a loop into multiple loops.  Currently,
6687 this is only performed if the entire loop cannot be vectorized due to unsafe
6688 memory dependencies.  The transformation will attempt to isolate the unsafe
6689 dependencies into their own loop.
6690
6691 This metadata can be used to selectively enable or disable distribution of the
6692 loop.  The first operand is the string ``llvm.loop.distribute.enable`` and the
6693 second operand is a bit. If the bit operand value is 1 distribution is
6694 enabled. A value of 0 disables distribution:
6695
6696 .. code-block:: llvm
6697
6698    !0 = !{!"llvm.loop.distribute.enable", i1 0}
6699    !1 = !{!"llvm.loop.distribute.enable", i1 1}
6700
6701 This metadata should be used in conjunction with ``llvm.loop`` loop
6702 identification metadata.
6703
6704 '``llvm.loop.distribute.followup_coincident``' Metadata
6705 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6706
6707 This metadata defines which attributes extracted loops with no cyclic
6708 dependencies will have (i.e. can be vectorized). See
6709 :ref:`Transformation Metadata <transformation-metadata>` for details.
6710
6711 '``llvm.loop.distribute.followup_sequential``' Metadata
6712 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6713
6714 This metadata defines which attributes the isolated loops with unsafe
6715 memory dependencies will have. See
6716 :ref:`Transformation Metadata <transformation-metadata>` for details.
6717
6718 '``llvm.loop.distribute.followup_fallback``' Metadata
6719 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6720
6721 If loop versioning is necessary, this metadata defined the attributes
6722 the non-distributed fallback version will have. See
6723 :ref:`Transformation Metadata <transformation-metadata>` for details.
6724
6725 '``llvm.loop.distribute.followup_all``' Metadata
6726 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6727
6728 The attributes in this metadata is added to all followup loops of the
6729 loop distribution pass. See
6730 :ref:`Transformation Metadata <transformation-metadata>` for details.
6731
6732 '``llvm.licm.disable``' Metadata
6733 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6734
6735 This metadata indicates that loop-invariant code motion (LICM) should not be
6736 performed on this loop. The metadata has a single operand which is the string
6737 ``llvm.licm.disable``. For example:
6738
6739 .. code-block:: llvm
6740
6741    !0 = !{!"llvm.licm.disable"}
6742
6743 Note that although it operates per loop it isn't given the llvm.loop prefix
6744 as it is not affected by the ``llvm.loop.disable_nonforced`` metadata.
6745
6746 '``llvm.access.group``' Metadata
6747 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6748
6749 ``llvm.access.group`` metadata can be attached to any instruction that
6750 potentially accesses memory. It can point to a single distinct metadata
6751 node, which we call access group. This node represents all memory access
6752 instructions referring to it via ``llvm.access.group``. When an
6753 instruction belongs to multiple access groups, it can also point to a
6754 list of accesses groups, illustrated by the following example.
6755
6756 .. code-block:: llvm
6757
6758    %val = load i32, i32* %arrayidx, !llvm.access.group !0
6759    ...
6760    !0 = !{!1, !2}
6761    !1 = distinct !{}
6762    !2 = distinct !{}
6763
6764 It is illegal for the list node to be empty since it might be confused
6765 with an access group.
6766
6767 The access group metadata node must be 'distinct' to avoid collapsing
6768 multiple access groups by content. A access group metadata node must
6769 always be empty which can be used to distinguish an access group
6770 metadata node from a list of access groups. Being empty avoids the
6771 situation that the content must be updated which, because metadata is
6772 immutable by design, would required finding and updating all references
6773 to the access group node.
6774
6775 The access group can be used to refer to a memory access instruction
6776 without pointing to it directly (which is not possible in global
6777 metadata). Currently, the only metadata making use of it is
6778 ``llvm.loop.parallel_accesses``.
6779
6780 '``llvm.loop.parallel_accesses``' Metadata
6781 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6782
6783 The ``llvm.loop.parallel_accesses`` metadata refers to one or more
6784 access group metadata nodes (see ``llvm.access.group``). It denotes that
6785 no loop-carried memory dependence exist between it and other instructions
6786 in the loop with this metadata.
6787
6788 Let ``m1`` and ``m2`` be two instructions that both have the
6789 ``llvm.access.group`` metadata to the access group ``g1``, respectively
6790 ``g2`` (which might be identical). If a loop contains both access groups
6791 in its ``llvm.loop.parallel_accesses`` metadata, then the compiler can
6792 assume that there is no dependency between ``m1`` and ``m2`` carried by
6793 this loop. Instructions that belong to multiple access groups are
6794 considered having this property if at least one of the access groups
6795 matches the ``llvm.loop.parallel_accesses`` list.
6796
6797 If all memory-accessing instructions in a loop have
6798 ``llvm.access.group`` metadata that each refer to one of the access
6799 groups of a loop's ``llvm.loop.parallel_accesses`` metadata, then the
6800 loop has no loop carried memory dependences and is considered to be a
6801 parallel loop.
6802
6803 Note that if not all memory access instructions belong to an access
6804 group referred to by ``llvm.loop.parallel_accesses``, then the loop must
6805 not be considered trivially parallel. Additional
6806 memory dependence analysis is required to make that determination. As a fail
6807 safe mechanism, this causes loops that were originally parallel to be considered
6808 sequential (if optimization passes that are unaware of the parallel semantics
6809 insert new memory instructions into the loop body).
6810
6811 Example of a loop that is considered parallel due to its correct use of
6812 both ``llvm.access.group`` and ``llvm.loop.parallel_accesses``
6813 metadata types.
6814
6815 .. code-block:: llvm
6816
6817    for.body:
6818      ...
6819      %val0 = load i32, i32* %arrayidx, !llvm.access.group !1
6820      ...
6821      store i32 %val0, i32* %arrayidx1, !llvm.access.group !1
6822      ...
6823      br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
6824
6825    for.end:
6826    ...
6827    !0 = distinct !{!0, !{!"llvm.loop.parallel_accesses", !1}}
6828    !1 = distinct !{}
6829
6830 It is also possible to have nested parallel loops:
6831
6832 .. code-block:: llvm
6833
6834    outer.for.body:
6835      ...
6836      %val1 = load i32, i32* %arrayidx3, !llvm.access.group !4
6837      ...
6838      br label %inner.for.body
6839
6840    inner.for.body:
6841      ...
6842      %val0 = load i32, i32* %arrayidx1, !llvm.access.group !3
6843      ...
6844      store i32 %val0, i32* %arrayidx2, !llvm.access.group !3
6845      ...
6846      br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1
6847
6848    inner.for.end:
6849      ...
6850      store i32 %val1, i32* %arrayidx4, !llvm.access.group !4
6851      ...
6852      br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2
6853
6854    outer.for.end:                                          ; preds = %for.body
6855    ...
6856    !1 = distinct !{!1, !{!"llvm.loop.parallel_accesses", !3}}     ; metadata for the inner loop
6857    !2 = distinct !{!2, !{!"llvm.loop.parallel_accesses", !3, !4}} ; metadata for the outer loop
6858    !3 = distinct !{} ; access group for instructions in the inner loop (which are implicitly contained in outer loop as well)
6859    !4 = distinct !{} ; access group for instructions in the outer, but not the inner loop
6860
6861 '``llvm.loop.mustprogress``' Metadata
6862 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6863
6864 The ``llvm.loop.mustprogress`` metadata indicates that this loop is required to
6865 terminate, unwind, or interact with the environment in an observable way e.g.
6866 via a volatile memory access, I/O, or other synchronization. If such a loop is
6867 not found to interact with the environment in an observable way, the loop may
6868 be removed. This corresponds to the ``mustprogress`` function attribute.
6869
6870 '``irr_loop``' Metadata
6871 ^^^^^^^^^^^^^^^^^^^^^^^
6872
6873 ``irr_loop`` metadata may be attached to the terminator instruction of a basic
6874 block that's an irreducible loop header (note that an irreducible loop has more
6875 than once header basic blocks.) If ``irr_loop`` metadata is attached to the
6876 terminator instruction of a basic block that is not really an irreducible loop
6877 header, the behavior is undefined. The intent of this metadata is to improve the
6878 accuracy of the block frequency propagation. For example, in the code below, the
6879 block ``header0`` may have a loop header weight (relative to the other headers of
6880 the irreducible loop) of 100:
6881
6882 .. code-block:: llvm
6883
6884     header0:
6885     ...
6886     br i1 %cmp, label %t1, label %t2, !irr_loop !0
6887
6888     ...
6889     !0 = !{"loop_header_weight", i64 100}
6890
6891 Irreducible loop header weights are typically based on profile data.
6892
6893 .. _md_invariant.group:
6894
6895 '``invariant.group``' Metadata
6896 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6897
6898 The experimental ``invariant.group`` metadata may be attached to
6899 ``load``/``store`` instructions referencing a single metadata with no entries.
6900 The existence of the ``invariant.group`` metadata on the instruction tells
6901 the optimizer that every ``load`` and ``store`` to the same pointer operand
6902 can be assumed to load or store the same
6903 value (but see the ``llvm.launder.invariant.group`` intrinsic which affects
6904 when two pointers are considered the same). Pointers returned by bitcast or
6905 getelementptr with only zero indices are considered the same.
6906
6907 Examples:
6908
6909 .. code-block:: llvm
6910
6911    @unknownPtr = external global i8
6912    ...
6913    %ptr = alloca i8
6914    store i8 42, i8* %ptr, !invariant.group !0
6915    call void @foo(i8* %ptr)
6916
6917    %a = load i8, i8* %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change
6918    call void @foo(i8* %ptr)
6919
6920    %newPtr = call i8* @getPointer(i8* %ptr)
6921    %c = load i8, i8* %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr
6922
6923    %unknownValue = load i8, i8* @unknownPtr
6924    store i8 %unknownValue, i8* %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42
6925
6926    call void @foo(i8* %ptr)
6927    %newPtr2 = call i8* @llvm.launder.invariant.group(i8* %ptr)
6928    %d = load i8, i8* %newPtr2, !invariant.group !0  ; Can't step through launder.invariant.group to get value of %ptr
6929
6930    ...
6931    declare void @foo(i8*)
6932    declare i8* @getPointer(i8*)
6933    declare i8* @llvm.launder.invariant.group(i8*)
6934
6935    !0 = !{}
6936
6937 The invariant.group metadata must be dropped when replacing one pointer by
6938 another based on aliasing information. This is because invariant.group is tied
6939 to the SSA value of the pointer operand.
6940
6941 .. code-block:: llvm
6942
6943   %v = load i8, i8* %x, !invariant.group !0
6944   ; if %x mustalias %y then we can replace the above instruction with
6945   %v = load i8, i8* %y
6946
6947 Note that this is an experimental feature, which means that its semantics might
6948 change in the future.
6949
6950 '``type``' Metadata
6951 ^^^^^^^^^^^^^^^^^^^
6952
6953 See :doc:`TypeMetadata`.
6954
6955 '``associated``' Metadata
6956 ^^^^^^^^^^^^^^^^^^^^^^^^^
6957
6958 The ``associated`` metadata may be attached to a global variable definition with
6959 a single argument that references a global object (optionally through an alias).
6960
6961 This metadata lowers to the ELF section flag ``SHF_LINK_ORDER`` which prevents
6962 discarding of the global variable in linker GC unless the referenced object is
6963 also discarded. The linker support for this feature is spotty. For best
6964 compatibility, globals carrying this metadata should:
6965
6966 - Be in ``@llvm.compiler.used``.
6967 - If the referenced global variable is in a comdat, be in the same comdat.
6968
6969 ``!associated`` can not express many-to-one relationship. A global variable with
6970 the metadata should generally not be referenced by a function: the function may
6971 be inlined into other functions, leading to more references to the metadata.
6972 Ideally we would want to keep metadata alive as long as any inline location is
6973 alive, but this many-to-one relationship is not representable. Moreover, if the
6974 metadata is retained while the function is discarded, the linker will report an
6975 error of a relocation referencing a discarded section.
6976
6977 The metadata is often used with an explicit section consisting of valid C
6978 identifiers so that the runtime can find the metadata section with
6979 linker-defined encapsulation symbols ``__start_<section_name>`` and
6980 ``__stop_<section_name>``.
6981
6982 It does not have any effect on non-ELF targets.
6983
6984 Example:
6985
6986 .. code-block:: text
6987
6988     $a = comdat any
6989     @a = global i32 1, comdat $a
6990     @b = internal global i32 2, comdat $a, section "abc", !associated !0
6991     !0 = !{i32* @a}
6992
6993
6994 '``prof``' Metadata
6995 ^^^^^^^^^^^^^^^^^^^
6996
6997 The ``prof`` metadata is used to record profile data in the IR.
6998 The first operand of the metadata node indicates the profile metadata
6999 type. There are currently 3 types:
7000 :ref:`branch_weights<prof_node_branch_weights>`,
7001 :ref:`function_entry_count<prof_node_function_entry_count>`, and
7002 :ref:`VP<prof_node_VP>`.
7003
7004 .. _prof_node_branch_weights:
7005
7006 branch_weights
7007 """"""""""""""
7008
7009 Branch weight metadata attached to a branch, select, switch or call instruction
7010 represents the likeliness of the associated branch being taken.
7011 For more information, see :doc:`BranchWeightMetadata`.
7012
7013 .. _prof_node_function_entry_count:
7014
7015 function_entry_count
7016 """"""""""""""""""""
7017
7018 Function entry count metadata can be attached to function definitions
7019 to record the number of times the function is called. Used with BFI
7020 information, it is also used to derive the basic block profile count.
7021 For more information, see :doc:`BranchWeightMetadata`.
7022
7023 .. _prof_node_VP:
7024
7025 VP
7026 ""
7027
7028 VP (value profile) metadata can be attached to instructions that have
7029 value profile information. Currently this is indirect calls (where it
7030 records the hottest callees) and calls to memory intrinsics such as memcpy,
7031 memmove, and memset (where it records the hottest byte lengths).
7032
7033 Each VP metadata node contains "VP" string, then a uint32_t value for the value
7034 profiling kind, a uint64_t value for the total number of times the instruction
7035 is executed, followed by uint64_t value and execution count pairs.
7036 The value profiling kind is 0 for indirect call targets and 1 for memory
7037 operations. For indirect call targets, each profile value is a hash
7038 of the callee function name, and for memory operations each value is the
7039 byte length.
7040
7041 Note that the value counts do not need to add up to the total count
7042 listed in the third operand (in practice only the top hottest values
7043 are tracked and reported).
7044
7045 Indirect call example:
7046
7047 .. code-block:: llvm
7048
7049     call void %f(), !prof !1
7050     !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410}
7051
7052 Note that the VP type is 0 (the second operand), which indicates this is
7053 an indirect call value profile data. The third operand indicates that the
7054 indirect call executed 1600 times. The 4th and 6th operands give the
7055 hashes of the 2 hottest target functions' names (this is the same hash used
7056 to represent function names in the profile database), and the 5th and 7th
7057 operands give the execution count that each of the respective prior target
7058 functions was called.
7059
7060 .. _md_annotation:
7061
7062 '``annotation``' Metadata
7063 ^^^^^^^^^^^^^^^^^^^^^^^^^
7064
7065 The ``annotation`` metadata can be used to attach a tuple of annotation strings
7066 to any instruction. This metadata does not impact the semantics of the program
7067 and may only be used to provide additional insight about the program and
7068 transformations to users.
7069
7070 Example:
7071
7072 .. code-block:: text
7073
7074     %a.addr = alloca float*, align 8, !annotation !0
7075     !0 = !{!"auto-init"}
7076
7077 Module Flags Metadata
7078 =====================
7079
7080 Information about the module as a whole is difficult to convey to LLVM's
7081 subsystems. The LLVM IR isn't sufficient to transmit this information.
7082 The ``llvm.module.flags`` named metadata exists in order to facilitate
7083 this. These flags are in the form of key / value pairs --- much like a
7084 dictionary --- making it easy for any subsystem who cares about a flag to
7085 look it up.
7086
7087 The ``llvm.module.flags`` metadata contains a list of metadata triplets.
7088 Each triplet has the following form:
7089
7090 -  The first element is a *behavior* flag, which specifies the behavior
7091    when two (or more) modules are merged together, and it encounters two
7092    (or more) metadata with the same ID. The supported behaviors are
7093    described below.
7094 -  The second element is a metadata string that is a unique ID for the
7095    metadata. Each module may only have one flag entry for each unique ID (not
7096    including entries with the **Require** behavior).
7097 -  The third element is the value of the flag.
7098
7099 When two (or more) modules are merged together, the resulting
7100 ``llvm.module.flags`` metadata is the union of the modules' flags. That is, for
7101 each unique metadata ID string, there will be exactly one entry in the merged
7102 modules ``llvm.module.flags`` metadata table, and the value for that entry will
7103 be determined by the merge behavior flag, as described below. The only exception
7104 is that entries with the *Require* behavior are always preserved.
7105
7106 The following behaviors are supported:
7107
7108 .. list-table::
7109    :header-rows: 1
7110    :widths: 10 90
7111
7112    * - Value
7113      - Behavior
7114
7115    * - 1
7116      - **Error**
7117            Emits an error if two values disagree, otherwise the resulting value
7118            is that of the operands.
7119
7120    * - 2
7121      - **Warning**
7122            Emits a warning if two values disagree. The result value will be the
7123            operand for the flag from the first module being linked, or the max
7124            if the other module uses **Max** (in which case the resulting flag
7125            will be **Max**).
7126
7127    * - 3
7128      - **Require**
7129            Adds a requirement that another module flag be present and have a
7130            specified value after linking is performed. The value must be a
7131            metadata pair, where the first element of the pair is the ID of the
7132            module flag to be restricted, and the second element of the pair is
7133            the value the module flag should be restricted to. This behavior can
7134            be used to restrict the allowable results (via triggering of an
7135            error) of linking IDs with the **Override** behavior.
7136
7137    * - 4
7138      - **Override**
7139            Uses the specified value, regardless of the behavior or value of the
7140            other module. If both modules specify **Override**, but the values
7141            differ, an error will be emitted.
7142
7143    * - 5
7144      - **Append**
7145            Appends the two values, which are required to be metadata nodes.
7146
7147    * - 6
7148      - **AppendUnique**
7149            Appends the two values, which are required to be metadata
7150            nodes. However, duplicate entries in the second list are dropped
7151            during the append operation.
7152
7153    * - 7
7154      - **Max**
7155            Takes the max of the two values, which are required to be integers.
7156
7157 It is an error for a particular unique flag ID to have multiple behaviors,
7158 except in the case of **Require** (which adds restrictions on another metadata
7159 value) or **Override**.
7160
7161 An example of module flags:
7162
7163 .. code-block:: llvm
7164
7165     !0 = !{ i32 1, !"foo", i32 1 }
7166     !1 = !{ i32 4, !"bar", i32 37 }
7167     !2 = !{ i32 2, !"qux", i32 42 }
7168     !3 = !{ i32 3, !"qux",
7169       !{
7170         !"foo", i32 1
7171       }
7172     }
7173     !llvm.module.flags = !{ !0, !1, !2, !3 }
7174
7175 -  Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior
7176    if two or more ``!"foo"`` flags are seen is to emit an error if their
7177    values are not equal.
7178
7179 -  Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The
7180    behavior if two or more ``!"bar"`` flags are seen is to use the value
7181    '37'.
7182
7183 -  Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The
7184    behavior if two or more ``!"qux"`` flags are seen is to emit a
7185    warning if their values are not equal.
7186
7187 -  Metadata ``!3`` has the ID ``!"qux"`` and the value:
7188
7189    ::
7190
7191        !{ !"foo", i32 1 }
7192
7193    The behavior is to emit an error if the ``llvm.module.flags`` does not
7194    contain a flag with the ID ``!"foo"`` that has the value '1' after linking is
7195    performed.
7196
7197 Synthesized Functions Module Flags Metadata
7198 -------------------------------------------
7199
7200 These metadata specify the default attributes synthesized functions should have.
7201 These metadata are currently respected by a few instrumentation passes, such as
7202 sanitizers.
7203
7204 These metadata correspond to a few function attributes with significant code
7205 generation behaviors. Function attributes with just optimization purposes
7206 should not be listed because the performance impact of these synthesized
7207 functions is small.
7208
7209 - "frame-pointer": **Max**. The value can be 0, 1, or 2. A synthesized function
7210   will get the "frame-pointer" function attribute, with value being "none",
7211   "non-leaf", or "all", respectively.
7212 - "uwtable": **Max**. The value can be 0 or 1. If the value is 1, a synthesized
7213   function will get the ``uwtable`` function attribute.
7214
7215 Objective-C Garbage Collection Module Flags Metadata
7216 ----------------------------------------------------
7217
7218 On the Mach-O platform, Objective-C stores metadata about garbage
7219 collection in a special section called "image info". The metadata
7220 consists of a version number and a bitmask specifying what types of
7221 garbage collection are supported (if any) by the file. If two or more
7222 modules are linked together their garbage collection metadata needs to
7223 be merged rather than appended together.
7224
7225 The Objective-C garbage collection module flags metadata consists of the
7226 following key-value pairs:
7227
7228 .. list-table::
7229    :header-rows: 1
7230    :widths: 30 70
7231
7232    * - Key
7233      - Value
7234
7235    * - ``Objective-C Version``
7236      - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2.
7237
7238    * - ``Objective-C Image Info Version``
7239      - **[Required]** --- The version of the image info section. Currently
7240        always 0.
7241
7242    * - ``Objective-C Image Info Section``
7243      - **[Required]** --- The section to place the metadata. Valid values are
7244        ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and
7245        ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for
7246        Objective-C ABI version 2.
7247
7248    * - ``Objective-C Garbage Collection``
7249      - **[Required]** --- Specifies whether garbage collection is supported or
7250        not. Valid values are 0, for no garbage collection, and 2, for garbage
7251        collection supported.
7252
7253    * - ``Objective-C GC Only``
7254      - **[Optional]** --- Specifies that only garbage collection is supported.
7255        If present, its value must be 6. This flag requires that the
7256        ``Objective-C Garbage Collection`` flag have the value 2.
7257
7258 Some important flag interactions:
7259
7260 -  If a module with ``Objective-C Garbage Collection`` set to 0 is
7261    merged with a module with ``Objective-C Garbage Collection`` set to
7262    2, then the resulting module has the
7263    ``Objective-C Garbage Collection`` flag set to 0.
7264 -  A module with ``Objective-C Garbage Collection`` set to 0 cannot be
7265    merged with a module with ``Objective-C GC Only`` set to 6.
7266
7267 C type width Module Flags Metadata
7268 ----------------------------------
7269
7270 The ARM backend emits a section into each generated object file describing the
7271 options that it was compiled with (in a compiler-independent way) to prevent
7272 linking incompatible objects, and to allow automatic library selection. Some
7273 of these options are not visible at the IR level, namely wchar_t width and enum
7274 width.
7275
7276 To pass this information to the backend, these options are encoded in module
7277 flags metadata, using the following key-value pairs:
7278
7279 .. list-table::
7280    :header-rows: 1
7281    :widths: 30 70
7282
7283    * - Key
7284      - Value
7285
7286    * - short_wchar
7287      - * 0 --- sizeof(wchar_t) == 4
7288        * 1 --- sizeof(wchar_t) == 2
7289
7290    * - short_enum
7291      - * 0 --- Enums are at least as large as an ``int``.
7292        * 1 --- Enums are stored in the smallest integer type which can
7293          represent all of its values.
7294
7295 For example, the following metadata section specifies that the module was
7296 compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an
7297 enum is the smallest type which can represent all of its values::
7298
7299     !llvm.module.flags = !{!0, !1}
7300     !0 = !{i32 1, !"short_wchar", i32 1}
7301     !1 = !{i32 1, !"short_enum", i32 0}
7302
7303 LTO Post-Link Module Flags Metadata
7304 -----------------------------------
7305
7306 Some optimisations are only when the entire LTO unit is present in the current
7307 module. This is represented by the ``LTOPostLink`` module flags metadata, which
7308 will be created with a value of ``1`` when LTO linking occurs.
7309
7310 Automatic Linker Flags Named Metadata
7311 =====================================
7312
7313 Some targets support embedding of flags to the linker inside individual object
7314 files. Typically this is used in conjunction with language extensions which
7315 allow source files to contain linker command line options, and have these
7316 automatically be transmitted to the linker via object files.
7317
7318 These flags are encoded in the IR using named metadata with the name
7319 ``!llvm.linker.options``. Each operand is expected to be a metadata node
7320 which should be a list of other metadata nodes, each of which should be a
7321 list of metadata strings defining linker options.
7322
7323 For example, the following metadata section specifies two separate sets of
7324 linker options, presumably to link against ``libz`` and the ``Cocoa``
7325 framework::
7326
7327     !0 = !{ !"-lz" }
7328     !1 = !{ !"-framework", !"Cocoa" }
7329     !llvm.linker.options = !{ !0, !1 }
7330
7331 The metadata encoding as lists of lists of options, as opposed to a collapsed
7332 list of options, is chosen so that the IR encoding can use multiple option
7333 strings to specify e.g., a single library, while still having that specifier be
7334 preserved as an atomic element that can be recognized by a target specific
7335 assembly writer or object file emitter.
7336
7337 Each individual option is required to be either a valid option for the target's
7338 linker, or an option that is reserved by the target specific assembly writer or
7339 object file emitter. No other aspect of these options is defined by the IR.
7340
7341 Dependent Libs Named Metadata
7342 =============================
7343
7344 Some targets support embedding of strings into object files to indicate
7345 a set of libraries to add to the link. Typically this is used in conjunction
7346 with language extensions which allow source files to explicitly declare the
7347 libraries they depend on, and have these automatically be transmitted to the
7348 linker via object files.
7349
7350 The list is encoded in the IR using named metadata with the name
7351 ``!llvm.dependent-libraries``. Each operand is expected to be a metadata node
7352 which should contain a single string operand.
7353
7354 For example, the following metadata section contains two library specifiers::
7355
7356     !0 = !{!"a library specifier"}
7357     !1 = !{!"another library specifier"}
7358     !llvm.dependent-libraries = !{ !0, !1 }
7359
7360 Each library specifier will be handled independently by the consuming linker.
7361 The effect of the library specifiers are defined by the consuming linker.
7362
7363 .. _summary:
7364
7365 ThinLTO Summary
7366 ===============
7367
7368 Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_
7369 causes the building of a compact summary of the module that is emitted into
7370 the bitcode. The summary is emitted into the LLVM assembly and identified
7371 in syntax by a caret ('``^``').
7372
7373 The summary is parsed into a bitcode output, along with the Module
7374 IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes
7375 of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the
7376 summary entries (just as they currently ignore summary entries in a bitcode
7377 input file).
7378
7379 Eventually, the summary will be parsed into a ModuleSummaryIndex object under
7380 the same conditions where summary index is currently built from bitcode.
7381 Specifically, tools that test the Thin Link portion of a ThinLTO compile
7382 (i.e. llvm-lto and llvm-lto2), or when parsing a combined index
7383 for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag
7384 (this part is not yet implemented, use llvm-as to create a bitcode object
7385 before feeding into thin link tools for now).
7386
7387 There are currently 3 types of summary entries in the LLVM assembly:
7388 :ref:`module paths<module_path_summary>`,
7389 :ref:`global values<gv_summary>`, and
7390 :ref:`type identifiers<typeid_summary>`.
7391
7392 .. _module_path_summary:
7393
7394 Module Path Summary Entry
7395 -------------------------
7396
7397 Each module path summary entry lists a module containing global values included
7398 in the summary. For a single IR module there will be one such entry, but
7399 in a combined summary index produced during the thin link, there will be
7400 one module path entry per linked module with summary.
7401
7402 Example:
7403
7404 .. code-block:: text
7405
7406     ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418))
7407
7408 The ``path`` field is a string path to the bitcode file, and the ``hash``
7409 field is the 160-bit SHA-1 hash of the IR bitcode contents, used for
7410 incremental builds and caching.
7411
7412 .. _gv_summary:
7413
7414 Global Value Summary Entry
7415 --------------------------
7416
7417 Each global value summary entry corresponds to a global value defined or
7418 referenced by a summarized module.
7419
7420 Example:
7421
7422 .. code-block:: text
7423
7424     ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831
7425
7426 For declarations, there will not be a summary list. For definitions, a
7427 global value will contain a list of summaries, one per module containing
7428 a definition. There can be multiple entries in a combined summary index
7429 for symbols with weak linkage.
7430
7431 Each ``Summary`` format will depend on whether the global value is a
7432 :ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or
7433 :ref:`alias<alias_summary>`.
7434
7435 .. _function_summary:
7436
7437 Function Summary
7438 ^^^^^^^^^^^^^^^^
7439
7440 If the global value is a function, the ``Summary`` entry will look like:
7441
7442 .. code-block:: text
7443
7444     function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Params]?[, Refs]?
7445
7446 The ``module`` field includes the summary entry id for the module containing
7447 this definition, and the ``flags`` field contains information such as
7448 the linkage type, a flag indicating whether it is legal to import the
7449 definition, whether it is globally live and whether the linker resolved it
7450 to a local definition (the latter two are populated during the thin link).
7451 The ``insts`` field contains the number of IR instructions in the function.
7452 Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`,
7453 :ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`,
7454 :ref:`Params<params_summary>`, :ref:`Refs<refs_summary>`.
7455
7456 .. _variable_summary:
7457
7458 Global Variable Summary
7459 ^^^^^^^^^^^^^^^^^^^^^^^
7460
7461 If the global value is a variable, the ``Summary`` entry will look like:
7462
7463 .. code-block:: text
7464
7465     variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]?
7466
7467 The variable entry contains a subset of the fields in a
7468 :ref:`function summary <function_summary>`, see the descriptions there.
7469
7470 .. _alias_summary:
7471
7472 Alias Summary
7473 ^^^^^^^^^^^^^
7474
7475 If the global value is an alias, the ``Summary`` entry will look like:
7476
7477 .. code-block:: text
7478
7479     alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2)
7480
7481 The ``module`` and ``flags`` fields are as described for a
7482 :ref:`function summary <function_summary>`. The ``aliasee`` field
7483 contains a reference to the global value summary entry of the aliasee.
7484
7485 .. _funcflags_summary:
7486
7487 Function Flags
7488 ^^^^^^^^^^^^^^
7489
7490 The optional ``FuncFlags`` field looks like:
7491
7492 .. code-block:: text
7493
7494     funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0)
7495
7496 If unspecified, flags are assumed to hold the conservative ``false`` value of
7497 ``0``.
7498
7499 .. _calls_summary:
7500
7501 Calls
7502 ^^^^^
7503
7504 The optional ``Calls`` field looks like:
7505
7506 .. code-block:: text
7507
7508     calls: ((Callee)[, (Callee)]*)
7509
7510 where each ``Callee`` looks like:
7511
7512 .. code-block:: text
7513
7514     callee: ^1[, hotness: None]?[, relbf: 0]?
7515
7516 The ``callee`` refers to the summary entry id of the callee. At most one
7517 of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``,
7518 ``Hot``, and ``Critical``), and ``relbf`` (which holds the integer
7519 branch frequency relative to the entry frequency, scaled down by 2^8)
7520 may be specified. The defaults are ``Unknown`` and ``0``, respectively.
7521
7522 .. _params_summary:
7523
7524 Params
7525 ^^^^^^
7526
7527 The optional ``Params`` is used by ``StackSafety`` and looks like:
7528
7529 .. code-block:: text
7530
7531     Params: ((Param)[, (Param)]*)
7532
7533 where each ``Param`` describes pointer parameter access inside of the
7534 function and looks like:
7535
7536 .. code-block:: text
7537
7538     param: 4, offset: [0, 5][, calls: ((Callee)[, (Callee)]*)]?
7539
7540 where the first ``param`` is the number of the parameter it describes,
7541 ``offset`` is the inclusive range of offsets from the pointer parameter to bytes
7542 which can be accessed by the function. This range does not include accesses by
7543 function calls from ``calls`` list.
7544
7545 where each ``Callee`` describes how parameter is forwarded into other
7546 functions and looks like:
7547
7548 .. code-block:: text
7549
7550     callee: ^3, param: 5, offset: [-3, 3]
7551
7552 The ``callee`` refers to the summary entry id of the callee,  ``param`` is
7553 the number of the callee parameter which points into the callers parameter
7554 with offset known to be inside of the ``offset`` range. ``calls`` will be
7555 consumed and removed by thin link stage to update ``Param::offset`` so it
7556 covers all accesses possible by ``calls``.
7557
7558 Pointer parameter without corresponding ``Param`` is considered unsafe and we
7559 assume that access with any offset is possible.
7560
7561 Example:
7562
7563 If we have the following function:
7564
7565 .. code-block:: text
7566
7567     define i64 @foo(i64* %0, i32* %1, i8* %2, i8 %3) {
7568       store i32* %1, i32** @x
7569       %5 = getelementptr inbounds i8, i8* %2, i64 5
7570       %6 = load i8, i8* %5
7571       %7 = getelementptr inbounds i8, i8* %2, i8 %3
7572       tail call void @bar(i8 %3, i8* %7)
7573       %8 = load i64, i64* %0
7574       ret i64 %8
7575     }
7576
7577 We can expect the record like this:
7578
7579 .. code-block:: text
7580
7581     params: ((param: 0, offset: [0, 7]),(param: 2, offset: [5, 5], calls: ((callee: ^3, param: 1, offset: [-128, 127]))))
7582
7583 The function may access just 8 bytes of the parameter %0 . ``calls`` is empty,
7584 so the parameter is either not used for function calls or ``offset`` already
7585 covers all accesses from nested function calls.
7586 Parameter %1 escapes, so access is unknown.
7587 The function itself can access just a single byte of the parameter %2. Additional
7588 access is possible inside of the ``@bar`` or ``^3``. The function adds signed
7589 offset to the pointer and passes the result as the argument %1 into ``^3``.
7590 This record itself does not tell us how ``^3`` will access the parameter.
7591 Parameter %3 is not a pointer.
7592
7593 .. _refs_summary:
7594
7595 Refs
7596 ^^^^
7597
7598 The optional ``Refs`` field looks like:
7599
7600 .. code-block:: text
7601
7602     refs: ((Ref)[, (Ref)]*)
7603
7604 where each ``Ref`` contains a reference to the summary id of the referenced
7605 value (e.g. ``^1``).
7606
7607 .. _typeidinfo_summary:
7608
7609 TypeIdInfo
7610 ^^^^^^^^^^
7611
7612 The optional ``TypeIdInfo`` field, used for
7613 `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
7614 looks like:
7615
7616 .. code-block:: text
7617
7618     typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]?
7619
7620 These optional fields have the following forms:
7621
7622 TypeTests
7623 """""""""
7624
7625 .. code-block:: text
7626
7627     typeTests: (TypeIdRef[, TypeIdRef]*)
7628
7629 Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
7630 by summary id or ``GUID``.
7631
7632 TypeTestAssumeVCalls
7633 """"""""""""""""""""
7634
7635 .. code-block:: text
7636
7637     typeTestAssumeVCalls: (VFuncId[, VFuncId]*)
7638
7639 Where each VFuncId has the format:
7640
7641 .. code-block:: text
7642
7643     vFuncId: (TypeIdRef, offset: 16)
7644
7645 Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
7646 by summary id or ``GUID`` preceded by a ``guid:`` tag.
7647
7648 TypeCheckedLoadVCalls
7649 """""""""""""""""""""
7650
7651 .. code-block:: text
7652
7653     typeCheckedLoadVCalls: (VFuncId[, VFuncId]*)
7654
7655 Where each VFuncId has the format described for ``TypeTestAssumeVCalls``.
7656
7657 TypeTestAssumeConstVCalls
7658 """""""""""""""""""""""""
7659
7660 .. code-block:: text
7661
7662     typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*)
7663
7664 Where each ConstVCall has the format:
7665
7666 .. code-block:: text
7667
7668     (VFuncId, args: (Arg[, Arg]*))
7669
7670 and where each VFuncId has the format described for ``TypeTestAssumeVCalls``,
7671 and each Arg is an integer argument number.
7672
7673 TypeCheckedLoadConstVCalls
7674 """"""""""""""""""""""""""
7675
7676 .. code-block:: text
7677
7678     typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*)
7679
7680 Where each ConstVCall has the format described for
7681 ``TypeTestAssumeConstVCalls``.
7682
7683 .. _typeid_summary:
7684
7685 Type ID Summary Entry
7686 ---------------------
7687
7688 Each type id summary entry corresponds to a type identifier resolution
7689 which is generated during the LTO link portion of the compile when building
7690 with `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
7691 so these are only present in a combined summary index.
7692
7693 Example:
7694
7695 .. code-block:: text
7696
7697     ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778
7698
7699 The ``typeTestRes`` gives the type test resolution ``kind`` (which may
7700 be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and
7701 the ``size-1`` bit width. It is followed by optional flags, which default to 0,
7702 and an optional WpdResolutions (whole program devirtualization resolution)
7703 field that looks like:
7704
7705 .. code-block:: text
7706
7707     wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]*
7708
7709 where each entry is a mapping from the given byte offset to the whole-program
7710 devirtualization resolution WpdRes, that has one of the following formats:
7711
7712 .. code-block:: text
7713
7714     wpdRes: (kind: branchFunnel)
7715     wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi")
7716     wpdRes: (kind: indir)
7717
7718 Additionally, each wpdRes has an optional ``resByArg`` field, which
7719 describes the resolutions for calls with all constant integer arguments:
7720
7721 .. code-block:: text
7722
7723     resByArg: (ResByArg[, ResByArg]*)
7724
7725 where ResByArg is:
7726
7727 .. code-block:: text
7728
7729     args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0])
7730
7731 Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal``
7732 or ``VirtualConstProp``. The ``info`` field is only used if the kind
7733 is ``UniformRetVal`` (indicates the uniform return value), or
7734 ``UniqueRetVal`` (holds the return value associated with the unique vtable
7735 (0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does
7736 not support the use of absolute symbols to store constants.
7737
7738 .. _intrinsicglobalvariables:
7739
7740 Intrinsic Global Variables
7741 ==========================
7742
7743 LLVM has a number of "magic" global variables that contain data that
7744 affect code generation or other IR semantics. These are documented here.
7745 All globals of this sort should have a section specified as
7746 "``llvm.metadata``". This section and all globals that start with
7747 "``llvm.``" are reserved for use by LLVM.
7748
7749 .. _gv_llvmused:
7750
7751 The '``llvm.used``' Global Variable
7752 -----------------------------------
7753
7754 The ``@llvm.used`` global is an array which has
7755 :ref:`appending linkage <linkage_appending>`. This array contains a list of
7756 pointers to named global variables, functions and aliases which may optionally
7757 have a pointer cast formed of bitcast or getelementptr. For example, a legal
7758 use of it is:
7759
7760 .. code-block:: llvm
7761
7762     @X = global i8 4
7763     @Y = global i32 123
7764
7765     @llvm.used = appending global [2 x i8*] [
7766        i8* @X,
7767        i8* bitcast (i32* @Y to i8*)
7768     ], section "llvm.metadata"
7769
7770 If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler,
7771 and linker are required to treat the symbol as if there is a reference to the
7772 symbol that it cannot see (which is why they have to be named). For example, if
7773 a variable has internal linkage and no references other than that from the
7774 ``@llvm.used`` list, it cannot be deleted. This is commonly used to represent
7775 references from inline asms and other things the compiler cannot "see", and
7776 corresponds to "``attribute((used))``" in GNU C.
7777
7778 On some targets, the code generator must emit a directive to the
7779 assembler or object file to prevent the assembler and linker from
7780 removing the symbol.
7781
7782 .. _gv_llvmcompilerused:
7783
7784 The '``llvm.compiler.used``' Global Variable
7785 --------------------------------------------
7786
7787 The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used``
7788 directive, except that it only prevents the compiler from touching the
7789 symbol. On targets that support it, this allows an intelligent linker to
7790 optimize references to the symbol without being impeded as it would be
7791 by ``@llvm.used``.
7792
7793 This is a rare construct that should only be used in rare circumstances,
7794 and should not be exposed to source languages.
7795
7796 .. _gv_llvmglobalctors:
7797
7798 The '``llvm.global_ctors``' Global Variable
7799 -------------------------------------------
7800
7801 .. code-block:: llvm
7802
7803     %0 = type { i32, void ()*, i8* }
7804     @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }]
7805
7806 The ``@llvm.global_ctors`` array contains a list of constructor
7807 functions, priorities, and an associated global or function.
7808 The functions referenced by this array will be called in ascending order
7809 of priority (i.e. lowest first) when the module is loaded. The order of
7810 functions with the same priority is not defined.
7811
7812 If the third field is non-null, and points to a global variable
7813 or function, the initializer function will only run if the associated
7814 data from the current module is not discarded.
7815 On ELF the referenced global variable or function must be in a comdat.
7816
7817 .. _llvmglobaldtors:
7818
7819 The '``llvm.global_dtors``' Global Variable
7820 -------------------------------------------
7821
7822 .. code-block:: llvm
7823
7824     %0 = type { i32, void ()*, i8* }
7825     @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }]
7826
7827 The ``@llvm.global_dtors`` array contains a list of destructor
7828 functions, priorities, and an associated global or function.
7829 The functions referenced by this array will be called in descending
7830 order of priority (i.e. highest first) when the module is unloaded. The
7831 order of functions with the same priority is not defined.
7832
7833 If the third field is non-null, and points to a global variable
7834 or function, the destructor function will only run if the associated
7835 data from the current module is not discarded.
7836 On ELF the referenced global variable or function must be in a comdat.
7837
7838 Instruction Reference
7839 =====================
7840
7841 The LLVM instruction set consists of several different classifications
7842 of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary
7843 instructions <binaryops>`, :ref:`bitwise binary
7844 instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and
7845 :ref:`other instructions <otherops>`.
7846
7847 .. _terminators:
7848
7849 Terminator Instructions
7850 -----------------------
7851
7852 As mentioned :ref:`previously <functionstructure>`, every basic block in a
7853 program ends with a "Terminator" instruction, which indicates which
7854 block should be executed after the current block is finished. These
7855 terminator instructions typically yield a '``void``' value: they produce
7856 control flow, not values (the one exception being the
7857 ':ref:`invoke <i_invoke>`' instruction).
7858
7859 The terminator instructions are: ':ref:`ret <i_ret>`',
7860 ':ref:`br <i_br>`', ':ref:`switch <i_switch>`',
7861 ':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`',
7862 ':ref:`callbr <i_callbr>`'
7863 ':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`',
7864 ':ref:`catchret <i_catchret>`',
7865 ':ref:`cleanupret <i_cleanupret>`',
7866 and ':ref:`unreachable <i_unreachable>`'.
7867
7868 .. _i_ret:
7869
7870 '``ret``' Instruction
7871 ^^^^^^^^^^^^^^^^^^^^^
7872
7873 Syntax:
7874 """""""
7875
7876 ::
7877
7878       ret <type> <value>       ; Return a value from a non-void function
7879       ret void                 ; Return from void function
7880
7881 Overview:
7882 """""""""
7883
7884 The '``ret``' instruction is used to return control flow (and optionally
7885 a value) from a function back to the caller.
7886
7887 There are two forms of the '``ret``' instruction: one that returns a
7888 value and then causes control flow, and one that just causes control
7889 flow to occur.
7890
7891 Arguments:
7892 """"""""""
7893
7894 The '``ret``' instruction optionally accepts a single argument, the
7895 return value. The type of the return value must be a ':ref:`first
7896 class <t_firstclass>`' type.
7897
7898 A function is not :ref:`well formed <wellformed>` if it has a non-void
7899 return type and contains a '``ret``' instruction with no return value or
7900 a return value with a type that does not match its type, or if it has a
7901 void return type and contains a '``ret``' instruction with a return
7902 value.
7903
7904 Semantics:
7905 """"""""""
7906
7907 When the '``ret``' instruction is executed, control flow returns back to
7908 the calling function's context. If the caller is a
7909 ":ref:`call <i_call>`" instruction, execution continues at the
7910 instruction after the call. If the caller was an
7911 ":ref:`invoke <i_invoke>`" instruction, execution continues at the
7912 beginning of the "normal" destination block. If the instruction returns
7913 a value, that value shall set the call or invoke instruction's return
7914 value.
7915
7916 Example:
7917 """"""""
7918
7919 .. code-block:: llvm
7920
7921       ret i32 5                       ; Return an integer value of 5
7922       ret void                        ; Return from a void function
7923       ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
7924
7925 .. _i_br:
7926
7927 '``br``' Instruction
7928 ^^^^^^^^^^^^^^^^^^^^
7929
7930 Syntax:
7931 """""""
7932
7933 ::
7934
7935       br i1 <cond>, label <iftrue>, label <iffalse>
7936       br label <dest>          ; Unconditional branch
7937
7938 Overview:
7939 """""""""
7940
7941 The '``br``' instruction is used to cause control flow to transfer to a
7942 different basic block in the current function. There are two forms of
7943 this instruction, corresponding to a conditional branch and an
7944 unconditional branch.
7945
7946 Arguments:
7947 """"""""""
7948
7949 The conditional branch form of the '``br``' instruction takes a single
7950 '``i1``' value and two '``label``' values. The unconditional form of the
7951 '``br``' instruction takes a single '``label``' value as a target.
7952
7953 Semantics:
7954 """"""""""
7955
7956 Upon execution of a conditional '``br``' instruction, the '``i1``'
7957 argument is evaluated. If the value is ``true``, control flows to the
7958 '``iftrue``' ``label`` argument. If "cond" is ``false``, control flows
7959 to the '``iffalse``' ``label`` argument.
7960 If '``cond``' is ``poison`` or ``undef``, this instruction has undefined
7961 behavior.
7962
7963 Example:
7964 """"""""
7965
7966 .. code-block:: llvm
7967
7968     Test:
7969       %cond = icmp eq i32 %a, %b
7970       br i1 %cond, label %IfEqual, label %IfUnequal
7971     IfEqual:
7972       ret i32 1
7973     IfUnequal:
7974       ret i32 0
7975
7976 .. _i_switch:
7977
7978 '``switch``' Instruction
7979 ^^^^^^^^^^^^^^^^^^^^^^^^
7980
7981 Syntax:
7982 """""""
7983
7984 ::
7985
7986       switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ]
7987
7988 Overview:
7989 """""""""
7990
7991 The '``switch``' instruction is used to transfer control flow to one of
7992 several different places. It is a generalization of the '``br``'
7993 instruction, allowing a branch to occur to one of many possible
7994 destinations.
7995
7996 Arguments:
7997 """"""""""
7998
7999 The '``switch``' instruction uses three parameters: an integer
8000 comparison value '``value``', a default '``label``' destination, and an
8001 array of pairs of comparison value constants and '``label``'s. The table
8002 is not allowed to contain duplicate constant entries.
8003
8004 Semantics:
8005 """"""""""
8006
8007 The ``switch`` instruction specifies a table of values and destinations.
8008 When the '``switch``' instruction is executed, this table is searched
8009 for the given value. If the value is found, control flow is transferred
8010 to the corresponding destination; otherwise, control flow is transferred
8011 to the default destination.
8012 If '``value``' is ``poison`` or ``undef``, this instruction has undefined
8013 behavior.
8014
8015 Implementation:
8016 """""""""""""""
8017
8018 Depending on properties of the target machine and the particular
8019 ``switch`` instruction, this instruction may be code generated in
8020 different ways. For example, it could be generated as a series of
8021 chained conditional branches or with a lookup table.
8022
8023 Example:
8024 """"""""
8025
8026 .. code-block:: llvm
8027
8028      ; Emulate a conditional br instruction
8029      %Val = zext i1 %value to i32
8030      switch i32 %Val, label %truedest [ i32 0, label %falsedest ]
8031
8032      ; Emulate an unconditional br instruction
8033      switch i32 0, label %dest [ ]
8034
8035      ; Implement a jump table:
8036      switch i32 %val, label %otherwise [ i32 0, label %onzero
8037                                          i32 1, label %onone
8038                                          i32 2, label %ontwo ]
8039
8040 .. _i_indirectbr:
8041
8042 '``indirectbr``' Instruction
8043 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8044
8045 Syntax:
8046 """""""
8047
8048 ::
8049
8050       indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ]
8051
8052 Overview:
8053 """""""""
8054
8055 The '``indirectbr``' instruction implements an indirect branch to a
8056 label within the current function, whose address is specified by
8057 "``address``". Address must be derived from a
8058 :ref:`blockaddress <blockaddress>` constant.
8059
8060 Arguments:
8061 """"""""""
8062
8063 The '``address``' argument is the address of the label to jump to. The
8064 rest of the arguments indicate the full set of possible destinations
8065 that the address may point to. Blocks are allowed to occur multiple
8066 times in the destination list, though this isn't particularly useful.
8067
8068 This destination list is required so that dataflow analysis has an
8069 accurate understanding of the CFG.
8070
8071 Semantics:
8072 """"""""""
8073
8074 Control transfers to the block specified in the address argument. All
8075 possible destination blocks must be listed in the label list, otherwise
8076 this instruction has undefined behavior. This implies that jumps to
8077 labels defined in other functions have undefined behavior as well.
8078 If '``address``' is ``poison`` or ``undef``, this instruction has undefined
8079 behavior.
8080
8081 Implementation:
8082 """""""""""""""
8083
8084 This is typically implemented with a jump through a register.
8085
8086 Example:
8087 """"""""
8088
8089 .. code-block:: llvm
8090
8091      indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ]
8092
8093 .. _i_invoke:
8094
8095 '``invoke``' Instruction
8096 ^^^^^^^^^^^^^^^^^^^^^^^^
8097
8098 Syntax:
8099 """""""
8100
8101 ::
8102
8103       <result> = invoke [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
8104                     [operand bundles] to label <normal label> unwind label <exception label>
8105
8106 Overview:
8107 """""""""
8108
8109 The '``invoke``' instruction causes control to transfer to a specified
8110 function, with the possibility of control flow transfer to either the
8111 '``normal``' label or the '``exception``' label. If the callee function
8112 returns with the "``ret``" instruction, control flow will return to the
8113 "normal" label. If the callee (or any indirect callees) returns via the
8114 ":ref:`resume <i_resume>`" instruction or other exception handling
8115 mechanism, control is interrupted and continued at the dynamically
8116 nearest "exception" label.
8117
8118 The '``exception``' label is a `landing
8119 pad <ExceptionHandling.html#overview>`_ for the exception. As such,
8120 '``exception``' label is required to have the
8121 ":ref:`landingpad <i_landingpad>`" instruction, which contains the
8122 information about the behavior of the program after unwinding happens,
8123 as its first non-PHI instruction. The restrictions on the
8124 "``landingpad``" instruction's tightly couples it to the "``invoke``"
8125 instruction, so that the important information contained within the
8126 "``landingpad``" instruction can't be lost through normal code motion.
8127
8128 Arguments:
8129 """"""""""
8130
8131 This instruction requires several arguments:
8132
8133 #. The optional "cconv" marker indicates which :ref:`calling
8134    convention <callingconv>` the call should use. If none is
8135    specified, the call defaults to using C calling conventions.
8136 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
8137    values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
8138    are valid here.
8139 #. The optional addrspace attribute can be used to indicate the address space
8140    of the called function. If it is not specified, the program address space
8141    from the :ref:`datalayout string<langref_datalayout>` will be used.
8142 #. '``ty``': the type of the call instruction itself which is also the
8143    type of the return value. Functions that return no value are marked
8144    ``void``.
8145 #. '``fnty``': shall be the signature of the function being invoked. The
8146    argument types must match the types implied by this signature. This
8147    type can be omitted if the function is not varargs.
8148 #. '``fnptrval``': An LLVM value containing a pointer to a function to
8149    be invoked. In most cases, this is a direct function invocation, but
8150    indirect ``invoke``'s are just as possible, calling an arbitrary pointer
8151    to function value.
8152 #. '``function args``': argument list whose types match the function
8153    signature argument types and parameter attributes. All arguments must
8154    be of :ref:`first class <t_firstclass>` type. If the function signature
8155    indicates the function accepts a variable number of arguments, the
8156    extra arguments can be specified.
8157 #. '``normal label``': the label reached when the called function
8158    executes a '``ret``' instruction.
8159 #. '``exception label``': the label reached when a callee returns via
8160    the :ref:`resume <i_resume>` instruction or other exception handling
8161    mechanism.
8162 #. The optional :ref:`function attributes <fnattrs>` list.
8163 #. The optional :ref:`operand bundles <opbundles>` list.
8164
8165 Semantics:
8166 """"""""""
8167
8168 This instruction is designed to operate as a standard '``call``'
8169 instruction in most regards. The primary difference is that it
8170 establishes an association with a label, which is used by the runtime
8171 library to unwind the stack.
8172
8173 This instruction is used in languages with destructors to ensure that
8174 proper cleanup is performed in the case of either a ``longjmp`` or a
8175 thrown exception. Additionally, this is important for implementation of
8176 '``catch``' clauses in high-level languages that support them.
8177
8178 For the purposes of the SSA form, the definition of the value returned
8179 by the '``invoke``' instruction is deemed to occur on the edge from the
8180 current block to the "normal" label. If the callee unwinds then no
8181 return value is available.
8182
8183 Example:
8184 """"""""
8185
8186 .. code-block:: llvm
8187
8188       %retval = invoke i32 @Test(i32 15) to label %Continue
8189                   unwind label %TestCleanup              ; i32:retval set
8190       %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
8191                   unwind label %TestCleanup              ; i32:retval set
8192
8193 .. _i_callbr:
8194
8195 '``callbr``' Instruction
8196 ^^^^^^^^^^^^^^^^^^^^^^^^
8197
8198 Syntax:
8199 """""""
8200
8201 ::
8202
8203       <result> = callbr [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
8204                     [operand bundles] to label <fallthrough label> [indirect labels]
8205
8206 Overview:
8207 """""""""
8208
8209 The '``callbr``' instruction causes control to transfer to a specified
8210 function, with the possibility of control flow transfer to either the
8211 '``fallthrough``' label or one of the '``indirect``' labels.
8212
8213 This instruction should only be used to implement the "goto" feature of gcc
8214 style inline assembly. Any other usage is an error in the IR verifier.
8215
8216 Arguments:
8217 """"""""""
8218
8219 This instruction requires several arguments:
8220
8221 #. The optional "cconv" marker indicates which :ref:`calling
8222    convention <callingconv>` the call should use. If none is
8223    specified, the call defaults to using C calling conventions.
8224 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
8225    values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
8226    are valid here.
8227 #. The optional addrspace attribute can be used to indicate the address space
8228    of the called function. If it is not specified, the program address space
8229    from the :ref:`datalayout string<langref_datalayout>` will be used.
8230 #. '``ty``': the type of the call instruction itself which is also the
8231    type of the return value. Functions that return no value are marked
8232    ``void``.
8233 #. '``fnty``': shall be the signature of the function being called. The
8234    argument types must match the types implied by this signature. This
8235    type can be omitted if the function is not varargs.
8236 #. '``fnptrval``': An LLVM value containing a pointer to a function to
8237    be called. In most cases, this is a direct function call, but
8238    other ``callbr``'s are just as possible, calling an arbitrary pointer
8239    to function value.
8240 #. '``function args``': argument list whose types match the function
8241    signature argument types and parameter attributes. All arguments must
8242    be of :ref:`first class <t_firstclass>` type. If the function signature
8243    indicates the function accepts a variable number of arguments, the
8244    extra arguments can be specified.
8245 #. '``fallthrough label``': the label reached when the inline assembly's
8246    execution exits the bottom.
8247 #. '``indirect labels``': the labels reached when a callee transfers control
8248    to a location other than the '``fallthrough label``'. The blockaddress
8249    constant for these should also be in the list of '``function args``'.
8250 #. The optional :ref:`function attributes <fnattrs>` list.
8251 #. The optional :ref:`operand bundles <opbundles>` list.
8252
8253 Semantics:
8254 """"""""""
8255
8256 This instruction is designed to operate as a standard '``call``'
8257 instruction in most regards. The primary difference is that it
8258 establishes an association with additional labels to define where control
8259 flow goes after the call.
8260
8261 The output values of a '``callbr``' instruction are available only to
8262 the '``fallthrough``' block, not to any '``indirect``' blocks(s).
8263
8264 The only use of this today is to implement the "goto" feature of gcc inline
8265 assembly where additional labels can be provided as locations for the inline
8266 assembly to jump to.
8267
8268 Example:
8269 """"""""
8270
8271 .. code-block:: llvm
8272
8273       ; "asm goto" without output constraints.
8274       callbr void asm "", "r,X"(i32 %x, i8 *blockaddress(@foo, %indirect))
8275                   to label %fallthrough [label %indirect]
8276
8277       ; "asm goto" with output constraints.
8278       <result> = callbr i32 asm "", "=r,r,X"(i32 %x, i8 *blockaddress(@foo, %indirect))
8279                   to label %fallthrough [label %indirect]
8280
8281 .. _i_resume:
8282
8283 '``resume``' Instruction
8284 ^^^^^^^^^^^^^^^^^^^^^^^^
8285
8286 Syntax:
8287 """""""
8288
8289 ::
8290
8291       resume <type> <value>
8292
8293 Overview:
8294 """""""""
8295
8296 The '``resume``' instruction is a terminator instruction that has no
8297 successors.
8298
8299 Arguments:
8300 """"""""""
8301
8302 The '``resume``' instruction requires one argument, which must have the
8303 same type as the result of any '``landingpad``' instruction in the same
8304 function.
8305
8306 Semantics:
8307 """"""""""
8308
8309 The '``resume``' instruction resumes propagation of an existing
8310 (in-flight) exception whose unwinding was interrupted with a
8311 :ref:`landingpad <i_landingpad>` instruction.
8312
8313 Example:
8314 """"""""
8315
8316 .. code-block:: llvm
8317
8318       resume { i8*, i32 } %exn
8319
8320 .. _i_catchswitch:
8321
8322 '``catchswitch``' Instruction
8323 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8324
8325 Syntax:
8326 """""""
8327
8328 ::
8329
8330       <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller
8331       <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default>
8332
8333 Overview:
8334 """""""""
8335
8336 The '``catchswitch``' instruction is used by `LLVM's exception handling system
8337 <ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers
8338 that may be executed by the :ref:`EH personality routine <personalityfn>`.
8339
8340 Arguments:
8341 """"""""""
8342
8343 The ``parent`` argument is the token of the funclet that contains the
8344 ``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet,
8345 this operand may be the token ``none``.
8346
8347 The ``default`` argument is the label of another basic block beginning with
8348 either a ``cleanuppad`` or ``catchswitch`` instruction.  This unwind destination
8349 must be a legal target with respect to the ``parent`` links, as described in
8350 the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
8351
8352 The ``handlers`` are a nonempty list of successor blocks that each begin with a
8353 :ref:`catchpad <i_catchpad>` instruction.
8354
8355 Semantics:
8356 """"""""""
8357
8358 Executing this instruction transfers control to one of the successors in
8359 ``handlers``, if appropriate, or continues to unwind via the unwind label if
8360 present.
8361
8362 The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that
8363 it must be both the first non-phi instruction and last instruction in the basic
8364 block. Therefore, it must be the only non-phi instruction in the block.
8365
8366 Example:
8367 """"""""
8368
8369 .. code-block:: text
8370
8371     dispatch1:
8372       %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller
8373     dispatch2:
8374       %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup
8375
8376 .. _i_catchret:
8377
8378 '``catchret``' Instruction
8379 ^^^^^^^^^^^^^^^^^^^^^^^^^^
8380
8381 Syntax:
8382 """""""
8383
8384 ::
8385
8386       catchret from <token> to label <normal>
8387
8388 Overview:
8389 """""""""
8390
8391 The '``catchret``' instruction is a terminator instruction that has a
8392 single successor.
8393
8394
8395 Arguments:
8396 """"""""""
8397
8398 The first argument to a '``catchret``' indicates which ``catchpad`` it
8399 exits.  It must be a :ref:`catchpad <i_catchpad>`.
8400 The second argument to a '``catchret``' specifies where control will
8401 transfer to next.
8402
8403 Semantics:
8404 """"""""""
8405
8406 The '``catchret``' instruction ends an existing (in-flight) exception whose
8407 unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction.  The
8408 :ref:`personality function <personalityfn>` gets a chance to execute arbitrary
8409 code to, for example, destroy the active exception.  Control then transfers to
8410 ``normal``.
8411
8412 The ``token`` argument must be a token produced by a ``catchpad`` instruction.
8413 If the specified ``catchpad`` is not the most-recently-entered not-yet-exited
8414 funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8415 the ``catchret``'s behavior is undefined.
8416
8417 Example:
8418 """"""""
8419
8420 .. code-block:: text
8421
8422       catchret from %catch to label %continue
8423
8424 .. _i_cleanupret:
8425
8426 '``cleanupret``' Instruction
8427 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8428
8429 Syntax:
8430 """""""
8431
8432 ::
8433
8434       cleanupret from <value> unwind label <continue>
8435       cleanupret from <value> unwind to caller
8436
8437 Overview:
8438 """""""""
8439
8440 The '``cleanupret``' instruction is a terminator instruction that has
8441 an optional successor.
8442
8443
8444 Arguments:
8445 """"""""""
8446
8447 The '``cleanupret``' instruction requires one argument, which indicates
8448 which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`.
8449 If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited
8450 funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8451 the ``cleanupret``'s behavior is undefined.
8452
8453 The '``cleanupret``' instruction also has an optional successor, ``continue``,
8454 which must be the label of another basic block beginning with either a
8455 ``cleanuppad`` or ``catchswitch`` instruction.  This unwind destination must
8456 be a legal target with respect to the ``parent`` links, as described in the
8457 `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
8458
8459 Semantics:
8460 """"""""""
8461
8462 The '``cleanupret``' instruction indicates to the
8463 :ref:`personality function <personalityfn>` that one
8464 :ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended.
8465 It transfers control to ``continue`` or unwinds out of the function.
8466
8467 Example:
8468 """"""""
8469
8470 .. code-block:: text
8471
8472       cleanupret from %cleanup unwind to caller
8473       cleanupret from %cleanup unwind label %continue
8474
8475 .. _i_unreachable:
8476
8477 '``unreachable``' Instruction
8478 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8479
8480 Syntax:
8481 """""""
8482
8483 ::
8484
8485       unreachable
8486
8487 Overview:
8488 """""""""
8489
8490 The '``unreachable``' instruction has no defined semantics. This
8491 instruction is used to inform the optimizer that a particular portion of
8492 the code is not reachable. This can be used to indicate that the code
8493 after a no-return function cannot be reached, and other facts.
8494
8495 Semantics:
8496 """"""""""
8497
8498 The '``unreachable``' instruction has no defined semantics.
8499
8500 .. _unaryops:
8501
8502 Unary Operations
8503 -----------------
8504
8505 Unary operators require a single operand, execute an operation on
8506 it, and produce a single value. The operand might represent multiple
8507 data, as is the case with the :ref:`vector <t_vector>` data type. The
8508 result value has the same type as its operand.
8509
8510 .. _i_fneg:
8511
8512 '``fneg``' Instruction
8513 ^^^^^^^^^^^^^^^^^^^^^^
8514
8515 Syntax:
8516 """""""
8517
8518 ::
8519
8520       <result> = fneg [fast-math flags]* <ty> <op1>   ; yields ty:result
8521
8522 Overview:
8523 """""""""
8524
8525 The '``fneg``' instruction returns the negation of its operand.
8526
8527 Arguments:
8528 """"""""""
8529
8530 The argument to the '``fneg``' instruction must be a
8531 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8532 floating-point values.
8533
8534 Semantics:
8535 """"""""""
8536
8537 The value produced is a copy of the operand with its sign bit flipped.
8538 This instruction can also take any number of :ref:`fast-math
8539 flags <fastmath>`, which are optimization hints to enable otherwise
8540 unsafe floating-point optimizations:
8541
8542 Example:
8543 """"""""
8544
8545 .. code-block:: text
8546
8547       <result> = fneg float %val          ; yields float:result = -%var
8548
8549 .. _binaryops:
8550
8551 Binary Operations
8552 -----------------
8553
8554 Binary operators are used to do most of the computation in a program.
8555 They require two operands of the same type, execute an operation on
8556 them, and produce a single value. The operands might represent multiple
8557 data, as is the case with the :ref:`vector <t_vector>` data type. The
8558 result value has the same type as its operands.
8559
8560 There are several different binary operators:
8561
8562 .. _i_add:
8563
8564 '``add``' Instruction
8565 ^^^^^^^^^^^^^^^^^^^^^
8566
8567 Syntax:
8568 """""""
8569
8570 ::
8571
8572       <result> = add <ty> <op1>, <op2>          ; yields ty:result
8573       <result> = add nuw <ty> <op1>, <op2>      ; yields ty:result
8574       <result> = add nsw <ty> <op1>, <op2>      ; yields ty:result
8575       <result> = add nuw nsw <ty> <op1>, <op2>  ; yields ty:result
8576
8577 Overview:
8578 """""""""
8579
8580 The '``add``' instruction returns the sum of its two operands.
8581
8582 Arguments:
8583 """"""""""
8584
8585 The two arguments to the '``add``' instruction must be
8586 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8587 arguments must have identical types.
8588
8589 Semantics:
8590 """"""""""
8591
8592 The value produced is the integer sum of the two operands.
8593
8594 If the sum has unsigned overflow, the result returned is the
8595 mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
8596 the result.
8597
8598 Because LLVM integers use a two's complement representation, this
8599 instruction is appropriate for both signed and unsigned integers.
8600
8601 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
8602 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
8603 result value of the ``add`` is a :ref:`poison value <poisonvalues>` if
8604 unsigned and/or signed overflow, respectively, occurs.
8605
8606 Example:
8607 """"""""
8608
8609 .. code-block:: text
8610
8611       <result> = add i32 4, %var          ; yields i32:result = 4 + %var
8612
8613 .. _i_fadd:
8614
8615 '``fadd``' Instruction
8616 ^^^^^^^^^^^^^^^^^^^^^^
8617
8618 Syntax:
8619 """""""
8620
8621 ::
8622
8623       <result> = fadd [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8624
8625 Overview:
8626 """""""""
8627
8628 The '``fadd``' instruction returns the sum of its two operands.
8629
8630 Arguments:
8631 """"""""""
8632
8633 The two arguments to the '``fadd``' instruction must be
8634 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8635 floating-point values. Both arguments must have identical types.
8636
8637 Semantics:
8638 """"""""""
8639
8640 The value produced is the floating-point sum of the two operands.
8641 This instruction is assumed to execute in the default :ref:`floating-point
8642 environment <floatenv>`.
8643 This instruction can also take any number of :ref:`fast-math
8644 flags <fastmath>`, which are optimization hints to enable otherwise
8645 unsafe floating-point optimizations:
8646
8647 Example:
8648 """"""""
8649
8650 .. code-block:: text
8651
8652       <result> = fadd float 4.0, %var          ; yields float:result = 4.0 + %var
8653
8654 .. _i_sub:
8655
8656 '``sub``' Instruction
8657 ^^^^^^^^^^^^^^^^^^^^^
8658
8659 Syntax:
8660 """""""
8661
8662 ::
8663
8664       <result> = sub <ty> <op1>, <op2>          ; yields ty:result
8665       <result> = sub nuw <ty> <op1>, <op2>      ; yields ty:result
8666       <result> = sub nsw <ty> <op1>, <op2>      ; yields ty:result
8667       <result> = sub nuw nsw <ty> <op1>, <op2>  ; yields ty:result
8668
8669 Overview:
8670 """""""""
8671
8672 The '``sub``' instruction returns the difference of its two operands.
8673
8674 Note that the '``sub``' instruction is used to represent the '``neg``'
8675 instruction present in most other intermediate representations.
8676
8677 Arguments:
8678 """"""""""
8679
8680 The two arguments to the '``sub``' instruction must be
8681 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8682 arguments must have identical types.
8683
8684 Semantics:
8685 """"""""""
8686
8687 The value produced is the integer difference of the two operands.
8688
8689 If the difference has unsigned overflow, the result returned is the
8690 mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
8691 the result.
8692
8693 Because LLVM integers use a two's complement representation, this
8694 instruction is appropriate for both signed and unsigned integers.
8695
8696 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
8697 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
8698 result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if
8699 unsigned and/or signed overflow, respectively, occurs.
8700
8701 Example:
8702 """"""""
8703
8704 .. code-block:: text
8705
8706       <result> = sub i32 4, %var          ; yields i32:result = 4 - %var
8707       <result> = sub i32 0, %val          ; yields i32:result = -%var
8708
8709 .. _i_fsub:
8710
8711 '``fsub``' Instruction
8712 ^^^^^^^^^^^^^^^^^^^^^^
8713
8714 Syntax:
8715 """""""
8716
8717 ::
8718
8719       <result> = fsub [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8720
8721 Overview:
8722 """""""""
8723
8724 The '``fsub``' instruction returns the difference of its two operands.
8725
8726 Arguments:
8727 """"""""""
8728
8729 The two arguments to the '``fsub``' instruction must be
8730 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8731 floating-point values. Both arguments must have identical types.
8732
8733 Semantics:
8734 """"""""""
8735
8736 The value produced is the floating-point difference of the two operands.
8737 This instruction is assumed to execute in the default :ref:`floating-point
8738 environment <floatenv>`.
8739 This instruction can also take any number of :ref:`fast-math
8740 flags <fastmath>`, which are optimization hints to enable otherwise
8741 unsafe floating-point optimizations:
8742
8743 Example:
8744 """"""""
8745
8746 .. code-block:: text
8747
8748       <result> = fsub float 4.0, %var           ; yields float:result = 4.0 - %var
8749       <result> = fsub float -0.0, %val          ; yields float:result = -%var
8750
8751 .. _i_mul:
8752
8753 '``mul``' Instruction
8754 ^^^^^^^^^^^^^^^^^^^^^
8755
8756 Syntax:
8757 """""""
8758
8759 ::
8760
8761       <result> = mul <ty> <op1>, <op2>          ; yields ty:result
8762       <result> = mul nuw <ty> <op1>, <op2>      ; yields ty:result
8763       <result> = mul nsw <ty> <op1>, <op2>      ; yields ty:result
8764       <result> = mul nuw nsw <ty> <op1>, <op2>  ; yields ty:result
8765
8766 Overview:
8767 """""""""
8768
8769 The '``mul``' instruction returns the product of its two operands.
8770
8771 Arguments:
8772 """"""""""
8773
8774 The two arguments to the '``mul``' instruction must be
8775 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8776 arguments must have identical types.
8777
8778 Semantics:
8779 """"""""""
8780
8781 The value produced is the integer product of the two operands.
8782
8783 If the result of the multiplication has unsigned overflow, the result
8784 returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the
8785 bit width of the result.
8786
8787 Because LLVM integers use a two's complement representation, and the
8788 result is the same width as the operands, this instruction returns the
8789 correct result for both signed and unsigned integers. If a full product
8790 (e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be
8791 sign-extended or zero-extended as appropriate to the width of the full
8792 product.
8793
8794 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
8795 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
8796 result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if
8797 unsigned and/or signed overflow, respectively, occurs.
8798
8799 Example:
8800 """"""""
8801
8802 .. code-block:: text
8803
8804       <result> = mul i32 4, %var          ; yields i32:result = 4 * %var
8805
8806 .. _i_fmul:
8807
8808 '``fmul``' Instruction
8809 ^^^^^^^^^^^^^^^^^^^^^^
8810
8811 Syntax:
8812 """""""
8813
8814 ::
8815
8816       <result> = fmul [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8817
8818 Overview:
8819 """""""""
8820
8821 The '``fmul``' instruction returns the product of its two operands.
8822
8823 Arguments:
8824 """"""""""
8825
8826 The two arguments to the '``fmul``' instruction must be
8827 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8828 floating-point values. Both arguments must have identical types.
8829
8830 Semantics:
8831 """"""""""
8832
8833 The value produced is the floating-point product of the two operands.
8834 This instruction is assumed to execute in the default :ref:`floating-point
8835 environment <floatenv>`.
8836 This instruction can also take any number of :ref:`fast-math
8837 flags <fastmath>`, which are optimization hints to enable otherwise
8838 unsafe floating-point optimizations:
8839
8840 Example:
8841 """"""""
8842
8843 .. code-block:: text
8844
8845       <result> = fmul float 4.0, %var          ; yields float:result = 4.0 * %var
8846
8847 .. _i_udiv:
8848
8849 '``udiv``' Instruction
8850 ^^^^^^^^^^^^^^^^^^^^^^
8851
8852 Syntax:
8853 """""""
8854
8855 ::
8856
8857       <result> = udiv <ty> <op1>, <op2>         ; yields ty:result
8858       <result> = udiv exact <ty> <op1>, <op2>   ; yields ty:result
8859
8860 Overview:
8861 """""""""
8862
8863 The '``udiv``' instruction returns the quotient of its two operands.
8864
8865 Arguments:
8866 """"""""""
8867
8868 The two arguments to the '``udiv``' instruction must be
8869 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8870 arguments must have identical types.
8871
8872 Semantics:
8873 """"""""""
8874
8875 The value produced is the unsigned integer quotient of the two operands.
8876
8877 Note that unsigned integer division and signed integer division are
8878 distinct operations; for signed integer division, use '``sdiv``'.
8879
8880 Division by zero is undefined behavior. For vectors, if any element
8881 of the divisor is zero, the operation has undefined behavior.
8882
8883
8884 If the ``exact`` keyword is present, the result value of the ``udiv`` is
8885 a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as
8886 such, "((a udiv exact b) mul b) == a").
8887
8888 Example:
8889 """"""""
8890
8891 .. code-block:: text
8892
8893       <result> = udiv i32 4, %var          ; yields i32:result = 4 / %var
8894
8895 .. _i_sdiv:
8896
8897 '``sdiv``' Instruction
8898 ^^^^^^^^^^^^^^^^^^^^^^
8899
8900 Syntax:
8901 """""""
8902
8903 ::
8904
8905       <result> = sdiv <ty> <op1>, <op2>         ; yields ty:result
8906       <result> = sdiv exact <ty> <op1>, <op2>   ; yields ty:result
8907
8908 Overview:
8909 """""""""
8910
8911 The '``sdiv``' instruction returns the quotient of its two operands.
8912
8913 Arguments:
8914 """"""""""
8915
8916 The two arguments to the '``sdiv``' instruction must be
8917 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8918 arguments must have identical types.
8919
8920 Semantics:
8921 """"""""""
8922
8923 The value produced is the signed integer quotient of the two operands
8924 rounded towards zero.
8925
8926 Note that signed integer division and unsigned integer division are
8927 distinct operations; for unsigned integer division, use '``udiv``'.
8928
8929 Division by zero is undefined behavior. For vectors, if any element
8930 of the divisor is zero, the operation has undefined behavior.
8931 Overflow also leads to undefined behavior; this is a rare case, but can
8932 occur, for example, by doing a 32-bit division of -2147483648 by -1.
8933
8934 If the ``exact`` keyword is present, the result value of the ``sdiv`` is
8935 a :ref:`poison value <poisonvalues>` if the result would be rounded.
8936
8937 Example:
8938 """"""""
8939
8940 .. code-block:: text
8941
8942       <result> = sdiv i32 4, %var          ; yields i32:result = 4 / %var
8943
8944 .. _i_fdiv:
8945
8946 '``fdiv``' Instruction
8947 ^^^^^^^^^^^^^^^^^^^^^^
8948
8949 Syntax:
8950 """""""
8951
8952 ::
8953
8954       <result> = fdiv [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8955
8956 Overview:
8957 """""""""
8958
8959 The '``fdiv``' instruction returns the quotient of its two operands.
8960
8961 Arguments:
8962 """"""""""
8963
8964 The two arguments to the '``fdiv``' instruction must be
8965 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8966 floating-point values. Both arguments must have identical types.
8967
8968 Semantics:
8969 """"""""""
8970
8971 The value produced is the floating-point quotient of the two operands.
8972 This instruction is assumed to execute in the default :ref:`floating-point
8973 environment <floatenv>`.
8974 This instruction can also take any number of :ref:`fast-math
8975 flags <fastmath>`, which are optimization hints to enable otherwise
8976 unsafe floating-point optimizations:
8977
8978 Example:
8979 """"""""
8980
8981 .. code-block:: text
8982
8983       <result> = fdiv float 4.0, %var          ; yields float:result = 4.0 / %var
8984
8985 .. _i_urem:
8986
8987 '``urem``' Instruction
8988 ^^^^^^^^^^^^^^^^^^^^^^
8989
8990 Syntax:
8991 """""""
8992
8993 ::
8994
8995       <result> = urem <ty> <op1>, <op2>   ; yields ty:result
8996
8997 Overview:
8998 """""""""
8999
9000 The '``urem``' instruction returns the remainder from the unsigned
9001 division of its two arguments.
9002
9003 Arguments:
9004 """"""""""
9005
9006 The two arguments to the '``urem``' instruction must be
9007 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9008 arguments must have identical types.
9009
9010 Semantics:
9011 """"""""""
9012
9013 This instruction returns the unsigned integer *remainder* of a division.
9014 This instruction always performs an unsigned division to get the
9015 remainder.
9016
9017 Note that unsigned integer remainder and signed integer remainder are
9018 distinct operations; for signed integer remainder, use '``srem``'.
9019
9020 Taking the remainder of a division by zero is undefined behavior.
9021 For vectors, if any element of the divisor is zero, the operation has
9022 undefined behavior.
9023
9024 Example:
9025 """"""""
9026
9027 .. code-block:: text
9028
9029       <result> = urem i32 4, %var          ; yields i32:result = 4 % %var
9030
9031 .. _i_srem:
9032
9033 '``srem``' Instruction
9034 ^^^^^^^^^^^^^^^^^^^^^^
9035
9036 Syntax:
9037 """""""
9038
9039 ::
9040
9041       <result> = srem <ty> <op1>, <op2>   ; yields ty:result
9042
9043 Overview:
9044 """""""""
9045
9046 The '``srem``' instruction returns the remainder from the signed
9047 division of its two operands. This instruction can also take
9048 :ref:`vector <t_vector>` versions of the values in which case the elements
9049 must be integers.
9050
9051 Arguments:
9052 """"""""""
9053
9054 The two arguments to the '``srem``' instruction must be
9055 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9056 arguments must have identical types.
9057
9058 Semantics:
9059 """"""""""
9060
9061 This instruction returns the *remainder* of a division (where the result
9062 is either zero or has the same sign as the dividend, ``op1``), not the
9063 *modulo* operator (where the result is either zero or has the same sign
9064 as the divisor, ``op2``) of a value. For more information about the
9065 difference, see `The Math
9066 Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a
9067 table of how this is implemented in various languages, please see
9068 `Wikipedia: modulo
9069 operation <http://en.wikipedia.org/wiki/Modulo_operation>`_.
9070
9071 Note that signed integer remainder and unsigned integer remainder are
9072 distinct operations; for unsigned integer remainder, use '``urem``'.
9073
9074 Taking the remainder of a division by zero is undefined behavior.
9075 For vectors, if any element of the divisor is zero, the operation has
9076 undefined behavior.
9077 Overflow also leads to undefined behavior; this is a rare case, but can
9078 occur, for example, by taking the remainder of a 32-bit division of
9079 -2147483648 by -1. (The remainder doesn't actually overflow, but this
9080 rule lets srem be implemented using instructions that return both the
9081 result of the division and the remainder.)
9082
9083 Example:
9084 """"""""
9085
9086 .. code-block:: text
9087
9088       <result> = srem i32 4, %var          ; yields i32:result = 4 % %var
9089
9090 .. _i_frem:
9091
9092 '``frem``' Instruction
9093 ^^^^^^^^^^^^^^^^^^^^^^
9094
9095 Syntax:
9096 """""""
9097
9098 ::
9099
9100       <result> = frem [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
9101
9102 Overview:
9103 """""""""
9104
9105 The '``frem``' instruction returns the remainder from the division of
9106 its two operands.
9107
9108 Arguments:
9109 """"""""""
9110
9111 The two arguments to the '``frem``' instruction must be
9112 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9113 floating-point values. Both arguments must have identical types.
9114
9115 Semantics:
9116 """"""""""
9117
9118 The value produced is the floating-point remainder of the two operands.
9119 This is the same output as a libm '``fmod``' function, but without any
9120 possibility of setting ``errno``. The remainder has the same sign as the
9121 dividend.
9122 This instruction is assumed to execute in the default :ref:`floating-point
9123 environment <floatenv>`.
9124 This instruction can also take any number of :ref:`fast-math
9125 flags <fastmath>`, which are optimization hints to enable otherwise
9126 unsafe floating-point optimizations:
9127
9128 Example:
9129 """"""""
9130
9131 .. code-block:: text
9132
9133       <result> = frem float 4.0, %var          ; yields float:result = 4.0 % %var
9134
9135 .. _bitwiseops:
9136
9137 Bitwise Binary Operations
9138 -------------------------
9139
9140 Bitwise binary operators are used to do various forms of bit-twiddling
9141 in a program. They are generally very efficient instructions and can
9142 commonly be strength reduced from other instructions. They require two
9143 operands of the same type, execute an operation on them, and produce a
9144 single value. The resulting value is the same type as its operands.
9145
9146 .. _i_shl:
9147
9148 '``shl``' Instruction
9149 ^^^^^^^^^^^^^^^^^^^^^
9150
9151 Syntax:
9152 """""""
9153
9154 ::
9155
9156       <result> = shl <ty> <op1>, <op2>           ; yields ty:result
9157       <result> = shl nuw <ty> <op1>, <op2>       ; yields ty:result
9158       <result> = shl nsw <ty> <op1>, <op2>       ; yields ty:result
9159       <result> = shl nuw nsw <ty> <op1>, <op2>   ; yields ty:result
9160
9161 Overview:
9162 """""""""
9163
9164 The '``shl``' instruction returns the first operand shifted to the left
9165 a specified number of bits.
9166
9167 Arguments:
9168 """"""""""
9169
9170 Both arguments to the '``shl``' instruction must be the same
9171 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9172 '``op2``' is treated as an unsigned value.
9173
9174 Semantics:
9175 """"""""""
9176
9177 The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`,
9178 where ``n`` is the width of the result. If ``op2`` is (statically or
9179 dynamically) equal to or larger than the number of bits in
9180 ``op1``, this instruction returns a :ref:`poison value <poisonvalues>`.
9181 If the arguments are vectors, each vector element of ``op1`` is shifted
9182 by the corresponding shift amount in ``op2``.
9183
9184 If the ``nuw`` keyword is present, then the shift produces a poison
9185 value if it shifts out any non-zero bits.
9186 If the ``nsw`` keyword is present, then the shift produces a poison
9187 value if it shifts out any bits that disagree with the resultant sign bit.
9188
9189 Example:
9190 """"""""
9191
9192 .. code-block:: text
9193
9194       <result> = shl i32 4, %var   ; yields i32: 4 << %var
9195       <result> = shl i32 4, 2      ; yields i32: 16
9196       <result> = shl i32 1, 10     ; yields i32: 1024
9197       <result> = shl i32 1, 32     ; undefined
9198       <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2>   ; yields: result=<2 x i32> < i32 2, i32 4>
9199
9200 .. _i_lshr:
9201
9202
9203 '``lshr``' Instruction
9204 ^^^^^^^^^^^^^^^^^^^^^^
9205
9206 Syntax:
9207 """""""
9208
9209 ::
9210
9211       <result> = lshr <ty> <op1>, <op2>         ; yields ty:result
9212       <result> = lshr exact <ty> <op1>, <op2>   ; yields ty:result
9213
9214 Overview:
9215 """""""""
9216
9217 The '``lshr``' instruction (logical shift right) returns the first
9218 operand shifted to the right a specified number of bits with zero fill.
9219
9220 Arguments:
9221 """"""""""
9222
9223 Both arguments to the '``lshr``' instruction must be the same
9224 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9225 '``op2``' is treated as an unsigned value.
9226
9227 Semantics:
9228 """"""""""
9229
9230 This instruction always performs a logical shift right operation. The
9231 most significant bits of the result will be filled with zero bits after
9232 the shift. If ``op2`` is (statically or dynamically) equal to or larger
9233 than the number of bits in ``op1``, this instruction returns a :ref:`poison
9234 value <poisonvalues>`. If the arguments are vectors, each vector element
9235 of ``op1`` is shifted by the corresponding shift amount in ``op2``.
9236
9237 If the ``exact`` keyword is present, the result value of the ``lshr`` is
9238 a poison value if any of the bits shifted out are non-zero.
9239
9240 Example:
9241 """"""""
9242
9243 .. code-block:: text
9244
9245       <result> = lshr i32 4, 1   ; yields i32:result = 2
9246       <result> = lshr i32 4, 2   ; yields i32:result = 1
9247       <result> = lshr i8  4, 3   ; yields i8:result = 0
9248       <result> = lshr i8 -2, 1   ; yields i8:result = 0x7F
9249       <result> = lshr i32 1, 32  ; undefined
9250       <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2>   ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
9251
9252 .. _i_ashr:
9253
9254 '``ashr``' Instruction
9255 ^^^^^^^^^^^^^^^^^^^^^^
9256
9257 Syntax:
9258 """""""
9259
9260 ::
9261
9262       <result> = ashr <ty> <op1>, <op2>         ; yields ty:result
9263       <result> = ashr exact <ty> <op1>, <op2>   ; yields ty:result
9264
9265 Overview:
9266 """""""""
9267
9268 The '``ashr``' instruction (arithmetic shift right) returns the first
9269 operand shifted to the right a specified number of bits with sign
9270 extension.
9271
9272 Arguments:
9273 """"""""""
9274
9275 Both arguments to the '``ashr``' instruction must be the same
9276 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9277 '``op2``' is treated as an unsigned value.
9278
9279 Semantics:
9280 """"""""""
9281
9282 This instruction always performs an arithmetic shift right operation,
9283 The most significant bits of the result will be filled with the sign bit
9284 of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger
9285 than the number of bits in ``op1``, this instruction returns a :ref:`poison
9286 value <poisonvalues>`. If the arguments are vectors, each vector element
9287 of ``op1`` is shifted by the corresponding shift amount in ``op2``.
9288
9289 If the ``exact`` keyword is present, the result value of the ``ashr`` is
9290 a poison value if any of the bits shifted out are non-zero.
9291
9292 Example:
9293 """"""""
9294
9295 .. code-block:: text
9296
9297       <result> = ashr i32 4, 1   ; yields i32:result = 2
9298       <result> = ashr i32 4, 2   ; yields i32:result = 1
9299       <result> = ashr i8  4, 3   ; yields i8:result = 0
9300       <result> = ashr i8 -2, 1   ; yields i8:result = -1
9301       <result> = ashr i32 1, 32  ; undefined
9302       <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3>   ; yields: result=<2 x i32> < i32 -1, i32 0>
9303
9304 .. _i_and:
9305
9306 '``and``' Instruction
9307 ^^^^^^^^^^^^^^^^^^^^^
9308
9309 Syntax:
9310 """""""
9311
9312 ::
9313
9314       <result> = and <ty> <op1>, <op2>   ; yields ty:result
9315
9316 Overview:
9317 """""""""
9318
9319 The '``and``' instruction returns the bitwise logical and of its two
9320 operands.
9321
9322 Arguments:
9323 """"""""""
9324
9325 The two arguments to the '``and``' instruction must be
9326 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9327 arguments must have identical types.
9328
9329 Semantics:
9330 """"""""""
9331
9332 The truth table used for the '``and``' instruction is:
9333
9334 +-----+-----+-----+
9335 | In0 | In1 | Out |
9336 +-----+-----+-----+
9337 |   0 |   0 |   0 |
9338 +-----+-----+-----+
9339 |   0 |   1 |   0 |
9340 +-----+-----+-----+
9341 |   1 |   0 |   0 |
9342 +-----+-----+-----+
9343 |   1 |   1 |   1 |
9344 +-----+-----+-----+
9345
9346 Example:
9347 """"""""
9348
9349 .. code-block:: text
9350
9351       <result> = and i32 4, %var         ; yields i32:result = 4 & %var
9352       <result> = and i32 15, 40          ; yields i32:result = 8
9353       <result> = and i32 4, 8            ; yields i32:result = 0
9354
9355 .. _i_or:
9356
9357 '``or``' Instruction
9358 ^^^^^^^^^^^^^^^^^^^^
9359
9360 Syntax:
9361 """""""
9362
9363 ::
9364
9365       <result> = or <ty> <op1>, <op2>   ; yields ty:result
9366
9367 Overview:
9368 """""""""
9369
9370 The '``or``' instruction returns the bitwise logical inclusive or of its
9371 two operands.
9372
9373 Arguments:
9374 """"""""""
9375
9376 The two arguments to the '``or``' instruction must be
9377 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9378 arguments must have identical types.
9379
9380 Semantics:
9381 """"""""""
9382
9383 The truth table used for the '``or``' instruction is:
9384
9385 +-----+-----+-----+
9386 | In0 | In1 | Out |
9387 +-----+-----+-----+
9388 |   0 |   0 |   0 |
9389 +-----+-----+-----+
9390 |   0 |   1 |   1 |
9391 +-----+-----+-----+
9392 |   1 |   0 |   1 |
9393 +-----+-----+-----+
9394 |   1 |   1 |   1 |
9395 +-----+-----+-----+
9396
9397 Example:
9398 """"""""
9399
9400 ::
9401
9402       <result> = or i32 4, %var         ; yields i32:result = 4 | %var
9403       <result> = or i32 15, 40          ; yields i32:result = 47
9404       <result> = or i32 4, 8            ; yields i32:result = 12
9405
9406 .. _i_xor:
9407
9408 '``xor``' Instruction
9409 ^^^^^^^^^^^^^^^^^^^^^
9410
9411 Syntax:
9412 """""""
9413
9414 ::
9415
9416       <result> = xor <ty> <op1>, <op2>   ; yields ty:result
9417
9418 Overview:
9419 """""""""
9420
9421 The '``xor``' instruction returns the bitwise logical exclusive or of
9422 its two operands. The ``xor`` is used to implement the "one's
9423 complement" operation, which is the "~" operator in C.
9424
9425 Arguments:
9426 """"""""""
9427
9428 The two arguments to the '``xor``' instruction must be
9429 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9430 arguments must have identical types.
9431
9432 Semantics:
9433 """"""""""
9434
9435 The truth table used for the '``xor``' instruction is:
9436
9437 +-----+-----+-----+
9438 | In0 | In1 | Out |
9439 +-----+-----+-----+
9440 |   0 |   0 |   0 |
9441 +-----+-----+-----+
9442 |   0 |   1 |   1 |
9443 +-----+-----+-----+
9444 |   1 |   0 |   1 |
9445 +-----+-----+-----+
9446 |   1 |   1 |   0 |
9447 +-----+-----+-----+
9448
9449 Example:
9450 """"""""
9451
9452 .. code-block:: text
9453
9454       <result> = xor i32 4, %var         ; yields i32:result = 4 ^ %var
9455       <result> = xor i32 15, 40          ; yields i32:result = 39
9456       <result> = xor i32 4, 8            ; yields i32:result = 12
9457       <result> = xor i32 %V, -1          ; yields i32:result = ~%V
9458
9459 Vector Operations
9460 -----------------
9461
9462 LLVM supports several instructions to represent vector operations in a
9463 target-independent manner. These instructions cover the element-access
9464 and vector-specific operations needed to process vectors effectively.
9465 While LLVM does directly support these vector operations, many
9466 sophisticated algorithms will want to use target-specific intrinsics to
9467 take full advantage of a specific target.
9468
9469 .. _i_extractelement:
9470
9471 '``extractelement``' Instruction
9472 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9473
9474 Syntax:
9475 """""""
9476
9477 ::
9478
9479       <result> = extractelement <n x <ty>> <val>, <ty2> <idx>  ; yields <ty>
9480       <result> = extractelement <vscale x n x <ty>> <val>, <ty2> <idx> ; yields <ty>
9481
9482 Overview:
9483 """""""""
9484
9485 The '``extractelement``' instruction extracts a single scalar element
9486 from a vector at a specified index.
9487
9488 Arguments:
9489 """"""""""
9490
9491 The first operand of an '``extractelement``' instruction is a value of
9492 :ref:`vector <t_vector>` type. The second operand is an index indicating
9493 the position from which to extract the element. The index may be a
9494 variable of any integer type.
9495
9496 Semantics:
9497 """"""""""
9498
9499 The result is a scalar of the same type as the element type of ``val``.
9500 Its value is the value at position ``idx`` of ``val``. If ``idx``
9501 exceeds the length of ``val`` for a fixed-length vector, the result is a
9502 :ref:`poison value <poisonvalues>`. For a scalable vector, if the value
9503 of ``idx`` exceeds the runtime length of the vector, the result is a
9504 :ref:`poison value <poisonvalues>`.
9505
9506 Example:
9507 """"""""
9508
9509 .. code-block:: text
9510
9511       <result> = extractelement <4 x i32> %vec, i32 0    ; yields i32
9512
9513 .. _i_insertelement:
9514
9515 '``insertelement``' Instruction
9516 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9517
9518 Syntax:
9519 """""""
9520
9521 ::
9522
9523       <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx>    ; yields <n x <ty>>
9524       <result> = insertelement <vscale x n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <vscale x n x <ty>>
9525
9526 Overview:
9527 """""""""
9528
9529 The '``insertelement``' instruction inserts a scalar element into a
9530 vector at a specified index.
9531
9532 Arguments:
9533 """"""""""
9534
9535 The first operand of an '``insertelement``' instruction is a value of
9536 :ref:`vector <t_vector>` type. The second operand is a scalar value whose
9537 type must equal the element type of the first operand. The third operand
9538 is an index indicating the position at which to insert the value. The
9539 index may be a variable of any integer type.
9540
9541 Semantics:
9542 """"""""""
9543
9544 The result is a vector of the same type as ``val``. Its element values
9545 are those of ``val`` except at position ``idx``, where it gets the value
9546 ``elt``. If ``idx`` exceeds the length of ``val`` for a fixed-length vector,
9547 the result is a :ref:`poison value <poisonvalues>`. For a scalable vector,
9548 if the value of ``idx`` exceeds the runtime length of the vector, the result
9549 is a :ref:`poison value <poisonvalues>`.
9550
9551 Example:
9552 """"""""
9553
9554 .. code-block:: text
9555
9556       <result> = insertelement <4 x i32> %vec, i32 1, i32 0    ; yields <4 x i32>
9557
9558 .. _i_shufflevector:
9559
9560 '``shufflevector``' Instruction
9561 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9562
9563 Syntax:
9564 """""""
9565
9566 ::
9567
9568       <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask>    ; yields <m x <ty>>
9569       <result> = shufflevector <vscale x n x <ty>> <v1>, <vscale x n x <ty>> v2, <vscale x m x i32> <mask>  ; yields <vscale x m x <ty>>
9570
9571 Overview:
9572 """""""""
9573
9574 The '``shufflevector``' instruction constructs a permutation of elements
9575 from two input vectors, returning a vector with the same element type as
9576 the input and length that is the same as the shuffle mask.
9577
9578 Arguments:
9579 """"""""""
9580
9581 The first two operands of a '``shufflevector``' instruction are vectors
9582 with the same type. The third argument is a shuffle mask vector constant
9583 whose element type is ``i32``. The mask vector elements must be constant
9584 integers or ``undef`` values. The result of the instruction is a vector
9585 whose length is the same as the shuffle mask and whose element type is the
9586 same as the element type of the first two operands.
9587
9588 Semantics:
9589 """"""""""
9590
9591 The elements of the two input vectors are numbered from left to right
9592 across both of the vectors. For each element of the result vector, the
9593 shuffle mask selects an element from one of the input vectors to copy
9594 to the result. Non-negative elements in the mask represent an index
9595 into the concatenated pair of input vectors.
9596
9597 If the shuffle mask is undefined, the result vector is undefined. If
9598 the shuffle mask selects an undefined element from one of the input
9599 vectors, the resulting element is undefined. An undefined element
9600 in the mask vector specifies that the resulting element is undefined.
9601 An undefined element in the mask vector prevents a poisoned vector
9602 element from propagating.
9603
9604 For scalable vectors, the only valid mask values at present are
9605 ``zeroinitializer`` and ``undef``, since we cannot write all indices as
9606 literals for a vector with a length unknown at compile time.
9607
9608 Example:
9609 """"""""
9610
9611 .. code-block:: text
9612
9613       <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
9614                               <4 x i32> <i32 0, i32 4, i32 1, i32 5>  ; yields <4 x i32>
9615       <result> = shufflevector <4 x i32> %v1, <4 x i32> undef,
9616                               <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32> - Identity shuffle.
9617       <result> = shufflevector <8 x i32> %v1, <8 x i32> undef,
9618                               <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32>
9619       <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
9620                               <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 >  ; yields <8 x i32>
9621
9622 Aggregate Operations
9623 --------------------
9624
9625 LLVM supports several instructions for working with
9626 :ref:`aggregate <t_aggregate>` values.
9627
9628 .. _i_extractvalue:
9629
9630 '``extractvalue``' Instruction
9631 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9632
9633 Syntax:
9634 """""""
9635
9636 ::
9637
9638       <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
9639
9640 Overview:
9641 """""""""
9642
9643 The '``extractvalue``' instruction extracts the value of a member field
9644 from an :ref:`aggregate <t_aggregate>` value.
9645
9646 Arguments:
9647 """"""""""
9648
9649 The first operand of an '``extractvalue``' instruction is a value of
9650 :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are
9651 constant indices to specify which value to extract in a similar manner
9652 as indices in a '``getelementptr``' instruction.
9653
9654 The major differences to ``getelementptr`` indexing are:
9655
9656 -  Since the value being indexed is not a pointer, the first index is
9657    omitted and assumed to be zero.
9658 -  At least one index must be specified.
9659 -  Not only struct indices but also array indices must be in bounds.
9660
9661 Semantics:
9662 """"""""""
9663
9664 The result is the value at the position in the aggregate specified by
9665 the index operands.
9666
9667 Example:
9668 """"""""
9669
9670 .. code-block:: text
9671
9672       <result> = extractvalue {i32, float} %agg, 0    ; yields i32
9673
9674 .. _i_insertvalue:
9675
9676 '``insertvalue``' Instruction
9677 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9678
9679 Syntax:
9680 """""""
9681
9682 ::
9683
9684       <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}*    ; yields <aggregate type>
9685
9686 Overview:
9687 """""""""
9688
9689 The '``insertvalue``' instruction inserts a value into a member field in
9690 an :ref:`aggregate <t_aggregate>` value.
9691
9692 Arguments:
9693 """"""""""
9694
9695 The first operand of an '``insertvalue``' instruction is a value of
9696 :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
9697 a first-class value to insert. The following operands are constant
9698 indices indicating the position at which to insert the value in a
9699 similar manner as indices in a '``extractvalue``' instruction. The value
9700 to insert must have the same type as the value identified by the
9701 indices.
9702
9703 Semantics:
9704 """"""""""
9705
9706 The result is an aggregate of the same type as ``val``. Its value is
9707 that of ``val`` except that the value at the position specified by the
9708 indices is that of ``elt``.
9709
9710 Example:
9711 """"""""
9712
9713 .. code-block:: llvm
9714
9715       %agg1 = insertvalue {i32, float} undef, i32 1, 0              ; yields {i32 1, float undef}
9716       %agg2 = insertvalue {i32, float} %agg1, float %val, 1         ; yields {i32 1, float %val}
9717       %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0    ; yields {i32 undef, {float %val}}
9718
9719 .. _memoryops:
9720
9721 Memory Access and Addressing Operations
9722 ---------------------------------------
9723
9724 A key design point of an SSA-based representation is how it represents
9725 memory. In LLVM, no memory locations are in SSA form, which makes things
9726 very simple. This section describes how to read, write, and allocate
9727 memory in LLVM.
9728
9729 .. _i_alloca:
9730
9731 '``alloca``' Instruction
9732 ^^^^^^^^^^^^^^^^^^^^^^^^
9733
9734 Syntax:
9735 """""""
9736
9737 ::
9738
9739       <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)]     ; yields type addrspace(num)*:result
9740
9741 Overview:
9742 """""""""
9743
9744 The '``alloca``' instruction allocates memory on the stack frame of the
9745 currently executing function, to be automatically released when this
9746 function returns to its caller.  If the address space is not explicitly
9747 specified, the object is allocated in the alloca address space from the
9748 :ref:`datalayout string<langref_datalayout>`.
9749
9750 Arguments:
9751 """"""""""
9752
9753 The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements``
9754 bytes of memory on the runtime stack, returning a pointer of the
9755 appropriate type to the program. If "NumElements" is specified, it is
9756 the number of elements allocated, otherwise "NumElements" is defaulted
9757 to be one. If a constant alignment is specified, the value result of the
9758 allocation is guaranteed to be aligned to at least that boundary. The
9759 alignment may not be greater than ``1 << 32``. If not specified, or if
9760 zero, the target can choose to align the allocation on any convenient
9761 boundary compatible with the type.
9762
9763 '``type``' may be any sized type.
9764
9765 Semantics:
9766 """"""""""
9767
9768 Memory is allocated; a pointer is returned. The allocated memory is
9769 uninitialized, and loading from uninitialized memory produces an undefined
9770 value. The operation itself is undefined if there is insufficient stack
9771 space for the allocation.'``alloca``'d memory is automatically released
9772 when the function returns. The '``alloca``' instruction is commonly used
9773 to represent automatic variables that must have an address available. When
9774 the function returns (either with the ``ret`` or ``resume`` instructions),
9775 the memory is reclaimed. Allocating zero bytes is legal, but the returned
9776 pointer may not be unique. The order in which memory is allocated (ie.,
9777 which way the stack grows) is not specified.
9778
9779 Note that '``alloca``' outside of the alloca address space from the
9780 :ref:`datalayout string<langref_datalayout>` is meaningful only if the
9781 target has assigned it a semantics.
9782
9783 If the returned pointer is used by :ref:`llvm.lifetime.start <int_lifestart>`,
9784 the returned object is initially dead.
9785 See :ref:`llvm.lifetime.start <int_lifestart>` and
9786 :ref:`llvm.lifetime.end <int_lifeend>` for the precise semantics of
9787 lifetime-manipulating intrinsics.
9788
9789 Example:
9790 """"""""
9791
9792 .. code-block:: llvm
9793
9794       %ptr = alloca i32                             ; yields i32*:ptr
9795       %ptr = alloca i32, i32 4                      ; yields i32*:ptr
9796       %ptr = alloca i32, i32 4, align 1024          ; yields i32*:ptr
9797       %ptr = alloca i32, align 1024                 ; yields i32*:ptr
9798
9799 .. _i_load:
9800
9801 '``load``' Instruction
9802 ^^^^^^^^^^^^^^^^^^^^^^
9803
9804 Syntax:
9805 """""""
9806
9807 ::
9808
9809       <result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.load !<empty_node>][, !invariant.group !<empty_node>][, !nonnull !<empty_node>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>][, !noundef !<empty_node>]
9810       <result> = load atomic [volatile] <ty>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>]
9811       !<nontemp_node> = !{ i32 1 }
9812       !<empty_node> = !{}
9813       !<deref_bytes_node> = !{ i64 <dereferenceable_bytes> }
9814       !<align_node> = !{ i64 <value_alignment> }
9815
9816 Overview:
9817 """""""""
9818
9819 The '``load``' instruction is used to read from memory.
9820
9821 Arguments:
9822 """"""""""
9823
9824 The argument to the ``load`` instruction specifies the memory address from which
9825 to load. The type specified must be a :ref:`first class <t_firstclass>` type of
9826 known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If
9827 the ``load`` is marked as ``volatile``, then the optimizer is not allowed to
9828 modify the number or order of execution of this ``load`` with other
9829 :ref:`volatile operations <volatile>`.
9830
9831 If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
9832 <ordering>` and optional ``syncscope("<target-scope>")`` argument. The
9833 ``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions.
9834 Atomic loads produce :ref:`defined <memmodel>` results when they may see
9835 multiple atomic stores. The type of the pointee must be an integer, pointer, or
9836 floating-point type whose bit width is a power of two greater than or equal to
9837 eight and less than or equal to a target-specific size limit.  ``align`` must be
9838 explicitly specified on atomic loads, and the load has undefined behavior if the
9839 alignment is not set to a value which is at least the size in bytes of the
9840 pointee. ``!nontemporal`` does not have any defined semantics for atomic loads.
9841
9842 The optional constant ``align`` argument specifies the alignment of the
9843 operation (that is, the alignment of the memory address). A value of 0
9844 or an omitted ``align`` argument means that the operation has the ABI
9845 alignment for the target. It is the responsibility of the code emitter
9846 to ensure that the alignment information is correct. Overestimating the
9847 alignment results in undefined behavior. Underestimating the alignment
9848 may produce less efficient code. An alignment of 1 is always safe. The
9849 maximum possible alignment is ``1 << 32``. An alignment value higher
9850 than the size of the loaded type implies memory up to the alignment
9851 value bytes can be safely loaded without trapping in the default
9852 address space. Access of the high bytes can interfere with debugging
9853 tools, so should not be accessed if the function has the
9854 ``sanitize_thread`` or ``sanitize_address`` attributes.
9855
9856 The optional ``!nontemporal`` metadata must reference a single
9857 metadata name ``<nontemp_node>`` corresponding to a metadata node with one
9858 ``i32`` entry of value 1. The existence of the ``!nontemporal``
9859 metadata on the instruction tells the optimizer and code generator
9860 that this load is not expected to be reused in the cache. The code
9861 generator may select special instructions to save cache bandwidth, such
9862 as the ``MOVNT`` instruction on x86.
9863
9864 The optional ``!invariant.load`` metadata must reference a single
9865 metadata name ``<empty_node>`` corresponding to a metadata node with no
9866 entries. If a load instruction tagged with the ``!invariant.load``
9867 metadata is executed, the memory location referenced by the load has
9868 to contain the same value at all points in the program where the
9869 memory location is dereferenceable; otherwise, the behavior is
9870 undefined.
9871
9872 The optional ``!invariant.group`` metadata must reference a single metadata name
9873  ``<empty_node>`` corresponding to a metadata node with no entries.
9874  See ``invariant.group`` metadata :ref:`invariant.group <md_invariant.group>`.
9875
9876 The optional ``!nonnull`` metadata must reference a single
9877 metadata name ``<empty_node>`` corresponding to a metadata node with no
9878 entries. The existence of the ``!nonnull`` metadata on the
9879 instruction tells the optimizer that the value loaded is known to
9880 never be null. If the value is null at runtime, the behavior is undefined.
9881 This is analogous to the ``nonnull`` attribute on parameters and return
9882 values. This metadata can only be applied to loads of a pointer type.
9883
9884 The optional ``!dereferenceable`` metadata must reference a single metadata
9885 name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
9886 entry.
9887 See ``dereferenceable`` metadata :ref:`dereferenceable <md_dereferenceable>`.
9888
9889 The optional ``!dereferenceable_or_null`` metadata must reference a single
9890 metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
9891 ``i64`` entry.
9892 See ``dereferenceable_or_null`` metadata :ref:`dereferenceable_or_null
9893 <md_dereferenceable_or_null>`.
9894
9895 The optional ``!align`` metadata must reference a single metadata name
9896 ``<align_node>`` corresponding to a metadata node with one ``i64`` entry.
9897 The existence of the ``!align`` metadata on the instruction tells the
9898 optimizer that the value loaded is known to be aligned to a boundary specified
9899 by the integer value in the metadata node. The alignment must be a power of 2.
9900 This is analogous to the ''align'' attribute on parameters and return values.
9901 This metadata can only be applied to loads of a pointer type. If the returned
9902 value is not appropriately aligned at runtime, the behavior is undefined.
9903
9904 The optional ``!noundef`` metadata must reference a single metadata name
9905 ``<empty_node>`` corresponding to a node with no entries. The existence of
9906 ``!noundef`` metadata on the instruction tells the optimizer that the value
9907 loaded is known to be :ref:`well defined <welldefinedvalues>`.
9908 If the value isn't well defined, the behavior is undefined.
9909
9910 Semantics:
9911 """"""""""
9912
9913 The location of memory pointed to is loaded. If the value being loaded
9914 is of scalar type then the number of bytes read does not exceed the
9915 minimum number of bytes needed to hold all bits of the type. For
9916 example, loading an ``i24`` reads at most three bytes. When loading a
9917 value of a type like ``i20`` with a size that is not an integral number
9918 of bytes, the result is undefined if the value was not originally
9919 written using a store of the same type.
9920 If the value being loaded is of aggregate type, the bytes that correspond to
9921 padding may be accessed but are ignored, because it is impossible to observe
9922 padding from the loaded aggregate value.
9923 If ``<pointer>`` is not a well-defined value, the behavior is undefined.
9924
9925 Examples:
9926 """""""""
9927
9928 .. code-block:: llvm
9929
9930       %ptr = alloca i32                               ; yields i32*:ptr
9931       store i32 3, i32* %ptr                          ; yields void
9932       %val = load i32, i32* %ptr                      ; yields i32:val = i32 3
9933
9934 .. _i_store:
9935
9936 '``store``' Instruction
9937 ^^^^^^^^^^^^^^^^^^^^^^^
9938
9939 Syntax:
9940 """""""
9941
9942 ::
9943
9944       store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>]        ; yields void
9945       store atomic [volatile] <ty> <value>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] ; yields void
9946       !<nontemp_node> = !{ i32 1 }
9947       !<empty_node> = !{}
9948
9949 Overview:
9950 """""""""
9951
9952 The '``store``' instruction is used to write to memory.
9953
9954 Arguments:
9955 """"""""""
9956
9957 There are two arguments to the ``store`` instruction: a value to store and an
9958 address at which to store it. The type of the ``<pointer>`` operand must be a
9959 pointer to the :ref:`first class <t_firstclass>` type of the ``<value>``
9960 operand. If the ``store`` is marked as ``volatile``, then the optimizer is not
9961 allowed to modify the number or order of execution of this ``store`` with other
9962 :ref:`volatile operations <volatile>`.  Only values of :ref:`first class
9963 <t_firstclass>` types of known size (i.e. not containing an :ref:`opaque
9964 structural type <t_opaque>`) can be stored.
9965
9966 If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
9967 <ordering>` and optional ``syncscope("<target-scope>")`` argument. The
9968 ``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions.
9969 Atomic loads produce :ref:`defined <memmodel>` results when they may see
9970 multiple atomic stores. The type of the pointee must be an integer, pointer, or
9971 floating-point type whose bit width is a power of two greater than or equal to
9972 eight and less than or equal to a target-specific size limit.  ``align`` must be
9973 explicitly specified on atomic stores, and the store has undefined behavior if
9974 the alignment is not set to a value which is at least the size in bytes of the
9975 pointee. ``!nontemporal`` does not have any defined semantics for atomic stores.
9976
9977 The optional constant ``align`` argument specifies the alignment of the
9978 operation (that is, the alignment of the memory address). A value of 0
9979 or an omitted ``align`` argument means that the operation has the ABI
9980 alignment for the target. It is the responsibility of the code emitter
9981 to ensure that the alignment information is correct. Overestimating the
9982 alignment results in undefined behavior. Underestimating the
9983 alignment may produce less efficient code. An alignment of 1 is always
9984 safe. The maximum possible alignment is ``1 << 32``. An alignment
9985 value higher than the size of the stored type implies memory up to the
9986 alignment value bytes can be stored to without trapping in the default
9987 address space. Storing to the higher bytes however may result in data
9988 races if another thread can access the same address. Introducing a
9989 data race is not allowed. Storing to the extra bytes is not allowed
9990 even in situations where a data race is known to not exist if the
9991 function has the ``sanitize_address`` attribute.
9992
9993 The optional ``!nontemporal`` metadata must reference a single metadata
9994 name ``<nontemp_node>`` corresponding to a metadata node with one ``i32`` entry
9995 of value 1. The existence of the ``!nontemporal`` metadata on the instruction
9996 tells the optimizer and code generator that this load is not expected to
9997 be reused in the cache. The code generator may select special
9998 instructions to save cache bandwidth, such as the ``MOVNT`` instruction on
9999 x86.
10000
10001 The optional ``!invariant.group`` metadata must reference a
10002 single metadata name ``<empty_node>``. See ``invariant.group`` metadata.
10003
10004 Semantics:
10005 """"""""""
10006
10007 The contents of memory are updated to contain ``<value>`` at the
10008 location specified by the ``<pointer>`` operand. If ``<value>`` is
10009 of scalar type then the number of bytes written does not exceed the
10010 minimum number of bytes needed to hold all bits of the type. For
10011 example, storing an ``i24`` writes at most three bytes. When writing a
10012 value of a type like ``i20`` with a size that is not an integral number
10013 of bytes, it is unspecified what happens to the extra bits that do not
10014 belong to the type, but they will typically be overwritten.
10015 If ``<value>`` is of aggregate type, padding is filled with
10016 :ref:`undef <undefvalues>`.
10017 If ``<pointer>`` is not a well-defined value, the behavior is undefined.
10018
10019 Example:
10020 """"""""
10021
10022 .. code-block:: llvm
10023
10024       %ptr = alloca i32                               ; yields i32*:ptr
10025       store i32 3, i32* %ptr                          ; yields void
10026       %val = load i32, i32* %ptr                      ; yields i32:val = i32 3
10027
10028 .. _i_fence:
10029
10030 '``fence``' Instruction
10031 ^^^^^^^^^^^^^^^^^^^^^^^
10032
10033 Syntax:
10034 """""""
10035
10036 ::
10037
10038       fence [syncscope("<target-scope>")] <ordering>  ; yields void
10039
10040 Overview:
10041 """""""""
10042
10043 The '``fence``' instruction is used to introduce happens-before edges
10044 between operations.
10045
10046 Arguments:
10047 """"""""""
10048
10049 '``fence``' instructions take an :ref:`ordering <ordering>` argument which
10050 defines what *synchronizes-with* edges they add. They can only be given
10051 ``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings.
10052
10053 Semantics:
10054 """"""""""
10055
10056 A fence A which has (at least) ``release`` ordering semantics
10057 *synchronizes with* a fence B with (at least) ``acquire`` ordering
10058 semantics if and only if there exist atomic operations X and Y, both
10059 operating on some atomic object M, such that A is sequenced before X, X
10060 modifies M (either directly or through some side effect of a sequence
10061 headed by X), Y is sequenced before B, and Y observes M. This provides a
10062 *happens-before* dependency between A and B. Rather than an explicit
10063 ``fence``, one (but not both) of the atomic operations X or Y might
10064 provide a ``release`` or ``acquire`` (resp.) ordering constraint and
10065 still *synchronize-with* the explicit ``fence`` and establish the
10066 *happens-before* edge.
10067
10068 A ``fence`` which has ``seq_cst`` ordering, in addition to having both
10069 ``acquire`` and ``release`` semantics specified above, participates in
10070 the global program order of other ``seq_cst`` operations and/or fences.
10071
10072 A ``fence`` instruction can also take an optional
10073 ":ref:`syncscope <syncscope>`" argument.
10074
10075 Example:
10076 """"""""
10077
10078 .. code-block:: text
10079
10080       fence acquire                                        ; yields void
10081       fence syncscope("singlethread") seq_cst              ; yields void
10082       fence syncscope("agent") seq_cst                     ; yields void
10083
10084 .. _i_cmpxchg:
10085
10086 '``cmpxchg``' Instruction
10087 ^^^^^^^^^^^^^^^^^^^^^^^^^
10088
10089 Syntax:
10090 """""""
10091
10092 ::
10093
10094       cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering>[, align <alignment>] ; yields  { ty, i1 }
10095
10096 Overview:
10097 """""""""
10098
10099 The '``cmpxchg``' instruction is used to atomically modify memory. It
10100 loads a value in memory and compares it to a given value. If they are
10101 equal, it tries to store a new value into the memory.
10102
10103 Arguments:
10104 """"""""""
10105
10106 There are three arguments to the '``cmpxchg``' instruction: an address
10107 to operate on, a value to compare to the value currently be at that
10108 address, and a new value to place at that address if the compared values
10109 are equal. The type of '<cmp>' must be an integer or pointer type whose
10110 bit width is a power of two greater than or equal to eight and less
10111 than or equal to a target-specific size limit. '<cmp>' and '<new>' must
10112 have the same type, and the type of '<pointer>' must be a pointer to
10113 that type. If the ``cmpxchg`` is marked as ``volatile``, then the
10114 optimizer is not allowed to modify the number or order of execution of
10115 this ``cmpxchg`` with other :ref:`volatile operations <volatile>`.
10116
10117 The success and failure :ref:`ordering <ordering>` arguments specify how this
10118 ``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
10119 must be at least ``monotonic``, the failure ordering cannot be either
10120 ``release`` or ``acq_rel``.
10121
10122 A ``cmpxchg`` instruction can also take an optional
10123 ":ref:`syncscope <syncscope>`" argument.
10124
10125 The instruction can take an optional ``align`` attribute.
10126 The alignment must be a power of two greater or equal to the size of the
10127 `<value>` type. If unspecified, the alignment is assumed to be equal to the
10128 size of the '<value>' type. Note that this default alignment assumption is
10129 different from the alignment used for the load/store instructions when align
10130 isn't specified.
10131
10132 The pointer passed into cmpxchg must have alignment greater than or
10133 equal to the size in memory of the operand.
10134
10135 Semantics:
10136 """"""""""
10137
10138 The contents of memory at the location specified by the '``<pointer>``' operand
10139 is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is
10140 written to the location. The original value at the location is returned,
10141 together with a flag indicating success (true) or failure (false).
10142
10143 If the cmpxchg operation is marked as ``weak`` then a spurious failure is
10144 permitted: the operation may not write ``<new>`` even if the comparison
10145 matched.
10146
10147 If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
10148 if the value loaded equals ``cmp``.
10149
10150 A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
10151 identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
10152 load with an ordering parameter determined the second ordering parameter.
10153
10154 Example:
10155 """"""""
10156
10157 .. code-block:: llvm
10158
10159     entry:
10160       %orig = load atomic i32, i32* %ptr unordered, align 4                      ; yields i32
10161       br label %loop
10162
10163     loop:
10164       %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop]
10165       %squared = mul i32 %cmp, %cmp
10166       %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields  { i32, i1 }
10167       %value_loaded = extractvalue { i32, i1 } %val_success, 0
10168       %success = extractvalue { i32, i1 } %val_success, 1
10169       br i1 %success, label %done, label %loop
10170
10171     done:
10172       ...
10173
10174 .. _i_atomicrmw:
10175
10176 '``atomicrmw``' Instruction
10177 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
10178
10179 Syntax:
10180 """""""
10181
10182 ::
10183
10184       atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering>[, align <alignment>]  ; yields ty
10185
10186 Overview:
10187 """""""""
10188
10189 The '``atomicrmw``' instruction is used to atomically modify memory.
10190
10191 Arguments:
10192 """"""""""
10193
10194 There are three arguments to the '``atomicrmw``' instruction: an
10195 operation to apply, an address whose value to modify, an argument to the
10196 operation. The operation must be one of the following keywords:
10197
10198 -  xchg
10199 -  add
10200 -  sub
10201 -  and
10202 -  nand
10203 -  or
10204 -  xor
10205 -  max
10206 -  min
10207 -  umax
10208 -  umin
10209 -  fadd
10210 -  fsub
10211
10212 For most of these operations, the type of '<value>' must be an integer
10213 type whose bit width is a power of two greater than or equal to eight
10214 and less than or equal to a target-specific size limit. For xchg, this
10215 may also be a floating point type with the same size constraints as
10216 integers.  For fadd/fsub, this must be a floating point type.  The
10217 type of the '``<pointer>``' operand must be a pointer to that type. If
10218 the ``atomicrmw`` is marked as ``volatile``, then the optimizer is not
10219 allowed to modify the number or order of execution of this
10220 ``atomicrmw`` with other :ref:`volatile operations <volatile>`.
10221
10222 The instruction can take an optional ``align`` attribute.
10223 The alignment must be a power of two greater or equal to the size of the
10224 `<value>` type. If unspecified, the alignment is assumed to be equal to the
10225 size of the '<value>' type. Note that this default alignment assumption is
10226 different from the alignment used for the load/store instructions when align
10227 isn't specified.
10228
10229 A ``atomicrmw`` instruction can also take an optional
10230 ":ref:`syncscope <syncscope>`" argument.
10231
10232 Semantics:
10233 """"""""""
10234
10235 The contents of memory at the location specified by the '``<pointer>``'
10236 operand are atomically read, modified, and written back. The original
10237 value at the location is returned. The modification is specified by the
10238 operation argument:
10239
10240 -  xchg: ``*ptr = val``
10241 -  add: ``*ptr = *ptr + val``
10242 -  sub: ``*ptr = *ptr - val``
10243 -  and: ``*ptr = *ptr & val``
10244 -  nand: ``*ptr = ~(*ptr & val)``
10245 -  or: ``*ptr = *ptr | val``
10246 -  xor: ``*ptr = *ptr ^ val``
10247 -  max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
10248 -  min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
10249 -  umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned comparison)
10250 -  umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned comparison)
10251 - fadd: ``*ptr = *ptr + val`` (using floating point arithmetic)
10252 - fsub: ``*ptr = *ptr - val`` (using floating point arithmetic)
10253
10254 Example:
10255 """"""""
10256
10257 .. code-block:: llvm
10258
10259       %old = atomicrmw add i32* %ptr, i32 1 acquire                        ; yields i32
10260
10261 .. _i_getelementptr:
10262
10263 '``getelementptr``' Instruction
10264 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10265
10266 Syntax:
10267 """""""
10268
10269 ::
10270
10271       <result> = getelementptr <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}*
10272       <result> = getelementptr inbounds <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}*
10273       <result> = getelementptr <ty>, <ptr vector> <ptrval>, [inrange] <vector index type> <idx>
10274
10275 Overview:
10276 """""""""
10277
10278 The '``getelementptr``' instruction is used to get the address of a
10279 subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
10280 address calculation only and does not access memory. The instruction can also
10281 be used to calculate a vector of such addresses.
10282
10283 Arguments:
10284 """"""""""
10285
10286 The first argument is always a type used as the basis for the calculations.
10287 The second argument is always a pointer or a vector of pointers, and is the
10288 base address to start from. The remaining arguments are indices
10289 that indicate which of the elements of the aggregate object are indexed.
10290 The interpretation of each index is dependent on the type being indexed
10291 into. The first index always indexes the pointer value given as the
10292 second argument, the second index indexes a value of the type pointed to
10293 (not necessarily the value directly pointed to, since the first index
10294 can be non-zero), etc. The first type indexed into must be a pointer
10295 value, subsequent types can be arrays, vectors, and structs. Note that
10296 subsequent types being indexed into can never be pointers, since that
10297 would require loading the pointer before continuing calculation.
10298
10299 The type of each index argument depends on the type it is indexing into.
10300 When indexing into a (optionally packed) structure, only ``i32`` integer
10301 **constants** are allowed (when using a vector of indices they must all
10302 be the **same** ``i32`` integer constant). When indexing into an array,
10303 pointer or vector, integers of any width are allowed, and they are not
10304 required to be constant. These integers are treated as signed values
10305 where relevant.
10306
10307 For example, let's consider a C code fragment and how it gets compiled
10308 to LLVM:
10309
10310 .. code-block:: c
10311
10312     struct RT {
10313       char A;
10314       int B[10][20];
10315       char C;
10316     };
10317     struct ST {
10318       int X;
10319       double Y;
10320       struct RT Z;
10321     };
10322
10323     int *foo(struct ST *s) {
10324       return &s[1].Z.B[5][13];
10325     }
10326
10327 The LLVM code generated by Clang is:
10328
10329 .. code-block:: llvm
10330
10331     %struct.RT = type { i8, [10 x [20 x i32]], i8 }
10332     %struct.ST = type { i32, double, %struct.RT }
10333
10334     define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp {
10335     entry:
10336       %arrayidx = getelementptr inbounds %struct.ST, %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13
10337       ret i32* %arrayidx
10338     }
10339
10340 Semantics:
10341 """"""""""
10342
10343 In the example above, the first index is indexing into the
10344 '``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``'
10345 = '``{ i32, double, %struct.RT }``' type, a structure. The second index
10346 indexes into the third element of the structure, yielding a
10347 '``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another
10348 structure. The third index indexes into the second element of the
10349 structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two
10350 dimensions of the array are subscripted into, yielding an '``i32``'
10351 type. The '``getelementptr``' instruction returns a pointer to this
10352 element, thus computing a value of '``i32*``' type.
10353
10354 Note that it is perfectly legal to index partially through a structure,
10355 returning a pointer to an inner element. Because of this, the LLVM code
10356 for the given testcase is equivalent to:
10357
10358 .. code-block:: llvm
10359
10360     define i32* @foo(%struct.ST* %s) {
10361       %t1 = getelementptr %struct.ST, %struct.ST* %s, i32 1                        ; yields %struct.ST*:%t1
10362       %t2 = getelementptr %struct.ST, %struct.ST* %t1, i32 0, i32 2                ; yields %struct.RT*:%t2
10363       %t3 = getelementptr %struct.RT, %struct.RT* %t2, i32 0, i32 1                ; yields [10 x [20 x i32]]*:%t3
10364       %t4 = getelementptr [10 x [20 x i32]], [10 x [20 x i32]]* %t3, i32 0, i32 5  ; yields [20 x i32]*:%t4
10365       %t5 = getelementptr [20 x i32], [20 x i32]* %t4, i32 0, i32 13               ; yields i32*:%t5
10366       ret i32* %t5
10367     }
10368
10369 If the ``inbounds`` keyword is present, the result value of the
10370 ``getelementptr`` is a :ref:`poison value <poisonvalues>` if one of the
10371 following rules is violated:
10372
10373 *  The base pointer has an *in bounds* address of an allocated object, which
10374    means that it points into an allocated object, or to its end. The only
10375    *in bounds* address for a null pointer in the default address-space is the
10376    null pointer itself.
10377 *  If the type of an index is larger than the pointer index type, the
10378    truncation to the pointer index type preserves the signed value.
10379 *  The multiplication of an index by the type size does not wrap the pointer
10380    index type in a signed sense (``nsw``).
10381 *  The successive addition of offsets (without adding the base address) does
10382    not wrap the pointer index type in a signed sense (``nsw``).
10383 *  The successive addition of the current address, interpreted as an unsigned
10384    number, and an offset, interpreted as a signed number, does not wrap the
10385    unsigned address space and remains *in bounds* of the allocated object.
10386    As a corollary, if the added offset is non-negative, the addition does not
10387    wrap in an unsigned sense (``nuw``).
10388 *  In cases where the base is a vector of pointers, the ``inbounds`` keyword
10389    applies to each of the computations element-wise.
10390
10391 These rules are based on the assumption that no allocated object may cross
10392 the unsigned address space boundary, and no allocated object may be larger
10393 than half the pointer index type space.
10394
10395 If the ``inbounds`` keyword is not present, the offsets are added to the
10396 base address with silently-wrapping two's complement arithmetic. If the
10397 offsets have a different width from the pointer, they are sign-extended
10398 or truncated to the width of the pointer. The result value of the
10399 ``getelementptr`` may be outside the object pointed to by the base
10400 pointer. The result value may not necessarily be used to access memory
10401 though, even if it happens to point into allocated storage. See the
10402 :ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
10403 information.
10404
10405 If the ``inrange`` keyword is present before any index, loading from or
10406 storing to any pointer derived from the ``getelementptr`` has undefined
10407 behavior if the load or store would access memory outside of the bounds of
10408 the element selected by the index marked as ``inrange``. The result of a
10409 pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations
10410 involving memory) involving a pointer derived from a ``getelementptr`` with
10411 the ``inrange`` keyword is undefined, with the exception of comparisons
10412 in the case where both operands are in the range of the element selected
10413 by the ``inrange`` keyword, inclusive of the address one past the end of
10414 that element. Note that the ``inrange`` keyword is currently only allowed
10415 in constant ``getelementptr`` expressions.
10416
10417 The getelementptr instruction is often confusing. For some more insight
10418 into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.
10419
10420 Example:
10421 """"""""
10422
10423 .. code-block:: llvm
10424
10425         ; yields [12 x i8]*:aptr
10426         %aptr = getelementptr {i32, [12 x i8]}, {i32, [12 x i8]}* %saptr, i64 0, i32 1
10427         ; yields i8*:vptr
10428         %vptr = getelementptr {i32, <2 x i8>}, {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1
10429         ; yields i8*:eptr
10430         %eptr = getelementptr [12 x i8], [12 x i8]* %aptr, i64 0, i32 1
10431         ; yields i32*:iptr
10432         %iptr = getelementptr [10 x i32], [10 x i32]* @arr, i16 0, i16 0
10433
10434 Vector of pointers:
10435 """""""""""""""""""
10436
10437 The ``getelementptr`` returns a vector of pointers, instead of a single address,
10438 when one or more of its arguments is a vector. In such cases, all vector
10439 arguments should have the same number of elements, and every scalar argument
10440 will be effectively broadcast into a vector during address calculation.
10441
10442 .. code-block:: llvm
10443
10444      ; All arguments are vectors:
10445      ;   A[i] = ptrs[i] + offsets[i]*sizeof(i8)
10446      %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
10447
10448      ; Add the same scalar offset to each pointer of a vector:
10449      ;   A[i] = ptrs[i] + offset*sizeof(i8)
10450      %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset
10451
10452      ; Add distinct offsets to the same pointer:
10453      ;   A[i] = ptr + offsets[i]*sizeof(i8)
10454      %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
10455
10456      ; In all cases described above the type of the result is <4 x i8*>
10457
10458 The two following instructions are equivalent:
10459
10460 .. code-block:: llvm
10461
10462      getelementptr  %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
10463        <4 x i32> <i32 2, i32 2, i32 2, i32 2>,
10464        <4 x i32> <i32 1, i32 1, i32 1, i32 1>,
10465        <4 x i32> %ind4,
10466        <4 x i64> <i64 13, i64 13, i64 13, i64 13>
10467
10468      getelementptr  %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
10469        i32 2, i32 1, <4 x i32> %ind4, i64 13
10470
10471 Let's look at the C code, where the vector version of ``getelementptr``
10472 makes sense:
10473
10474 .. code-block:: c
10475
10476     // Let's assume that we vectorize the following loop:
10477     double *A, *B; int *C;
10478     for (int i = 0; i < size; ++i) {
10479       A[i] = B[C[i]];
10480     }
10481
10482 .. code-block:: llvm
10483
10484     ; get pointers for 8 elements from array B
10485     %ptrs = getelementptr double, double* %B, <8 x i32> %C
10486     ; load 8 elements from array B into A
10487     %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x double*> %ptrs,
10488          i32 8, <8 x i1> %mask, <8 x double> %passthru)
10489
10490 Conversion Operations
10491 ---------------------
10492
10493 The instructions in this category are the conversion instructions
10494 (casting) which all take a single operand and a type. They perform
10495 various bit conversions on the operand.
10496
10497 .. _i_trunc:
10498
10499 '``trunc .. to``' Instruction
10500 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10501
10502 Syntax:
10503 """""""
10504
10505 ::
10506
10507       <result> = trunc <ty> <value> to <ty2>             ; yields ty2
10508
10509 Overview:
10510 """""""""
10511
10512 The '``trunc``' instruction truncates its operand to the type ``ty2``.
10513
10514 Arguments:
10515 """"""""""
10516
10517 The '``trunc``' instruction takes a value to trunc, and a type to trunc
10518 it to. Both types must be of :ref:`integer <t_integer>` types, or vectors
10519 of the same number of integers. The bit size of the ``value`` must be
10520 larger than the bit size of the destination type, ``ty2``. Equal sized
10521 types are not allowed.
10522
10523 Semantics:
10524 """"""""""
10525
10526 The '``trunc``' instruction truncates the high order bits in ``value``
10527 and converts the remaining bits to ``ty2``. Since the source size must
10528 be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
10529 It will always truncate bits.
10530
10531 Example:
10532 """"""""
10533
10534 .. code-block:: llvm
10535
10536       %X = trunc i32 257 to i8                        ; yields i8:1
10537       %Y = trunc i32 123 to i1                        ; yields i1:true
10538       %Z = trunc i32 122 to i1                        ; yields i1:false
10539       %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
10540
10541 .. _i_zext:
10542
10543 '``zext .. to``' Instruction
10544 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10545
10546 Syntax:
10547 """""""
10548
10549 ::
10550
10551       <result> = zext <ty> <value> to <ty2>             ; yields ty2
10552
10553 Overview:
10554 """""""""
10555
10556 The '``zext``' instruction zero extends its operand to type ``ty2``.
10557
10558 Arguments:
10559 """"""""""
10560
10561 The '``zext``' instruction takes a value to cast, and a type to cast it
10562 to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
10563 the same number of integers. The bit size of the ``value`` must be
10564 smaller than the bit size of the destination type, ``ty2``.
10565
10566 Semantics:
10567 """"""""""
10568
10569 The ``zext`` fills the high order bits of the ``value`` with zero bits
10570 until it reaches the size of the destination type, ``ty2``.
10571
10572 When zero extending from i1, the result will always be either 0 or 1.
10573
10574 Example:
10575 """"""""
10576
10577 .. code-block:: llvm
10578
10579       %X = zext i32 257 to i64              ; yields i64:257
10580       %Y = zext i1 true to i32              ; yields i32:1
10581       %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
10582
10583 .. _i_sext:
10584
10585 '``sext .. to``' Instruction
10586 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10587
10588 Syntax:
10589 """""""
10590
10591 ::
10592
10593       <result> = sext <ty> <value> to <ty2>             ; yields ty2
10594
10595 Overview:
10596 """""""""
10597
10598 The '``sext``' sign extends ``value`` to the type ``ty2``.
10599
10600 Arguments:
10601 """"""""""
10602
10603 The '``sext``' instruction takes a value to cast, and a type to cast it
10604 to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
10605 the same number of integers. The bit size of the ``value`` must be
10606 smaller than the bit size of the destination type, ``ty2``.
10607
10608 Semantics:
10609 """"""""""
10610
10611 The '``sext``' instruction performs a sign extension by copying the sign
10612 bit (highest order bit) of the ``value`` until it reaches the bit size
10613 of the type ``ty2``.
10614
10615 When sign extending from i1, the extension always results in -1 or 0.
10616
10617 Example:
10618 """"""""
10619
10620 .. code-block:: llvm
10621
10622       %X = sext i8  -1 to i16              ; yields i16   :65535
10623       %Y = sext i1 true to i32             ; yields i32:-1
10624       %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
10625
10626 '``fptrunc .. to``' Instruction
10627 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10628
10629 Syntax:
10630 """""""
10631
10632 ::
10633
10634       <result> = fptrunc <ty> <value> to <ty2>             ; yields ty2
10635
10636 Overview:
10637 """""""""
10638
10639 The '``fptrunc``' instruction truncates ``value`` to type ``ty2``.
10640
10641 Arguments:
10642 """"""""""
10643
10644 The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>`
10645 value to cast and a :ref:`floating-point <t_floating>` type to cast it to.
10646 The size of ``value`` must be larger than the size of ``ty2``. This
10647 implies that ``fptrunc`` cannot be used to make a *no-op cast*.
10648
10649 Semantics:
10650 """"""""""
10651
10652 The '``fptrunc``' instruction casts a ``value`` from a larger
10653 :ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
10654 <t_floating>` type.
10655 This instruction is assumed to execute in the default :ref:`floating-point
10656 environment <floatenv>`.
10657
10658 Example:
10659 """"""""
10660
10661 .. code-block:: llvm
10662
10663       %X = fptrunc double 16777217.0 to float    ; yields float:16777216.0
10664       %Y = fptrunc double 1.0E+300 to half       ; yields half:+infinity
10665
10666 '``fpext .. to``' Instruction
10667 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10668
10669 Syntax:
10670 """""""
10671
10672 ::
10673
10674       <result> = fpext <ty> <value> to <ty2>             ; yields ty2
10675
10676 Overview:
10677 """""""""
10678
10679 The '``fpext``' extends a floating-point ``value`` to a larger floating-point
10680 value.
10681
10682 Arguments:
10683 """"""""""
10684
10685 The '``fpext``' instruction takes a :ref:`floating-point <t_floating>`
10686 ``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it
10687 to. The source type must be smaller than the destination type.
10688
10689 Semantics:
10690 """"""""""
10691
10692 The '``fpext``' instruction extends the ``value`` from a smaller
10693 :ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
10694 <t_floating>` type. The ``fpext`` cannot be used to make a
10695 *no-op cast* because it always changes bits. Use ``bitcast`` to make a
10696 *no-op cast* for a floating-point cast.
10697
10698 Example:
10699 """"""""
10700
10701 .. code-block:: llvm
10702
10703       %X = fpext float 3.125 to double         ; yields double:3.125000e+00
10704       %Y = fpext double %X to fp128            ; yields fp128:0xL00000000000000004000900000000000
10705
10706 '``fptoui .. to``' Instruction
10707 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10708
10709 Syntax:
10710 """""""
10711
10712 ::
10713
10714       <result> = fptoui <ty> <value> to <ty2>             ; yields ty2
10715
10716 Overview:
10717 """""""""
10718
10719 The '``fptoui``' converts a floating-point ``value`` to its unsigned
10720 integer equivalent of type ``ty2``.
10721
10722 Arguments:
10723 """"""""""
10724
10725 The '``fptoui``' instruction takes a value to cast, which must be a
10726 scalar or vector :ref:`floating-point <t_floating>` value, and a type to
10727 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
10728 ``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
10729 type with the same number of elements as ``ty``
10730
10731 Semantics:
10732 """"""""""
10733
10734 The '``fptoui``' instruction converts its :ref:`floating-point
10735 <t_floating>` operand into the nearest (rounding towards zero)
10736 unsigned integer value. If the value cannot fit in ``ty2``, the result
10737 is a :ref:`poison value <poisonvalues>`.
10738
10739 Example:
10740 """"""""
10741
10742 .. code-block:: llvm
10743
10744       %X = fptoui double 123.0 to i32      ; yields i32:123
10745       %Y = fptoui float 1.0E+300 to i1     ; yields undefined:1
10746       %Z = fptoui float 1.04E+17 to i8     ; yields undefined:1
10747
10748 '``fptosi .. to``' Instruction
10749 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10750
10751 Syntax:
10752 """""""
10753
10754 ::
10755
10756       <result> = fptosi <ty> <value> to <ty2>             ; yields ty2
10757
10758 Overview:
10759 """""""""
10760
10761 The '``fptosi``' instruction converts :ref:`floating-point <t_floating>`
10762 ``value`` to type ``ty2``.
10763
10764 Arguments:
10765 """"""""""
10766
10767 The '``fptosi``' instruction takes a value to cast, which must be a
10768 scalar or vector :ref:`floating-point <t_floating>` value, and a type to
10769 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
10770 ``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
10771 type with the same number of elements as ``ty``
10772
10773 Semantics:
10774 """"""""""
10775
10776 The '``fptosi``' instruction converts its :ref:`floating-point
10777 <t_floating>` operand into the nearest (rounding towards zero)
10778 signed integer value. If the value cannot fit in ``ty2``, the result
10779 is a :ref:`poison value <poisonvalues>`.
10780
10781 Example:
10782 """"""""
10783
10784 .. code-block:: llvm
10785
10786       %X = fptosi double -123.0 to i32      ; yields i32:-123
10787       %Y = fptosi float 1.0E-247 to i1      ; yields undefined:1
10788       %Z = fptosi float 1.04E+17 to i8      ; yields undefined:1
10789
10790 '``uitofp .. to``' Instruction
10791 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10792
10793 Syntax:
10794 """""""
10795
10796 ::
10797
10798       <result> = uitofp <ty> <value> to <ty2>             ; yields ty2
10799
10800 Overview:
10801 """""""""
10802
10803 The '``uitofp``' instruction regards ``value`` as an unsigned integer
10804 and converts that value to the ``ty2`` type.
10805
10806 Arguments:
10807 """"""""""
10808
10809 The '``uitofp``' instruction takes a value to cast, which must be a
10810 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
10811 ``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
10812 ``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
10813 type with the same number of elements as ``ty``
10814
10815 Semantics:
10816 """"""""""
10817
10818 The '``uitofp``' instruction interprets its operand as an unsigned
10819 integer quantity and converts it to the corresponding floating-point
10820 value. If the value cannot be exactly represented, it is rounded using
10821 the default rounding mode.
10822
10823
10824 Example:
10825 """"""""
10826
10827 .. code-block:: llvm
10828
10829       %X = uitofp i32 257 to float         ; yields float:257.0
10830       %Y = uitofp i8 -1 to double          ; yields double:255.0
10831
10832 '``sitofp .. to``' Instruction
10833 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10834
10835 Syntax:
10836 """""""
10837
10838 ::
10839
10840       <result> = sitofp <ty> <value> to <ty2>             ; yields ty2
10841
10842 Overview:
10843 """""""""
10844
10845 The '``sitofp``' instruction regards ``value`` as a signed integer and
10846 converts that value to the ``ty2`` type.
10847
10848 Arguments:
10849 """"""""""
10850
10851 The '``sitofp``' instruction takes a value to cast, which must be a
10852 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
10853 ``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
10854 ``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
10855 type with the same number of elements as ``ty``
10856
10857 Semantics:
10858 """"""""""
10859
10860 The '``sitofp``' instruction interprets its operand as a signed integer
10861 quantity and converts it to the corresponding floating-point value. If the
10862 value cannot be exactly represented, it is rounded using the default rounding
10863 mode.
10864
10865 Example:
10866 """"""""
10867
10868 .. code-block:: llvm
10869
10870       %X = sitofp i32 257 to float         ; yields float:257.0
10871       %Y = sitofp i8 -1 to double          ; yields double:-1.0
10872
10873 .. _i_ptrtoint:
10874
10875 '``ptrtoint .. to``' Instruction
10876 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10877
10878 Syntax:
10879 """""""
10880
10881 ::
10882
10883       <result> = ptrtoint <ty> <value> to <ty2>             ; yields ty2
10884
10885 Overview:
10886 """""""""
10887
10888 The '``ptrtoint``' instruction converts the pointer or a vector of
10889 pointers ``value`` to the integer (or vector of integers) type ``ty2``.
10890
10891 Arguments:
10892 """"""""""
10893
10894 The '``ptrtoint``' instruction takes a ``value`` to cast, which must be
10895 a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a
10896 type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or
10897 a vector of integers type.
10898
10899 Semantics:
10900 """"""""""
10901
10902 The '``ptrtoint``' instruction converts ``value`` to integer type
10903 ``ty2`` by interpreting the pointer value as an integer and either
10904 truncating or zero extending that value to the size of the integer type.
10905 If ``value`` is smaller than ``ty2`` then a zero extension is done. If
10906 ``value`` is larger than ``ty2`` then a truncation is done. If they are
10907 the same size, then nothing is done (*no-op cast*) other than a type
10908 change.
10909
10910 Example:
10911 """"""""
10912
10913 .. code-block:: llvm
10914
10915       %X = ptrtoint i32* %P to i8                         ; yields truncation on 32-bit architecture
10916       %Y = ptrtoint i32* %P to i64                        ; yields zero extension on 32-bit architecture
10917       %Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
10918
10919 .. _i_inttoptr:
10920
10921 '``inttoptr .. to``' Instruction
10922 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10923
10924 Syntax:
10925 """""""
10926
10927 ::
10928
10929       <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>]             ; yields ty2
10930
10931 Overview:
10932 """""""""
10933
10934 The '``inttoptr``' instruction converts an integer ``value`` to a
10935 pointer type, ``ty2``.
10936
10937 Arguments:
10938 """"""""""
10939
10940 The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to
10941 cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>`
10942 type.
10943
10944 The optional ``!dereferenceable`` metadata must reference a single metadata
10945 name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
10946 entry.
10947 See ``dereferenceable`` metadata.
10948
10949 The optional ``!dereferenceable_or_null`` metadata must reference a single
10950 metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
10951 ``i64`` entry.
10952 See ``dereferenceable_or_null`` metadata.
10953
10954 Semantics:
10955 """"""""""
10956
10957 The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by
10958 applying either a zero extension or a truncation depending on the size
10959 of the integer ``value``. If ``value`` is larger than the size of a
10960 pointer then a truncation is done. If ``value`` is smaller than the size
10961 of a pointer then a zero extension is done. If they are the same size,
10962 nothing is done (*no-op cast*).
10963
10964 Example:
10965 """"""""
10966
10967 .. code-block:: llvm
10968
10969       %X = inttoptr i32 255 to i32*          ; yields zero extension on 64-bit architecture
10970       %Y = inttoptr i32 255 to i32*          ; yields no-op on 32-bit architecture
10971       %Z = inttoptr i64 0 to i32*            ; yields truncation on 32-bit architecture
10972       %Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers
10973
10974 .. _i_bitcast:
10975
10976 '``bitcast .. to``' Instruction
10977 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10978
10979 Syntax:
10980 """""""
10981
10982 ::
10983
10984       <result> = bitcast <ty> <value> to <ty2>             ; yields ty2
10985
10986 Overview:
10987 """""""""
10988
10989 The '``bitcast``' instruction converts ``value`` to type ``ty2`` without
10990 changing any bits.
10991
10992 Arguments:
10993 """"""""""
10994
10995 The '``bitcast``' instruction takes a value to cast, which must be a
10996 non-aggregate first class value, and a type to cast it to, which must
10997 also be a non-aggregate :ref:`first class <t_firstclass>` type. The
10998 bit sizes of ``value`` and the destination type, ``ty2``, must be
10999 identical. If the source type is a pointer, the destination type must
11000 also be a pointer of the same size. This instruction supports bitwise
11001 conversion of vectors to integers and to vectors of other types (as
11002 long as they have the same size).
11003
11004 Semantics:
11005 """"""""""
11006
11007 The '``bitcast``' instruction converts ``value`` to type ``ty2``. It
11008 is always a *no-op cast* because no bits change with this
11009 conversion. The conversion is done as if the ``value`` had been stored
11010 to memory and read back as type ``ty2``. Pointer (or vector of
11011 pointers) types may only be converted to other pointer (or vector of
11012 pointers) types with the same address space through this instruction.
11013 To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>`
11014 or :ref:`ptrtoint <i_ptrtoint>` instructions first.
11015
11016 There is a caveat for bitcasts involving vector types in relation to
11017 endianess. For example ``bitcast <2 x i8> <value> to i16`` puts element zero
11018 of the vector in the least significant bits of the i16 for little-endian while
11019 element zero ends up in the most significant bits for big-endian.
11020
11021 Example:
11022 """"""""
11023
11024 .. code-block:: text
11025
11026       %X = bitcast i8 255 to i8          ; yields i8 :-1
11027       %Y = bitcast i32* %x to i16*      ; yields i16*:%x
11028       %Z = bitcast <2 x i32> %V to i64;  ; yields i64: %V (depends on endianess)
11029       %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
11030
11031 .. _i_addrspacecast:
11032
11033 '``addrspacecast .. to``' Instruction
11034 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11035
11036 Syntax:
11037 """""""
11038
11039 ::
11040
11041       <result> = addrspacecast <pty> <ptrval> to <pty2>       ; yields pty2
11042
11043 Overview:
11044 """""""""
11045
11046 The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in
11047 address space ``n`` to type ``pty2`` in address space ``m``.
11048
11049 Arguments:
11050 """"""""""
11051
11052 The '``addrspacecast``' instruction takes a pointer or vector of pointer value
11053 to cast and a pointer type to cast it to, which must have a different
11054 address space.
11055
11056 Semantics:
11057 """"""""""
11058
11059 The '``addrspacecast``' instruction converts the pointer value
11060 ``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex
11061 value modification, depending on the target and the address space
11062 pair. Pointer conversions within the same address space must be
11063 performed with the ``bitcast`` instruction. Note that if the address space
11064 conversion is legal then both result and operand refer to the same memory
11065 location.
11066
11067 Example:
11068 """"""""
11069
11070 .. code-block:: llvm
11071
11072       %X = addrspacecast i32* %x to i32 addrspace(1)*    ; yields i32 addrspace(1)*:%x
11073       %Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)*    ; yields i64 addrspace(2)*:%y
11074       %Z = addrspacecast <4 x i32*> %z to <4 x float addrspace(3)*>   ; yields <4 x float addrspace(3)*>:%z
11075
11076 .. _otherops:
11077
11078 Other Operations
11079 ----------------
11080
11081 The instructions in this category are the "miscellaneous" instructions,
11082 which defy better classification.
11083
11084 .. _i_icmp:
11085
11086 '``icmp``' Instruction
11087 ^^^^^^^^^^^^^^^^^^^^^^
11088
11089 Syntax:
11090 """""""
11091
11092 ::
11093
11094       <result> = icmp <cond> <ty> <op1>, <op2>   ; yields i1 or <N x i1>:result
11095
11096 Overview:
11097 """""""""
11098
11099 The '``icmp``' instruction returns a boolean value or a vector of
11100 boolean values based on comparison of its two integer, integer vector,
11101 pointer, or pointer vector operands.
11102
11103 Arguments:
11104 """"""""""
11105
11106 The '``icmp``' instruction takes three operands. The first operand is
11107 the condition code indicating the kind of comparison to perform. It is
11108 not a value, just a keyword. The possible condition codes are:
11109
11110 #. ``eq``: equal
11111 #. ``ne``: not equal
11112 #. ``ugt``: unsigned greater than
11113 #. ``uge``: unsigned greater or equal
11114 #. ``ult``: unsigned less than
11115 #. ``ule``: unsigned less or equal
11116 #. ``sgt``: signed greater than
11117 #. ``sge``: signed greater or equal
11118 #. ``slt``: signed less than
11119 #. ``sle``: signed less or equal
11120
11121 The remaining two arguments must be :ref:`integer <t_integer>` or
11122 :ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They
11123 must also be identical types.
11124
11125 Semantics:
11126 """"""""""
11127
11128 The '``icmp``' compares ``op1`` and ``op2`` according to the condition
11129 code given as ``cond``. The comparison performed always yields either an
11130 :ref:`i1 <t_integer>` or vector of ``i1`` result, as follows:
11131
11132 #. ``eq``: yields ``true`` if the operands are equal, ``false``
11133    otherwise. No sign interpretation is necessary or performed.
11134 #. ``ne``: yields ``true`` if the operands are unequal, ``false``
11135    otherwise. No sign interpretation is necessary or performed.
11136 #. ``ugt``: interprets the operands as unsigned values and yields
11137    ``true`` if ``op1`` is greater than ``op2``.
11138 #. ``uge``: interprets the operands as unsigned values and yields
11139    ``true`` if ``op1`` is greater than or equal to ``op2``.
11140 #. ``ult``: interprets the operands as unsigned values and yields
11141    ``true`` if ``op1`` is less than ``op2``.
11142 #. ``ule``: interprets the operands as unsigned values and yields
11143    ``true`` if ``op1`` is less than or equal to ``op2``.
11144 #. ``sgt``: interprets the operands as signed values and yields ``true``
11145    if ``op1`` is greater than ``op2``.
11146 #. ``sge``: interprets the operands as signed values and yields ``true``
11147    if ``op1`` is greater than or equal to ``op2``.
11148 #. ``slt``: interprets the operands as signed values and yields ``true``
11149    if ``op1`` is less than ``op2``.
11150 #. ``sle``: interprets the operands as signed values and yields ``true``
11151    if ``op1`` is less than or equal to ``op2``.
11152
11153 If the operands are :ref:`pointer <t_pointer>` typed, the pointer values
11154 are compared as if they were integers.
11155
11156 If the operands are integer vectors, then they are compared element by
11157 element. The result is an ``i1`` vector with the same number of elements
11158 as the values being compared. Otherwise, the result is an ``i1``.
11159
11160 Example:
11161 """"""""
11162
11163 .. code-block:: text
11164
11165       <result> = icmp eq i32 4, 5          ; yields: result=false
11166       <result> = icmp ne float* %X, %X     ; yields: result=false
11167       <result> = icmp ult i16  4, 5        ; yields: result=true
11168       <result> = icmp sgt i16  4, 5        ; yields: result=false
11169       <result> = icmp ule i16 -4, 5        ; yields: result=false
11170       <result> = icmp sge i16  4, 5        ; yields: result=false
11171
11172 .. _i_fcmp:
11173
11174 '``fcmp``' Instruction
11175 ^^^^^^^^^^^^^^^^^^^^^^
11176
11177 Syntax:
11178 """""""
11179
11180 ::
11181
11182       <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2>     ; yields i1 or <N x i1>:result
11183
11184 Overview:
11185 """""""""
11186
11187 The '``fcmp``' instruction returns a boolean value or vector of boolean
11188 values based on comparison of its operands.
11189
11190 If the operands are floating-point scalars, then the result type is a
11191 boolean (:ref:`i1 <t_integer>`).
11192
11193 If the operands are floating-point vectors, then the result type is a
11194 vector of boolean with the same number of elements as the operands being
11195 compared.
11196
11197 Arguments:
11198 """"""""""
11199
11200 The '``fcmp``' instruction takes three operands. The first operand is
11201 the condition code indicating the kind of comparison to perform. It is
11202 not a value, just a keyword. The possible condition codes are:
11203
11204 #. ``false``: no comparison, always returns false
11205 #. ``oeq``: ordered and equal
11206 #. ``ogt``: ordered and greater than
11207 #. ``oge``: ordered and greater than or equal
11208 #. ``olt``: ordered and less than
11209 #. ``ole``: ordered and less than or equal
11210 #. ``one``: ordered and not equal
11211 #. ``ord``: ordered (no nans)
11212 #. ``ueq``: unordered or equal
11213 #. ``ugt``: unordered or greater than
11214 #. ``uge``: unordered or greater than or equal
11215 #. ``ult``: unordered or less than
11216 #. ``ule``: unordered or less than or equal
11217 #. ``une``: unordered or not equal
11218 #. ``uno``: unordered (either nans)
11219 #. ``true``: no comparison, always returns true
11220
11221 *Ordered* means that neither operand is a QNAN while *unordered* means
11222 that either operand may be a QNAN.
11223
11224 Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point
11225 <t_floating>` type or a :ref:`vector <t_vector>` of floating-point type.
11226 They must have identical types.
11227
11228 Semantics:
11229 """"""""""
11230
11231 The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the
11232 condition code given as ``cond``. If the operands are vectors, then the
11233 vectors are compared element by element. Each comparison performed
11234 always yields an :ref:`i1 <t_integer>` result, as follows:
11235
11236 #. ``false``: always yields ``false``, regardless of operands.
11237 #. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1``
11238    is equal to ``op2``.
11239 #. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1``
11240    is greater than ``op2``.
11241 #. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1``
11242    is greater than or equal to ``op2``.
11243 #. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1``
11244    is less than ``op2``.
11245 #. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1``
11246    is less than or equal to ``op2``.
11247 #. ``one``: yields ``true`` if both operands are not a QNAN and ``op1``
11248    is not equal to ``op2``.
11249 #. ``ord``: yields ``true`` if both operands are not a QNAN.
11250 #. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is
11251    equal to ``op2``.
11252 #. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is
11253    greater than ``op2``.
11254 #. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is
11255    greater than or equal to ``op2``.
11256 #. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is
11257    less than ``op2``.
11258 #. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is
11259    less than or equal to ``op2``.
11260 #. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is
11261    not equal to ``op2``.
11262 #. ``uno``: yields ``true`` if either operand is a QNAN.
11263 #. ``true``: always yields ``true``, regardless of operands.
11264
11265 The ``fcmp`` instruction can also optionally take any number of
11266 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
11267 otherwise unsafe floating-point optimizations.
11268
11269 Any set of fast-math flags are legal on an ``fcmp`` instruction, but the
11270 only flags that have any effect on its semantics are those that allow
11271 assumptions to be made about the values of input arguments; namely
11272 ``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information.
11273
11274 Example:
11275 """"""""
11276
11277 .. code-block:: text
11278
11279       <result> = fcmp oeq float 4.0, 5.0    ; yields: result=false
11280       <result> = fcmp one float 4.0, 5.0    ; yields: result=true
11281       <result> = fcmp olt float 4.0, 5.0    ; yields: result=true
11282       <result> = fcmp ueq double 1.0, 2.0   ; yields: result=false
11283
11284 .. _i_phi:
11285
11286 '``phi``' Instruction
11287 ^^^^^^^^^^^^^^^^^^^^^
11288
11289 Syntax:
11290 """""""
11291
11292 ::
11293
11294       <result> = phi [fast-math-flags] <ty> [ <val0>, <label0>], ...
11295
11296 Overview:
11297 """""""""
11298
11299 The '``phi``' instruction is used to implement the φ node in the SSA
11300 graph representing the function.
11301
11302 Arguments:
11303 """"""""""
11304
11305 The type of the incoming values is specified with the first type field.
11306 After this, the '``phi``' instruction takes a list of pairs as
11307 arguments, with one pair for each predecessor basic block of the current
11308 block. Only values of :ref:`first class <t_firstclass>` type may be used as
11309 the value arguments to the PHI node. Only labels may be used as the
11310 label arguments.
11311
11312 There must be no non-phi instructions between the start of a basic block
11313 and the PHI instructions: i.e. PHI instructions must be first in a basic
11314 block.
11315
11316 For the purposes of the SSA form, the use of each incoming value is
11317 deemed to occur on the edge from the corresponding predecessor block to
11318 the current block (but after any definition of an '``invoke``'
11319 instruction's return value on the same edge).
11320
11321 The optional ``fast-math-flags`` marker indicates that the phi has one
11322 or more :ref:`fast-math-flags <fastmath>`. These are optimization hints
11323 to enable otherwise unsafe floating-point optimizations. Fast-math-flags
11324 are only valid for phis that return a floating-point scalar or vector
11325 type, or an array (nested to any depth) of floating-point scalar or vector
11326 types.
11327
11328 Semantics:
11329 """"""""""
11330
11331 At runtime, the '``phi``' instruction logically takes on the value
11332 specified by the pair corresponding to the predecessor basic block that
11333 executed just prior to the current block.
11334
11335 Example:
11336 """"""""
11337
11338 .. code-block:: llvm
11339
11340     Loop:       ; Infinite loop that counts from 0 on up...
11341       %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
11342       %nextindvar = add i32 %indvar, 1
11343       br label %Loop
11344
11345 .. _i_select:
11346
11347 '``select``' Instruction
11348 ^^^^^^^^^^^^^^^^^^^^^^^^
11349
11350 Syntax:
11351 """""""
11352
11353 ::
11354
11355       <result> = select [fast-math flags] selty <cond>, <ty> <val1>, <ty> <val2>             ; yields ty
11356
11357       selty is either i1 or {<N x i1>}
11358
11359 Overview:
11360 """""""""
11361
11362 The '``select``' instruction is used to choose one value based on a
11363 condition, without IR-level branching.
11364
11365 Arguments:
11366 """"""""""
11367
11368 The '``select``' instruction requires an 'i1' value or a vector of 'i1'
11369 values indicating the condition, and two values of the same :ref:`first
11370 class <t_firstclass>` type.
11371
11372 #. The optional ``fast-math flags`` marker indicates that the select has one or more
11373    :ref:`fast-math flags <fastmath>`. These are optimization hints to enable
11374    otherwise unsafe floating-point optimizations. Fast-math flags are only valid
11375    for selects that return a floating-point scalar or vector type, or an array
11376    (nested to any depth) of floating-point scalar or vector types.
11377
11378 Semantics:
11379 """"""""""
11380
11381 If the condition is an i1 and it evaluates to 1, the instruction returns
11382 the first value argument; otherwise, it returns the second value
11383 argument.
11384
11385 If the condition is a vector of i1, then the value arguments must be
11386 vectors of the same size, and the selection is done element by element.
11387
11388 If the condition is an i1 and the value arguments are vectors of the
11389 same size, then an entire vector is selected.
11390
11391 Example:
11392 """"""""
11393
11394 .. code-block:: llvm
11395
11396       %X = select i1 true, i8 17, i8 42          ; yields i8:17
11397
11398
11399 .. _i_freeze:
11400
11401 '``freeze``' Instruction
11402 ^^^^^^^^^^^^^^^^^^^^^^^^
11403
11404 Syntax:
11405 """""""
11406
11407 ::
11408
11409       <result> = freeze ty <val>    ; yields ty:result
11410
11411 Overview:
11412 """""""""
11413
11414 The '``freeze``' instruction is used to stop propagation of
11415 :ref:`undef <undefvalues>` and :ref:`poison <poisonvalues>` values.
11416
11417 Arguments:
11418 """"""""""
11419
11420 The '``freeze``' instruction takes a single argument.
11421
11422 Semantics:
11423 """"""""""
11424
11425 If the argument is ``undef`` or ``poison``, '``freeze``' returns an
11426 arbitrary, but fixed, value of type '``ty``'.
11427 Otherwise, this instruction is a no-op and returns the input argument.
11428 All uses of a value returned by the same '``freeze``' instruction are
11429 guaranteed to always observe the same value, while different '``freeze``'
11430 instructions may yield different values.
11431
11432 While ``undef`` and ``poison`` pointers can be frozen, the result is a
11433 non-dereferenceable pointer. See the
11434 :ref:`Pointer Aliasing Rules <pointeraliasing>` section for more information.
11435 If an aggregate value or vector is frozen, the operand is frozen element-wise.
11436 The padding of an aggregate isn't considered, since it isn't visible
11437 without storing it into memory and loading it with a different type.
11438
11439
11440 Example:
11441 """"""""
11442
11443 .. code-block:: text
11444
11445       %w = i32 undef
11446       %x = freeze i32 %w
11447       %y = add i32 %w, %w         ; undef
11448       %z = add i32 %x, %x         ; even number because all uses of %x observe
11449                                   ; the same value
11450       %x2 = freeze i32 %w
11451       %cmp = icmp eq i32 %x, %x2  ; can be true or false
11452
11453       ; example with vectors
11454       %v = <2 x i32> <i32 undef, i32 poison>
11455       %a = extractelement <2 x i32> %v, i32 0    ; undef
11456       %b = extractelement <2 x i32> %v, i32 1    ; poison
11457       %add = add i32 %a, %a                      ; undef
11458
11459       %v.fr = freeze <2 x i32> %v                ; element-wise freeze
11460       %d = extractelement <2 x i32> %v.fr, i32 0 ; not undef
11461       %add.f = add i32 %d, %d                    ; even number
11462
11463       ; branching on frozen value
11464       %poison = add nsw i1 %k, undef   ; poison
11465       %c = freeze i1 %poison
11466       br i1 %c, label %foo, label %bar ; non-deterministic branch to %foo or %bar
11467
11468
11469 .. _i_call:
11470
11471 '``call``' Instruction
11472 ^^^^^^^^^^^^^^^^^^^^^^
11473
11474 Syntax:
11475 """""""
11476
11477 ::
11478
11479       <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(<num>)]
11480                  <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ]
11481
11482 Overview:
11483 """""""""
11484
11485 The '``call``' instruction represents a simple function call.
11486
11487 Arguments:
11488 """"""""""
11489
11490 This instruction requires several arguments:
11491
11492 #. The optional ``tail`` and ``musttail`` markers indicate that the optimizers
11493    should perform tail call optimization. The ``tail`` marker is a hint that
11494    `can be ignored <CodeGenerator.html#sibcallopt>`_. The ``musttail`` marker
11495    means that the call must be tail call optimized in order for the program to
11496    be correct. The ``musttail`` marker provides these guarantees:
11497
11498    #. The call will not cause unbounded stack growth if it is part of a
11499       recursive cycle in the call graph.
11500    #. Arguments with the :ref:`inalloca <attr_inalloca>` or
11501       :ref:`preallocated <attr_preallocated>` attribute are forwarded in place.
11502    #. If the musttail call appears in a function with the ``"thunk"`` attribute
11503       and the caller and callee both have varargs, than any unprototyped
11504       arguments in register or memory are forwarded to the callee. Similarly,
11505       the return value of the callee is returned to the caller's caller, even
11506       if a void return type is in use.
11507
11508    Both markers imply that the callee does not access allocas from the caller.
11509    The ``tail`` marker additionally implies that the callee does not access
11510    varargs from the caller. Calls marked ``musttail`` must obey the following
11511    additional  rules:
11512
11513    - The call must immediately precede a :ref:`ret <i_ret>` instruction,
11514      or a pointer bitcast followed by a ret instruction.
11515    - The ret instruction must return the (possibly bitcasted) value
11516      produced by the call, undef, or void.
11517    - The calling conventions of the caller and callee must match.
11518    - The callee must be varargs iff the caller is varargs. Bitcasting a
11519      non-varargs function to the appropriate varargs type is legal so
11520      long as the non-varargs prefixes obey the other rules.
11521    - The return type must not undergo automatic conversion to an `sret` pointer.
11522
11523   In addition, if the calling convention is not `swifttailcc` or `tailcc`:
11524
11525    - All ABI-impacting function attributes, such as sret, byval, inreg,
11526      returned, and inalloca, must match.
11527    - The caller and callee prototypes must match. Pointer types of parameters
11528      or return types may differ in pointee type, but not in address space.
11529
11530   On the other hand, if the calling convention is `swifttailcc` or `swiftcc`:
11531
11532    - Only these ABI-impacting attributes attributes are allowed: sret, byval,
11533      swiftself, and swiftasync.
11534    - Prototypes are not required to match.
11535
11536    Tail call optimization for calls marked ``tail`` is guaranteed to occur if
11537    the following conditions are met:
11538
11539    -  Caller and callee both have the calling convention ``fastcc`` or ``tailcc``.
11540    -  The call is in tail position (ret immediately follows call and ret
11541       uses value of call or is void).
11542    -  Option ``-tailcallopt`` is enabled,
11543       ``llvm::GuaranteedTailCallOpt`` is ``true``, or the calling convention
11544       is ``tailcc``
11545    -  `Platform-specific constraints are
11546       met. <CodeGenerator.html#tailcallopt>`_
11547
11548 #. The optional ``notail`` marker indicates that the optimizers should not add
11549    ``tail`` or ``musttail`` markers to the call. It is used to prevent tail
11550    call optimization from being performed on the call.
11551
11552 #. The optional ``fast-math flags`` marker indicates that the call has one or more
11553    :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
11554    otherwise unsafe floating-point optimizations. Fast-math flags are only valid
11555    for calls that return a floating-point scalar or vector type, or an array
11556    (nested to any depth) of floating-point scalar or vector types.
11557
11558 #. The optional "cconv" marker indicates which :ref:`calling
11559    convention <callingconv>` the call should use. If none is
11560    specified, the call defaults to using C calling conventions. The
11561    calling convention of the call must match the calling convention of
11562    the target function, or else the behavior is undefined.
11563 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
11564    values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
11565    are valid here.
11566 #. The optional addrspace attribute can be used to indicate the address space
11567    of the called function. If it is not specified, the program address space
11568    from the :ref:`datalayout string<langref_datalayout>` will be used.
11569 #. '``ty``': the type of the call instruction itself which is also the
11570    type of the return value. Functions that return no value are marked
11571    ``void``.
11572 #. '``fnty``': shall be the signature of the function being called. The
11573    argument types must match the types implied by this signature. This
11574    type can be omitted if the function is not varargs.
11575 #. '``fnptrval``': An LLVM value containing a pointer to a function to
11576    be called. In most cases, this is a direct function call, but
11577    indirect ``call``'s are just as possible, calling an arbitrary pointer
11578    to function value.
11579 #. '``function args``': argument list whose types match the function
11580    signature argument types and parameter attributes. All arguments must
11581    be of :ref:`first class <t_firstclass>` type. If the function signature
11582    indicates the function accepts a variable number of arguments, the
11583    extra arguments can be specified.
11584 #. The optional :ref:`function attributes <fnattrs>` list.
11585 #. The optional :ref:`operand bundles <opbundles>` list.
11586
11587 Semantics:
11588 """"""""""
11589
11590 The '``call``' instruction is used to cause control flow to transfer to
11591 a specified function, with its incoming arguments bound to the specified
11592 values. Upon a '``ret``' instruction in the called function, control
11593 flow continues with the instruction after the function call, and the
11594 return value of the function is bound to the result argument.
11595
11596 Example:
11597 """"""""
11598
11599 .. code-block:: llvm
11600
11601       %retval = call i32 @test(i32 %argc)
11602       call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42)        ; yields i32
11603       %X = tail call i32 @foo()                                    ; yields i32
11604       %Y = tail call fastcc i32 @foo()  ; yields i32
11605       call void %foo(i8 signext 97)
11606
11607       %struct.A = type { i32, i8 }
11608       %r = call %struct.A @foo()                        ; yields { i32, i8 }
11609       %gr = extractvalue %struct.A %r, 0                ; yields i32
11610       %gr1 = extractvalue %struct.A %r, 1               ; yields i8
11611       %Z = call void @foo() noreturn                    ; indicates that %foo never returns normally
11612       %ZZ = call zeroext i32 @bar()                     ; Return value is %zero extended
11613
11614 llvm treats calls to some functions with names and arguments that match
11615 the standard C99 library as being the C99 library functions, and may
11616 perform optimizations or generate code for them under that assumption.
11617 This is something we'd like to change in the future to provide better
11618 support for freestanding environments and non-C-based languages.
11619
11620 .. _i_va_arg:
11621
11622 '``va_arg``' Instruction
11623 ^^^^^^^^^^^^^^^^^^^^^^^^
11624
11625 Syntax:
11626 """""""
11627
11628 ::
11629
11630       <resultval> = va_arg <va_list*> <arglist>, <argty>
11631
11632 Overview:
11633 """""""""
11634
11635 The '``va_arg``' instruction is used to access arguments passed through
11636 the "variable argument" area of a function call. It is used to implement
11637 the ``va_arg`` macro in C.
11638
11639 Arguments:
11640 """"""""""
11641
11642 This instruction takes a ``va_list*`` value and the type of the
11643 argument. It returns a value of the specified argument type and
11644 increments the ``va_list`` to point to the next argument. The actual
11645 type of ``va_list`` is target specific.
11646
11647 Semantics:
11648 """"""""""
11649
11650 The '``va_arg``' instruction loads an argument of the specified type
11651 from the specified ``va_list`` and causes the ``va_list`` to point to
11652 the next argument. For more information, see the variable argument
11653 handling :ref:`Intrinsic Functions <int_varargs>`.
11654
11655 It is legal for this instruction to be called in a function which does
11656 not take a variable number of arguments, for example, the ``vfprintf``
11657 function.
11658
11659 ``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic
11660 function <intrinsics>` because it takes a type as an argument.
11661
11662 Example:
11663 """"""""
11664
11665 See the :ref:`variable argument processing <int_varargs>` section.
11666
11667 Note that the code generator does not yet fully support va\_arg on many
11668 targets. Also, it does not currently support va\_arg with aggregate
11669 types on any target.
11670
11671 .. _i_landingpad:
11672
11673 '``landingpad``' Instruction
11674 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11675
11676 Syntax:
11677 """""""
11678
11679 ::
11680
11681       <resultval> = landingpad <resultty> <clause>+
11682       <resultval> = landingpad <resultty> cleanup <clause>*
11683
11684       <clause> := catch <type> <value>
11685       <clause> := filter <array constant type> <array constant>
11686
11687 Overview:
11688 """""""""
11689
11690 The '``landingpad``' instruction is used by `LLVM's exception handling
11691 system <ExceptionHandling.html#overview>`_ to specify that a basic block
11692 is a landing pad --- one where the exception lands, and corresponds to the
11693 code found in the ``catch`` portion of a ``try``/``catch`` sequence. It
11694 defines values supplied by the :ref:`personality function <personalityfn>` upon
11695 re-entry to the function. The ``resultval`` has the type ``resultty``.
11696
11697 Arguments:
11698 """"""""""
11699
11700 The optional
11701 ``cleanup`` flag indicates that the landing pad block is a cleanup.
11702
11703 A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and
11704 contains the global variable representing the "type" that may be caught
11705 or filtered respectively. Unlike the ``catch`` clause, the ``filter``
11706 clause takes an array constant as its argument. Use
11707 "``[0 x i8**] undef``" for a filter which cannot throw. The
11708 '``landingpad``' instruction must contain *at least* one ``clause`` or
11709 the ``cleanup`` flag.
11710
11711 Semantics:
11712 """"""""""
11713
11714 The '``landingpad``' instruction defines the values which are set by the
11715 :ref:`personality function <personalityfn>` upon re-entry to the function, and
11716 therefore the "result type" of the ``landingpad`` instruction. As with
11717 calling conventions, how the personality function results are
11718 represented in LLVM IR is target specific.
11719
11720 The clauses are applied in order from top to bottom. If two
11721 ``landingpad`` instructions are merged together through inlining, the
11722 clauses from the calling function are appended to the list of clauses.
11723 When the call stack is being unwound due to an exception being thrown,
11724 the exception is compared against each ``clause`` in turn. If it doesn't
11725 match any of the clauses, and the ``cleanup`` flag is not set, then
11726 unwinding continues further up the call stack.
11727
11728 The ``landingpad`` instruction has several restrictions:
11729
11730 -  A landing pad block is a basic block which is the unwind destination
11731    of an '``invoke``' instruction.
11732 -  A landing pad block must have a '``landingpad``' instruction as its
11733    first non-PHI instruction.
11734 -  There can be only one '``landingpad``' instruction within the landing
11735    pad block.
11736 -  A basic block that is not a landing pad block may not include a
11737    '``landingpad``' instruction.
11738
11739 Example:
11740 """"""""
11741
11742 .. code-block:: llvm
11743
11744       ;; A landing pad which can catch an integer.
11745       %res = landingpad { i8*, i32 }
11746                catch i8** @_ZTIi
11747       ;; A landing pad that is a cleanup.
11748       %res = landingpad { i8*, i32 }
11749                cleanup
11750       ;; A landing pad which can catch an integer and can only throw a double.
11751       %res = landingpad { i8*, i32 }
11752                catch i8** @_ZTIi
11753                filter [1 x i8**] [i8** @_ZTId]
11754
11755 .. _i_catchpad:
11756
11757 '``catchpad``' Instruction
11758 ^^^^^^^^^^^^^^^^^^^^^^^^^^
11759
11760 Syntax:
11761 """""""
11762
11763 ::
11764
11765       <resultval> = catchpad within <catchswitch> [<args>*]
11766
11767 Overview:
11768 """""""""
11769
11770 The '``catchpad``' instruction is used by `LLVM's exception handling
11771 system <ExceptionHandling.html#overview>`_ to specify that a basic block
11772 begins a catch handler --- one where a personality routine attempts to transfer
11773 control to catch an exception.
11774
11775 Arguments:
11776 """"""""""
11777
11778 The ``catchswitch`` operand must always be a token produced by a
11779 :ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This
11780 ensures that each ``catchpad`` has exactly one predecessor block, and it always
11781 terminates in a ``catchswitch``.
11782
11783 The ``args`` correspond to whatever information the personality routine
11784 requires to know if this is an appropriate handler for the exception. Control
11785 will transfer to the ``catchpad`` if this is the first appropriate handler for
11786 the exception.
11787
11788 The ``resultval`` has the type :ref:`token <t_token>` and is used to match the
11789 ``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH
11790 pads.
11791
11792 Semantics:
11793 """"""""""
11794
11795 When the call stack is being unwound due to an exception being thrown, the
11796 exception is compared against the ``args``. If it doesn't match, control will
11797 not reach the ``catchpad`` instruction.  The representation of ``args`` is
11798 entirely target and personality function-specific.
11799
11800 Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad``
11801 instruction must be the first non-phi of its parent basic block.
11802
11803 The meaning of the tokens produced and consumed by ``catchpad`` and other "pad"
11804 instructions is described in the
11805 `Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_.
11806
11807 When a ``catchpad`` has been "entered" but not yet "exited" (as
11808 described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
11809 it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
11810 that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
11811
11812 Example:
11813 """"""""
11814
11815 .. code-block:: text
11816
11817     dispatch:
11818       %cs = catchswitch within none [label %handler0] unwind to caller
11819       ;; A catch block which can catch an integer.
11820     handler0:
11821       %tok = catchpad within %cs [i8** @_ZTIi]
11822
11823 .. _i_cleanuppad:
11824
11825 '``cleanuppad``' Instruction
11826 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11827
11828 Syntax:
11829 """""""
11830
11831 ::
11832
11833       <resultval> = cleanuppad within <parent> [<args>*]
11834
11835 Overview:
11836 """""""""
11837
11838 The '``cleanuppad``' instruction is used by `LLVM's exception handling
11839 system <ExceptionHandling.html#overview>`_ to specify that a basic block
11840 is a cleanup block --- one where a personality routine attempts to
11841 transfer control to run cleanup actions.
11842 The ``args`` correspond to whatever additional
11843 information the :ref:`personality function <personalityfn>` requires to
11844 execute the cleanup.
11845 The ``resultval`` has the type :ref:`token <t_token>` and is used to
11846 match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`.
11847 The ``parent`` argument is the token of the funclet that contains the
11848 ``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet,
11849 this operand may be the token ``none``.
11850
11851 Arguments:
11852 """"""""""
11853
11854 The instruction takes a list of arbitrary values which are interpreted
11855 by the :ref:`personality function <personalityfn>`.
11856
11857 Semantics:
11858 """"""""""
11859
11860 When the call stack is being unwound due to an exception being thrown,
11861 the :ref:`personality function <personalityfn>` transfers control to the
11862 ``cleanuppad`` with the aid of the personality-specific arguments.
11863 As with calling conventions, how the personality function results are
11864 represented in LLVM IR is target specific.
11865
11866 The ``cleanuppad`` instruction has several restrictions:
11867
11868 -  A cleanup block is a basic block which is the unwind destination of
11869    an exceptional instruction.
11870 -  A cleanup block must have a '``cleanuppad``' instruction as its
11871    first non-PHI instruction.
11872 -  There can be only one '``cleanuppad``' instruction within the
11873    cleanup block.
11874 -  A basic block that is not a cleanup block may not include a
11875    '``cleanuppad``' instruction.
11876
11877 When a ``cleanuppad`` has been "entered" but not yet "exited" (as
11878 described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
11879 it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
11880 that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
11881
11882 Example:
11883 """"""""
11884
11885 .. code-block:: text
11886
11887       %tok = cleanuppad within %cs []
11888
11889 .. _intrinsics:
11890
11891 Intrinsic Functions
11892 ===================
11893
11894 LLVM supports the notion of an "intrinsic function". These functions
11895 have well known names and semantics and are required to follow certain
11896 restrictions. Overall, these intrinsics represent an extension mechanism
11897 for the LLVM language that does not require changing all of the
11898 transformations in LLVM when adding to the language (or the bitcode
11899 reader/writer, the parser, etc...).
11900
11901 Intrinsic function names must all start with an "``llvm.``" prefix. This
11902 prefix is reserved in LLVM for intrinsic names; thus, function names may
11903 not begin with this prefix. Intrinsic functions must always be external
11904 functions: you cannot define the body of intrinsic functions. Intrinsic
11905 functions may only be used in call or invoke instructions: it is illegal
11906 to take the address of an intrinsic function. Additionally, because
11907 intrinsic functions are part of the LLVM language, it is required if any
11908 are added that they be documented here.
11909
11910 Some intrinsic functions can be overloaded, i.e., the intrinsic
11911 represents a family of functions that perform the same operation but on
11912 different data types. Because LLVM can represent over 8 million
11913 different integer types, overloading is used commonly to allow an
11914 intrinsic function to operate on any integer type. One or more of the
11915 argument types or the result type can be overloaded to accept any
11916 integer type. Argument types may also be defined as exactly matching a
11917 previous argument's type or the result type. This allows an intrinsic
11918 function which accepts multiple arguments, but needs all of them to be
11919 of the same type, to only be overloaded with respect to a single
11920 argument or the result.
11921
11922 Overloaded intrinsics will have the names of its overloaded argument
11923 types encoded into its function name, each preceded by a period. Only
11924 those types which are overloaded result in a name suffix. Arguments
11925 whose type is matched against another type do not. For example, the
11926 ``llvm.ctpop`` function can take an integer of any width and returns an
11927 integer of exactly the same integer width. This leads to a family of
11928 functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and
11929 ``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is
11930 overloaded, and only one type suffix is required. Because the argument's
11931 type is matched against the return type, it does not require its own
11932 name suffix.
11933
11934 :ref:`Unnamed types <t_opaque>` are encoded as ``s_s``. Overloaded intrinsics
11935 that depend on an unnamed type in one of its overloaded argument types get an
11936 additional ``.<number>`` suffix. This allows differentiating intrinsics with
11937 different unnamed types as arguments. (For example:
11938 ``llvm.ssa.copy.p0s_s.2(%42*)``) The number is tracked in the LLVM module and
11939 it ensures unique names in the module. While linking together two modules, it is
11940 still possible to get a name clash. In that case one of the names will be
11941 changed by getting a new number.
11942
11943 For target developers who are defining intrinsics for back-end code
11944 generation, any intrinsic overloads based solely the distinction between
11945 integer or floating point types should not be relied upon for correct
11946 code generation. In such cases, the recommended approach for target
11947 maintainers when defining intrinsics is to create separate integer and
11948 FP intrinsics rather than rely on overloading. For example, if different
11949 codegen is required for ``llvm.target.foo(<4 x i32>)`` and
11950 ``llvm.target.foo(<4 x float>)`` then these should be split into
11951 different intrinsics.
11952
11953 To learn how to add an intrinsic function, please see the `Extending
11954 LLVM Guide <ExtendingLLVM.html>`_.
11955
11956 .. _int_varargs:
11957
11958 Variable Argument Handling Intrinsics
11959 -------------------------------------
11960
11961 Variable argument support is defined in LLVM with the
11962 :ref:`va_arg <i_va_arg>` instruction and these three intrinsic
11963 functions. These functions are related to the similarly named macros
11964 defined in the ``<stdarg.h>`` header file.
11965
11966 All of these functions operate on arguments that use a target-specific
11967 value type "``va_list``". The LLVM assembly language reference manual
11968 does not define what this type is, so all transformations should be
11969 prepared to handle these functions regardless of the type used.
11970
11971 This example shows how the :ref:`va_arg <i_va_arg>` instruction and the
11972 variable argument handling intrinsic functions are used.
11973
11974 .. code-block:: llvm
11975
11976     ; This struct is different for every platform. For most platforms,
11977     ; it is merely an i8*.
11978     %struct.va_list = type { i8* }
11979
11980     ; For Unix x86_64 platforms, va_list is the following struct:
11981     ; %struct.va_list = type { i32, i32, i8*, i8* }
11982
11983     define i32 @test(i32 %X, ...) {
11984       ; Initialize variable argument processing
11985       %ap = alloca %struct.va_list
11986       %ap2 = bitcast %struct.va_list* %ap to i8*
11987       call void @llvm.va_start(i8* %ap2)
11988
11989       ; Read a single integer argument
11990       %tmp = va_arg i8* %ap2, i32
11991
11992       ; Demonstrate usage of llvm.va_copy and llvm.va_end
11993       %aq = alloca i8*
11994       %aq2 = bitcast i8** %aq to i8*
11995       call void @llvm.va_copy(i8* %aq2, i8* %ap2)
11996       call void @llvm.va_end(i8* %aq2)
11997
11998       ; Stop processing of arguments.
11999       call void @llvm.va_end(i8* %ap2)
12000       ret i32 %tmp
12001     }
12002
12003     declare void @llvm.va_start(i8*)
12004     declare void @llvm.va_copy(i8*, i8*)
12005     declare void @llvm.va_end(i8*)
12006
12007 .. _int_va_start:
12008
12009 '``llvm.va_start``' Intrinsic
12010 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12011
12012 Syntax:
12013 """""""
12014
12015 ::
12016
12017       declare void @llvm.va_start(i8* <arglist>)
12018
12019 Overview:
12020 """""""""
12021
12022 The '``llvm.va_start``' intrinsic initializes ``*<arglist>`` for
12023 subsequent use by ``va_arg``.
12024
12025 Arguments:
12026 """"""""""
12027
12028 The argument is a pointer to a ``va_list`` element to initialize.
12029
12030 Semantics:
12031 """"""""""
12032
12033 The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro
12034 available in C. In a target-dependent way, it initializes the
12035 ``va_list`` element to which the argument points, so that the next call
12036 to ``va_arg`` will produce the first variable argument passed to the
12037 function. Unlike the C ``va_start`` macro, this intrinsic does not need
12038 to know the last argument of the function as the compiler can figure
12039 that out.
12040
12041 '``llvm.va_end``' Intrinsic
12042 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
12043
12044 Syntax:
12045 """""""
12046
12047 ::
12048
12049       declare void @llvm.va_end(i8* <arglist>)
12050
12051 Overview:
12052 """""""""
12053
12054 The '``llvm.va_end``' intrinsic destroys ``*<arglist>``, which has been
12055 initialized previously with ``llvm.va_start`` or ``llvm.va_copy``.
12056
12057 Arguments:
12058 """"""""""
12059
12060 The argument is a pointer to a ``va_list`` to destroy.
12061
12062 Semantics:
12063 """"""""""
12064
12065 The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro
12066 available in C. In a target-dependent way, it destroys the ``va_list``
12067 element to which the argument points. Calls to
12068 :ref:`llvm.va_start <int_va_start>` and
12069 :ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to
12070 ``llvm.va_end``.
12071
12072 .. _int_va_copy:
12073
12074 '``llvm.va_copy``' Intrinsic
12075 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12076
12077 Syntax:
12078 """""""
12079
12080 ::
12081
12082       declare void @llvm.va_copy(i8* <destarglist>, i8* <srcarglist>)
12083
12084 Overview:
12085 """""""""
12086
12087 The '``llvm.va_copy``' intrinsic copies the current argument position
12088 from the source argument list to the destination argument list.
12089
12090 Arguments:
12091 """"""""""
12092
12093 The first argument is a pointer to a ``va_list`` element to initialize.
12094 The second argument is a pointer to a ``va_list`` element to copy from.
12095
12096 Semantics:
12097 """"""""""
12098
12099 The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro
12100 available in C. In a target-dependent way, it copies the source
12101 ``va_list`` element into the destination ``va_list`` element. This
12102 intrinsic is necessary because the `` llvm.va_start`` intrinsic may be
12103 arbitrarily complex and require, for example, memory allocation.
12104
12105 Accurate Garbage Collection Intrinsics
12106 --------------------------------------
12107
12108 LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_
12109 (GC) requires the frontend to generate code containing appropriate intrinsic
12110 calls and select an appropriate GC strategy which knows how to lower these
12111 intrinsics in a manner which is appropriate for the target collector.
12112
12113 These intrinsics allow identification of :ref:`GC roots on the
12114 stack <int_gcroot>`, as well as garbage collector implementations that
12115 require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers.
12116 Frontends for type-safe garbage collected languages should generate
12117 these intrinsics to make use of the LLVM garbage collectors. For more
12118 details, see `Garbage Collection with LLVM <GarbageCollection.html>`_.
12119
12120 LLVM provides an second experimental set of intrinsics for describing garbage
12121 collection safepoints in compiled code. These intrinsics are an alternative
12122 to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for
12123 :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The
12124 differences in approach are covered in the `Garbage Collection with LLVM
12125 <GarbageCollection.html>`_ documentation. The intrinsics themselves are
12126 described in :doc:`Statepoints`.
12127
12128 .. _int_gcroot:
12129
12130 '``llvm.gcroot``' Intrinsic
12131 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
12132
12133 Syntax:
12134 """""""
12135
12136 ::
12137
12138       declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata)
12139
12140 Overview:
12141 """""""""
12142
12143 The '``llvm.gcroot``' intrinsic declares the existence of a GC root to
12144 the code generator, and allows some metadata to be associated with it.
12145
12146 Arguments:
12147 """"""""""
12148
12149 The first argument specifies the address of a stack object that contains
12150 the root pointer. The second pointer (which must be either a constant or
12151 a global value address) contains the meta-data to be associated with the
12152 root.
12153
12154 Semantics:
12155 """"""""""
12156
12157 At runtime, a call to this intrinsic stores a null pointer into the
12158 "ptrloc" location. At compile-time, the code generator generates
12159 information to allow the runtime to find the pointer at GC safe points.
12160 The '``llvm.gcroot``' intrinsic may only be used in a function which
12161 :ref:`specifies a GC algorithm <gc>`.
12162
12163 .. _int_gcread:
12164
12165 '``llvm.gcread``' Intrinsic
12166 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
12167
12168 Syntax:
12169 """""""
12170
12171 ::
12172
12173       declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr)
12174
12175 Overview:
12176 """""""""
12177
12178 The '``llvm.gcread``' intrinsic identifies reads of references from heap
12179 locations, allowing garbage collector implementations that require read
12180 barriers.
12181
12182 Arguments:
12183 """"""""""
12184
12185 The second argument is the address to read from, which should be an
12186 address allocated from the garbage collector. The first object is a
12187 pointer to the start of the referenced object, if needed by the language
12188 runtime (otherwise null).
12189
12190 Semantics:
12191 """"""""""
12192
12193 The '``llvm.gcread``' intrinsic has the same semantics as a load
12194 instruction, but may be replaced with substantially more complex code by
12195 the garbage collector runtime, as needed. The '``llvm.gcread``'
12196 intrinsic may only be used in a function which :ref:`specifies a GC
12197 algorithm <gc>`.
12198
12199 .. _int_gcwrite:
12200
12201 '``llvm.gcwrite``' Intrinsic
12202 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12203
12204 Syntax:
12205 """""""
12206
12207 ::
12208
12209       declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2)
12210
12211 Overview:
12212 """""""""
12213
12214 The '``llvm.gcwrite``' intrinsic identifies writes of references to heap
12215 locations, allowing garbage collector implementations that require write
12216 barriers (such as generational or reference counting collectors).
12217
12218 Arguments:
12219 """"""""""
12220
12221 The first argument is the reference to store, the second is the start of
12222 the object to store it to, and the third is the address of the field of
12223 Obj to store to. If the runtime does not require a pointer to the
12224 object, Obj may be null.
12225
12226 Semantics:
12227 """"""""""
12228
12229 The '``llvm.gcwrite``' intrinsic has the same semantics as a store
12230 instruction, but may be replaced with substantially more complex code by
12231 the garbage collector runtime, as needed. The '``llvm.gcwrite``'
12232 intrinsic may only be used in a function which :ref:`specifies a GC
12233 algorithm <gc>`.
12234
12235
12236 .. _gc_statepoint:
12237
12238 'llvm.experimental.gc.statepoint' Intrinsic
12239 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12240
12241 Syntax:
12242 """""""
12243
12244 ::
12245
12246       declare token
12247         @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>,
12248                        func_type <target>,
12249                        i64 <#call args>, i64 <flags>,
12250                        ... (call parameters),
12251                        i64 0, i64 0)
12252
12253 Overview:
12254 """""""""
12255
12256 The statepoint intrinsic represents a call which is parse-able by the
12257 runtime.
12258
12259 Operands:
12260 """""""""
12261
12262 The 'id' operand is a constant integer that is reported as the ID
12263 field in the generated stackmap.  LLVM does not interpret this
12264 parameter in any way and its meaning is up to the statepoint user to
12265 decide.  Note that LLVM is free to duplicate code containing
12266 statepoint calls, and this may transform IR that had a unique 'id' per
12267 lexical call to statepoint to IR that does not.
12268
12269 If 'num patch bytes' is non-zero then the call instruction
12270 corresponding to the statepoint is not emitted and LLVM emits 'num
12271 patch bytes' bytes of nops in its place.  LLVM will emit code to
12272 prepare the function arguments and retrieve the function return value
12273 in accordance to the calling convention; the former before the nop
12274 sequence and the latter after the nop sequence.  It is expected that
12275 the user will patch over the 'num patch bytes' bytes of nops with a
12276 calling sequence specific to their runtime before executing the
12277 generated machine code.  There are no guarantees with respect to the
12278 alignment of the nop sequence.  Unlike :doc:`StackMaps` statepoints do
12279 not have a concept of shadow bytes.  Note that semantically the
12280 statepoint still represents a call or invoke to 'target', and the nop
12281 sequence after patching is expected to represent an operation
12282 equivalent to a call or invoke to 'target'.
12283
12284 The 'target' operand is the function actually being called.  The
12285 target can be specified as either a symbolic LLVM function, or as an
12286 arbitrary Value of appropriate function type.  Note that the function
12287 type must match the signature of the callee and the types of the 'call
12288 parameters' arguments.
12289
12290 The '#call args' operand is the number of arguments to the actual
12291 call.  It must exactly match the number of arguments passed in the
12292 'call parameters' variable length section.
12293
12294 The 'flags' operand is used to specify extra information about the
12295 statepoint. This is currently only used to mark certain statepoints
12296 as GC transitions. This operand is a 64-bit integer with the following
12297 layout, where bit 0 is the least significant bit:
12298
12299   +-------+---------------------------------------------------+
12300   | Bit # | Usage                                             |
12301   +=======+===================================================+
12302   |     0 | Set if the statepoint is a GC transition, cleared |
12303   |       | otherwise.                                        |
12304   +-------+---------------------------------------------------+
12305   |  1-63 | Reserved for future use; must be cleared.         |
12306   +-------+---------------------------------------------------+
12307
12308 The 'call parameters' arguments are simply the arguments which need to
12309 be passed to the call target.  They will be lowered according to the
12310 specified calling convention and otherwise handled like a normal call
12311 instruction.  The number of arguments must exactly match what is
12312 specified in '# call args'.  The types must match the signature of
12313 'target'.
12314
12315 The 'call parameter' attributes must be followed by two 'i64 0' constants.
12316 These were originally the length prefixes for 'gc transition parameter' and
12317 'deopt parameter' arguments, but the role of these parameter sets have been
12318 entirely replaced with the corresponding operand bundles.  In a future
12319 revision, these now redundant arguments will be removed.
12320
12321 Semantics:
12322 """"""""""
12323
12324 A statepoint is assumed to read and write all memory.  As a result,
12325 memory operations can not be reordered past a statepoint.  It is
12326 illegal to mark a statepoint as being either 'readonly' or 'readnone'.
12327
12328 Note that legal IR can not perform any memory operation on a 'gc
12329 pointer' argument of the statepoint in a location statically reachable
12330 from the statepoint.  Instead, the explicitly relocated value (from a
12331 ``gc.relocate``) must be used.
12332
12333 'llvm.experimental.gc.result' Intrinsic
12334 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12335
12336 Syntax:
12337 """""""
12338
12339 ::
12340
12341       declare type*
12342         @llvm.experimental.gc.result(token %statepoint_token)
12343
12344 Overview:
12345 """""""""
12346
12347 ``gc.result`` extracts the result of the original call instruction
12348 which was replaced by the ``gc.statepoint``.  The ``gc.result``
12349 intrinsic is actually a family of three intrinsics due to an
12350 implementation limitation.  Other than the type of the return value,
12351 the semantics are the same.
12352
12353 Operands:
12354 """""""""
12355
12356 The first and only argument is the ``gc.statepoint`` which starts
12357 the safepoint sequence of which this ``gc.result`` is a part.
12358 Despite the typing of this as a generic token, *only* the value defined
12359 by a ``gc.statepoint`` is legal here.
12360
12361 Semantics:
12362 """"""""""
12363
12364 The ``gc.result`` represents the return value of the call target of
12365 the ``statepoint``.  The type of the ``gc.result`` must exactly match
12366 the type of the target.  If the call target returns void, there will
12367 be no ``gc.result``.
12368
12369 A ``gc.result`` is modeled as a 'readnone' pure function.  It has no
12370 side effects since it is just a projection of the return value of the
12371 previous call represented by the ``gc.statepoint``.
12372
12373 'llvm.experimental.gc.relocate' Intrinsic
12374 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12375
12376 Syntax:
12377 """""""
12378
12379 ::
12380
12381       declare <pointer type>
12382         @llvm.experimental.gc.relocate(token %statepoint_token,
12383                                        i32 %base_offset,
12384                                        i32 %pointer_offset)
12385
12386 Overview:
12387 """""""""
12388
12389 A ``gc.relocate`` returns the potentially relocated value of a pointer
12390 at the safepoint.
12391
12392 Operands:
12393 """""""""
12394
12395 The first argument is the ``gc.statepoint`` which starts the
12396 safepoint sequence of which this ``gc.relocation`` is a part.
12397 Despite the typing of this as a generic token, *only* the value defined
12398 by a ``gc.statepoint`` is legal here.
12399
12400 The second and third arguments are both indices into operands of the
12401 corresponding statepoint's :ref:`gc-live <ob_gc_live>` operand bundle.
12402
12403 The second argument is an index which specifies the allocation for the pointer
12404 being relocated. The associated value must be within the object with which the
12405 pointer being relocated is associated. The optimizer is free to change *which*
12406 interior derived pointer is reported, provided that it does not replace an
12407 actual base pointer with another interior derived pointer. Collectors are
12408 allowed to rely on the base pointer operand remaining an actual base pointer if
12409 so constructed.
12410
12411 The third argument is an index which specify the (potentially) derived pointer
12412 being relocated.  It is legal for this index to be the same as the second
12413 argument if-and-only-if a base pointer is being relocated.
12414
12415 Semantics:
12416 """"""""""
12417
12418 The return value of ``gc.relocate`` is the potentially relocated value
12419 of the pointer specified by its arguments.  It is unspecified how the
12420 value of the returned pointer relates to the argument to the
12421 ``gc.statepoint`` other than that a) it points to the same source
12422 language object with the same offset, and b) the 'based-on'
12423 relationship of the newly relocated pointers is a projection of the
12424 unrelocated pointers.  In particular, the integer value of the pointer
12425 returned is unspecified.
12426
12427 A ``gc.relocate`` is modeled as a ``readnone`` pure function.  It has no
12428 side effects since it is just a way to extract information about work
12429 done during the actual call modeled by the ``gc.statepoint``.
12430
12431 .. _gc.get.pointer.base:
12432
12433 'llvm.experimental.gc.get.pointer.base' Intrinsic
12434 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12435
12436 Syntax:
12437 """""""
12438
12439 ::
12440
12441       declare <pointer type>
12442         @llvm.experimental.gc.get.pointer.base(
12443           <pointer type> readnone nocapture %derived_ptr)
12444           nounwind readnone willreturn
12445
12446 Overview:
12447 """""""""
12448
12449 ``gc.get.pointer.base`` for a derived pointer returns its base pointer.
12450
12451 Operands:
12452 """""""""
12453
12454 The only argument is a pointer which is based on some object with
12455 an unknown offset from the base of said object.
12456
12457 Semantics:
12458 """"""""""
12459
12460 This intrinsic is used in the abstract machine model for GC to represent
12461 the base pointer for an arbitrary derived pointer.
12462
12463 This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
12464 replacing all uses of this callsite with the offset of a derived pointer from
12465 its base pointer value. The replacement is done as part of the lowering to the
12466 explicit statepoint model.
12467
12468 The return pointer type must be the same as the type of the parameter.
12469
12470
12471 'llvm.experimental.gc.get.pointer.offset' Intrinsic
12472 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12473
12474 Syntax:
12475 """""""
12476
12477 ::
12478
12479       declare i64
12480         @llvm.experimental.gc.get.pointer.offset(
12481           <pointer type> readnone nocapture %derived_ptr)
12482           nounwind readnone willreturn
12483
12484 Overview:
12485 """""""""
12486
12487 ``gc.get.pointer.offset`` for a derived pointer returns the offset from its
12488 base pointer.
12489
12490 Operands:
12491 """""""""
12492
12493 The only argument is a pointer which is based on some object with
12494 an unknown offset from the base of said object.
12495
12496 Semantics:
12497 """"""""""
12498
12499 This intrinsic is used in the abstract machine model for GC to represent
12500 the offset of an arbitrary derived pointer from its base pointer.
12501
12502 This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
12503 replacing all uses of this callsite with the offset of a derived pointer from
12504 its base pointer value. The replacement is done as part of the lowering to the
12505 explicit statepoint model.
12506
12507 Basically this call calculates difference between the derived pointer and its
12508 base pointer (see :ref:`gc.get.pointer.base`) both ptrtoint casted. But
12509 this cast done outside the :ref:`RewriteStatepointsForGC` pass could result
12510 in the pointers lost for further lowering from the abstract model to the
12511 explicit physical one.
12512
12513 Code Generator Intrinsics
12514 -------------------------
12515
12516 These intrinsics are provided by LLVM to expose special features that
12517 may only be implemented with code generator support.
12518
12519 '``llvm.returnaddress``' Intrinsic
12520 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12521
12522 Syntax:
12523 """""""
12524
12525 ::
12526
12527       declare i8* @llvm.returnaddress(i32 <level>)
12528
12529 Overview:
12530 """""""""
12531
12532 The '``llvm.returnaddress``' intrinsic attempts to compute a
12533 target-specific value indicating the return address of the current
12534 function or one of its callers.
12535
12536 Arguments:
12537 """"""""""
12538
12539 The argument to this intrinsic indicates which function to return the
12540 address for. Zero indicates the calling function, one indicates its
12541 caller, etc. The argument is **required** to be a constant integer
12542 value.
12543
12544 Semantics:
12545 """"""""""
12546
12547 The '``llvm.returnaddress``' intrinsic either returns a pointer
12548 indicating the return address of the specified call frame, or zero if it
12549 cannot be identified. The value returned by this intrinsic is likely to
12550 be incorrect or 0 for arguments other than zero, so it should only be
12551 used for debugging purposes.
12552
12553 Note that calling this intrinsic does not prevent function inlining or
12554 other aggressive transformations, so the value returned may not be that
12555 of the obvious source-language caller.
12556
12557 '``llvm.addressofreturnaddress``' Intrinsic
12558 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12559
12560 Syntax:
12561 """""""
12562
12563 ::
12564
12565       declare i8* @llvm.addressofreturnaddress()
12566
12567 Overview:
12568 """""""""
12569
12570 The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific
12571 pointer to the place in the stack frame where the return address of the
12572 current function is stored.
12573
12574 Semantics:
12575 """"""""""
12576
12577 Note that calling this intrinsic does not prevent function inlining or
12578 other aggressive transformations, so the value returned may not be that
12579 of the obvious source-language caller.
12580
12581 This intrinsic is only implemented for x86 and aarch64.
12582
12583 '``llvm.sponentry``' Intrinsic
12584 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12585
12586 Syntax:
12587 """""""
12588
12589 ::
12590
12591       declare i8* @llvm.sponentry()
12592
12593 Overview:
12594 """""""""
12595
12596 The '``llvm.sponentry``' intrinsic returns the stack pointer value at
12597 the entry of the current function calling this intrinsic.
12598
12599 Semantics:
12600 """"""""""
12601
12602 Note this intrinsic is only verified on AArch64.
12603
12604 '``llvm.frameaddress``' Intrinsic
12605 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12606
12607 Syntax:
12608 """""""
12609
12610 ::
12611
12612       declare i8* @llvm.frameaddress(i32 <level>)
12613
12614 Overview:
12615 """""""""
12616
12617 The '``llvm.frameaddress``' intrinsic attempts to return the
12618 target-specific frame pointer value for the specified stack frame.
12619
12620 Arguments:
12621 """"""""""
12622
12623 The argument to this intrinsic indicates which function to return the
12624 frame pointer for. Zero indicates the calling function, one indicates
12625 its caller, etc. The argument is **required** to be a constant integer
12626 value.
12627
12628 Semantics:
12629 """"""""""
12630
12631 The '``llvm.frameaddress``' intrinsic either returns a pointer
12632 indicating the frame address of the specified call frame, or zero if it
12633 cannot be identified. The value returned by this intrinsic is likely to
12634 be incorrect or 0 for arguments other than zero, so it should only be
12635 used for debugging purposes.
12636
12637 Note that calling this intrinsic does not prevent function inlining or
12638 other aggressive transformations, so the value returned may not be that
12639 of the obvious source-language caller.
12640
12641 '``llvm.swift.async.context.addr``' Intrinsic
12642 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12643
12644 Syntax:
12645 """""""
12646
12647 ::
12648
12649       declare i8** @llvm.swift.async.context.addr()
12650
12651 Overview:
12652 """""""""
12653
12654 The '``llvm.swift.async.context.addr``' intrinsic returns a pointer to
12655 the part of the extended frame record containing the asynchronous
12656 context of a Swift execution.
12657
12658 Semantics:
12659 """"""""""
12660
12661 If the caller has a ``swiftasync`` parameter, that argument will initially
12662 be stored at the returned address. If not, it will be initialized to null.
12663
12664 '``llvm.localescape``' and '``llvm.localrecover``' Intrinsics
12665 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12666
12667 Syntax:
12668 """""""
12669
12670 ::
12671
12672       declare void @llvm.localescape(...)
12673       declare i8* @llvm.localrecover(i8* %func, i8* %fp, i32 %idx)
12674
12675 Overview:
12676 """""""""
12677
12678 The '``llvm.localescape``' intrinsic escapes offsets of a collection of static
12679 allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a
12680 live frame pointer to recover the address of the allocation. The offset is
12681 computed during frame layout of the caller of ``llvm.localescape``.
12682
12683 Arguments:
12684 """"""""""
12685
12686 All arguments to '``llvm.localescape``' must be pointers to static allocas or
12687 casts of static allocas. Each function can only call '``llvm.localescape``'
12688 once, and it can only do so from the entry block.
12689
12690 The ``func`` argument to '``llvm.localrecover``' must be a constant
12691 bitcasted pointer to a function defined in the current module. The code
12692 generator cannot determine the frame allocation offset of functions defined in
12693 other modules.
12694
12695 The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a
12696 call frame that is currently live. The return value of '``llvm.localaddress``'
12697 is one way to produce such a value, but various runtimes also expose a suitable
12698 pointer in platform-specific ways.
12699
12700 The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to
12701 '``llvm.localescape``' to recover. It is zero-indexed.
12702
12703 Semantics:
12704 """"""""""
12705
12706 These intrinsics allow a group of functions to share access to a set of local
12707 stack allocations of a one parent function. The parent function may call the
12708 '``llvm.localescape``' intrinsic once from the function entry block, and the
12709 child functions can use '``llvm.localrecover``' to access the escaped allocas.
12710 The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where
12711 the escaped allocas are allocated, which would break attempts to use
12712 '``llvm.localrecover``'.
12713
12714 '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' Intrinsics
12715 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12716
12717 Syntax:
12718 """""""
12719
12720 ::
12721
12722       declare void @llvm.seh.try.begin()
12723       declare void @llvm.seh.try.end()
12724
12725 Overview:
12726 """""""""
12727
12728 The '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' intrinsics mark
12729 the boundary of a _try region for Windows SEH Asynchrous Exception Handling.
12730
12731 Semantics:
12732 """"""""""
12733
12734 When a C-function is compiled with Windows SEH Asynchrous Exception option,
12735 -feh_asynch (aka MSVC -EHa), these two intrinsics are injected to mark _try
12736 boundary and to prevent potential exceptions from being moved across boundary.
12737 Any set of operations can then be confined to the region by reading their leaf
12738 inputs via volatile loads and writing their root outputs via volatile stores.
12739
12740 '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' Intrinsics
12741 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12742
12743 Syntax:
12744 """""""
12745
12746 ::
12747
12748       declare void @llvm.seh.scope.begin()
12749       declare void @llvm.seh.scope.end()
12750
12751 Overview:
12752 """""""""
12753
12754 The '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' intrinsics mark
12755 the boundary of a CPP object lifetime for Windows SEH Asynchrous Exception
12756 Handling (MSVC option -EHa).
12757
12758 Semantics:
12759 """"""""""
12760
12761 LLVM's ordinary exception-handling representation associates EH cleanups and
12762 handlers only with ``invoke``s, which normally correspond only to call sites.  To
12763 support arbitrary faulting instructions, it must be possible to recover the current
12764 EH scope for any instruction.  Turning every operation in LLVM that could fault
12765 into an ``invoke`` of a new, potentially-throwing intrinsic would require adding a
12766 large number of intrinsics, impede optimization of those operations, and make
12767 compilation slower by introducing many extra basic blocks.  These intrinsics can
12768 be used instead to mark the region protected by a cleanup, such as for a local
12769 C++ object with a non-trivial destructor.  ``llvm.seh.scope.begin`` is used to mark
12770 the start of the region; it is always called with ``invoke``, with the unwind block
12771 being the desired unwind destination for any potentially-throwing instructions
12772 within the region.  `llvm.seh.scope.end` is used to mark when the scope ends
12773 and the EH cleanup is no longer required (e.g. because the destructor is being
12774 called).
12775
12776 .. _int_read_register:
12777 .. _int_read_volatile_register:
12778 .. _int_write_register:
12779
12780 '``llvm.read_register``', '``llvm.read_volatile_register``', and '``llvm.write_register``' Intrinsics
12781 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12782
12783 Syntax:
12784 """""""
12785
12786 ::
12787
12788       declare i32 @llvm.read_register.i32(metadata)
12789       declare i64 @llvm.read_register.i64(metadata)
12790       declare i32 @llvm.read_volatile_register.i32(metadata)
12791       declare i64 @llvm.read_volatile_register.i64(metadata)
12792       declare void @llvm.write_register.i32(metadata, i32 @value)
12793       declare void @llvm.write_register.i64(metadata, i64 @value)
12794       !0 = !{!"sp\00"}
12795
12796 Overview:
12797 """""""""
12798
12799 The '``llvm.read_register``', '``llvm.read_volatile_register``', and
12800 '``llvm.write_register``' intrinsics provide access to the named register.
12801 The register must be valid on the architecture being compiled to. The type
12802 needs to be compatible with the register being read.
12803
12804 Semantics:
12805 """"""""""
12806
12807 The '``llvm.read_register``' and '``llvm.read_volatile_register``' intrinsics
12808 return the current value of the register, where possible. The
12809 '``llvm.write_register``' intrinsic sets the current value of the register,
12810 where possible.
12811
12812 A call to '``llvm.read_volatile_register``' is assumed to have side-effects
12813 and possibly return a different value each time (e.g. for a timer register).
12814
12815 This is useful to implement named register global variables that need
12816 to always be mapped to a specific register, as is common practice on
12817 bare-metal programs including OS kernels.
12818
12819 The compiler doesn't check for register availability or use of the used
12820 register in surrounding code, including inline assembly. Because of that,
12821 allocatable registers are not supported.
12822
12823 Warning: So far it only works with the stack pointer on selected
12824 architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of
12825 work is needed to support other registers and even more so, allocatable
12826 registers.
12827
12828 .. _int_stacksave:
12829
12830 '``llvm.stacksave``' Intrinsic
12831 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12832
12833 Syntax:
12834 """""""
12835
12836 ::
12837
12838       declare i8* @llvm.stacksave()
12839
12840 Overview:
12841 """""""""
12842
12843 The '``llvm.stacksave``' intrinsic is used to remember the current state
12844 of the function stack, for use with
12845 :ref:`llvm.stackrestore <int_stackrestore>`. This is useful for
12846 implementing language features like scoped automatic variable sized
12847 arrays in C99.
12848
12849 Semantics:
12850 """"""""""
12851
12852 This intrinsic returns an opaque pointer value that can be passed to
12853 :ref:`llvm.stackrestore <int_stackrestore>`. When an
12854 ``llvm.stackrestore`` intrinsic is executed with a value saved from
12855 ``llvm.stacksave``, it effectively restores the state of the stack to
12856 the state it was in when the ``llvm.stacksave`` intrinsic executed. In
12857 practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that
12858 were allocated after the ``llvm.stacksave`` was executed.
12859
12860 .. _int_stackrestore:
12861
12862 '``llvm.stackrestore``' Intrinsic
12863 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12864
12865 Syntax:
12866 """""""
12867
12868 ::
12869
12870       declare void @llvm.stackrestore(i8* %ptr)
12871
12872 Overview:
12873 """""""""
12874
12875 The '``llvm.stackrestore``' intrinsic is used to restore the state of
12876 the function stack to the state it was in when the corresponding
12877 :ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is
12878 useful for implementing language features like scoped automatic variable
12879 sized arrays in C99.
12880
12881 Semantics:
12882 """"""""""
12883
12884 See the description for :ref:`llvm.stacksave <int_stacksave>`.
12885
12886 .. _int_get_dynamic_area_offset:
12887
12888 '``llvm.get.dynamic.area.offset``' Intrinsic
12889 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12890
12891 Syntax:
12892 """""""
12893
12894 ::
12895
12896       declare i32 @llvm.get.dynamic.area.offset.i32()
12897       declare i64 @llvm.get.dynamic.area.offset.i64()
12898
12899 Overview:
12900 """""""""
12901
12902       The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to
12903       get the offset from native stack pointer to the address of the most
12904       recent dynamic alloca on the caller's stack. These intrinsics are
12905       intended for use in combination with
12906       :ref:`llvm.stacksave <int_stacksave>` to get a
12907       pointer to the most recent dynamic alloca. This is useful, for example,
12908       for AddressSanitizer's stack unpoisoning routines.
12909
12910 Semantics:
12911 """"""""""
12912
12913       These intrinsics return a non-negative integer value that can be used to
12914       get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>`
12915       on the caller's stack. In particular, for targets where stack grows downwards,
12916       adding this offset to the native stack pointer would get the address of the most
12917       recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more
12918       complicated, because subtracting this value from stack pointer would get the address
12919       one past the end of the most recent dynamic alloca.
12920
12921       Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
12922       returns just a zero, for others, such as PowerPC and PowerPC64, it returns a
12923       compile-time-known constant value.
12924
12925       The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
12926       must match the target's default address space's (address space 0) pointer type.
12927
12928 '``llvm.prefetch``' Intrinsic
12929 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12930
12931 Syntax:
12932 """""""
12933
12934 ::
12935
12936       declare void @llvm.prefetch(i8* <address>, i32 <rw>, i32 <locality>, i32 <cache type>)
12937
12938 Overview:
12939 """""""""
12940
12941 The '``llvm.prefetch``' intrinsic is a hint to the code generator to
12942 insert a prefetch instruction if supported; otherwise, it is a noop.
12943 Prefetches have no effect on the behavior of the program but can change
12944 its performance characteristics.
12945
12946 Arguments:
12947 """"""""""
12948
12949 ``address`` is the address to be prefetched, ``rw`` is the specifier
12950 determining if the fetch should be for a read (0) or write (1), and
12951 ``locality`` is a temporal locality specifier ranging from (0) - no
12952 locality, to (3) - extremely local keep in cache. The ``cache type``
12953 specifies whether the prefetch is performed on the data (1) or
12954 instruction (0) cache. The ``rw``, ``locality`` and ``cache type``
12955 arguments must be constant integers.
12956
12957 Semantics:
12958 """"""""""
12959
12960 This intrinsic does not modify the behavior of the program. In
12961 particular, prefetches cannot trap and do not produce a value. On
12962 targets that support this intrinsic, the prefetch can provide hints to
12963 the processor cache for better performance.
12964
12965 '``llvm.pcmarker``' Intrinsic
12966 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12967
12968 Syntax:
12969 """""""
12970
12971 ::
12972
12973       declare void @llvm.pcmarker(i32 <id>)
12974
12975 Overview:
12976 """""""""
12977
12978 The '``llvm.pcmarker``' intrinsic is a method to export a Program
12979 Counter (PC) in a region of code to simulators and other tools. The
12980 method is target specific, but it is expected that the marker will use
12981 exported symbols to transmit the PC of the marker. The marker makes no
12982 guarantees that it will remain with any specific instruction after
12983 optimizations. It is possible that the presence of a marker will inhibit
12984 optimizations. The intended use is to be inserted after optimizations to
12985 allow correlations of simulation runs.
12986
12987 Arguments:
12988 """"""""""
12989
12990 ``id`` is a numerical id identifying the marker.
12991
12992 Semantics:
12993 """"""""""
12994
12995 This intrinsic does not modify the behavior of the program. Backends
12996 that do not support this intrinsic may ignore it.
12997
12998 '``llvm.readcyclecounter``' Intrinsic
12999 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13000
13001 Syntax:
13002 """""""
13003
13004 ::
13005
13006       declare i64 @llvm.readcyclecounter()
13007
13008 Overview:
13009 """""""""
13010
13011 The '``llvm.readcyclecounter``' intrinsic provides access to the cycle
13012 counter register (or similar low latency, high accuracy clocks) on those
13013 targets that support it. On X86, it should map to RDTSC. On Alpha, it
13014 should map to RPCC. As the backing counters overflow quickly (on the
13015 order of 9 seconds on alpha), this should only be used for small
13016 timings.
13017
13018 Semantics:
13019 """"""""""
13020
13021 When directly supported, reading the cycle counter should not modify any
13022 memory. Implementations are allowed to either return an application
13023 specific value or a system wide value. On backends without support, this
13024 is lowered to a constant 0.
13025
13026 Note that runtime support may be conditional on the privilege-level code is
13027 running at and the host platform.
13028
13029 '``llvm.clear_cache``' Intrinsic
13030 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13031
13032 Syntax:
13033 """""""
13034
13035 ::
13036
13037       declare void @llvm.clear_cache(i8*, i8*)
13038
13039 Overview:
13040 """""""""
13041
13042 The '``llvm.clear_cache``' intrinsic ensures visibility of modifications
13043 in the specified range to the execution unit of the processor. On
13044 targets with non-unified instruction and data cache, the implementation
13045 flushes the instruction cache.
13046
13047 Semantics:
13048 """"""""""
13049
13050 On platforms with coherent instruction and data caches (e.g. x86), this
13051 intrinsic is a nop. On platforms with non-coherent instruction and data
13052 cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate
13053 instructions or a system call, if cache flushing requires special
13054 privileges.
13055
13056 The default behavior is to emit a call to ``__clear_cache`` from the run
13057 time library.
13058
13059 This intrinsic does *not* empty the instruction pipeline. Modifications
13060 of the current function are outside the scope of the intrinsic.
13061
13062 '``llvm.instrprof.increment``' Intrinsic
13063 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13064
13065 Syntax:
13066 """""""
13067
13068 ::
13069
13070       declare void @llvm.instrprof.increment(i8* <name>, i64 <hash>,
13071                                              i32 <num-counters>, i32 <index>)
13072
13073 Overview:
13074 """""""""
13075
13076 The '``llvm.instrprof.increment``' intrinsic can be emitted by a
13077 frontend for use with instrumentation based profiling. These will be
13078 lowered by the ``-instrprof`` pass to generate execution counts of a
13079 program at runtime.
13080
13081 Arguments:
13082 """"""""""
13083
13084 The first argument is a pointer to a global variable containing the
13085 name of the entity being instrumented. This should generally be the
13086 (mangled) function name for a set of counters.
13087
13088 The second argument is a hash value that can be used by the consumer
13089 of the profile data to detect changes to the instrumented source, and
13090 the third is the number of counters associated with ``name``. It is an
13091 error if ``hash`` or ``num-counters`` differ between two instances of
13092 ``instrprof.increment`` that refer to the same name.
13093
13094 The last argument refers to which of the counters for ``name`` should
13095 be incremented. It should be a value between 0 and ``num-counters``.
13096
13097 Semantics:
13098 """"""""""
13099
13100 This intrinsic represents an increment of a profiling counter. It will
13101 cause the ``-instrprof`` pass to generate the appropriate data
13102 structures and the code to increment the appropriate value, in a
13103 format that can be written out by a compiler runtime and consumed via
13104 the ``llvm-profdata`` tool.
13105
13106 '``llvm.instrprof.increment.step``' Intrinsic
13107 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13108
13109 Syntax:
13110 """""""
13111
13112 ::
13113
13114       declare void @llvm.instrprof.increment.step(i8* <name>, i64 <hash>,
13115                                                   i32 <num-counters>,
13116                                                   i32 <index>, i64 <step>)
13117
13118 Overview:
13119 """""""""
13120
13121 The '``llvm.instrprof.increment.step``' intrinsic is an extension to
13122 the '``llvm.instrprof.increment``' intrinsic with an additional fifth
13123 argument to specify the step of the increment.
13124
13125 Arguments:
13126 """"""""""
13127 The first four arguments are the same as '``llvm.instrprof.increment``'
13128 intrinsic.
13129
13130 The last argument specifies the value of the increment of the counter variable.
13131
13132 Semantics:
13133 """"""""""
13134 See description of '``llvm.instrprof.increment``' intrinsic.
13135
13136
13137 '``llvm.instrprof.value.profile``' Intrinsic
13138 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13139
13140 Syntax:
13141 """""""
13142
13143 ::
13144
13145       declare void @llvm.instrprof.value.profile(i8* <name>, i64 <hash>,
13146                                                  i64 <value>, i32 <value_kind>,
13147                                                  i32 <index>)
13148
13149 Overview:
13150 """""""""
13151
13152 The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a
13153 frontend for use with instrumentation based profiling. This will be
13154 lowered by the ``-instrprof`` pass to find out the target values,
13155 instrumented expressions take in a program at runtime.
13156
13157 Arguments:
13158 """"""""""
13159
13160 The first argument is a pointer to a global variable containing the
13161 name of the entity being instrumented. ``name`` should generally be the
13162 (mangled) function name for a set of counters.
13163
13164 The second argument is a hash value that can be used by the consumer
13165 of the profile data to detect changes to the instrumented source. It
13166 is an error if ``hash`` differs between two instances of
13167 ``llvm.instrprof.*`` that refer to the same name.
13168
13169 The third argument is the value of the expression being profiled. The profiled
13170 expression's value should be representable as an unsigned 64-bit value. The
13171 fourth argument represents the kind of value profiling that is being done. The
13172 supported value profiling kinds are enumerated through the
13173 ``InstrProfValueKind`` type declared in the
13174 ``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the
13175 index of the instrumented expression within ``name``. It should be >= 0.
13176
13177 Semantics:
13178 """"""""""
13179
13180 This intrinsic represents the point where a call to a runtime routine
13181 should be inserted for value profiling of target expressions. ``-instrprof``
13182 pass will generate the appropriate data structures and replace the
13183 ``llvm.instrprof.value.profile`` intrinsic with the call to the profile
13184 runtime library with proper arguments.
13185
13186 '``llvm.thread.pointer``' Intrinsic
13187 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13188
13189 Syntax:
13190 """""""
13191
13192 ::
13193
13194       declare i8* @llvm.thread.pointer()
13195
13196 Overview:
13197 """""""""
13198
13199 The '``llvm.thread.pointer``' intrinsic returns the value of the thread
13200 pointer.
13201
13202 Semantics:
13203 """"""""""
13204
13205 The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area
13206 for the current thread.  The exact semantics of this value are target
13207 specific: it may point to the start of TLS area, to the end, or somewhere
13208 in the middle.  Depending on the target, this intrinsic may read a register,
13209 call a helper function, read from an alternate memory space, or perform
13210 other operations necessary to locate the TLS area.  Not all targets support
13211 this intrinsic.
13212
13213 '``llvm.call.preallocated.setup``' Intrinsic
13214 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13215
13216 Syntax:
13217 """""""
13218
13219 ::
13220
13221       declare token @llvm.call.preallocated.setup(i32 %num_args)
13222
13223 Overview:
13224 """""""""
13225
13226 The '``llvm.call.preallocated.setup``' intrinsic returns a token which can
13227 be used with a call's ``"preallocated"`` operand bundle to indicate that
13228 certain arguments are allocated and initialized before the call.
13229
13230 Semantics:
13231 """"""""""
13232
13233 The '``llvm.call.preallocated.setup``' intrinsic returns a token which is
13234 associated with at most one call. The token can be passed to
13235 '``@llvm.call.preallocated.arg``' to get a pointer to get that
13236 corresponding argument. The token must be the parameter to a
13237 ``"preallocated"`` operand bundle for the corresponding call.
13238
13239 Nested calls to '``llvm.call.preallocated.setup``' are allowed, but must
13240 be properly nested. e.g.
13241
13242 :: code-block:: llvm
13243
13244       %t1 = call token @llvm.call.preallocated.setup(i32 0)
13245       %t2 = call token @llvm.call.preallocated.setup(i32 0)
13246       call void foo() ["preallocated"(token %t2)]
13247       call void foo() ["preallocated"(token %t1)]
13248
13249 is allowed, but not
13250
13251 :: code-block:: llvm
13252
13253       %t1 = call token @llvm.call.preallocated.setup(i32 0)
13254       %t2 = call token @llvm.call.preallocated.setup(i32 0)
13255       call void foo() ["preallocated"(token %t1)]
13256       call void foo() ["preallocated"(token %t2)]
13257
13258 .. _int_call_preallocated_arg:
13259
13260 '``llvm.call.preallocated.arg``' Intrinsic
13261 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13262
13263 Syntax:
13264 """""""
13265
13266 ::
13267
13268       declare i8* @llvm.call.preallocated.arg(token %setup_token, i32 %arg_index)
13269
13270 Overview:
13271 """""""""
13272
13273 The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
13274 corresponding preallocated argument for the preallocated call.
13275
13276 Semantics:
13277 """"""""""
13278
13279 The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
13280 ``%arg_index``th argument with the ``preallocated`` attribute for
13281 the call associated with the ``%setup_token``, which must be from
13282 '``llvm.call.preallocated.setup``'.
13283
13284 A call to '``llvm.call.preallocated.arg``' must have a call site
13285 ``preallocated`` attribute. The type of the ``preallocated`` attribute must
13286 match the type used by the ``preallocated`` attribute of the corresponding
13287 argument at the preallocated call. The type is used in the case that an
13288 ``llvm.call.preallocated.setup`` does not have a corresponding call (e.g. due
13289 to DCE), where otherwise we cannot know how large the arguments are.
13290
13291 It is undefined behavior if this is called with a token from an
13292 '``llvm.call.preallocated.setup``' if another
13293 '``llvm.call.preallocated.setup``' has already been called or if the
13294 preallocated call corresponding to the '``llvm.call.preallocated.setup``'
13295 has already been called.
13296
13297 .. _int_call_preallocated_teardown:
13298
13299 '``llvm.call.preallocated.teardown``' Intrinsic
13300 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13301
13302 Syntax:
13303 """""""
13304
13305 ::
13306
13307       declare i8* @llvm.call.preallocated.teardown(token %setup_token)
13308
13309 Overview:
13310 """""""""
13311
13312 The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
13313 created by a '``llvm.call.preallocated.setup``'.
13314
13315 Semantics:
13316 """"""""""
13317
13318 The token argument must be a '``llvm.call.preallocated.setup``'.
13319
13320 The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
13321 allocated by the corresponding '``llvm.call.preallocated.setup``'. Exactly
13322 one of this or the preallocated call must be called to prevent stack leaks.
13323 It is undefined behavior to call both a '``llvm.call.preallocated.teardown``'
13324 and the preallocated call for a given '``llvm.call.preallocated.setup``'.
13325
13326 For example, if the stack is allocated for a preallocated call by a
13327 '``llvm.call.preallocated.setup``', then an initializer function called on an
13328 allocated argument throws an exception, there should be a
13329 '``llvm.call.preallocated.teardown``' in the exception handler to prevent
13330 stack leaks.
13331
13332 Following the nesting rules in '``llvm.call.preallocated.setup``', nested
13333 calls to '``llvm.call.preallocated.setup``' and
13334 '``llvm.call.preallocated.teardown``' are allowed but must be properly
13335 nested.
13336
13337 Example:
13338 """"""""
13339
13340 .. code-block:: llvm
13341
13342         %cs = call token @llvm.call.preallocated.setup(i32 1)
13343         %x = call i8* @llvm.call.preallocated.arg(token %cs, i32 0) preallocated(i32)
13344         %y = bitcast i8* %x to i32*
13345         invoke void @constructor(i32* %y) to label %conta unwind label %contb
13346     conta:
13347         call void @foo1(i32* preallocated(i32) %y) ["preallocated"(token %cs)]
13348         ret void
13349     contb:
13350         %s = catchswitch within none [label %catch] unwind to caller
13351     catch:
13352         %p = catchpad within %s []
13353         call void @llvm.call.preallocated.teardown(token %cs)
13354         ret void
13355
13356 Standard C/C++ Library Intrinsics
13357 ---------------------------------
13358
13359 LLVM provides intrinsics for a few important standard C/C++ library
13360 functions. These intrinsics allow source-language front-ends to pass
13361 information about the alignment of the pointer arguments to the code
13362 generator, providing opportunity for more efficient code generation.
13363
13364
13365 '``llvm.abs.*``' Intrinsic
13366 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13367
13368 Syntax:
13369 """""""
13370
13371 This is an overloaded intrinsic. You can use ``llvm.abs`` on any
13372 integer bit width or any vector of integer elements.
13373
13374 ::
13375
13376       declare i32 @llvm.abs.i32(i32 <src>, i1 <is_int_min_poison>)
13377       declare <4 x i32> @llvm.abs.v4i32(<4 x i32> <src>, i1 <is_int_min_poison>)
13378
13379 Overview:
13380 """""""""
13381
13382 The '``llvm.abs``' family of intrinsic functions returns the absolute value
13383 of an argument.
13384
13385 Arguments:
13386 """"""""""
13387
13388 The first argument is the value for which the absolute value is to be returned.
13389 This argument may be of any integer type or a vector with integer element type.
13390 The return type must match the first argument type.
13391
13392 The second argument must be a constant and is a flag to indicate whether the
13393 result value of the '``llvm.abs``' intrinsic is a
13394 :ref:`poison value <poisonvalues>` if the argument is statically or dynamically
13395 an ``INT_MIN`` value.
13396
13397 Semantics:
13398 """"""""""
13399
13400 The '``llvm.abs``' intrinsic returns the magnitude (always positive) of the
13401 argument or each element of a vector argument.". If the argument is ``INT_MIN``,
13402 then the result is also ``INT_MIN`` if ``is_int_min_poison == 0`` and
13403 ``poison`` otherwise.
13404
13405
13406 '``llvm.smax.*``' Intrinsic
13407 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13408
13409 Syntax:
13410 """""""
13411
13412 This is an overloaded intrinsic. You can use ``@llvm.smax`` on any
13413 integer bit width or any vector of integer elements.
13414
13415 ::
13416
13417       declare i32 @llvm.smax.i32(i32 %a, i32 %b)
13418       declare <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b)
13419
13420 Overview:
13421 """""""""
13422
13423 Return the larger of ``%a`` and ``%b`` comparing the values as signed integers.
13424 Vector intrinsics operate on a per-element basis. The larger element of ``%a``
13425 and ``%b`` at a given index is returned for that index.
13426
13427 Arguments:
13428 """"""""""
13429
13430 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13431 integer element type. The argument types must match each other, and the return
13432 type must match the argument type.
13433
13434
13435 '``llvm.smin.*``' Intrinsic
13436 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13437
13438 Syntax:
13439 """""""
13440
13441 This is an overloaded intrinsic. You can use ``@llvm.smin`` on any
13442 integer bit width or any vector of integer elements.
13443
13444 ::
13445
13446       declare i32 @llvm.smin.i32(i32 %a, i32 %b)
13447       declare <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b)
13448
13449 Overview:
13450 """""""""
13451
13452 Return the smaller of ``%a`` and ``%b`` comparing the values as signed integers.
13453 Vector intrinsics operate on a per-element basis. The smaller element of ``%a``
13454 and ``%b`` at a given index is returned for that index.
13455
13456 Arguments:
13457 """"""""""
13458
13459 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13460 integer element type. The argument types must match each other, and the return
13461 type must match the argument type.
13462
13463
13464 '``llvm.umax.*``' Intrinsic
13465 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13466
13467 Syntax:
13468 """""""
13469
13470 This is an overloaded intrinsic. You can use ``@llvm.umax`` on any
13471 integer bit width or any vector of integer elements.
13472
13473 ::
13474
13475       declare i32 @llvm.umax.i32(i32 %a, i32 %b)
13476       declare <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b)
13477
13478 Overview:
13479 """""""""
13480
13481 Return the larger of ``%a`` and ``%b`` comparing the values as unsigned
13482 integers. Vector intrinsics operate on a per-element basis. The larger element
13483 of ``%a`` and ``%b`` at a given index is returned for that index.
13484
13485 Arguments:
13486 """"""""""
13487
13488 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13489 integer element type. The argument types must match each other, and the return
13490 type must match the argument type.
13491
13492
13493 '``llvm.umin.*``' Intrinsic
13494 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13495
13496 Syntax:
13497 """""""
13498
13499 This is an overloaded intrinsic. You can use ``@llvm.umin`` on any
13500 integer bit width or any vector of integer elements.
13501
13502 ::
13503
13504       declare i32 @llvm.umin.i32(i32 %a, i32 %b)
13505       declare <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b)
13506
13507 Overview:
13508 """""""""
13509
13510 Return the smaller of ``%a`` and ``%b`` comparing the values as unsigned
13511 integers. Vector intrinsics operate on a per-element basis. The smaller element
13512 of ``%a`` and ``%b`` at a given index is returned for that index.
13513
13514 Arguments:
13515 """"""""""
13516
13517 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13518 integer element type. The argument types must match each other, and the return
13519 type must match the argument type.
13520
13521
13522 .. _int_memcpy:
13523
13524 '``llvm.memcpy``' Intrinsic
13525 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13526
13527 Syntax:
13528 """""""
13529
13530 This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any
13531 integer bit width and for different address spaces. Not all targets
13532 support all bit widths however.
13533
13534 ::
13535
13536       declare void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
13537                                               i32 <len>, i1 <isvolatile>)
13538       declare void @llvm.memcpy.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
13539                                               i64 <len>, i1 <isvolatile>)
13540
13541 Overview:
13542 """""""""
13543
13544 The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
13545 source location to the destination location.
13546
13547 Note that, unlike the standard libc function, the ``llvm.memcpy.*``
13548 intrinsics do not return a value, takes extra isvolatile
13549 arguments and the pointers can be in specified address spaces.
13550
13551 Arguments:
13552 """"""""""
13553
13554 The first argument is a pointer to the destination, the second is a
13555 pointer to the source. The third argument is an integer argument
13556 specifying the number of bytes to copy, and the fourth is a
13557 boolean indicating a volatile access.
13558
13559 The :ref:`align <attr_align>` parameter attribute can be provided
13560 for the first and second arguments.
13561
13562 If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is
13563 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13564 very cleanly specified and it is unwise to depend on it.
13565
13566 Semantics:
13567 """"""""""
13568
13569 The '``llvm.memcpy.*``' intrinsics copy a block of memory from the source
13570 location to the destination location, which must either be equal or
13571 non-overlapping. It copies "len" bytes of memory over. If the argument is known
13572 to be aligned to some boundary, this can be specified as an attribute on the
13573 argument.
13574
13575 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13576 the arguments.
13577 If ``<len>`` is not a well-defined value, the behavior is undefined.
13578 If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
13579 otherwise the behavior is undefined.
13580
13581 .. _int_memcpy_inline:
13582
13583 '``llvm.memcpy.inline``' Intrinsic
13584 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13585
13586 Syntax:
13587 """""""
13588
13589 This is an overloaded intrinsic. You can use ``llvm.memcpy.inline`` on any
13590 integer bit width and for different address spaces. Not all targets
13591 support all bit widths however.
13592
13593 ::
13594
13595       declare void @llvm.memcpy.inline.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
13596                                                      i32 <len>, i1 <isvolatile>)
13597       declare void @llvm.memcpy.inline.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
13598                                                      i64 <len>, i1 <isvolatile>)
13599
13600 Overview:
13601 """""""""
13602
13603 The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
13604 source location to the destination location and guarantees that no external
13605 functions are called.
13606
13607 Note that, unlike the standard libc function, the ``llvm.memcpy.inline.*``
13608 intrinsics do not return a value, takes extra isvolatile
13609 arguments and the pointers can be in specified address spaces.
13610
13611 Arguments:
13612 """"""""""
13613
13614 The first argument is a pointer to the destination, the second is a
13615 pointer to the source. The third argument is a constant integer argument
13616 specifying the number of bytes to copy, and the fourth is a
13617 boolean indicating a volatile access.
13618
13619 The :ref:`align <attr_align>` parameter attribute can be provided
13620 for the first and second arguments.
13621
13622 If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy.inline`` call is
13623 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13624 very cleanly specified and it is unwise to depend on it.
13625
13626 Semantics:
13627 """"""""""
13628
13629 The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
13630 source location to the destination location, which are not allowed to
13631 overlap. It copies "len" bytes of memory over. If the argument is known
13632 to be aligned to some boundary, this can be specified as an attribute on
13633 the argument.
13634 The behavior of '``llvm.memcpy.inline.*``' is equivalent to the behavior of
13635 '``llvm.memcpy.*``', but the generated code is guaranteed not to call any
13636 external functions.
13637
13638 .. _int_memmove:
13639
13640 '``llvm.memmove``' Intrinsic
13641 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13642
13643 Syntax:
13644 """""""
13645
13646 This is an overloaded intrinsic. You can use llvm.memmove on any integer
13647 bit width and for different address space. Not all targets support all
13648 bit widths however.
13649
13650 ::
13651
13652       declare void @llvm.memmove.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
13653                                                i32 <len>, i1 <isvolatile>)
13654       declare void @llvm.memmove.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
13655                                                i64 <len>, i1 <isvolatile>)
13656
13657 Overview:
13658 """""""""
13659
13660 The '``llvm.memmove.*``' intrinsics move a block of memory from the
13661 source location to the destination location. It is similar to the
13662 '``llvm.memcpy``' intrinsic but allows the two memory locations to
13663 overlap.
13664
13665 Note that, unlike the standard libc function, the ``llvm.memmove.*``
13666 intrinsics do not return a value, takes an extra isvolatile
13667 argument and the pointers can be in specified address spaces.
13668
13669 Arguments:
13670 """"""""""
13671
13672 The first argument is a pointer to the destination, the second is a
13673 pointer to the source. The third argument is an integer argument
13674 specifying the number of bytes to copy, and the fourth is a
13675 boolean indicating a volatile access.
13676
13677 The :ref:`align <attr_align>` parameter attribute can be provided
13678 for the first and second arguments.
13679
13680 If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call
13681 is a :ref:`volatile operation <volatile>`. The detailed access behavior is
13682 not very cleanly specified and it is unwise to depend on it.
13683
13684 Semantics:
13685 """"""""""
13686
13687 The '``llvm.memmove.*``' intrinsics copy a block of memory from the
13688 source location to the destination location, which may overlap. It
13689 copies "len" bytes of memory over. If the argument is known to be
13690 aligned to some boundary, this can be specified as an attribute on
13691 the argument.
13692
13693 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13694 the arguments.
13695 If ``<len>`` is not a well-defined value, the behavior is undefined.
13696 If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
13697 otherwise the behavior is undefined.
13698
13699 .. _int_memset:
13700
13701 '``llvm.memset.*``' Intrinsics
13702 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13703
13704 Syntax:
13705 """""""
13706
13707 This is an overloaded intrinsic. You can use llvm.memset on any integer
13708 bit width and for different address spaces. However, not all targets
13709 support all bit widths.
13710
13711 ::
13712
13713       declare void @llvm.memset.p0i8.i32(i8* <dest>, i8 <val>,
13714                                          i32 <len>, i1 <isvolatile>)
13715       declare void @llvm.memset.p0i8.i64(i8* <dest>, i8 <val>,
13716                                          i64 <len>, i1 <isvolatile>)
13717
13718 Overview:
13719 """""""""
13720
13721 The '``llvm.memset.*``' intrinsics fill a block of memory with a
13722 particular byte value.
13723
13724 Note that, unlike the standard libc function, the ``llvm.memset``
13725 intrinsic does not return a value and takes an extra volatile
13726 argument. Also, the destination can be in an arbitrary address space.
13727
13728 Arguments:
13729 """"""""""
13730
13731 The first argument is a pointer to the destination to fill, the second
13732 is the byte value with which to fill it, the third argument is an
13733 integer argument specifying the number of bytes to fill, and the fourth
13734 is a boolean indicating a volatile access.
13735
13736 The :ref:`align <attr_align>` parameter attribute can be provided
13737 for the first arguments.
13738
13739 If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is
13740 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13741 very cleanly specified and it is unwise to depend on it.
13742
13743 Semantics:
13744 """"""""""
13745
13746 The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting
13747 at the destination location. If the argument is known to be
13748 aligned to some boundary, this can be specified as an attribute on
13749 the argument.
13750
13751 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13752 the arguments.
13753 If ``<len>`` is not a well-defined value, the behavior is undefined.
13754 If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
13755 otherwise the behavior is undefined.
13756
13757 '``llvm.sqrt.*``' Intrinsic
13758 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13759
13760 Syntax:
13761 """""""
13762
13763 This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any
13764 floating-point or vector of floating-point type. Not all targets support
13765 all types however.
13766
13767 ::
13768
13769       declare float     @llvm.sqrt.f32(float %Val)
13770       declare double    @llvm.sqrt.f64(double %Val)
13771       declare x86_fp80  @llvm.sqrt.f80(x86_fp80 %Val)
13772       declare fp128     @llvm.sqrt.f128(fp128 %Val)
13773       declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)
13774
13775 Overview:
13776 """""""""
13777
13778 The '``llvm.sqrt``' intrinsics return the square root of the specified value.
13779
13780 Arguments:
13781 """"""""""
13782
13783 The argument and return value are floating-point numbers of the same type.
13784
13785 Semantics:
13786 """"""""""
13787
13788 Return the same value as a corresponding libm '``sqrt``' function but without
13789 trapping or setting ``errno``. For types specified by IEEE-754, the result
13790 matches a conforming libm implementation.
13791
13792 When specified with the fast-math-flag 'afn', the result may be approximated
13793 using a less accurate calculation.
13794
13795 '``llvm.powi.*``' Intrinsic
13796 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13797
13798 Syntax:
13799 """""""
13800
13801 This is an overloaded intrinsic. You can use ``llvm.powi`` on any
13802 floating-point or vector of floating-point type. Not all targets support
13803 all types however.
13804
13805 Generally, the only supported type for the exponent is the one matching
13806 with the C type ``int``.
13807
13808 ::
13809
13810       declare float     @llvm.powi.f32.i32(float  %Val, i32 %power)
13811       declare double    @llvm.powi.f64.i16(double %Val, i16 %power)
13812       declare x86_fp80  @llvm.powi.f80.i32(x86_fp80  %Val, i32 %power)
13813       declare fp128     @llvm.powi.f128.i32(fp128 %Val, i32 %power)
13814       declare ppc_fp128 @llvm.powi.ppcf128.i32(ppc_fp128  %Val, i32 %power)
13815
13816 Overview:
13817 """""""""
13818
13819 The '``llvm.powi.*``' intrinsics return the first operand raised to the
13820 specified (positive or negative) power. The order of evaluation of
13821 multiplications is not defined. When a vector of floating-point type is
13822 used, the second argument remains a scalar integer value.
13823
13824 Arguments:
13825 """"""""""
13826
13827 The second argument is an integer power, and the first is a value to
13828 raise to that power.
13829
13830 Semantics:
13831 """"""""""
13832
13833 This function returns the first value raised to the second power with an
13834 unspecified sequence of rounding operations.
13835
13836 '``llvm.sin.*``' Intrinsic
13837 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13838
13839 Syntax:
13840 """""""
13841
13842 This is an overloaded intrinsic. You can use ``llvm.sin`` on any
13843 floating-point or vector of floating-point type. Not all targets support
13844 all types however.
13845
13846 ::
13847
13848       declare float     @llvm.sin.f32(float  %Val)
13849       declare double    @llvm.sin.f64(double %Val)
13850       declare x86_fp80  @llvm.sin.f80(x86_fp80  %Val)
13851       declare fp128     @llvm.sin.f128(fp128 %Val)
13852       declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128  %Val)
13853
13854 Overview:
13855 """""""""
13856
13857 The '``llvm.sin.*``' intrinsics return the sine of the operand.
13858
13859 Arguments:
13860 """"""""""
13861
13862 The argument and return value are floating-point numbers of the same type.
13863
13864 Semantics:
13865 """"""""""
13866
13867 Return the same value as a corresponding libm '``sin``' function but without
13868 trapping or setting ``errno``.
13869
13870 When specified with the fast-math-flag 'afn', the result may be approximated
13871 using a less accurate calculation.
13872
13873 '``llvm.cos.*``' Intrinsic
13874 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13875
13876 Syntax:
13877 """""""
13878
13879 This is an overloaded intrinsic. You can use ``llvm.cos`` on any
13880 floating-point or vector of floating-point type. Not all targets support
13881 all types however.
13882
13883 ::
13884
13885       declare float     @llvm.cos.f32(float  %Val)
13886       declare double    @llvm.cos.f64(double %Val)
13887       declare x86_fp80  @llvm.cos.f80(x86_fp80  %Val)
13888       declare fp128     @llvm.cos.f128(fp128 %Val)
13889       declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128  %Val)
13890
13891 Overview:
13892 """""""""
13893
13894 The '``llvm.cos.*``' intrinsics return the cosine of the operand.
13895
13896 Arguments:
13897 """"""""""
13898
13899 The argument and return value are floating-point numbers of the same type.
13900
13901 Semantics:
13902 """"""""""
13903
13904 Return the same value as a corresponding libm '``cos``' function but without
13905 trapping or setting ``errno``.
13906
13907 When specified with the fast-math-flag 'afn', the result may be approximated
13908 using a less accurate calculation.
13909
13910 '``llvm.pow.*``' Intrinsic
13911 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13912
13913 Syntax:
13914 """""""
13915
13916 This is an overloaded intrinsic. You can use ``llvm.pow`` on any
13917 floating-point or vector of floating-point type. Not all targets support
13918 all types however.
13919
13920 ::
13921
13922       declare float     @llvm.pow.f32(float  %Val, float %Power)
13923       declare double    @llvm.pow.f64(double %Val, double %Power)
13924       declare x86_fp80  @llvm.pow.f80(x86_fp80  %Val, x86_fp80 %Power)
13925       declare fp128     @llvm.pow.f128(fp128 %Val, fp128 %Power)
13926       declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128  %Val, ppc_fp128 Power)
13927
13928 Overview:
13929 """""""""
13930
13931 The '``llvm.pow.*``' intrinsics return the first operand raised to the
13932 specified (positive or negative) power.
13933
13934 Arguments:
13935 """"""""""
13936
13937 The arguments and return value are floating-point numbers of the same type.
13938
13939 Semantics:
13940 """"""""""
13941
13942 Return the same value as a corresponding libm '``pow``' function but without
13943 trapping or setting ``errno``.
13944
13945 When specified with the fast-math-flag 'afn', the result may be approximated
13946 using a less accurate calculation.
13947
13948 '``llvm.exp.*``' Intrinsic
13949 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13950
13951 Syntax:
13952 """""""
13953
13954 This is an overloaded intrinsic. You can use ``llvm.exp`` on any
13955 floating-point or vector of floating-point type. Not all targets support
13956 all types however.
13957
13958 ::
13959
13960       declare float     @llvm.exp.f32(float  %Val)
13961       declare double    @llvm.exp.f64(double %Val)
13962       declare x86_fp80  @llvm.exp.f80(x86_fp80  %Val)
13963       declare fp128     @llvm.exp.f128(fp128 %Val)
13964       declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128  %Val)
13965
13966 Overview:
13967 """""""""
13968
13969 The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified
13970 value.
13971
13972 Arguments:
13973 """"""""""
13974
13975 The argument and return value are floating-point numbers of the same type.
13976
13977 Semantics:
13978 """"""""""
13979
13980 Return the same value as a corresponding libm '``exp``' function but without
13981 trapping or setting ``errno``.
13982
13983 When specified with the fast-math-flag 'afn', the result may be approximated
13984 using a less accurate calculation.
13985
13986 '``llvm.exp2.*``' Intrinsic
13987 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13988
13989 Syntax:
13990 """""""
13991
13992 This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
13993 floating-point or vector of floating-point type. Not all targets support
13994 all types however.
13995
13996 ::
13997
13998       declare float     @llvm.exp2.f32(float  %Val)
13999       declare double    @llvm.exp2.f64(double %Val)
14000       declare x86_fp80  @llvm.exp2.f80(x86_fp80  %Val)
14001       declare fp128     @llvm.exp2.f128(fp128 %Val)
14002       declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128  %Val)
14003
14004 Overview:
14005 """""""""
14006
14007 The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the
14008 specified value.
14009
14010 Arguments:
14011 """"""""""
14012
14013 The argument and return value are floating-point numbers of the same type.
14014
14015 Semantics:
14016 """"""""""
14017
14018 Return the same value as a corresponding libm '``exp2``' function but without
14019 trapping or setting ``errno``.
14020
14021 When specified with the fast-math-flag 'afn', the result may be approximated
14022 using a less accurate calculation.
14023
14024 '``llvm.log.*``' Intrinsic
14025 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14026
14027 Syntax:
14028 """""""
14029
14030 This is an overloaded intrinsic. You can use ``llvm.log`` on any
14031 floating-point or vector of floating-point type. Not all targets support
14032 all types however.
14033
14034 ::
14035
14036       declare float     @llvm.log.f32(float  %Val)
14037       declare double    @llvm.log.f64(double %Val)
14038       declare x86_fp80  @llvm.log.f80(x86_fp80  %Val)
14039       declare fp128     @llvm.log.f128(fp128 %Val)
14040       declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128  %Val)
14041
14042 Overview:
14043 """""""""
14044
14045 The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified
14046 value.
14047
14048 Arguments:
14049 """"""""""
14050
14051 The argument and return value are floating-point numbers of the same type.
14052
14053 Semantics:
14054 """"""""""
14055
14056 Return the same value as a corresponding libm '``log``' function but without
14057 trapping or setting ``errno``.
14058
14059 When specified with the fast-math-flag 'afn', the result may be approximated
14060 using a less accurate calculation.
14061
14062 '``llvm.log10.*``' Intrinsic
14063 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14064
14065 Syntax:
14066 """""""
14067
14068 This is an overloaded intrinsic. You can use ``llvm.log10`` on any
14069 floating-point or vector of floating-point type. Not all targets support
14070 all types however.
14071
14072 ::
14073
14074       declare float     @llvm.log10.f32(float  %Val)
14075       declare double    @llvm.log10.f64(double %Val)
14076       declare x86_fp80  @llvm.log10.f80(x86_fp80  %Val)
14077       declare fp128     @llvm.log10.f128(fp128 %Val)
14078       declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128  %Val)
14079
14080 Overview:
14081 """""""""
14082
14083 The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the
14084 specified value.
14085
14086 Arguments:
14087 """"""""""
14088
14089 The argument and return value are floating-point numbers of the same type.
14090
14091 Semantics:
14092 """"""""""
14093
14094 Return the same value as a corresponding libm '``log10``' function but without
14095 trapping or setting ``errno``.
14096
14097 When specified with the fast-math-flag 'afn', the result may be approximated
14098 using a less accurate calculation.
14099
14100 '``llvm.log2.*``' Intrinsic
14101 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14102
14103 Syntax:
14104 """""""
14105
14106 This is an overloaded intrinsic. You can use ``llvm.log2`` on any
14107 floating-point or vector of floating-point type. Not all targets support
14108 all types however.
14109
14110 ::
14111
14112       declare float     @llvm.log2.f32(float  %Val)
14113       declare double    @llvm.log2.f64(double %Val)
14114       declare x86_fp80  @llvm.log2.f80(x86_fp80  %Val)
14115       declare fp128     @llvm.log2.f128(fp128 %Val)
14116       declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128  %Val)
14117
14118 Overview:
14119 """""""""
14120
14121 The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified
14122 value.
14123
14124 Arguments:
14125 """"""""""
14126
14127 The argument and return value are floating-point numbers of the same type.
14128
14129 Semantics:
14130 """"""""""
14131
14132 Return the same value as a corresponding libm '``log2``' function but without
14133 trapping or setting ``errno``.
14134
14135 When specified with the fast-math-flag 'afn', the result may be approximated
14136 using a less accurate calculation.
14137
14138 .. _int_fma:
14139
14140 '``llvm.fma.*``' Intrinsic
14141 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14142
14143 Syntax:
14144 """""""
14145
14146 This is an overloaded intrinsic. You can use ``llvm.fma`` on any
14147 floating-point or vector of floating-point type. Not all targets support
14148 all types however.
14149
14150 ::
14151
14152       declare float     @llvm.fma.f32(float  %a, float  %b, float  %c)
14153       declare double    @llvm.fma.f64(double %a, double %b, double %c)
14154       declare x86_fp80  @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c)
14155       declare fp128     @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c)
14156       declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c)
14157
14158 Overview:
14159 """""""""
14160
14161 The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation.
14162
14163 Arguments:
14164 """"""""""
14165
14166 The arguments and return value are floating-point numbers of the same type.
14167
14168 Semantics:
14169 """"""""""
14170
14171 Return the same value as a corresponding libm '``fma``' function but without
14172 trapping or setting ``errno``.
14173
14174 When specified with the fast-math-flag 'afn', the result may be approximated
14175 using a less accurate calculation.
14176
14177 '``llvm.fabs.*``' Intrinsic
14178 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14179
14180 Syntax:
14181 """""""
14182
14183 This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
14184 floating-point or vector of floating-point type. Not all targets support
14185 all types however.
14186
14187 ::
14188
14189       declare float     @llvm.fabs.f32(float  %Val)
14190       declare double    @llvm.fabs.f64(double %Val)
14191       declare x86_fp80  @llvm.fabs.f80(x86_fp80 %Val)
14192       declare fp128     @llvm.fabs.f128(fp128 %Val)
14193       declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val)
14194
14195 Overview:
14196 """""""""
14197
14198 The '``llvm.fabs.*``' intrinsics return the absolute value of the
14199 operand.
14200
14201 Arguments:
14202 """"""""""
14203
14204 The argument and return value are floating-point numbers of the same
14205 type.
14206
14207 Semantics:
14208 """"""""""
14209
14210 This function returns the same values as the libm ``fabs`` functions
14211 would, and handles error conditions in the same way.
14212
14213 '``llvm.minnum.*``' Intrinsic
14214 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14215
14216 Syntax:
14217 """""""
14218
14219 This is an overloaded intrinsic. You can use ``llvm.minnum`` on any
14220 floating-point or vector of floating-point type. Not all targets support
14221 all types however.
14222
14223 ::
14224
14225       declare float     @llvm.minnum.f32(float %Val0, float %Val1)
14226       declare double    @llvm.minnum.f64(double %Val0, double %Val1)
14227       declare x86_fp80  @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14228       declare fp128     @llvm.minnum.f128(fp128 %Val0, fp128 %Val1)
14229       declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14230
14231 Overview:
14232 """""""""
14233
14234 The '``llvm.minnum.*``' intrinsics return the minimum of the two
14235 arguments.
14236
14237
14238 Arguments:
14239 """"""""""
14240
14241 The arguments and return value are floating-point numbers of the same
14242 type.
14243
14244 Semantics:
14245 """"""""""
14246
14247 Follows the IEEE-754 semantics for minNum, except for handling of
14248 signaling NaNs. This match's the behavior of libm's fmin.
14249
14250 If either operand is a NaN, returns the other non-NaN operand. Returns
14251 NaN only if both operands are NaN. The returned NaN is always
14252 quiet. If the operands compare equal, returns a value that compares
14253 equal to both operands. This means that fmin(+/-0.0, +/-0.0) could
14254 return either -0.0 or 0.0.
14255
14256 Unlike the IEEE-754 2008 behavior, this does not distinguish between
14257 signaling and quiet NaN inputs. If a target's implementation follows
14258 the standard and returns a quiet NaN if either input is a signaling
14259 NaN, the intrinsic lowering is responsible for quieting the inputs to
14260 correctly return the non-NaN input (e.g. by using the equivalent of
14261 ``llvm.canonicalize``).
14262
14263
14264 '``llvm.maxnum.*``' Intrinsic
14265 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14266
14267 Syntax:
14268 """""""
14269
14270 This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any
14271 floating-point or vector of floating-point type. Not all targets support
14272 all types however.
14273
14274 ::
14275
14276       declare float     @llvm.maxnum.f32(float  %Val0, float  %Val1)
14277       declare double    @llvm.maxnum.f64(double %Val0, double %Val1)
14278       declare x86_fp80  @llvm.maxnum.f80(x86_fp80  %Val0, x86_fp80  %Val1)
14279       declare fp128     @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1)
14280       declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128  %Val0, ppc_fp128  %Val1)
14281
14282 Overview:
14283 """""""""
14284
14285 The '``llvm.maxnum.*``' intrinsics return the maximum of the two
14286 arguments.
14287
14288
14289 Arguments:
14290 """"""""""
14291
14292 The arguments and return value are floating-point numbers of the same
14293 type.
14294
14295 Semantics:
14296 """"""""""
14297 Follows the IEEE-754 semantics for maxNum except for the handling of
14298 signaling NaNs. This matches the behavior of libm's fmax.
14299
14300 If either operand is a NaN, returns the other non-NaN operand. Returns
14301 NaN only if both operands are NaN. The returned NaN is always
14302 quiet. If the operands compare equal, returns a value that compares
14303 equal to both operands. This means that fmax(+/-0.0, +/-0.0) could
14304 return either -0.0 or 0.0.
14305
14306 Unlike the IEEE-754 2008 behavior, this does not distinguish between
14307 signaling and quiet NaN inputs. If a target's implementation follows
14308 the standard and returns a quiet NaN if either input is a signaling
14309 NaN, the intrinsic lowering is responsible for quieting the inputs to
14310 correctly return the non-NaN input (e.g. by using the equivalent of
14311 ``llvm.canonicalize``).
14312
14313 '``llvm.minimum.*``' Intrinsic
14314 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14315
14316 Syntax:
14317 """""""
14318
14319 This is an overloaded intrinsic. You can use ``llvm.minimum`` on any
14320 floating-point or vector of floating-point type. Not all targets support
14321 all types however.
14322
14323 ::
14324
14325       declare float     @llvm.minimum.f32(float %Val0, float %Val1)
14326       declare double    @llvm.minimum.f64(double %Val0, double %Val1)
14327       declare x86_fp80  @llvm.minimum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14328       declare fp128     @llvm.minimum.f128(fp128 %Val0, fp128 %Val1)
14329       declare ppc_fp128 @llvm.minimum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14330
14331 Overview:
14332 """""""""
14333
14334 The '``llvm.minimum.*``' intrinsics return the minimum of the two
14335 arguments, propagating NaNs and treating -0.0 as less than +0.0.
14336
14337
14338 Arguments:
14339 """"""""""
14340
14341 The arguments and return value are floating-point numbers of the same
14342 type.
14343
14344 Semantics:
14345 """"""""""
14346 If either operand is a NaN, returns NaN. Otherwise returns the lesser
14347 of the two arguments. -0.0 is considered to be less than +0.0 for this
14348 intrinsic. Note that these are the semantics specified in the draft of
14349 IEEE 754-2018.
14350
14351 '``llvm.maximum.*``' Intrinsic
14352 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14353
14354 Syntax:
14355 """""""
14356
14357 This is an overloaded intrinsic. You can use ``llvm.maximum`` on any
14358 floating-point or vector of floating-point type. Not all targets support
14359 all types however.
14360
14361 ::
14362
14363       declare float     @llvm.maximum.f32(float %Val0, float %Val1)
14364       declare double    @llvm.maximum.f64(double %Val0, double %Val1)
14365       declare x86_fp80  @llvm.maximum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14366       declare fp128     @llvm.maximum.f128(fp128 %Val0, fp128 %Val1)
14367       declare ppc_fp128 @llvm.maximum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14368
14369 Overview:
14370 """""""""
14371
14372 The '``llvm.maximum.*``' intrinsics return the maximum of the two
14373 arguments, propagating NaNs and treating -0.0 as less than +0.0.
14374
14375
14376 Arguments:
14377 """"""""""
14378
14379 The arguments and return value are floating-point numbers of the same
14380 type.
14381
14382 Semantics:
14383 """"""""""
14384 If either operand is a NaN, returns NaN. Otherwise returns the greater
14385 of the two arguments. -0.0 is considered to be less than +0.0 for this
14386 intrinsic. Note that these are the semantics specified in the draft of
14387 IEEE 754-2018.
14388
14389 '``llvm.copysign.*``' Intrinsic
14390 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14391
14392 Syntax:
14393 """""""
14394
14395 This is an overloaded intrinsic. You can use ``llvm.copysign`` on any
14396 floating-point or vector of floating-point type. Not all targets support
14397 all types however.
14398
14399 ::
14400
14401       declare float     @llvm.copysign.f32(float  %Mag, float  %Sgn)
14402       declare double    @llvm.copysign.f64(double %Mag, double %Sgn)
14403       declare x86_fp80  @llvm.copysign.f80(x86_fp80  %Mag, x86_fp80  %Sgn)
14404       declare fp128     @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn)
14405       declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128  %Mag, ppc_fp128  %Sgn)
14406
14407 Overview:
14408 """""""""
14409
14410 The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the
14411 first operand and the sign of the second operand.
14412
14413 Arguments:
14414 """"""""""
14415
14416 The arguments and return value are floating-point numbers of the same
14417 type.
14418
14419 Semantics:
14420 """"""""""
14421
14422 This function returns the same values as the libm ``copysign``
14423 functions would, and handles error conditions in the same way.
14424
14425 '``llvm.floor.*``' Intrinsic
14426 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14427
14428 Syntax:
14429 """""""
14430
14431 This is an overloaded intrinsic. You can use ``llvm.floor`` on any
14432 floating-point or vector of floating-point type. Not all targets support
14433 all types however.
14434
14435 ::
14436
14437       declare float     @llvm.floor.f32(float  %Val)
14438       declare double    @llvm.floor.f64(double %Val)
14439       declare x86_fp80  @llvm.floor.f80(x86_fp80  %Val)
14440       declare fp128     @llvm.floor.f128(fp128 %Val)
14441       declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128  %Val)
14442
14443 Overview:
14444 """""""""
14445
14446 The '``llvm.floor.*``' intrinsics return the floor of the operand.
14447
14448 Arguments:
14449 """"""""""
14450
14451 The argument and return value are floating-point numbers of the same
14452 type.
14453
14454 Semantics:
14455 """"""""""
14456
14457 This function returns the same values as the libm ``floor`` functions
14458 would, and handles error conditions in the same way.
14459
14460 '``llvm.ceil.*``' Intrinsic
14461 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14462
14463 Syntax:
14464 """""""
14465
14466 This is an overloaded intrinsic. You can use ``llvm.ceil`` on any
14467 floating-point or vector of floating-point type. Not all targets support
14468 all types however.
14469
14470 ::
14471
14472       declare float     @llvm.ceil.f32(float  %Val)
14473       declare double    @llvm.ceil.f64(double %Val)
14474       declare x86_fp80  @llvm.ceil.f80(x86_fp80  %Val)
14475       declare fp128     @llvm.ceil.f128(fp128 %Val)
14476       declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128  %Val)
14477
14478 Overview:
14479 """""""""
14480
14481 The '``llvm.ceil.*``' intrinsics return the ceiling of the operand.
14482
14483 Arguments:
14484 """"""""""
14485
14486 The argument and return value are floating-point numbers of the same
14487 type.
14488
14489 Semantics:
14490 """"""""""
14491
14492 This function returns the same values as the libm ``ceil`` functions
14493 would, and handles error conditions in the same way.
14494
14495 '``llvm.trunc.*``' Intrinsic
14496 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14497
14498 Syntax:
14499 """""""
14500
14501 This is an overloaded intrinsic. You can use ``llvm.trunc`` on any
14502 floating-point or vector of floating-point type. Not all targets support
14503 all types however.
14504
14505 ::
14506
14507       declare float     @llvm.trunc.f32(float  %Val)
14508       declare double    @llvm.trunc.f64(double %Val)
14509       declare x86_fp80  @llvm.trunc.f80(x86_fp80  %Val)
14510       declare fp128     @llvm.trunc.f128(fp128 %Val)
14511       declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128  %Val)
14512
14513 Overview:
14514 """""""""
14515
14516 The '``llvm.trunc.*``' intrinsics returns the operand rounded to the
14517 nearest integer not larger in magnitude than the operand.
14518
14519 Arguments:
14520 """"""""""
14521
14522 The argument and return value are floating-point numbers of the same
14523 type.
14524
14525 Semantics:
14526 """"""""""
14527
14528 This function returns the same values as the libm ``trunc`` functions
14529 would, and handles error conditions in the same way.
14530
14531 '``llvm.rint.*``' Intrinsic
14532 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14533
14534 Syntax:
14535 """""""
14536
14537 This is an overloaded intrinsic. You can use ``llvm.rint`` on any
14538 floating-point or vector of floating-point type. Not all targets support
14539 all types however.
14540
14541 ::
14542
14543       declare float     @llvm.rint.f32(float  %Val)
14544       declare double    @llvm.rint.f64(double %Val)
14545       declare x86_fp80  @llvm.rint.f80(x86_fp80  %Val)
14546       declare fp128     @llvm.rint.f128(fp128 %Val)
14547       declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128  %Val)
14548
14549 Overview:
14550 """""""""
14551
14552 The '``llvm.rint.*``' intrinsics returns the operand rounded to the
14553 nearest integer. It may raise an inexact floating-point exception if the
14554 operand isn't an integer.
14555
14556 Arguments:
14557 """"""""""
14558
14559 The argument and return value are floating-point numbers of the same
14560 type.
14561
14562 Semantics:
14563 """"""""""
14564
14565 This function returns the same values as the libm ``rint`` functions
14566 would, and handles error conditions in the same way.
14567
14568 '``llvm.nearbyint.*``' Intrinsic
14569 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14570
14571 Syntax:
14572 """""""
14573
14574 This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any
14575 floating-point or vector of floating-point type. Not all targets support
14576 all types however.
14577
14578 ::
14579
14580       declare float     @llvm.nearbyint.f32(float  %Val)
14581       declare double    @llvm.nearbyint.f64(double %Val)
14582       declare x86_fp80  @llvm.nearbyint.f80(x86_fp80  %Val)
14583       declare fp128     @llvm.nearbyint.f128(fp128 %Val)
14584       declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128  %Val)
14585
14586 Overview:
14587 """""""""
14588
14589 The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the
14590 nearest integer.
14591
14592 Arguments:
14593 """"""""""
14594
14595 The argument and return value are floating-point numbers of the same
14596 type.
14597
14598 Semantics:
14599 """"""""""
14600
14601 This function returns the same values as the libm ``nearbyint``
14602 functions would, and handles error conditions in the same way.
14603
14604 '``llvm.round.*``' Intrinsic
14605 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14606
14607 Syntax:
14608 """""""
14609
14610 This is an overloaded intrinsic. You can use ``llvm.round`` on any
14611 floating-point or vector of floating-point type. Not all targets support
14612 all types however.
14613
14614 ::
14615
14616       declare float     @llvm.round.f32(float  %Val)
14617       declare double    @llvm.round.f64(double %Val)
14618       declare x86_fp80  @llvm.round.f80(x86_fp80  %Val)
14619       declare fp128     @llvm.round.f128(fp128 %Val)
14620       declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128  %Val)
14621
14622 Overview:
14623 """""""""
14624
14625 The '``llvm.round.*``' intrinsics returns the operand rounded to the
14626 nearest integer.
14627
14628 Arguments:
14629 """"""""""
14630
14631 The argument and return value are floating-point numbers of the same
14632 type.
14633
14634 Semantics:
14635 """"""""""
14636
14637 This function returns the same values as the libm ``round``
14638 functions would, and handles error conditions in the same way.
14639
14640 '``llvm.roundeven.*``' Intrinsic
14641 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14642
14643 Syntax:
14644 """""""
14645
14646 This is an overloaded intrinsic. You can use ``llvm.roundeven`` on any
14647 floating-point or vector of floating-point type. Not all targets support
14648 all types however.
14649
14650 ::
14651
14652       declare float     @llvm.roundeven.f32(float  %Val)
14653       declare double    @llvm.roundeven.f64(double %Val)
14654       declare x86_fp80  @llvm.roundeven.f80(x86_fp80  %Val)
14655       declare fp128     @llvm.roundeven.f128(fp128 %Val)
14656       declare ppc_fp128 @llvm.roundeven.ppcf128(ppc_fp128  %Val)
14657
14658 Overview:
14659 """""""""
14660
14661 The '``llvm.roundeven.*``' intrinsics returns the operand rounded to the nearest
14662 integer in floating-point format rounding halfway cases to even (that is, to the
14663 nearest value that is an even integer).
14664
14665 Arguments:
14666 """"""""""
14667
14668 The argument and return value are floating-point numbers of the same type.
14669
14670 Semantics:
14671 """"""""""
14672
14673 This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
14674 also behaves in the same way as C standard function ``roundeven``, except that
14675 it does not raise floating point exceptions.
14676
14677
14678 '``llvm.lround.*``' Intrinsic
14679 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14680
14681 Syntax:
14682 """""""
14683
14684 This is an overloaded intrinsic. You can use ``llvm.lround`` on any
14685 floating-point type. Not all targets support all types however.
14686
14687 ::
14688
14689       declare i32 @llvm.lround.i32.f32(float %Val)
14690       declare i32 @llvm.lround.i32.f64(double %Val)
14691       declare i32 @llvm.lround.i32.f80(float %Val)
14692       declare i32 @llvm.lround.i32.f128(double %Val)
14693       declare i32 @llvm.lround.i32.ppcf128(double %Val)
14694
14695       declare i64 @llvm.lround.i64.f32(float %Val)
14696       declare i64 @llvm.lround.i64.f64(double %Val)
14697       declare i64 @llvm.lround.i64.f80(float %Val)
14698       declare i64 @llvm.lround.i64.f128(double %Val)
14699       declare i64 @llvm.lround.i64.ppcf128(double %Val)
14700
14701 Overview:
14702 """""""""
14703
14704 The '``llvm.lround.*``' intrinsics return the operand rounded to the nearest
14705 integer with ties away from zero.
14706
14707
14708 Arguments:
14709 """"""""""
14710
14711 The argument is a floating-point number and the return value is an integer
14712 type.
14713
14714 Semantics:
14715 """"""""""
14716
14717 This function returns the same values as the libm ``lround``
14718 functions would, but without setting errno.
14719
14720 '``llvm.llround.*``' Intrinsic
14721 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14722
14723 Syntax:
14724 """""""
14725
14726 This is an overloaded intrinsic. You can use ``llvm.llround`` on any
14727 floating-point type. Not all targets support all types however.
14728
14729 ::
14730
14731       declare i64 @llvm.lround.i64.f32(float %Val)
14732       declare i64 @llvm.lround.i64.f64(double %Val)
14733       declare i64 @llvm.lround.i64.f80(float %Val)
14734       declare i64 @llvm.lround.i64.f128(double %Val)
14735       declare i64 @llvm.lround.i64.ppcf128(double %Val)
14736
14737 Overview:
14738 """""""""
14739
14740 The '``llvm.llround.*``' intrinsics return the operand rounded to the nearest
14741 integer with ties away from zero.
14742
14743 Arguments:
14744 """"""""""
14745
14746 The argument is a floating-point number and the return value is an integer
14747 type.
14748
14749 Semantics:
14750 """"""""""
14751
14752 This function returns the same values as the libm ``llround``
14753 functions would, but without setting errno.
14754
14755 '``llvm.lrint.*``' Intrinsic
14756 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14757
14758 Syntax:
14759 """""""
14760
14761 This is an overloaded intrinsic. You can use ``llvm.lrint`` on any
14762 floating-point type. Not all targets support all types however.
14763
14764 ::
14765
14766       declare i32 @llvm.lrint.i32.f32(float %Val)
14767       declare i32 @llvm.lrint.i32.f64(double %Val)
14768       declare i32 @llvm.lrint.i32.f80(float %Val)
14769       declare i32 @llvm.lrint.i32.f128(double %Val)
14770       declare i32 @llvm.lrint.i32.ppcf128(double %Val)
14771
14772       declare i64 @llvm.lrint.i64.f32(float %Val)
14773       declare i64 @llvm.lrint.i64.f64(double %Val)
14774       declare i64 @llvm.lrint.i64.f80(float %Val)
14775       declare i64 @llvm.lrint.i64.f128(double %Val)
14776       declare i64 @llvm.lrint.i64.ppcf128(double %Val)
14777
14778 Overview:
14779 """""""""
14780
14781 The '``llvm.lrint.*``' intrinsics return the operand rounded to the nearest
14782 integer.
14783
14784
14785 Arguments:
14786 """"""""""
14787
14788 The argument is a floating-point number and the return value is an integer
14789 type.
14790
14791 Semantics:
14792 """"""""""
14793
14794 This function returns the same values as the libm ``lrint``
14795 functions would, but without setting errno.
14796
14797 '``llvm.llrint.*``' Intrinsic
14798 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14799
14800 Syntax:
14801 """""""
14802
14803 This is an overloaded intrinsic. You can use ``llvm.llrint`` on any
14804 floating-point type. Not all targets support all types however.
14805
14806 ::
14807
14808       declare i64 @llvm.llrint.i64.f32(float %Val)
14809       declare i64 @llvm.llrint.i64.f64(double %Val)
14810       declare i64 @llvm.llrint.i64.f80(float %Val)
14811       declare i64 @llvm.llrint.i64.f128(double %Val)
14812       declare i64 @llvm.llrint.i64.ppcf128(double %Val)
14813
14814 Overview:
14815 """""""""
14816
14817 The '``llvm.llrint.*``' intrinsics return the operand rounded to the nearest
14818 integer.
14819
14820 Arguments:
14821 """"""""""
14822
14823 The argument is a floating-point number and the return value is an integer
14824 type.
14825
14826 Semantics:
14827 """"""""""
14828
14829 This function returns the same values as the libm ``llrint``
14830 functions would, but without setting errno.
14831
14832 Bit Manipulation Intrinsics
14833 ---------------------------
14834
14835 LLVM provides intrinsics for a few important bit manipulation
14836 operations. These allow efficient code generation for some algorithms.
14837
14838 '``llvm.bitreverse.*``' Intrinsics
14839 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14840
14841 Syntax:
14842 """""""
14843
14844 This is an overloaded intrinsic function. You can use bitreverse on any
14845 integer type.
14846
14847 ::
14848
14849       declare i16 @llvm.bitreverse.i16(i16 <id>)
14850       declare i32 @llvm.bitreverse.i32(i32 <id>)
14851       declare i64 @llvm.bitreverse.i64(i64 <id>)
14852       declare <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> <id>)
14853
14854 Overview:
14855 """""""""
14856
14857 The '``llvm.bitreverse``' family of intrinsics is used to reverse the
14858 bitpattern of an integer value or vector of integer values; for example
14859 ``0b10110110`` becomes ``0b01101101``.
14860
14861 Semantics:
14862 """"""""""
14863
14864 The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit
14865 ``M`` in the input moved to bit ``N-M`` in the output. The vector
14866 intrinsics, such as ``llvm.bitreverse.v4i32``, operate on a per-element
14867 basis and the element order is not affected.
14868
14869 '``llvm.bswap.*``' Intrinsics
14870 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14871
14872 Syntax:
14873 """""""
14874
14875 This is an overloaded intrinsic function. You can use bswap on any
14876 integer type that is an even number of bytes (i.e. BitWidth % 16 == 0).
14877
14878 ::
14879
14880       declare i16 @llvm.bswap.i16(i16 <id>)
14881       declare i32 @llvm.bswap.i32(i32 <id>)
14882       declare i64 @llvm.bswap.i64(i64 <id>)
14883       declare <4 x i32> @llvm.bswap.v4i32(<4 x i32> <id>)
14884
14885 Overview:
14886 """""""""
14887
14888 The '``llvm.bswap``' family of intrinsics is used to byte swap an integer
14889 value or vector of integer values with an even number of bytes (positive
14890 multiple of 16 bits).
14891
14892 Semantics:
14893 """"""""""
14894
14895 The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high
14896 and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32``
14897 intrinsic returns an i32 value that has the four bytes of the input i32
14898 swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the
14899 returned i32 will have its bytes in 3, 2, 1, 0 order. The
14900 ``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this
14901 concept to additional even-byte lengths (6 bytes, 8 bytes and more,
14902 respectively). The vector intrinsics, such as ``llvm.bswap.v4i32``,
14903 operate on a per-element basis and the element order is not affected.
14904
14905 '``llvm.ctpop.*``' Intrinsic
14906 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14907
14908 Syntax:
14909 """""""
14910
14911 This is an overloaded intrinsic. You can use llvm.ctpop on any integer
14912 bit width, or on any vector with integer elements. Not all targets
14913 support all bit widths or vector types, however.
14914
14915 ::
14916
14917       declare i8 @llvm.ctpop.i8(i8  <src>)
14918       declare i16 @llvm.ctpop.i16(i16 <src>)
14919       declare i32 @llvm.ctpop.i32(i32 <src>)
14920       declare i64 @llvm.ctpop.i64(i64 <src>)
14921       declare i256 @llvm.ctpop.i256(i256 <src>)
14922       declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>)
14923
14924 Overview:
14925 """""""""
14926
14927 The '``llvm.ctpop``' family of intrinsics counts the number of bits set
14928 in a value.
14929
14930 Arguments:
14931 """"""""""
14932
14933 The only argument is the value to be counted. The argument may be of any
14934 integer type, or a vector with integer elements. The return type must
14935 match the argument type.
14936
14937 Semantics:
14938 """"""""""
14939
14940 The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within
14941 each element of a vector.
14942
14943 '``llvm.ctlz.*``' Intrinsic
14944 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14945
14946 Syntax:
14947 """""""
14948
14949 This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any
14950 integer bit width, or any vector whose elements are integers. Not all
14951 targets support all bit widths or vector types, however.
14952
14953 ::
14954
14955       declare i8   @llvm.ctlz.i8  (i8   <src>, i1 <is_zero_undef>)
14956       declare i16  @llvm.ctlz.i16 (i16  <src>, i1 <is_zero_undef>)
14957       declare i32  @llvm.ctlz.i32 (i32  <src>, i1 <is_zero_undef>)
14958       declare i64  @llvm.ctlz.i64 (i64  <src>, i1 <is_zero_undef>)
14959       declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>)
14960       declare <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
14961
14962 Overview:
14963 """""""""
14964
14965 The '``llvm.ctlz``' family of intrinsic functions counts the number of
14966 leading zeros in a variable.
14967
14968 Arguments:
14969 """"""""""
14970
14971 The first argument is the value to be counted. This argument may be of
14972 any integer type, or a vector with integer element type. The return
14973 type must match the first argument type.
14974
14975 The second argument must be a constant and is a flag to indicate whether
14976 the intrinsic should ensure that a zero as the first argument produces a
14977 defined result. Historically some architectures did not provide a
14978 defined result for zero values as efficiently, and many algorithms are
14979 now predicated on avoiding zero-value inputs.
14980
14981 Semantics:
14982 """"""""""
14983
14984 The '``llvm.ctlz``' intrinsic counts the leading (most significant)
14985 zeros in a variable, or within each element of the vector. If
14986 ``src == 0`` then the result is the size in bits of the type of ``src``
14987 if ``is_zero_undef == 0`` and ``undef`` otherwise. For example,
14988 ``llvm.ctlz(i32 2) = 30``.
14989
14990 '``llvm.cttz.*``' Intrinsic
14991 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14992
14993 Syntax:
14994 """""""
14995
14996 This is an overloaded intrinsic. You can use ``llvm.cttz`` on any
14997 integer bit width, or any vector of integer elements. Not all targets
14998 support all bit widths or vector types, however.
14999
15000 ::
15001
15002       declare i8   @llvm.cttz.i8  (i8   <src>, i1 <is_zero_undef>)
15003       declare i16  @llvm.cttz.i16 (i16  <src>, i1 <is_zero_undef>)
15004       declare i32  @llvm.cttz.i32 (i32  <src>, i1 <is_zero_undef>)
15005       declare i64  @llvm.cttz.i64 (i64  <src>, i1 <is_zero_undef>)
15006       declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>)
15007       declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
15008
15009 Overview:
15010 """""""""
15011
15012 The '``llvm.cttz``' family of intrinsic functions counts the number of
15013 trailing zeros.
15014
15015 Arguments:
15016 """"""""""
15017
15018 The first argument is the value to be counted. This argument may be of
15019 any integer type, or a vector with integer element type. The return
15020 type must match the first argument type.
15021
15022 The second argument must be a constant and is a flag to indicate whether
15023 the intrinsic should ensure that a zero as the first argument produces a
15024 defined result. Historically some architectures did not provide a
15025 defined result for zero values as efficiently, and many algorithms are
15026 now predicated on avoiding zero-value inputs.
15027
15028 Semantics:
15029 """"""""""
15030
15031 The '``llvm.cttz``' intrinsic counts the trailing (least significant)
15032 zeros in a variable, or within each element of a vector. If ``src == 0``
15033 then the result is the size in bits of the type of ``src`` if
15034 ``is_zero_undef == 0`` and ``undef`` otherwise. For example,
15035 ``llvm.cttz(2) = 1``.
15036
15037 .. _int_overflow:
15038
15039 '``llvm.fshl.*``' Intrinsic
15040 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15041
15042 Syntax:
15043 """""""
15044
15045 This is an overloaded intrinsic. You can use ``llvm.fshl`` on any
15046 integer bit width or any vector of integer elements. Not all targets
15047 support all bit widths or vector types, however.
15048
15049 ::
15050
15051       declare i8  @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c)
15052       declare i67 @llvm.fshl.i67(i67 %a, i67 %b, i67 %c)
15053       declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
15054
15055 Overview:
15056 """""""""
15057
15058 The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left:
15059 the first two values are concatenated as { %a : %b } (%a is the most significant
15060 bits of the wide value), the combined value is shifted left, and the most
15061 significant bits are extracted to produce a result that is the same size as the
15062 original arguments. If the first 2 arguments are identical, this is equivalent
15063 to a rotate left operation. For vector types, the operation occurs for each
15064 element of the vector. The shift argument is treated as an unsigned amount
15065 modulo the element size of the arguments.
15066
15067 Arguments:
15068 """"""""""
15069
15070 The first two arguments are the values to be concatenated. The third
15071 argument is the shift amount. The arguments may be any integer type or a
15072 vector with integer element type. All arguments and the return value must
15073 have the same type.
15074
15075 Example:
15076 """"""""
15077
15078 .. code-block:: text
15079
15080       %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z)  ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8)
15081       %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15)  ; %r = i8: 128 (0b10000000)
15082       %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11)  ; %r = i8: 120 (0b01111000)
15083       %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8)   ; %r = i8: 0   (0b00000000)
15084
15085 '``llvm.fshr.*``' Intrinsic
15086 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15087
15088 Syntax:
15089 """""""
15090
15091 This is an overloaded intrinsic. You can use ``llvm.fshr`` on any
15092 integer bit width or any vector of integer elements. Not all targets
15093 support all bit widths or vector types, however.
15094
15095 ::
15096
15097       declare i8  @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c)
15098       declare i67 @llvm.fshr.i67(i67 %a, i67 %b, i67 %c)
15099       declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
15100
15101 Overview:
15102 """""""""
15103
15104 The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right:
15105 the first two values are concatenated as { %a : %b } (%a is the most significant
15106 bits of the wide value), the combined value is shifted right, and the least
15107 significant bits are extracted to produce a result that is the same size as the
15108 original arguments. If the first 2 arguments are identical, this is equivalent
15109 to a rotate right operation. For vector types, the operation occurs for each
15110 element of the vector. The shift argument is treated as an unsigned amount
15111 modulo the element size of the arguments.
15112
15113 Arguments:
15114 """"""""""
15115
15116 The first two arguments are the values to be concatenated. The third
15117 argument is the shift amount. The arguments may be any integer type or a
15118 vector with integer element type. All arguments and the return value must
15119 have the same type.
15120
15121 Example:
15122 """"""""
15123
15124 .. code-block:: text
15125
15126       %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z)  ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8)
15127       %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15)  ; %r = i8: 254 (0b11111110)
15128       %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11)  ; %r = i8: 225 (0b11100001)
15129       %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8)   ; %r = i8: 255 (0b11111111)
15130
15131 Arithmetic with Overflow Intrinsics
15132 -----------------------------------
15133
15134 LLVM provides intrinsics for fast arithmetic overflow checking.
15135
15136 Each of these intrinsics returns a two-element struct. The first
15137 element of this struct contains the result of the corresponding
15138 arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of
15139 the result. Therefore, for example, the first element of the struct
15140 returned by ``llvm.sadd.with.overflow.i32`` is always the same as the
15141 result of a 32-bit ``add`` instruction with the same operands, where
15142 the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag.
15143
15144 The second element of the result is an ``i1`` that is 1 if the
15145 arithmetic operation overflowed and 0 otherwise. An operation
15146 overflows if, for any values of its operands ``A`` and ``B`` and for
15147 any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is
15148 not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is
15149 ``sext`` for signed overflow and ``zext`` for unsigned overflow, and
15150 ``op`` is the underlying arithmetic operation.
15151
15152 The behavior of these intrinsics is well-defined for all argument
15153 values.
15154
15155 '``llvm.sadd.with.overflow.*``' Intrinsics
15156 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15157
15158 Syntax:
15159 """""""
15160
15161 This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow``
15162 on any integer bit width or vectors of integers.
15163
15164 ::
15165
15166       declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
15167       declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
15168       declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b)
15169       declare {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15170
15171 Overview:
15172 """""""""
15173
15174 The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
15175 a signed addition of the two arguments, and indicate whether an overflow
15176 occurred during the signed summation.
15177
15178 Arguments:
15179 """"""""""
15180
15181 The arguments (%a and %b) and the first element of the result structure
15182 may be of integer types of any bit width, but they must have the same
15183 bit width. The second element of the result structure must be of type
15184 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15185 addition.
15186
15187 Semantics:
15188 """"""""""
15189
15190 The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
15191 a signed addition of the two variables. They return a structure --- the
15192 first element of which is the signed summation, and the second element
15193 of which is a bit specifying if the signed summation resulted in an
15194 overflow.
15195
15196 Examples:
15197 """""""""
15198
15199 .. code-block:: llvm
15200
15201       %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
15202       %sum = extractvalue {i32, i1} %res, 0
15203       %obit = extractvalue {i32, i1} %res, 1
15204       br i1 %obit, label %overflow, label %normal
15205
15206 '``llvm.uadd.with.overflow.*``' Intrinsics
15207 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15208
15209 Syntax:
15210 """""""
15211
15212 This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow``
15213 on any integer bit width or vectors of integers.
15214
15215 ::
15216
15217       declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b)
15218       declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
15219       declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b)
15220       declare {<4 x i32>, <4 x i1>} @llvm.uadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15221
15222 Overview:
15223 """""""""
15224
15225 The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
15226 an unsigned addition of the two arguments, and indicate whether a carry
15227 occurred during the unsigned summation.
15228
15229 Arguments:
15230 """"""""""
15231
15232 The arguments (%a and %b) and the first element of the result structure
15233 may be of integer types of any bit width, but they must have the same
15234 bit width. The second element of the result structure must be of type
15235 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15236 addition.
15237
15238 Semantics:
15239 """"""""""
15240
15241 The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
15242 an unsigned addition of the two arguments. They return a structure --- the
15243 first element of which is the sum, and the second element of which is a
15244 bit specifying if the unsigned summation resulted in a carry.
15245
15246 Examples:
15247 """""""""
15248
15249 .. code-block:: llvm
15250
15251       %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
15252       %sum = extractvalue {i32, i1} %res, 0
15253       %obit = extractvalue {i32, i1} %res, 1
15254       br i1 %obit, label %carry, label %normal
15255
15256 '``llvm.ssub.with.overflow.*``' Intrinsics
15257 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15258
15259 Syntax:
15260 """""""
15261
15262 This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow``
15263 on any integer bit width or vectors of integers.
15264
15265 ::
15266
15267       declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b)
15268       declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
15269       declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b)
15270       declare {<4 x i32>, <4 x i1>} @llvm.ssub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15271
15272 Overview:
15273 """""""""
15274
15275 The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
15276 a signed subtraction of the two arguments, and indicate whether an
15277 overflow occurred during the signed subtraction.
15278
15279 Arguments:
15280 """"""""""
15281
15282 The arguments (%a and %b) and the first element of the result structure
15283 may be of integer types of any bit width, but they must have the same
15284 bit width. The second element of the result structure must be of type
15285 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15286 subtraction.
15287
15288 Semantics:
15289 """"""""""
15290
15291 The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
15292 a signed subtraction of the two arguments. They return a structure --- the
15293 first element of which is the subtraction, and the second element of
15294 which is a bit specifying if the signed subtraction resulted in an
15295 overflow.
15296
15297 Examples:
15298 """""""""
15299
15300 .. code-block:: llvm
15301
15302       %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
15303       %sum = extractvalue {i32, i1} %res, 0
15304       %obit = extractvalue {i32, i1} %res, 1
15305       br i1 %obit, label %overflow, label %normal
15306
15307 '``llvm.usub.with.overflow.*``' Intrinsics
15308 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15309
15310 Syntax:
15311 """""""
15312
15313 This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow``
15314 on any integer bit width or vectors of integers.
15315
15316 ::
15317
15318       declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b)
15319       declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
15320       declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b)
15321       declare {<4 x i32>, <4 x i1>} @llvm.usub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15322
15323 Overview:
15324 """""""""
15325
15326 The '``llvm.usub.with.overflow``' family of intrinsic functions perform
15327 an unsigned subtraction of the two arguments, and indicate whether an
15328 overflow occurred during the unsigned subtraction.
15329
15330 Arguments:
15331 """"""""""
15332
15333 The arguments (%a and %b) and the first element of the result structure
15334 may be of integer types of any bit width, but they must have the same
15335 bit width. The second element of the result structure must be of type
15336 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15337 subtraction.
15338
15339 Semantics:
15340 """"""""""
15341
15342 The '``llvm.usub.with.overflow``' family of intrinsic functions perform
15343 an unsigned subtraction of the two arguments. They return a structure ---
15344 the first element of which is the subtraction, and the second element of
15345 which is a bit specifying if the unsigned subtraction resulted in an
15346 overflow.
15347
15348 Examples:
15349 """""""""
15350
15351 .. code-block:: llvm
15352
15353       %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
15354       %sum = extractvalue {i32, i1} %res, 0
15355       %obit = extractvalue {i32, i1} %res, 1
15356       br i1 %obit, label %overflow, label %normal
15357
15358 '``llvm.smul.with.overflow.*``' Intrinsics
15359 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15360
15361 Syntax:
15362 """""""
15363
15364 This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow``
15365 on any integer bit width or vectors of integers.
15366
15367 ::
15368
15369       declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b)
15370       declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
15371       declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b)
15372       declare {<4 x i32>, <4 x i1>} @llvm.smul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15373
15374 Overview:
15375 """""""""
15376
15377 The '``llvm.smul.with.overflow``' family of intrinsic functions perform
15378 a signed multiplication of the two arguments, and indicate whether an
15379 overflow occurred during the signed multiplication.
15380
15381 Arguments:
15382 """"""""""
15383
15384 The arguments (%a and %b) and the first element of the result structure
15385 may be of integer types of any bit width, but they must have the same
15386 bit width. The second element of the result structure must be of type
15387 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15388 multiplication.
15389
15390 Semantics:
15391 """"""""""
15392
15393 The '``llvm.smul.with.overflow``' family of intrinsic functions perform
15394 a signed multiplication of the two arguments. They return a structure ---
15395 the first element of which is the multiplication, and the second element
15396 of which is a bit specifying if the signed multiplication resulted in an
15397 overflow.
15398
15399 Examples:
15400 """""""""
15401
15402 .. code-block:: llvm
15403
15404       %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
15405       %sum = extractvalue {i32, i1} %res, 0
15406       %obit = extractvalue {i32, i1} %res, 1
15407       br i1 %obit, label %overflow, label %normal
15408
15409 '``llvm.umul.with.overflow.*``' Intrinsics
15410 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15411
15412 Syntax:
15413 """""""
15414
15415 This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow``
15416 on any integer bit width or vectors of integers.
15417
15418 ::
15419
15420       declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b)
15421       declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
15422       declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
15423       declare {<4 x i32>, <4 x i1>} @llvm.umul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15424
15425 Overview:
15426 """""""""
15427
15428 The '``llvm.umul.with.overflow``' family of intrinsic functions perform
15429 a unsigned multiplication of the two arguments, and indicate whether an
15430 overflow occurred during the unsigned multiplication.
15431
15432 Arguments:
15433 """"""""""
15434
15435 The arguments (%a and %b) and the first element of the result structure
15436 may be of integer types of any bit width, but they must have the same
15437 bit width. The second element of the result structure must be of type
15438 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15439 multiplication.
15440
15441 Semantics:
15442 """"""""""
15443
15444 The '``llvm.umul.with.overflow``' family of intrinsic functions perform
15445 an unsigned multiplication of the two arguments. They return a structure ---
15446 the first element of which is the multiplication, and the second
15447 element of which is a bit specifying if the unsigned multiplication
15448 resulted in an overflow.
15449
15450 Examples:
15451 """""""""
15452
15453 .. code-block:: llvm
15454
15455       %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
15456       %sum = extractvalue {i32, i1} %res, 0
15457       %obit = extractvalue {i32, i1} %res, 1
15458       br i1 %obit, label %overflow, label %normal
15459
15460 Saturation Arithmetic Intrinsics
15461 ---------------------------------
15462
15463 Saturation arithmetic is a version of arithmetic in which operations are
15464 limited to a fixed range between a minimum and maximum value. If the result of
15465 an operation is greater than the maximum value, the result is set (or
15466 "clamped") to this maximum. If it is below the minimum, it is clamped to this
15467 minimum.
15468
15469
15470 '``llvm.sadd.sat.*``' Intrinsics
15471 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15472
15473 Syntax
15474 """""""
15475
15476 This is an overloaded intrinsic. You can use ``llvm.sadd.sat``
15477 on any integer bit width or vectors of integers.
15478
15479 ::
15480
15481       declare i16 @llvm.sadd.sat.i16(i16 %a, i16 %b)
15482       declare i32 @llvm.sadd.sat.i32(i32 %a, i32 %b)
15483       declare i64 @llvm.sadd.sat.i64(i64 %a, i64 %b)
15484       declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15485
15486 Overview
15487 """""""""
15488
15489 The '``llvm.sadd.sat``' family of intrinsic functions perform signed
15490 saturating addition on the 2 arguments.
15491
15492 Arguments
15493 """"""""""
15494
15495 The arguments (%a and %b) and the result may be of integer types of any bit
15496 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15497 values that will undergo signed addition.
15498
15499 Semantics:
15500 """"""""""
15501
15502 The maximum value this operation can clamp to is the largest signed value
15503 representable by the bit width of the arguments. The minimum value is the
15504 smallest signed value representable by this bit width.
15505
15506
15507 Examples
15508 """""""""
15509
15510 .. code-block:: llvm
15511
15512       %res = call i4 @llvm.sadd.sat.i4(i4 1, i4 2)  ; %res = 3
15513       %res = call i4 @llvm.sadd.sat.i4(i4 5, i4 6)  ; %res = 7
15514       %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 2)  ; %res = -2
15515       %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 -5)  ; %res = -8
15516
15517
15518 '``llvm.uadd.sat.*``' Intrinsics
15519 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15520
15521 Syntax
15522 """""""
15523
15524 This is an overloaded intrinsic. You can use ``llvm.uadd.sat``
15525 on any integer bit width or vectors of integers.
15526
15527 ::
15528
15529       declare i16 @llvm.uadd.sat.i16(i16 %a, i16 %b)
15530       declare i32 @llvm.uadd.sat.i32(i32 %a, i32 %b)
15531       declare i64 @llvm.uadd.sat.i64(i64 %a, i64 %b)
15532       declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15533
15534 Overview
15535 """""""""
15536
15537 The '``llvm.uadd.sat``' family of intrinsic functions perform unsigned
15538 saturating addition on the 2 arguments.
15539
15540 Arguments
15541 """"""""""
15542
15543 The arguments (%a and %b) and the result may be of integer types of any bit
15544 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15545 values that will undergo unsigned addition.
15546
15547 Semantics:
15548 """"""""""
15549
15550 The maximum value this operation can clamp to is the largest unsigned value
15551 representable by the bit width of the arguments. Because this is an unsigned
15552 operation, the result will never saturate towards zero.
15553
15554
15555 Examples
15556 """""""""
15557
15558 .. code-block:: llvm
15559
15560       %res = call i4 @llvm.uadd.sat.i4(i4 1, i4 2)  ; %res = 3
15561       %res = call i4 @llvm.uadd.sat.i4(i4 5, i4 6)  ; %res = 11
15562       %res = call i4 @llvm.uadd.sat.i4(i4 8, i4 8)  ; %res = 15
15563
15564
15565 '``llvm.ssub.sat.*``' Intrinsics
15566 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15567
15568 Syntax
15569 """""""
15570
15571 This is an overloaded intrinsic. You can use ``llvm.ssub.sat``
15572 on any integer bit width or vectors of integers.
15573
15574 ::
15575
15576       declare i16 @llvm.ssub.sat.i16(i16 %a, i16 %b)
15577       declare i32 @llvm.ssub.sat.i32(i32 %a, i32 %b)
15578       declare i64 @llvm.ssub.sat.i64(i64 %a, i64 %b)
15579       declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15580
15581 Overview
15582 """""""""
15583
15584 The '``llvm.ssub.sat``' family of intrinsic functions perform signed
15585 saturating subtraction on the 2 arguments.
15586
15587 Arguments
15588 """"""""""
15589
15590 The arguments (%a and %b) and the result may be of integer types of any bit
15591 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15592 values that will undergo signed subtraction.
15593
15594 Semantics:
15595 """"""""""
15596
15597 The maximum value this operation can clamp to is the largest signed value
15598 representable by the bit width of the arguments. The minimum value is the
15599 smallest signed value representable by this bit width.
15600
15601
15602 Examples
15603 """""""""
15604
15605 .. code-block:: llvm
15606
15607       %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 1)  ; %res = 1
15608       %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 6)  ; %res = -4
15609       %res = call i4 @llvm.ssub.sat.i4(i4 -4, i4 5)  ; %res = -8
15610       %res = call i4 @llvm.ssub.sat.i4(i4 4, i4 -5)  ; %res = 7
15611
15612
15613 '``llvm.usub.sat.*``' Intrinsics
15614 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15615
15616 Syntax
15617 """""""
15618
15619 This is an overloaded intrinsic. You can use ``llvm.usub.sat``
15620 on any integer bit width or vectors of integers.
15621
15622 ::
15623
15624       declare i16 @llvm.usub.sat.i16(i16 %a, i16 %b)
15625       declare i32 @llvm.usub.sat.i32(i32 %a, i32 %b)
15626       declare i64 @llvm.usub.sat.i64(i64 %a, i64 %b)
15627       declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15628
15629 Overview
15630 """""""""
15631
15632 The '``llvm.usub.sat``' family of intrinsic functions perform unsigned
15633 saturating subtraction on the 2 arguments.
15634
15635 Arguments
15636 """"""""""
15637
15638 The arguments (%a and %b) and the result may be of integer types of any bit
15639 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15640 values that will undergo unsigned subtraction.
15641
15642 Semantics:
15643 """"""""""
15644
15645 The minimum value this operation can clamp to is 0, which is the smallest
15646 unsigned value representable by the bit width of the unsigned arguments.
15647 Because this is an unsigned operation, the result will never saturate towards
15648 the largest possible value representable by this bit width.
15649
15650
15651 Examples
15652 """""""""
15653
15654 .. code-block:: llvm
15655
15656       %res = call i4 @llvm.usub.sat.i4(i4 2, i4 1)  ; %res = 1
15657       %res = call i4 @llvm.usub.sat.i4(i4 2, i4 6)  ; %res = 0
15658
15659
15660 '``llvm.sshl.sat.*``' Intrinsics
15661 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15662
15663 Syntax
15664 """""""
15665
15666 This is an overloaded intrinsic. You can use ``llvm.sshl.sat``
15667 on integers or vectors of integers of any bit width.
15668
15669 ::
15670
15671       declare i16 @llvm.sshl.sat.i16(i16 %a, i16 %b)
15672       declare i32 @llvm.sshl.sat.i32(i32 %a, i32 %b)
15673       declare i64 @llvm.sshl.sat.i64(i64 %a, i64 %b)
15674       declare <4 x i32> @llvm.sshl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15675
15676 Overview
15677 """""""""
15678
15679 The '``llvm.sshl.sat``' family of intrinsic functions perform signed
15680 saturating left shift on the first argument.
15681
15682 Arguments
15683 """"""""""
15684
15685 The arguments (``%a`` and ``%b``) and the result may be of integer types of any
15686 bit width, but they must have the same bit width. ``%a`` is the value to be
15687 shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
15688 dynamically) equal to or larger than the integer bit width of the arguments,
15689 the result is a :ref:`poison value <poisonvalues>`. If the arguments are
15690 vectors, each vector element of ``a`` is shifted by the corresponding shift
15691 amount in ``b``.
15692
15693
15694 Semantics:
15695 """"""""""
15696
15697 The maximum value this operation can clamp to is the largest signed value
15698 representable by the bit width of the arguments. The minimum value is the
15699 smallest signed value representable by this bit width.
15700
15701
15702 Examples
15703 """""""""
15704
15705 .. code-block:: llvm
15706
15707       %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 1)  ; %res = 4
15708       %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 2)  ; %res = 7
15709       %res = call i4 @llvm.sshl.sat.i4(i4 -5, i4 1)  ; %res = -8
15710       %res = call i4 @llvm.sshl.sat.i4(i4 -1, i4 1)  ; %res = -2
15711
15712
15713 '``llvm.ushl.sat.*``' Intrinsics
15714 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15715
15716 Syntax
15717 """""""
15718
15719 This is an overloaded intrinsic. You can use ``llvm.ushl.sat``
15720 on integers or vectors of integers of any bit width.
15721
15722 ::
15723
15724       declare i16 @llvm.ushl.sat.i16(i16 %a, i16 %b)
15725       declare i32 @llvm.ushl.sat.i32(i32 %a, i32 %b)
15726       declare i64 @llvm.ushl.sat.i64(i64 %a, i64 %b)
15727       declare <4 x i32> @llvm.ushl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15728
15729 Overview
15730 """""""""
15731
15732 The '``llvm.ushl.sat``' family of intrinsic functions perform unsigned
15733 saturating left shift on the first argument.
15734
15735 Arguments
15736 """"""""""
15737
15738 The arguments (``%a`` and ``%b``) and the result may be of integer types of any
15739 bit width, but they must have the same bit width. ``%a`` is the value to be
15740 shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
15741 dynamically) equal to or larger than the integer bit width of the arguments,
15742 the result is a :ref:`poison value <poisonvalues>`. If the arguments are
15743 vectors, each vector element of ``a`` is shifted by the corresponding shift
15744 amount in ``b``.
15745
15746 Semantics:
15747 """"""""""
15748
15749 The maximum value this operation can clamp to is the largest unsigned value
15750 representable by the bit width of the arguments.
15751
15752
15753 Examples
15754 """""""""
15755
15756 .. code-block:: llvm
15757
15758       %res = call i4 @llvm.ushl.sat.i4(i4 2, i4 1)  ; %res = 4
15759       %res = call i4 @llvm.ushl.sat.i4(i4 3, i4 3)  ; %res = 15
15760
15761
15762 Fixed Point Arithmetic Intrinsics
15763 ---------------------------------
15764
15765 A fixed point number represents a real data type for a number that has a fixed
15766 number of digits after a radix point (equivalent to the decimal point '.').
15767 The number of digits after the radix point is referred as the `scale`. These
15768 are useful for representing fractional values to a specific precision. The
15769 following intrinsics perform fixed point arithmetic operations on 2 operands
15770 of the same scale, specified as the third argument.
15771
15772 The ``llvm.*mul.fix`` family of intrinsic functions represents a multiplication
15773 of fixed point numbers through scaled integers. Therefore, fixed point
15774 multiplication can be represented as
15775
15776 .. code-block:: llvm
15777
15778         %result = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %scale)
15779
15780         ; Expands to
15781         %a2 = sext i4 %a to i8
15782         %b2 = sext i4 %b to i8
15783         %mul = mul nsw nuw i8 %a, %b
15784         %scale2 = trunc i32 %scale to i8
15785         %r = ashr i8 %mul, i8 %scale2  ; this is for a target rounding down towards negative infinity
15786         %result = trunc i8 %r to i4
15787
15788 The ``llvm.*div.fix`` family of intrinsic functions represents a division of
15789 fixed point numbers through scaled integers. Fixed point division can be
15790 represented as:
15791
15792 .. code-block:: llvm
15793
15794         %result call i4 @llvm.sdiv.fix.i4(i4 %a, i4 %b, i32 %scale)
15795
15796         ; Expands to
15797         %a2 = sext i4 %a to i8
15798         %b2 = sext i4 %b to i8
15799         %scale2 = trunc i32 %scale to i8
15800         %a3 = shl i8 %a2, %scale2
15801         %r = sdiv i8 %a3, %b2 ; this is for a target rounding towards zero
15802         %result = trunc i8 %r to i4
15803
15804 For each of these functions, if the result cannot be represented exactly with
15805 the provided scale, the result is rounded. Rounding is unspecified since
15806 preferred rounding may vary for different targets. Rounding is specified
15807 through a target hook. Different pipelines should legalize or optimize this
15808 using the rounding specified by this hook if it is provided. Operations like
15809 constant folding, instruction combining, KnownBits, and ValueTracking should
15810 also use this hook, if provided, and not assume the direction of rounding. A
15811 rounded result must always be within one unit of precision from the true
15812 result. That is, the error between the returned result and the true result must
15813 be less than 1/2^(scale).
15814
15815
15816 '``llvm.smul.fix.*``' Intrinsics
15817 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15818
15819 Syntax
15820 """""""
15821
15822 This is an overloaded intrinsic. You can use ``llvm.smul.fix``
15823 on any integer bit width or vectors of integers.
15824
15825 ::
15826
15827       declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale)
15828       declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale)
15829       declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale)
15830       declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
15831
15832 Overview
15833 """""""""
15834
15835 The '``llvm.smul.fix``' family of intrinsic functions perform signed
15836 fixed point multiplication on 2 arguments of the same scale.
15837
15838 Arguments
15839 """"""""""
15840
15841 The arguments (%a and %b) and the result may be of integer types of any bit
15842 width, but they must have the same bit width. The arguments may also work with
15843 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
15844 values that will undergo signed fixed point multiplication. The argument
15845 ``%scale`` represents the scale of both operands, and must be a constant
15846 integer.
15847
15848 Semantics:
15849 """"""""""
15850
15851 This operation performs fixed point multiplication on the 2 arguments of a
15852 specified scale. The result will also be returned in the same scale specified
15853 in the third argument.
15854
15855 If the result value cannot be precisely represented in the given scale, the
15856 value is rounded up or down to the closest representable value. The rounding
15857 direction is unspecified.
15858
15859 It is undefined behavior if the result value does not fit within the range of
15860 the fixed point type.
15861
15862
15863 Examples
15864 """""""""
15865
15866 .. code-block:: llvm
15867
15868       %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
15869       %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
15870       %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1)  ; %res = -3 (1.5 x -1 = -1.5)
15871
15872       ; The result in the following could be rounded up to -2 or down to -2.5
15873       %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1)  ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
15874
15875
15876 '``llvm.umul.fix.*``' Intrinsics
15877 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15878
15879 Syntax
15880 """""""
15881
15882 This is an overloaded intrinsic. You can use ``llvm.umul.fix``
15883 on any integer bit width or vectors of integers.
15884
15885 ::
15886
15887       declare i16 @llvm.umul.fix.i16(i16 %a, i16 %b, i32 %scale)
15888       declare i32 @llvm.umul.fix.i32(i32 %a, i32 %b, i32 %scale)
15889       declare i64 @llvm.umul.fix.i64(i64 %a, i64 %b, i32 %scale)
15890       declare <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
15891
15892 Overview
15893 """""""""
15894
15895 The '``llvm.umul.fix``' family of intrinsic functions perform unsigned
15896 fixed point multiplication on 2 arguments of the same scale.
15897
15898 Arguments
15899 """"""""""
15900
15901 The arguments (%a and %b) and the result may be of integer types of any bit
15902 width, but they must have the same bit width. The arguments may also work with
15903 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
15904 values that will undergo unsigned fixed point multiplication. The argument
15905 ``%scale`` represents the scale of both operands, and must be a constant
15906 integer.
15907
15908 Semantics:
15909 """"""""""
15910
15911 This operation performs unsigned fixed point multiplication on the 2 arguments of a
15912 specified scale. The result will also be returned in the same scale specified
15913 in the third argument.
15914
15915 If the result value cannot be precisely represented in the given scale, the
15916 value is rounded up or down to the closest representable value. The rounding
15917 direction is unspecified.
15918
15919 It is undefined behavior if the result value does not fit within the range of
15920 the fixed point type.
15921
15922
15923 Examples
15924 """""""""
15925
15926 .. code-block:: llvm
15927
15928       %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
15929       %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
15930
15931       ; The result in the following could be rounded down to 3.5 or up to 4
15932       %res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1)  ; %res = 7 (or 8) (7.5 x 0.5 = 3.75)
15933
15934
15935 '``llvm.smul.fix.sat.*``' Intrinsics
15936 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15937
15938 Syntax
15939 """""""
15940
15941 This is an overloaded intrinsic. You can use ``llvm.smul.fix.sat``
15942 on any integer bit width or vectors of integers.
15943
15944 ::
15945
15946       declare i16 @llvm.smul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
15947       declare i32 @llvm.smul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
15948       declare i64 @llvm.smul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
15949       declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
15950
15951 Overview
15952 """""""""
15953
15954 The '``llvm.smul.fix.sat``' family of intrinsic functions perform signed
15955 fixed point saturating multiplication on 2 arguments of the same scale.
15956
15957 Arguments
15958 """"""""""
15959
15960 The arguments (%a and %b) and the result may be of integer types of any bit
15961 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15962 values that will undergo signed fixed point multiplication. The argument
15963 ``%scale`` represents the scale of both operands, and must be a constant
15964 integer.
15965
15966 Semantics:
15967 """"""""""
15968
15969 This operation performs fixed point multiplication on the 2 arguments of a
15970 specified scale. The result will also be returned in the same scale specified
15971 in the third argument.
15972
15973 If the result value cannot be precisely represented in the given scale, the
15974 value is rounded up or down to the closest representable value. The rounding
15975 direction is unspecified.
15976
15977 The maximum value this operation can clamp to is the largest signed value
15978 representable by the bit width of the first 2 arguments. The minimum value is the
15979 smallest signed value representable by this bit width.
15980
15981
15982 Examples
15983 """""""""
15984
15985 .. code-block:: llvm
15986
15987       %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
15988       %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
15989       %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -2, i32 1)  ; %res = -3 (1.5 x -1 = -1.5)
15990
15991       ; The result in the following could be rounded up to -2 or down to -2.5
15992       %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -3, i32 1)  ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
15993
15994       ; Saturation
15995       %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 0)  ; %res = 7
15996       %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 4, i32 2)  ; %res = 7
15997       %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 5, i32 2)  ; %res = -8
15998       %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 -2, i32 1)  ; %res = 7
15999
16000       ; Scale can affect the saturation result
16001       %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 0)  ; %res = 7 (2 x 4 -> clamped to 7)
16002       %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1)  ; %res = 4 (1 x 2 = 2)
16003
16004
16005 '``llvm.umul.fix.sat.*``' Intrinsics
16006 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16007
16008 Syntax
16009 """""""
16010
16011 This is an overloaded intrinsic. You can use ``llvm.umul.fix.sat``
16012 on any integer bit width or vectors of integers.
16013
16014 ::
16015
16016       declare i16 @llvm.umul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16017       declare i32 @llvm.umul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16018       declare i64 @llvm.umul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16019       declare <4 x i32> @llvm.umul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16020
16021 Overview
16022 """""""""
16023
16024 The '``llvm.umul.fix.sat``' family of intrinsic functions perform unsigned
16025 fixed point saturating multiplication on 2 arguments of the same scale.
16026
16027 Arguments
16028 """"""""""
16029
16030 The arguments (%a and %b) and the result may be of integer types of any bit
16031 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16032 values that will undergo unsigned fixed point multiplication. The argument
16033 ``%scale`` represents the scale of both operands, and must be a constant
16034 integer.
16035
16036 Semantics:
16037 """"""""""
16038
16039 This operation performs fixed point multiplication on the 2 arguments of a
16040 specified scale. The result will also be returned in the same scale specified
16041 in the third argument.
16042
16043 If the result value cannot be precisely represented in the given scale, the
16044 value is rounded up or down to the closest representable value. The rounding
16045 direction is unspecified.
16046
16047 The maximum value this operation can clamp to is the largest unsigned value
16048 representable by the bit width of the first 2 arguments. The minimum value is the
16049 smallest unsigned value representable by this bit width (zero).
16050
16051
16052 Examples
16053 """""""""
16054
16055 .. code-block:: llvm
16056
16057       %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
16058       %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
16059
16060       ; The result in the following could be rounded down to 2 or up to 2.5
16061       %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 3, i32 1)  ; %res = 4 (or 5) (1.5 x 1.5 = 2.25)
16062
16063       ; Saturation
16064       %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 2, i32 0)  ; %res = 15 (8 x 2 -> clamped to 15)
16065       %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 8, i32 2)  ; %res = 15 (2 x 2 -> clamped to 3.75)
16066
16067       ; Scale can affect the saturation result
16068       %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 0)  ; %res = 7 (2 x 4 -> clamped to 7)
16069       %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 1)  ; %res = 4 (1 x 2 = 2)
16070
16071
16072 '``llvm.sdiv.fix.*``' Intrinsics
16073 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16074
16075 Syntax
16076 """""""
16077
16078 This is an overloaded intrinsic. You can use ``llvm.sdiv.fix``
16079 on any integer bit width or vectors of integers.
16080
16081 ::
16082
16083       declare i16 @llvm.sdiv.fix.i16(i16 %a, i16 %b, i32 %scale)
16084       declare i32 @llvm.sdiv.fix.i32(i32 %a, i32 %b, i32 %scale)
16085       declare i64 @llvm.sdiv.fix.i64(i64 %a, i64 %b, i32 %scale)
16086       declare <4 x i32> @llvm.sdiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16087
16088 Overview
16089 """""""""
16090
16091 The '``llvm.sdiv.fix``' family of intrinsic functions perform signed
16092 fixed point division on 2 arguments of the same scale.
16093
16094 Arguments
16095 """"""""""
16096
16097 The arguments (%a and %b) and the result may be of integer types of any bit
16098 width, but they must have the same bit width. The arguments may also work with
16099 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16100 values that will undergo signed fixed point division. The argument
16101 ``%scale`` represents the scale of both operands, and must be a constant
16102 integer.
16103
16104 Semantics:
16105 """"""""""
16106
16107 This operation performs fixed point division on the 2 arguments of a
16108 specified scale. The result will also be returned in the same scale specified
16109 in the third argument.
16110
16111 If the result value cannot be precisely represented in the given scale, the
16112 value is rounded up or down to the closest representable value. The rounding
16113 direction is unspecified.
16114
16115 It is undefined behavior if the result value does not fit within the range of
16116 the fixed point type, or if the second argument is zero.
16117
16118
16119 Examples
16120 """""""""
16121
16122 .. code-block:: llvm
16123
16124       %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16125       %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16126       %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
16127
16128       ; The result in the following could be rounded up to 1 or down to 0.5
16129       %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16130
16131
16132 '``llvm.udiv.fix.*``' Intrinsics
16133 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16134
16135 Syntax
16136 """""""
16137
16138 This is an overloaded intrinsic. You can use ``llvm.udiv.fix``
16139 on any integer bit width or vectors of integers.
16140
16141 ::
16142
16143       declare i16 @llvm.udiv.fix.i16(i16 %a, i16 %b, i32 %scale)
16144       declare i32 @llvm.udiv.fix.i32(i32 %a, i32 %b, i32 %scale)
16145       declare i64 @llvm.udiv.fix.i64(i64 %a, i64 %b, i32 %scale)
16146       declare <4 x i32> @llvm.udiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16147
16148 Overview
16149 """""""""
16150
16151 The '``llvm.udiv.fix``' family of intrinsic functions perform unsigned
16152 fixed point division on 2 arguments of the same scale.
16153
16154 Arguments
16155 """"""""""
16156
16157 The arguments (%a and %b) and the result may be of integer types of any bit
16158 width, but they must have the same bit width. The arguments may also work with
16159 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16160 values that will undergo unsigned fixed point division. The argument
16161 ``%scale`` represents the scale of both operands, and must be a constant
16162 integer.
16163
16164 Semantics:
16165 """"""""""
16166
16167 This operation performs fixed point division on the 2 arguments of a
16168 specified scale. The result will also be returned in the same scale specified
16169 in the third argument.
16170
16171 If the result value cannot be precisely represented in the given scale, the
16172 value is rounded up or down to the closest representable value. The rounding
16173 direction is unspecified.
16174
16175 It is undefined behavior if the result value does not fit within the range of
16176 the fixed point type, or if the second argument is zero.
16177
16178
16179 Examples
16180 """""""""
16181
16182 .. code-block:: llvm
16183
16184       %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16185       %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16186       %res = call i4 @llvm.udiv.fix.i4(i4 1, i4 -8, i32 4) ; %res = 2 (0.0625 / 0.5 = 0.125)
16187
16188       ; The result in the following could be rounded up to 1 or down to 0.5
16189       %res = call i4 @llvm.udiv.fix.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16190
16191
16192 '``llvm.sdiv.fix.sat.*``' Intrinsics
16193 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16194
16195 Syntax
16196 """""""
16197
16198 This is an overloaded intrinsic. You can use ``llvm.sdiv.fix.sat``
16199 on any integer bit width or vectors of integers.
16200
16201 ::
16202
16203       declare i16 @llvm.sdiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16204       declare i32 @llvm.sdiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16205       declare i64 @llvm.sdiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16206       declare <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16207
16208 Overview
16209 """""""""
16210
16211 The '``llvm.sdiv.fix.sat``' family of intrinsic functions perform signed
16212 fixed point saturating division on 2 arguments of the same scale.
16213
16214 Arguments
16215 """"""""""
16216
16217 The arguments (%a and %b) and the result may be of integer types of any bit
16218 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16219 values that will undergo signed fixed point division. The argument
16220 ``%scale`` represents the scale of both operands, and must be a constant
16221 integer.
16222
16223 Semantics:
16224 """"""""""
16225
16226 This operation performs fixed point division on the 2 arguments of a
16227 specified scale. The result will also be returned in the same scale specified
16228 in the third argument.
16229
16230 If the result value cannot be precisely represented in the given scale, the
16231 value is rounded up or down to the closest representable value. The rounding
16232 direction is unspecified.
16233
16234 The maximum value this operation can clamp to is the largest signed value
16235 representable by the bit width of the first 2 arguments. The minimum value is the
16236 smallest signed value representable by this bit width.
16237
16238 It is undefined behavior if the second argument is zero.
16239
16240
16241 Examples
16242 """""""""
16243
16244 .. code-block:: llvm
16245
16246       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16247       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16248       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
16249
16250       ; The result in the following could be rounded up to 1 or down to 0.5
16251       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16252
16253       ; Saturation
16254       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -8, i4 -1, i32 0)  ; %res = 7 (-8 / -1 = 8 => 7)
16255       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 4, i4 2, i32 2)  ; %res = 7 (1 / 0.5 = 2 => 1.75)
16256       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -4, i4 1, i32 2)  ; %res = -8 (-1 / 0.25 = -4 => -2)
16257
16258
16259 '``llvm.udiv.fix.sat.*``' Intrinsics
16260 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16261
16262 Syntax
16263 """""""
16264
16265 This is an overloaded intrinsic. You can use ``llvm.udiv.fix.sat``
16266 on any integer bit width or vectors of integers.
16267
16268 ::
16269
16270       declare i16 @llvm.udiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16271       declare i32 @llvm.udiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16272       declare i64 @llvm.udiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16273       declare <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16274
16275 Overview
16276 """""""""
16277
16278 The '``llvm.udiv.fix.sat``' family of intrinsic functions perform unsigned
16279 fixed point saturating division on 2 arguments of the same scale.
16280
16281 Arguments
16282 """"""""""
16283
16284 The arguments (%a and %b) and the result may be of integer types of any bit
16285 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16286 values that will undergo unsigned fixed point division. The argument
16287 ``%scale`` represents the scale of both operands, and must be a constant
16288 integer.
16289
16290 Semantics:
16291 """"""""""
16292
16293 This operation performs fixed point division on the 2 arguments of a
16294 specified scale. The result will also be returned in the same scale specified
16295 in the third argument.
16296
16297 If the result value cannot be precisely represented in the given scale, the
16298 value is rounded up or down to the closest representable value. The rounding
16299 direction is unspecified.
16300
16301 The maximum value this operation can clamp to is the largest unsigned value
16302 representable by the bit width of the first 2 arguments. The minimum value is the
16303 smallest unsigned value representable by this bit width (zero).
16304
16305 It is undefined behavior if the second argument is zero.
16306
16307 Examples
16308 """""""""
16309
16310 .. code-block:: llvm
16311
16312       %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16313       %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16314
16315       ; The result in the following could be rounded down to 0.5 or up to 1
16316       %res = call i4 @llvm.udiv.fix.sat.i4(i4 3, i4 4, i32 1)  ; %res = 1 (or 2) (1.5 / 2 = 0.75)
16317
16318       ; Saturation
16319       %res = call i4 @llvm.udiv.fix.sat.i4(i4 8, i4 2, i32 2)  ; %res = 15 (2 / 0.5 = 4 => 3.75)
16320
16321
16322 Specialised Arithmetic Intrinsics
16323 ---------------------------------
16324
16325 .. _i_intr_llvm_canonicalize:
16326
16327 '``llvm.canonicalize.*``' Intrinsic
16328 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16329
16330 Syntax:
16331 """""""
16332
16333 ::
16334
16335       declare float @llvm.canonicalize.f32(float %a)
16336       declare double @llvm.canonicalize.f64(double %b)
16337
16338 Overview:
16339 """""""""
16340
16341 The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical
16342 encoding of a floating-point number. This canonicalization is useful for
16343 implementing certain numeric primitives such as frexp. The canonical encoding is
16344 defined by IEEE-754-2008 to be:
16345
16346 ::
16347
16348       2.1.8 canonical encoding: The preferred encoding of a floating-point
16349       representation in a format. Applied to declets, significands of finite
16350       numbers, infinities, and NaNs, especially in decimal formats.
16351
16352 This operation can also be considered equivalent to the IEEE-754-2008
16353 conversion of a floating-point value to the same format. NaNs are handled
16354 according to section 6.2.
16355
16356 Examples of non-canonical encodings:
16357
16358 - x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are
16359   converted to a canonical representation per hardware-specific protocol.
16360 - Many normal decimal floating-point numbers have non-canonical alternative
16361   encodings.
16362 - Some machines, like GPUs or ARMv7 NEON, do not support subnormal values.
16363   These are treated as non-canonical encodings of zero and will be flushed to
16364   a zero of the same sign by this operation.
16365
16366 Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with
16367 default exception handling must signal an invalid exception, and produce a
16368 quiet NaN result.
16369
16370 This function should always be implementable as multiplication by 1.0, provided
16371 that the compiler does not constant fold the operation. Likewise, division by
16372 1.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with
16373 -0.0 is also sufficient provided that the rounding mode is not -Infinity.
16374
16375 ``@llvm.canonicalize`` must preserve the equality relation. That is:
16376
16377 - ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)``
16378 - ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent to
16379   to ``(x == y)``
16380
16381 Additionally, the sign of zero must be conserved:
16382 ``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0``
16383
16384 The payload bits of a NaN must be conserved, with two exceptions.
16385 First, environments which use only a single canonical representation of NaN
16386 must perform said canonicalization. Second, SNaNs must be quieted per the
16387 usual methods.
16388
16389 The canonicalization operation may be optimized away if:
16390
16391 - The input is known to be canonical. For example, it was produced by a
16392   floating-point operation that is required by the standard to be canonical.
16393 - The result is consumed only by (or fused with) other floating-point
16394   operations. That is, the bits of the floating-point value are not examined.
16395
16396 '``llvm.fmuladd.*``' Intrinsic
16397 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16398
16399 Syntax:
16400 """""""
16401
16402 ::
16403
16404       declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
16405       declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
16406
16407 Overview:
16408 """""""""
16409
16410 The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add
16411 expressions that can be fused if the code generator determines that (a) the
16412 target instruction set has support for a fused operation, and (b) that the
16413 fused operation is more efficient than the equivalent, separate pair of mul
16414 and add instructions.
16415
16416 Arguments:
16417 """"""""""
16418
16419 The '``llvm.fmuladd.*``' intrinsics each take three arguments: two
16420 multiplicands, a and b, and an addend c.
16421
16422 Semantics:
16423 """"""""""
16424
16425 The expression:
16426
16427 ::
16428
16429       %0 = call float @llvm.fmuladd.f32(%a, %b, %c)
16430
16431 is equivalent to the expression a \* b + c, except that it is unspecified
16432 whether rounding will be performed between the multiplication and addition
16433 steps. Fusion is not guaranteed, even if the target platform supports it.
16434 If a fused multiply-add is required, the corresponding
16435 :ref:`llvm.fma <int_fma>` intrinsic function should be used instead.
16436 This never sets errno, just as '``llvm.fma.*``'.
16437
16438 Examples:
16439 """""""""
16440
16441 .. code-block:: llvm
16442
16443       %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
16444
16445
16446 Hardware-Loop Intrinsics
16447 ------------------------
16448
16449 LLVM support several intrinsics to mark a loop as a hardware-loop. They are
16450 hints to the backend which are required to lower these intrinsics further to target
16451 specific instructions, or revert the hardware-loop to a normal loop if target
16452 specific restriction are not met and a hardware-loop can't be generated.
16453
16454 These intrinsics may be modified in the future and are not intended to be used
16455 outside the backend. Thus, front-end and mid-level optimizations should not be
16456 generating these intrinsics.
16457
16458
16459 '``llvm.set.loop.iterations.*``' Intrinsic
16460 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16461
16462 Syntax:
16463 """""""
16464
16465 This is an overloaded intrinsic.
16466
16467 ::
16468
16469       declare void @llvm.set.loop.iterations.i32(i32)
16470       declare void @llvm.set.loop.iterations.i64(i64)
16471
16472 Overview:
16473 """""""""
16474
16475 The '``llvm.set.loop.iterations.*``' intrinsics are used to specify the
16476 hardware-loop trip count. They are placed in the loop preheader basic block and
16477 are marked as ``IntrNoDuplicate`` to avoid optimizers duplicating these
16478 instructions.
16479
16480 Arguments:
16481 """"""""""
16482
16483 The integer operand is the loop trip count of the hardware-loop, and thus
16484 not e.g. the loop back-edge taken count.
16485
16486 Semantics:
16487 """"""""""
16488
16489 The '``llvm.set.loop.iterations.*``' intrinsics do not perform any arithmetic
16490 on their operand. It's a hint to the backend that can use this to set up the
16491 hardware-loop count with a target specific instruction, usually a move of this
16492 value to a special register or a hardware-loop instruction.
16493
16494
16495 '``llvm.start.loop.iterations.*``' Intrinsic
16496 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16497
16498 Syntax:
16499 """""""
16500
16501 This is an overloaded intrinsic.
16502
16503 ::
16504
16505       declare i32 @llvm.start.loop.iterations.i32(i32)
16506       declare i64 @llvm.start.loop.iterations.i64(i64)
16507
16508 Overview:
16509 """""""""
16510
16511 The '``llvm.start.loop.iterations.*``' intrinsics are similar to the
16512 '``llvm.set.loop.iterations.*``' intrinsics, used to specify the
16513 hardware-loop trip count but also produce a value identical to the input
16514 that can be used as the input to the loop. They are placed in the loop
16515 preheader basic block and the output is expected to be the input to the
16516 phi for the induction variable of the loop, decremented by the
16517 '``llvm.loop.decrement.reg.*``'.
16518
16519 Arguments:
16520 """"""""""
16521
16522 The integer operand is the loop trip count of the hardware-loop, and thus
16523 not e.g. the loop back-edge taken count.
16524
16525 Semantics:
16526 """"""""""
16527
16528 The '``llvm.start.loop.iterations.*``' intrinsics do not perform any arithmetic
16529 on their operand. It's a hint to the backend that can use this to set up the
16530 hardware-loop count with a target specific instruction, usually a move of this
16531 value to a special register or a hardware-loop instruction.
16532
16533 '``llvm.test.set.loop.iterations.*``' Intrinsic
16534 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16535
16536 Syntax:
16537 """""""
16538
16539 This is an overloaded intrinsic.
16540
16541 ::
16542
16543       declare i1 @llvm.test.set.loop.iterations.i32(i32)
16544       declare i1 @llvm.test.set.loop.iterations.i64(i64)
16545
16546 Overview:
16547 """""""""
16548
16549 The '``llvm.test.set.loop.iterations.*``' intrinsics are used to specify the
16550 the loop trip count, and also test that the given count is not zero, allowing
16551 it to control entry to a while-loop.  They are placed in the loop preheader's
16552 predecessor basic block, and are marked as ``IntrNoDuplicate`` to avoid
16553 optimizers duplicating these instructions.
16554
16555 Arguments:
16556 """"""""""
16557
16558 The integer operand is the loop trip count of the hardware-loop, and thus
16559 not e.g. the loop back-edge taken count.
16560
16561 Semantics:
16562 """"""""""
16563
16564 The '``llvm.test.set.loop.iterations.*``' intrinsics do not perform any
16565 arithmetic on their operand. It's a hint to the backend that can use this to
16566 set up the hardware-loop count with a target specific instruction, usually a
16567 move of this value to a special register or a hardware-loop instruction.
16568 The result is the conditional value of whether the given count is not zero.
16569
16570
16571 '``llvm.test.start.loop.iterations.*``' Intrinsic
16572 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16573
16574 Syntax:
16575 """""""
16576
16577 This is an overloaded intrinsic.
16578
16579 ::
16580
16581       declare {i32, i1} @llvm.test.start.loop.iterations.i32(i32)
16582       declare {i64, i1} @llvm.test.start.loop.iterations.i64(i64)
16583
16584 Overview:
16585 """""""""
16586
16587 The '``llvm.test.start.loop.iterations.*``' intrinsics are similar to the
16588 '``llvm.test.set.loop.iterations.*``' and '``llvm.start.loop.iterations.*``'
16589 intrinsics, used to specify the hardware-loop trip count, but also produce a
16590 value identical to the input that can be used as the input to the loop. The
16591 second i1 output controls entry to a while-loop.
16592
16593 Arguments:
16594 """"""""""
16595
16596 The integer operand is the loop trip count of the hardware-loop, and thus
16597 not e.g. the loop back-edge taken count.
16598
16599 Semantics:
16600 """"""""""
16601
16602 The '``llvm.test.start.loop.iterations.*``' intrinsics do not perform any
16603 arithmetic on their operand. It's a hint to the backend that can use this to
16604 set up the hardware-loop count with a target specific instruction, usually a
16605 move of this value to a special register or a hardware-loop instruction.
16606 The result is a pair of the input and a conditional value of whether the
16607 given count is not zero.
16608
16609
16610 '``llvm.loop.decrement.reg.*``' Intrinsic
16611 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16612
16613 Syntax:
16614 """""""
16615
16616 This is an overloaded intrinsic.
16617
16618 ::
16619
16620       declare i32 @llvm.loop.decrement.reg.i32(i32, i32)
16621       declare i64 @llvm.loop.decrement.reg.i64(i64, i64)
16622
16623 Overview:
16624 """""""""
16625
16626 The '``llvm.loop.decrement.reg.*``' intrinsics are used to lower the loop
16627 iteration counter and return an updated value that will be used in the next
16628 loop test check.
16629
16630 Arguments:
16631 """"""""""
16632
16633 Both arguments must have identical integer types. The first operand is the
16634 loop iteration counter. The second operand is the maximum number of elements
16635 processed in an iteration.
16636
16637 Semantics:
16638 """"""""""
16639
16640 The '``llvm.loop.decrement.reg.*``' intrinsics do an integer ``SUB`` of its
16641 two operands, which is not allowed to wrap. They return the remaining number of
16642 iterations still to be executed, and can be used together with a ``PHI``,
16643 ``ICMP`` and ``BR`` to control the number of loop iterations executed. Any
16644 optimisations are allowed to treat it is a ``SUB``, and it is supported by
16645 SCEV, so it's the backends responsibility to handle cases where it may be
16646 optimised. These intrinsics are marked as ``IntrNoDuplicate`` to avoid
16647 optimizers duplicating these instructions.
16648
16649
16650 '``llvm.loop.decrement.*``' Intrinsic
16651 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16652
16653 Syntax:
16654 """""""
16655
16656 This is an overloaded intrinsic.
16657
16658 ::
16659
16660       declare i1 @llvm.loop.decrement.i32(i32)
16661       declare i1 @llvm.loop.decrement.i64(i64)
16662
16663 Overview:
16664 """""""""
16665
16666 The HardwareLoops pass allows the loop decrement value to be specified with an
16667 option. It defaults to a loop decrement value of 1, but it can be an unsigned
16668 integer value provided by this option.  The '``llvm.loop.decrement.*``'
16669 intrinsics decrement the loop iteration counter with this value, and return a
16670 false predicate if the loop should exit, and true otherwise.
16671 This is emitted if the loop counter is not updated via a ``PHI`` node, which
16672 can also be controlled with an option.
16673
16674 Arguments:
16675 """"""""""
16676
16677 The integer argument is the loop decrement value used to decrement the loop
16678 iteration counter.
16679
16680 Semantics:
16681 """"""""""
16682
16683 The '``llvm.loop.decrement.*``' intrinsics do a ``SUB`` of the loop iteration
16684 counter with the given loop decrement value, and return false if the loop
16685 should exit, this ``SUB`` is not allowed to wrap. The result is a condition
16686 that is used by the conditional branch controlling the loop.
16687
16688
16689 Vector Reduction Intrinsics
16690 ---------------------------
16691
16692 Horizontal reductions of vectors can be expressed using the following
16693 intrinsics. Each one takes a vector operand as an input and applies its
16694 respective operation across all elements of the vector, returning a single
16695 scalar result of the same element type.
16696
16697 .. _int_vector_reduce_add:
16698
16699 '``llvm.vector.reduce.add.*``' Intrinsic
16700 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16701
16702 Syntax:
16703 """""""
16704
16705 ::
16706
16707       declare i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %a)
16708       declare i64 @llvm.vector.reduce.add.v2i64(<2 x i64> %a)
16709
16710 Overview:
16711 """""""""
16712
16713 The '``llvm.vector.reduce.add.*``' intrinsics do an integer ``ADD``
16714 reduction of a vector, returning the result as a scalar. The return type matches
16715 the element-type of the vector input.
16716
16717 Arguments:
16718 """"""""""
16719 The argument to this intrinsic must be a vector of integer values.
16720
16721 .. _int_vector_reduce_fadd:
16722
16723 '``llvm.vector.reduce.fadd.*``' Intrinsic
16724 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16725
16726 Syntax:
16727 """""""
16728
16729 ::
16730
16731       declare float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %a)
16732       declare double @llvm.vector.reduce.fadd.v2f64(double %start_value, <2 x double> %a)
16733
16734 Overview:
16735 """""""""
16736
16737 The '``llvm.vector.reduce.fadd.*``' intrinsics do a floating-point
16738 ``ADD`` reduction of a vector, returning the result as a scalar. The return type
16739 matches the element-type of the vector input.
16740
16741 If the intrinsic call has the 'reassoc' flag set, then the reduction will not
16742 preserve the associativity of an equivalent scalarized counterpart. Otherwise
16743 the reduction will be *sequential*, thus implying that the operation respects
16744 the associativity of a scalarized reduction. That is, the reduction begins with
16745 the start value and performs an fadd operation with consecutively increasing
16746 vector element indices. See the following pseudocode:
16747
16748 ::
16749
16750     float sequential_fadd(start_value, input_vector)
16751       result = start_value
16752       for i = 0 to length(input_vector)
16753         result = result + input_vector[i]
16754       return result
16755
16756
16757 Arguments:
16758 """"""""""
16759 The first argument to this intrinsic is a scalar start value for the reduction.
16760 The type of the start value matches the element-type of the vector input.
16761 The second argument must be a vector of floating-point values.
16762
16763 To ignore the start value, negative zero (``-0.0``) can be used, as it is
16764 the neutral value of floating point addition.
16765
16766 Examples:
16767 """""""""
16768
16769 ::
16770
16771       %unord = call reassoc float @llvm.vector.reduce.fadd.v4f32(float -0.0, <4 x float> %input) ; relaxed reduction
16772       %ord = call float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
16773
16774
16775 .. _int_vector_reduce_mul:
16776
16777 '``llvm.vector.reduce.mul.*``' Intrinsic
16778 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16779
16780 Syntax:
16781 """""""
16782
16783 ::
16784
16785       declare i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %a)
16786       declare i64 @llvm.vector.reduce.mul.v2i64(<2 x i64> %a)
16787
16788 Overview:
16789 """""""""
16790
16791 The '``llvm.vector.reduce.mul.*``' intrinsics do an integer ``MUL``
16792 reduction of a vector, returning the result as a scalar. The return type matches
16793 the element-type of the vector input.
16794
16795 Arguments:
16796 """"""""""
16797 The argument to this intrinsic must be a vector of integer values.
16798
16799 .. _int_vector_reduce_fmul:
16800
16801 '``llvm.vector.reduce.fmul.*``' Intrinsic
16802 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16803
16804 Syntax:
16805 """""""
16806
16807 ::
16808
16809       declare float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %a)
16810       declare double @llvm.vector.reduce.fmul.v2f64(double %start_value, <2 x double> %a)
16811
16812 Overview:
16813 """""""""
16814
16815 The '``llvm.vector.reduce.fmul.*``' intrinsics do a floating-point
16816 ``MUL`` reduction of a vector, returning the result as a scalar. The return type
16817 matches the element-type of the vector input.
16818
16819 If the intrinsic call has the 'reassoc' flag set, then the reduction will not
16820 preserve the associativity of an equivalent scalarized counterpart. Otherwise
16821 the reduction will be *sequential*, thus implying that the operation respects
16822 the associativity of a scalarized reduction. That is, the reduction begins with
16823 the start value and performs an fmul operation with consecutively increasing
16824 vector element indices. See the following pseudocode:
16825
16826 ::
16827
16828     float sequential_fmul(start_value, input_vector)
16829       result = start_value
16830       for i = 0 to length(input_vector)
16831         result = result * input_vector[i]
16832       return result
16833
16834
16835 Arguments:
16836 """"""""""
16837 The first argument to this intrinsic is a scalar start value for the reduction.
16838 The type of the start value matches the element-type of the vector input.
16839 The second argument must be a vector of floating-point values.
16840
16841 To ignore the start value, one (``1.0``) can be used, as it is the neutral
16842 value of floating point multiplication.
16843
16844 Examples:
16845 """""""""
16846
16847 ::
16848
16849       %unord = call reassoc float @llvm.vector.reduce.fmul.v4f32(float 1.0, <4 x float> %input) ; relaxed reduction
16850       %ord = call float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
16851
16852 .. _int_vector_reduce_and:
16853
16854 '``llvm.vector.reduce.and.*``' Intrinsic
16855 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16856
16857 Syntax:
16858 """""""
16859
16860 ::
16861
16862       declare i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %a)
16863
16864 Overview:
16865 """""""""
16866
16867 The '``llvm.vector.reduce.and.*``' intrinsics do a bitwise ``AND``
16868 reduction of a vector, returning the result as a scalar. The return type matches
16869 the element-type of the vector input.
16870
16871 Arguments:
16872 """"""""""
16873 The argument to this intrinsic must be a vector of integer values.
16874
16875 .. _int_vector_reduce_or:
16876
16877 '``llvm.vector.reduce.or.*``' Intrinsic
16878 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16879
16880 Syntax:
16881 """""""
16882
16883 ::
16884
16885       declare i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %a)
16886
16887 Overview:
16888 """""""""
16889
16890 The '``llvm.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction
16891 of a vector, returning the result as a scalar. The return type matches the
16892 element-type of the vector input.
16893
16894 Arguments:
16895 """"""""""
16896 The argument to this intrinsic must be a vector of integer values.
16897
16898 .. _int_vector_reduce_xor:
16899
16900 '``llvm.vector.reduce.xor.*``' Intrinsic
16901 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16902
16903 Syntax:
16904 """""""
16905
16906 ::
16907
16908       declare i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %a)
16909
16910 Overview:
16911 """""""""
16912
16913 The '``llvm.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR``
16914 reduction of a vector, returning the result as a scalar. The return type matches
16915 the element-type of the vector input.
16916
16917 Arguments:
16918 """"""""""
16919 The argument to this intrinsic must be a vector of integer values.
16920
16921 .. _int_vector_reduce_smax:
16922
16923 '``llvm.vector.reduce.smax.*``' Intrinsic
16924 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16925
16926 Syntax:
16927 """""""
16928
16929 ::
16930
16931       declare i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> %a)
16932
16933 Overview:
16934 """""""""
16935
16936 The '``llvm.vector.reduce.smax.*``' intrinsics do a signed integer
16937 ``MAX`` reduction of a vector, returning the result as a scalar. The return type
16938 matches the element-type of the vector input.
16939
16940 Arguments:
16941 """"""""""
16942 The argument to this intrinsic must be a vector of integer values.
16943
16944 .. _int_vector_reduce_smin:
16945
16946 '``llvm.vector.reduce.smin.*``' Intrinsic
16947 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16948
16949 Syntax:
16950 """""""
16951
16952 ::
16953
16954       declare i32 @llvm.vector.reduce.smin.v4i32(<4 x i32> %a)
16955
16956 Overview:
16957 """""""""
16958
16959 The '``llvm.vector.reduce.smin.*``' intrinsics do a signed integer
16960 ``MIN`` reduction of a vector, returning the result as a scalar. The return type
16961 matches the element-type of the vector input.
16962
16963 Arguments:
16964 """"""""""
16965 The argument to this intrinsic must be a vector of integer values.
16966
16967 .. _int_vector_reduce_umax:
16968
16969 '``llvm.vector.reduce.umax.*``' Intrinsic
16970 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16971
16972 Syntax:
16973 """""""
16974
16975 ::
16976
16977       declare i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %a)
16978
16979 Overview:
16980 """""""""
16981
16982 The '``llvm.vector.reduce.umax.*``' intrinsics do an unsigned
16983 integer ``MAX`` reduction of a vector, returning the result as a scalar. The
16984 return type matches the element-type of the vector input.
16985
16986 Arguments:
16987 """"""""""
16988 The argument to this intrinsic must be a vector of integer values.
16989
16990 .. _int_vector_reduce_umin:
16991
16992 '``llvm.vector.reduce.umin.*``' Intrinsic
16993 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16994
16995 Syntax:
16996 """""""
16997
16998 ::
16999
17000       declare i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %a)
17001
17002 Overview:
17003 """""""""
17004
17005 The '``llvm.vector.reduce.umin.*``' intrinsics do an unsigned
17006 integer ``MIN`` reduction of a vector, returning the result as a scalar. The
17007 return type matches the element-type of the vector input.
17008
17009 Arguments:
17010 """"""""""
17011 The argument to this intrinsic must be a vector of integer values.
17012
17013 .. _int_vector_reduce_fmax:
17014
17015 '``llvm.vector.reduce.fmax.*``' Intrinsic
17016 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17017
17018 Syntax:
17019 """""""
17020
17021 ::
17022
17023       declare float @llvm.vector.reduce.fmax.v4f32(<4 x float> %a)
17024       declare double @llvm.vector.reduce.fmax.v2f64(<2 x double> %a)
17025
17026 Overview:
17027 """""""""
17028
17029 The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point
17030 ``MAX`` reduction of a vector, returning the result as a scalar. The return type
17031 matches the element-type of the vector input.
17032
17033 This instruction has the same comparison semantics as the '``llvm.maxnum.*``'
17034 intrinsic. That is, the result will always be a number unless all elements of
17035 the vector are NaN. For a vector with maximum element magnitude 0.0 and
17036 containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
17037
17038 If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
17039 assume that NaNs are not present in the input vector.
17040
17041 Arguments:
17042 """"""""""
17043 The argument to this intrinsic must be a vector of floating-point values.
17044
17045 .. _int_vector_reduce_fmin:
17046
17047 '``llvm.vector.reduce.fmin.*``' Intrinsic
17048 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17049
17050 Syntax:
17051 """""""
17052 This is an overloaded intrinsic.
17053
17054 ::
17055
17056       declare float @llvm.vector.reduce.fmin.v4f32(<4 x float> %a)
17057       declare double @llvm.vector.reduce.fmin.v2f64(<2 x double> %a)
17058
17059 Overview:
17060 """""""""
17061
17062 The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point
17063 ``MIN`` reduction of a vector, returning the result as a scalar. The return type
17064 matches the element-type of the vector input.
17065
17066 This instruction has the same comparison semantics as the '``llvm.minnum.*``'
17067 intrinsic. That is, the result will always be a number unless all elements of
17068 the vector are NaN. For a vector with minimum element magnitude 0.0 and
17069 containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
17070
17071 If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
17072 assume that NaNs are not present in the input vector.
17073
17074 Arguments:
17075 """"""""""
17076 The argument to this intrinsic must be a vector of floating-point values.
17077
17078 '``llvm.experimental.vector.insert``' Intrinsic
17079 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17080
17081 Syntax:
17082 """""""
17083 This is an overloaded intrinsic. You can use ``llvm.experimental.vector.insert``
17084 to insert a fixed-width vector into a scalable vector, but not the other way
17085 around.
17086
17087 ::
17088
17089       declare <vscale x 4 x float> @llvm.experimental.vector.insert.v4f32(<vscale x 4 x float> %vec, <4 x float> %subvec, i64 %idx)
17090       declare <vscale x 2 x double> @llvm.experimental.vector.insert.v2f64(<vscale x 2 x double> %vec, <2 x double> %subvec, i64 %idx)
17091
17092 Overview:
17093 """""""""
17094
17095 The '``llvm.experimental.vector.insert.*``' intrinsics insert a vector into another vector
17096 starting from a given index. The return type matches the type of the vector we
17097 insert into. Conceptually, this can be used to build a scalable vector out of
17098 non-scalable vectors.
17099
17100 Arguments:
17101 """"""""""
17102
17103 The ``vec`` is the vector which ``subvec`` will be inserted into.
17104 The ``subvec`` is the vector that will be inserted.
17105
17106 ``idx`` represents the starting element number at which ``subvec`` will be
17107 inserted. ``idx`` must be a constant multiple of ``subvec``'s known minimum
17108 vector length. If ``subvec`` is a scalable vector, ``idx`` is first scaled by
17109 the runtime scaling factor of ``subvec``. The elements of ``vec`` starting at
17110 ``idx`` are overwritten with ``subvec``. Elements ``idx`` through (``idx`` +
17111 num_elements(``subvec``) - 1) must be valid ``vec`` indices. If this condition
17112 cannot be determined statically but is false at runtime, then the result vector
17113 is undefined.
17114
17115
17116 '``llvm.experimental.vector.extract``' Intrinsic
17117 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17118
17119 Syntax:
17120 """""""
17121 This is an overloaded intrinsic. You can use
17122 ``llvm.experimental.vector.extract`` to extract a fixed-width vector from a
17123 scalable vector, but not the other way around.
17124
17125 ::
17126
17127       declare <4 x float> @llvm.experimental.vector.extract.v4f32(<vscale x 4 x float> %vec, i64 %idx)
17128       declare <2 x double> @llvm.experimental.vector.extract.v2f64(<vscale x 2 x double> %vec, i64 %idx)
17129
17130 Overview:
17131 """""""""
17132
17133 The '``llvm.experimental.vector.extract.*``' intrinsics extract a vector from
17134 within another vector starting from a given index. The return type must be
17135 explicitly specified. Conceptually, this can be used to decompose a scalable
17136 vector into non-scalable parts.
17137
17138 Arguments:
17139 """"""""""
17140
17141 The ``vec`` is the vector from which we will extract a subvector.
17142
17143 The ``idx`` specifies the starting element number within ``vec`` from which a
17144 subvector is extracted. ``idx`` must be a constant multiple of the known-minimum
17145 vector length of the result type. If the result type is a scalable vector,
17146 ``idx`` is first scaled by the result type's runtime scaling factor. Elements
17147 ``idx`` through (``idx`` + num_elements(result_type) - 1) must be valid vector
17148 indices. If this condition cannot be determined statically but is false at
17149 runtime, then the result vector is undefined. The ``idx`` parameter must be a
17150 vector index constant type (for most targets this will be an integer pointer
17151 type).
17152
17153 '``llvm.experimental.vector.reverse``' Intrinsic
17154 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17155
17156 Syntax:
17157 """""""
17158 This is an overloaded intrinsic.
17159
17160 ::
17161
17162       declare <2 x i8> @llvm.experimental.vector.reverse.v2i8(<2 x i8> %a)
17163       declare <vscale x 4 x i32> @llvm.experimental.vector.reverse.nxv4i32(<vscale x 4 x i32> %a)
17164
17165 Overview:
17166 """""""""
17167
17168 The '``llvm.experimental.vector.reverse.*``' intrinsics reverse a vector.
17169 The intrinsic takes a single vector and returns a vector of matching type but
17170 with the original lane order reversed. These intrinsics work for both fixed
17171 and scalable vectors. While this intrinsic is marked as experimental the
17172 recommended way to express reverse operations for fixed-width vectors is still
17173 to use a shufflevector, as that may allow for more optimization opportunities.
17174
17175 Arguments:
17176 """"""""""
17177
17178 The argument to this intrinsic must be a vector.
17179
17180 '``llvm.experimental.vector.splice``' Intrinsic
17181 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17182
17183 Syntax:
17184 """""""
17185 This is an overloaded intrinsic.
17186
17187 ::
17188
17189       declare <2 x double> @llvm.experimental.vector.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm)
17190       declare <vscale x 4 x i32> @llvm.experimental.vector.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm)
17191
17192 Overview:
17193 """""""""
17194
17195 The '``llvm.experimental.vector.splice.*``' intrinsics construct a vector by
17196 concatenating elements from the first input vector with elements of the second
17197 input vector, returning a vector of the same type as the input vectors. The
17198 signed immediate, modulo the number of elements in the vector, is the index
17199 into the first vector from which to extract the result value. This means
17200 conceptually that for a positive immediate, a vector is extracted from
17201 ``concat(%vec1, %vec2)`` starting at index ``imm``, whereas for a negative
17202 immediate, it extracts ``-imm`` trailing elements from the first vector, and
17203 the remaining elements from ``%vec2``.
17204
17205 These intrinsics work for both fixed and scalable vectors. While this intrinsic
17206 is marked as experimental, the recommended way to express this operation for
17207 fixed-width vectors is still to use a shufflevector, as that may allow for more
17208 optimization opportunities.
17209
17210 For example:
17211
17212 .. code-block:: text
17213
17214  llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1)  ==> <B, C, D, E> ; index
17215  llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3) ==> <B, C, D, E> ; trailing elements
17216
17217
17218 Arguments:
17219 """"""""""
17220
17221 The first two operands are vectors with the same type. The start index is imm
17222 modulo the runtime number of elements in the source vector. For a fixed-width
17223 vector <N x eltty>, imm is a signed integer constant in the range
17224 -N <= imm < N. For a scalable vector <vscale x N x eltty>, imm is a signed
17225 integer constant in the range -X <= imm < X where X=vscale_range_min * N.
17226
17227 '``llvm.experimental.stepvector``' Intrinsic
17228 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17229
17230 This is an overloaded intrinsic. You can use ``llvm.experimental.stepvector``
17231 to generate a vector whose lane values comprise the linear sequence
17232 <0, 1, 2, ...>. It is primarily intended for scalable vectors.
17233
17234 ::
17235
17236       declare <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
17237       declare <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()
17238
17239 The '``llvm.experimental.stepvector``' intrinsics are used to create vectors
17240 of integers whose elements contain a linear sequence of values starting from 0
17241 with a step of 1.  This experimental intrinsic can only be used for vectors
17242 with integer elements that are at least 8 bits in size. If the sequence value
17243 exceeds the allowed limit for the element type then the result for that lane is
17244 undefined.
17245
17246 These intrinsics work for both fixed and scalable vectors. While this intrinsic
17247 is marked as experimental, the recommended way to express this operation for
17248 fixed-width vectors is still to generate a constant vector instead.
17249
17250
17251 Arguments:
17252 """"""""""
17253
17254 None.
17255
17256
17257 Matrix Intrinsics
17258 -----------------
17259
17260 Operations on matrixes requiring shape information (like number of rows/columns
17261 or the memory layout) can be expressed using the matrix intrinsics. These
17262 intrinsics require matrix dimensions to be passed as immediate arguments, and
17263 matrixes are passed and returned as vectors. This means that for a ``R`` x
17264 ``C`` matrix, element ``i`` of column ``j`` is at index ``j * R + i`` in the
17265 corresponding vector, with indices starting at 0. Currently column-major layout
17266 is assumed.  The intrinsics support both integer and floating point matrixes.
17267
17268
17269 '``llvm.matrix.transpose.*``' Intrinsic
17270 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17271
17272 Syntax:
17273 """""""
17274 This is an overloaded intrinsic.
17275
17276 ::
17277
17278       declare vectorty @llvm.matrix.transpose.*(vectorty %In, i32 <Rows>, i32 <Cols>)
17279
17280 Overview:
17281 """""""""
17282
17283 The '``llvm.matrix.transpose.*``' intrinsics treat ``%In`` as a ``<Rows> x
17284 <Cols>`` matrix and return the transposed matrix in the result vector.
17285
17286 Arguments:
17287 """"""""""
17288
17289 The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
17290 <Cols>`` matrix. Thus, arguments ``<Rows>`` and ``<Cols>`` correspond to the
17291 number of rows and columns, respectively, and must be positive, constant
17292 integers. The returned vector must have ``<Rows> * <Cols>`` elements, and have
17293 the same float or integer element type as ``%In``.
17294
17295 '``llvm.matrix.multiply.*``' Intrinsic
17296 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17297
17298 Syntax:
17299 """""""
17300 This is an overloaded intrinsic.
17301
17302 ::
17303
17304       declare vectorty @llvm.matrix.multiply.*(vectorty %A, vectorty %B, i32 <OuterRows>, i32 <Inner>, i32 <OuterColumns>)
17305
17306 Overview:
17307 """""""""
17308
17309 The '``llvm.matrix.multiply.*``' intrinsics treat ``%A`` as a ``<OuterRows> x
17310 <Inner>`` matrix, ``%B`` as a ``<Inner> x <OuterColumns>`` matrix, and
17311 multiplies them. The result matrix is returned in the result vector.
17312
17313 Arguments:
17314 """"""""""
17315
17316 The first vector argument ``%A`` corresponds to a matrix with ``<OuterRows> *
17317 <Inner>`` elements, and the second argument ``%B`` to a matrix with
17318 ``<Inner> * <OuterColumns>`` elements. Arguments ``<OuterRows>``,
17319 ``<Inner>`` and ``<OuterColumns>`` must be positive, constant integers. The
17320 returned vector must have ``<OuterRows> * <OuterColumns>`` elements.
17321 Vectors ``%A``, ``%B``, and the returned vector all have the same float or
17322 integer element type.
17323
17324
17325 '``llvm.matrix.column.major.load.*``' Intrinsic
17326 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17327
17328 Syntax:
17329 """""""
17330 This is an overloaded intrinsic.
17331
17332 ::
17333
17334       declare vectorty @llvm.matrix.column.major.load.*(
17335           ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
17336
17337 Overview:
17338 """""""""
17339
17340 The '``llvm.matrix.column.major.load.*``' intrinsics load a ``<Rows> x <Cols>``
17341 matrix using a stride of ``%Stride`` to compute the start address of the
17342 different columns.  The offset is computed using ``%Stride``'s bitwidth. This
17343 allows for convenient loading of sub matrixes. If ``<IsVolatile>`` is true, the
17344 intrinsic is considered a :ref:`volatile memory access <volatile>`. The result
17345 matrix is returned in the result vector. If the ``%Ptr`` argument is known to
17346 be aligned to some boundary, this can be specified as an attribute on the
17347 argument.
17348
17349 Arguments:
17350 """"""""""
17351
17352 The first argument ``%Ptr`` is a pointer type to the returned vector type, and
17353 corresponds to the start address to load from. The second argument ``%Stride``
17354 is a positive, constant integer with ``%Stride >= <Rows>``. ``%Stride`` is used
17355 to compute the column memory addresses. I.e., for a column ``C``, its start
17356 memory addresses is calculated with ``%Ptr + C * %Stride``. The third Argument
17357 ``<IsVolatile>`` is a boolean value.  The fourth and fifth arguments,
17358 ``<Rows>`` and ``<Cols>``, correspond to the number of rows and columns,
17359 respectively, and must be positive, constant integers. The returned vector must
17360 have ``<Rows> * <Cols>`` elements.
17361
17362 The :ref:`align <attr_align>` parameter attribute can be provided for the
17363 ``%Ptr`` arguments.
17364
17365
17366 '``llvm.matrix.column.major.store.*``' Intrinsic
17367 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17368
17369 Syntax:
17370 """""""
17371
17372 ::
17373
17374       declare void @llvm.matrix.column.major.store.*(
17375           vectorty %In, ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
17376
17377 Overview:
17378 """""""""
17379
17380 The '``llvm.matrix.column.major.store.*``' intrinsics store the ``<Rows> x
17381 <Cols>`` matrix in ``%In`` to memory using a stride of ``%Stride`` between
17382 columns. The offset is computed using ``%Stride``'s bitwidth. If
17383 ``<IsVolatile>`` is true, the intrinsic is considered a
17384 :ref:`volatile memory access <volatile>`.
17385
17386 If the ``%Ptr`` argument is known to be aligned to some boundary, this can be
17387 specified as an attribute on the argument.
17388
17389 Arguments:
17390 """"""""""
17391
17392 The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
17393 <Cols>`` matrix to be stored to memory. The second argument ``%Ptr`` is a
17394 pointer to the vector type of ``%In``, and is the start address of the matrix
17395 in memory. The third argument ``%Stride`` is a positive, constant integer with
17396 ``%Stride >= <Rows>``.  ``%Stride`` is used to compute the column memory
17397 addresses. I.e., for a column ``C``, its start memory addresses is calculated
17398 with ``%Ptr + C * %Stride``.  The fourth argument ``<IsVolatile>`` is a boolean
17399 value. The arguments ``<Rows>`` and ``<Cols>`` correspond to the number of rows
17400 and columns, respectively, and must be positive, constant integers.
17401
17402 The :ref:`align <attr_align>` parameter attribute can be provided
17403 for the ``%Ptr`` arguments.
17404
17405
17406 Half Precision Floating-Point Intrinsics
17407 ----------------------------------------
17408
17409 For most target platforms, half precision floating-point is a
17410 storage-only format. This means that it is a dense encoding (in memory)
17411 but does not support computation in the format.
17412
17413 This means that code must first load the half-precision floating-point
17414 value as an i16, then convert it to float with
17415 :ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can
17416 then be performed on the float value (including extending to double
17417 etc). To store the value back to memory, it is first converted to float
17418 if needed, then converted to i16 with
17419 :ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an
17420 i16 value.
17421
17422 .. _int_convert_to_fp16:
17423
17424 '``llvm.convert.to.fp16``' Intrinsic
17425 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17426
17427 Syntax:
17428 """""""
17429
17430 ::
17431
17432       declare i16 @llvm.convert.to.fp16.f32(float %a)
17433       declare i16 @llvm.convert.to.fp16.f64(double %a)
17434
17435 Overview:
17436 """""""""
17437
17438 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
17439 conventional floating-point type to half precision floating-point format.
17440
17441 Arguments:
17442 """"""""""
17443
17444 The intrinsic function contains single argument - the value to be
17445 converted.
17446
17447 Semantics:
17448 """"""""""
17449
17450 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
17451 conventional floating-point format to half precision floating-point format. The
17452 return value is an ``i16`` which contains the converted number.
17453
17454 Examples:
17455 """""""""
17456
17457 .. code-block:: llvm
17458
17459       %res = call i16 @llvm.convert.to.fp16.f32(float %a)
17460       store i16 %res, i16* @x, align 2
17461
17462 .. _int_convert_from_fp16:
17463
17464 '``llvm.convert.from.fp16``' Intrinsic
17465 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17466
17467 Syntax:
17468 """""""
17469
17470 ::
17471
17472       declare float @llvm.convert.from.fp16.f32(i16 %a)
17473       declare double @llvm.convert.from.fp16.f64(i16 %a)
17474
17475 Overview:
17476 """""""""
17477
17478 The '``llvm.convert.from.fp16``' intrinsic function performs a
17479 conversion from half precision floating-point format to single precision
17480 floating-point format.
17481
17482 Arguments:
17483 """"""""""
17484
17485 The intrinsic function contains single argument - the value to be
17486 converted.
17487
17488 Semantics:
17489 """"""""""
17490
17491 The '``llvm.convert.from.fp16``' intrinsic function performs a
17492 conversion from half single precision floating-point format to single
17493 precision floating-point format. The input half-float value is
17494 represented by an ``i16`` value.
17495
17496 Examples:
17497 """""""""
17498
17499 .. code-block:: llvm
17500
17501       %a = load i16, i16* @x, align 2
17502       %res = call float @llvm.convert.from.fp16(i16 %a)
17503
17504 Saturating floating-point to integer conversions
17505 ------------------------------------------------
17506
17507 The ``fptoui`` and ``fptosi`` instructions return a
17508 :ref:`poison value <poisonvalues>` if the rounded-towards-zero value is not
17509 representable by the result type. These intrinsics provide an alternative
17510 conversion, which will saturate towards the smallest and largest representable
17511 integer values instead.
17512
17513 '``llvm.fptoui.sat.*``' Intrinsic
17514 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17515
17516 Syntax:
17517 """""""
17518
17519 This is an overloaded intrinsic. You can use ``llvm.fptoui.sat`` on any
17520 floating-point argument type and any integer result type, or vectors thereof.
17521 Not all targets may support all types, however.
17522
17523 ::
17524
17525       declare i32 @llvm.fptoui.sat.i32.f32(float %f)
17526       declare i19 @llvm.fptoui.sat.i19.f64(double %f)
17527       declare <4 x i100> @llvm.fptoui.sat.v4i100.v4f128(<4 x fp128> %f)
17528
17529 Overview:
17530 """""""""
17531
17532 This intrinsic converts the argument into an unsigned integer using saturating
17533 semantics.
17534
17535 Arguments:
17536 """"""""""
17537
17538 The argument may be any floating-point or vector of floating-point type. The
17539 return value may be any integer or vector of integer type. The number of vector
17540 elements in argument and return must be the same.
17541
17542 Semantics:
17543 """"""""""
17544
17545 The conversion to integer is performed subject to the following rules:
17546
17547 - If the argument is any NaN, zero is returned.
17548 - If the argument is smaller than zero (this includes negative infinity),
17549   zero is returned.
17550 - If the argument is larger than the largest representable unsigned integer of
17551   the result type (this includes positive infinity), the largest representable
17552   unsigned integer is returned.
17553 - Otherwise, the result of rounding the argument towards zero is returned.
17554
17555 Example:
17556 """"""""
17557
17558 .. code-block:: text
17559
17560       %a = call i8 @llvm.fptoui.sat.i8.f32(float 123.9)              ; yields i8: 123
17561       %b = call i8 @llvm.fptoui.sat.i8.f32(float -5.7)               ; yields i8:   0
17562       %c = call i8 @llvm.fptoui.sat.i8.f32(float 377.0)              ; yields i8: 255
17563       %d = call i8 @llvm.fptoui.sat.i8.f32(float 0xFFF8000000000000) ; yields i8:   0
17564
17565 '``llvm.fptosi.sat.*``' Intrinsic
17566 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17567
17568 Syntax:
17569 """""""
17570
17571 This is an overloaded intrinsic. You can use ``llvm.fptosi.sat`` on any
17572 floating-point argument type and any integer result type, or vectors thereof.
17573 Not all targets may support all types, however.
17574
17575 ::
17576
17577       declare i32 @llvm.fptosi.sat.i32.f32(float %f)
17578       declare i19 @llvm.fptosi.sat.i19.f64(double %f)
17579       declare <4 x i100> @llvm.fptosi.sat.v4i100.v4f128(<4 x fp128> %f)
17580
17581 Overview:
17582 """""""""
17583
17584 This intrinsic converts the argument into a signed integer using saturating
17585 semantics.
17586
17587 Arguments:
17588 """"""""""
17589
17590 The argument may be any floating-point or vector of floating-point type. The
17591 return value may be any integer or vector of integer type. The number of vector
17592 elements in argument and return must be the same.
17593
17594 Semantics:
17595 """"""""""
17596
17597 The conversion to integer is performed subject to the following rules:
17598
17599 - If the argument is any NaN, zero is returned.
17600 - If the argument is smaller than the smallest representable signed integer of
17601   the result type (this includes negative infinity), the smallest
17602   representable signed integer is returned.
17603 - If the argument is larger than the largest representable signed integer of
17604   the result type (this includes positive infinity), the largest representable
17605   signed integer is returned.
17606 - Otherwise, the result of rounding the argument towards zero is returned.
17607
17608 Example:
17609 """"""""
17610
17611 .. code-block:: text
17612
17613       %a = call i8 @llvm.fptosi.sat.i8.f32(float 23.9)               ; yields i8:   23
17614       %b = call i8 @llvm.fptosi.sat.i8.f32(float -130.8)             ; yields i8: -128
17615       %c = call i8 @llvm.fptosi.sat.i8.f32(float 999.0)              ; yields i8:  127
17616       %d = call i8 @llvm.fptosi.sat.i8.f32(float 0xFFF8000000000000) ; yields i8:    0
17617
17618 .. _dbg_intrinsics:
17619
17620 Debugger Intrinsics
17621 -------------------
17622
17623 The LLVM debugger intrinsics (which all start with ``llvm.dbg.``
17624 prefix), are described in the `LLVM Source Level
17625 Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_
17626 document.
17627
17628 Exception Handling Intrinsics
17629 -----------------------------
17630
17631 The LLVM exception handling intrinsics (which all start with
17632 ``llvm.eh.`` prefix), are described in the `LLVM Exception
17633 Handling <ExceptionHandling.html#format-common-intrinsics>`_ document.
17634
17635 Pointer Authentication Intrinsics
17636 ---------------------------------
17637
17638 The LLVM pointer authentication intrinsics (which all start with
17639 ``llvm.ptrauth.`` prefix), are described in the `Pointer Authentication
17640 <PointerAuth.html#intrinsics>`_ document.
17641
17642 .. _int_trampoline:
17643
17644 Trampoline Intrinsics
17645 ---------------------
17646
17647 These intrinsics make it possible to excise one parameter, marked with
17648 the :ref:`nest <nest>` attribute, from a function. The result is a
17649 callable function pointer lacking the nest parameter - the caller does
17650 not need to provide a value for it. Instead, the value to use is stored
17651 in advance in a "trampoline", a block of memory usually allocated on the
17652 stack, which also contains code to splice the nest value into the
17653 argument list. This is used to implement the GCC nested function address
17654 extension.
17655
17656 For example, if the function is ``i32 f(i8* nest %c, i32 %x, i32 %y)``
17657 then the resulting function pointer has signature ``i32 (i32, i32)*``.
17658 It can be created as follows:
17659
17660 .. code-block:: llvm
17661
17662       %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86
17663       %tramp1 = getelementptr [10 x i8], [10 x i8]* %tramp, i32 0, i32 0
17664       call i8* @llvm.init.trampoline(i8* %tramp1, i8* bitcast (i32 (i8*, i32, i32)* @f to i8*), i8* %nval)
17665       %p = call i8* @llvm.adjust.trampoline(i8* %tramp1)
17666       %fp = bitcast i8* %p to i32 (i32, i32)*
17667
17668 The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to
17669 ``%val = call i32 %f(i8* %nval, i32 %x, i32 %y)``.
17670
17671 .. _int_it:
17672
17673 '``llvm.init.trampoline``' Intrinsic
17674 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17675
17676 Syntax:
17677 """""""
17678
17679 ::
17680
17681       declare void @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>)
17682
17683 Overview:
17684 """""""""
17685
17686 This fills the memory pointed to by ``tramp`` with executable code,
17687 turning it into a trampoline.
17688
17689 Arguments:
17690 """"""""""
17691
17692 The ``llvm.init.trampoline`` intrinsic takes three arguments, all
17693 pointers. The ``tramp`` argument must point to a sufficiently large and
17694 sufficiently aligned block of memory; this memory is written to by the
17695 intrinsic. Note that the size and the alignment are target-specific -
17696 LLVM currently provides no portable way of determining them, so a
17697 front-end that generates this intrinsic needs to have some
17698 target-specific knowledge. The ``func`` argument must hold a function
17699 bitcast to an ``i8*``.
17700
17701 Semantics:
17702 """"""""""
17703
17704 The block of memory pointed to by ``tramp`` is filled with target
17705 dependent code, turning it into a function. Then ``tramp`` needs to be
17706 passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can
17707 be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new
17708 function's signature is the same as that of ``func`` with any arguments
17709 marked with the ``nest`` attribute removed. At most one such ``nest``
17710 argument is allowed, and it must be of pointer type. Calling the new
17711 function is equivalent to calling ``func`` with the same argument list,
17712 but with ``nval`` used for the missing ``nest`` argument. If, after
17713 calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is
17714 modified, then the effect of any later call to the returned function
17715 pointer is undefined.
17716
17717 .. _int_at:
17718
17719 '``llvm.adjust.trampoline``' Intrinsic
17720 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17721
17722 Syntax:
17723 """""""
17724
17725 ::
17726
17727       declare i8* @llvm.adjust.trampoline(i8* <tramp>)
17728
17729 Overview:
17730 """""""""
17731
17732 This performs any required machine-specific adjustment to the address of
17733 a trampoline (passed as ``tramp``).
17734
17735 Arguments:
17736 """"""""""
17737
17738 ``tramp`` must point to a block of memory which already has trampoline
17739 code filled in by a previous call to
17740 :ref:`llvm.init.trampoline <int_it>`.
17741
17742 Semantics:
17743 """"""""""
17744
17745 On some architectures the address of the code to be executed needs to be
17746 different than the address where the trampoline is actually stored. This
17747 intrinsic returns the executable address corresponding to ``tramp``
17748 after performing the required machine specific adjustments. The pointer
17749 returned can then be :ref:`bitcast and executed <int_trampoline>`.
17750
17751
17752 .. _int_vp:
17753
17754 Vector Predication Intrinsics
17755 -----------------------------
17756 VP intrinsics are intended for predicated SIMD/vector code.  A typical VP
17757 operation takes a vector mask and an explicit vector length parameter as in:
17758
17759 ::
17760
17761       <W x T> llvm.vp.<opcode>.*(<W x T> %x, <W x T> %y, <W x i1> %mask, i32 %evl)
17762
17763 The vector mask parameter (%mask) always has a vector of `i1` type, for example
17764 `<32 x i1>`.  The explicit vector length parameter always has the type `i32` and
17765 is an unsigned integer value.  The explicit vector length parameter (%evl) is in
17766 the range:
17767
17768 ::
17769
17770       0 <= %evl <= W,  where W is the number of vector elements
17771
17772 Note that for :ref:`scalable vector types <t_vector>` ``W`` is the runtime
17773 length of the vector.
17774
17775 The VP intrinsic has undefined behavior if ``%evl > W``.  The explicit vector
17776 length (%evl) creates a mask, %EVLmask, with all elements ``0 <= i < %evl`` set
17777 to True, and all other lanes ``%evl <= i < W`` to False.  A new mask %M is
17778 calculated with an element-wise AND from %mask and %EVLmask:
17779
17780 ::
17781
17782       M = %mask AND %EVLmask
17783
17784 A vector operation ``<opcode>`` on vectors ``A`` and ``B`` calculates:
17785
17786 ::
17787
17788        A <opcode> B =  {  A[i] <opcode> B[i]   M[i] = True, and
17789                        {  undef otherwise
17790
17791 Optimization Hint
17792 ^^^^^^^^^^^^^^^^^
17793
17794 Some targets, such as AVX512, do not support the %evl parameter in hardware.
17795 The use of an effective %evl is discouraged for those targets.  The function
17796 ``TargetTransformInfo::hasActiveVectorLength()`` returns true when the target
17797 has native support for %evl.
17798
17799 .. _int_vp_select:
17800
17801 '``llvm.vp.select.*``' Intrinsics
17802 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17803
17804 Syntax:
17805 """""""
17806 This is an overloaded intrinsic.
17807
17808 ::
17809
17810       declare <16 x i32>  @llvm.vp.select.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <evl>)
17811       declare <vscale x 4 x i64>  @llvm.vp.select.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i64> <on_true>, <vscale x 4 x i64> <on_false>, i32 <evl>)
17812
17813 Overview:
17814 """""""""
17815
17816 The '``llvm.vp.select``' intrinsic is used to choose one value based on a
17817 condition vector, without IR-level branching.
17818
17819 Arguments:
17820 """"""""""
17821
17822 The first operand is a vector of ``i1`` and indicates the condition.  The
17823 second operand is the value that is selected where the condition vector is
17824 true.  The third operand is the value that is selected where the condition
17825 vector is false.  The vectors must be of the same size.  The fourth operand is
17826 the explicit vector length.
17827
17828 #. The optional ``fast-math flags`` marker indicates that the select has one or
17829    more :ref:`fast-math flags <fastmath>`. These are optimization hints to
17830    enable otherwise unsafe floating-point optimizations. Fast-math flags are
17831    only valid for selects that return a floating-point scalar or vector type,
17832    or an array (nested to any depth) of floating-point scalar or vector types.
17833
17834 Semantics:
17835 """"""""""
17836
17837 The intrinsic selects lanes from the second and third operand depending on a
17838 condition vector.
17839
17840 All result lanes at positions greater or equal than ``%evl`` are undefined.
17841 For all lanes below ``%evl`` where the condition vector is true the lane is
17842 taken from the second operand.  Otherwise, the lane is taken from the third
17843 operand.
17844
17845 Example:
17846 """"""""
17847
17848 .. code-block:: llvm
17849
17850       %r = call <4 x i32> @llvm.vp.select.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %evl)
17851
17852       ;;; Expansion.
17853       ;; Any result is legal on lanes at and above %evl.
17854       %also.r = select <4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false
17855
17856
17857 .. _int_vp_merge:
17858
17859 '``llvm.vp.merge.*``' Intrinsics
17860 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17861
17862 Syntax:
17863 """""""
17864 This is an overloaded intrinsic.
17865
17866 ::
17867
17868       declare <16 x i32>  @llvm.vp.merge.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <pivot>)
17869       declare <vscale x 4 x i64>  @llvm.vp.merge.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i64> <on_true>, <vscale x 4 x i64> <on_false>, i32 <pivot>)
17870
17871 Overview:
17872 """""""""
17873
17874 The '``llvm.vp.merge``' intrinsic is used to choose one value based on a
17875 condition vector and an index operand, without IR-level branching.
17876
17877 Arguments:
17878 """"""""""
17879
17880 The first operand is a vector of ``i1`` and indicates the condition.  The
17881 second operand is the value that is merged where the condition vector is true.
17882 The third operand is the value that is selected where the condition vector is
17883 false or the lane position is greater equal than the pivot. The fourth operand
17884 is the pivot.
17885
17886 #. The optional ``fast-math flags`` marker indicates that the merge has one or
17887    more :ref:`fast-math flags <fastmath>`. These are optimization hints to
17888    enable otherwise unsafe floating-point optimizations. Fast-math flags are
17889    only valid for merges that return a floating-point scalar or vector type,
17890    or an array (nested to any depth) of floating-point scalar or vector types.
17891
17892 Semantics:
17893 """"""""""
17894
17895 The intrinsic selects lanes from the second and third operand depending on a
17896 condition vector and pivot value.
17897
17898 For all lanes where the condition vector is true and the lane position is less
17899 than ``%pivot`` the lane is taken from the second operand.  Otherwise, the lane
17900 is taken from the third operand.
17901
17902 Example:
17903 """"""""
17904
17905 .. code-block:: llvm
17906
17907       %r = call <4 x i32> @llvm.vp.merge.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %pivot)
17908
17909       ;;; Expansion.
17910       ;; Lanes at and above %pivot are taken from %on_false
17911       %atfirst = insertelement <4 x i32> undef, i32 %pivot, i32 0
17912       %splat = shufflevector <4 x i32> %atfirst, <4 x i32> poison, <4 x i32> zeroinitializer
17913       %pivotmask = icmp ult <4 x i32> %splat, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
17914       %mergemask = and <4 x i1> %cond, <4 x i1> %pivotmask
17915       %also.r = select <4 x i1> %mergemask, <4 x i32> %on_true, <4 x i32> %on_false
17916
17917
17918
17919 .. _int_vp_add:
17920
17921 '``llvm.vp.add.*``' Intrinsics
17922 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17923
17924 Syntax:
17925 """""""
17926 This is an overloaded intrinsic.
17927
17928 ::
17929
17930       declare <16 x i32>  @llvm.vp.add.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17931       declare <vscale x 4 x i32>  @llvm.vp.add.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17932       declare <256 x i64>  @llvm.vp.add.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17933
17934 Overview:
17935 """""""""
17936
17937 Predicated integer addition of two vectors of integers.
17938
17939
17940 Arguments:
17941 """"""""""
17942
17943 The first two operands and the result have the same vector of integer type. The
17944 third operand is the vector mask and has the same number of elements as the
17945 result vector type. The fourth operand is the explicit vector length of the
17946 operation.
17947
17948 Semantics:
17949 """"""""""
17950
17951 The '``llvm.vp.add``' intrinsic performs integer addition (:ref:`add <i_add>`)
17952 of the first and second vector operand on each enabled lane.  The result on
17953 disabled lanes is undefined.
17954
17955 Examples:
17956 """""""""
17957
17958 .. code-block:: llvm
17959
17960       %r = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
17961       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
17962
17963       %t = add <4 x i32> %a, %b
17964       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
17965
17966 .. _int_vp_sub:
17967
17968 '``llvm.vp.sub.*``' Intrinsics
17969 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17970
17971 Syntax:
17972 """""""
17973 This is an overloaded intrinsic.
17974
17975 ::
17976
17977       declare <16 x i32>  @llvm.vp.sub.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17978       declare <vscale x 4 x i32>  @llvm.vp.sub.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17979       declare <256 x i64>  @llvm.vp.sub.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17980
17981 Overview:
17982 """""""""
17983
17984 Predicated integer subtraction of two vectors of integers.
17985
17986
17987 Arguments:
17988 """"""""""
17989
17990 The first two operands and the result have the same vector of integer type. The
17991 third operand is the vector mask and has the same number of elements as the
17992 result vector type. The fourth operand is the explicit vector length of the
17993 operation.
17994
17995 Semantics:
17996 """"""""""
17997
17998 The '``llvm.vp.sub``' intrinsic performs integer subtraction
17999 (:ref:`sub <i_sub>`)  of the first and second vector operand on each enabled
18000 lane. The result on disabled lanes is undefined.
18001
18002 Examples:
18003 """""""""
18004
18005 .. code-block:: llvm
18006
18007       %r = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18008       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18009
18010       %t = sub <4 x i32> %a, %b
18011       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18012
18013
18014
18015 .. _int_vp_mul:
18016
18017 '``llvm.vp.mul.*``' Intrinsics
18018 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18019
18020 Syntax:
18021 """""""
18022 This is an overloaded intrinsic.
18023
18024 ::
18025
18026       declare <16 x i32>  @llvm.vp.mul.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18027       declare <vscale x 4 x i32>  @llvm.vp.mul.nxv46i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18028       declare <256 x i64>  @llvm.vp.mul.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18029
18030 Overview:
18031 """""""""
18032
18033 Predicated integer multiplication of two vectors of integers.
18034
18035
18036 Arguments:
18037 """"""""""
18038
18039 The first two operands and the result have the same vector of integer type. The
18040 third operand is the vector mask and has the same number of elements as the
18041 result vector type. The fourth operand is the explicit vector length of the
18042 operation.
18043
18044 Semantics:
18045 """"""""""
18046 The '``llvm.vp.mul``' intrinsic performs integer multiplication
18047 (:ref:`mul <i_mul>`) of the first and second vector operand on each enabled
18048 lane. The result on disabled lanes is undefined.
18049
18050 Examples:
18051 """""""""
18052
18053 .. code-block:: llvm
18054
18055       %r = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18056       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18057
18058       %t = mul <4 x i32> %a, %b
18059       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18060
18061
18062 .. _int_vp_sdiv:
18063
18064 '``llvm.vp.sdiv.*``' Intrinsics
18065 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18066
18067 Syntax:
18068 """""""
18069 This is an overloaded intrinsic.
18070
18071 ::
18072
18073       declare <16 x i32>  @llvm.vp.sdiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18074       declare <vscale x 4 x i32>  @llvm.vp.sdiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18075       declare <256 x i64>  @llvm.vp.sdiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18076
18077 Overview:
18078 """""""""
18079
18080 Predicated, signed division of two vectors of integers.
18081
18082
18083 Arguments:
18084 """"""""""
18085
18086 The first two operands and the result have the same vector of integer type. The
18087 third operand is the vector mask and has the same number of elements as the
18088 result vector type. The fourth operand is the explicit vector length of the
18089 operation.
18090
18091 Semantics:
18092 """"""""""
18093
18094 The '``llvm.vp.sdiv``' intrinsic performs signed division (:ref:`sdiv <i_sdiv>`)
18095 of the first and second vector operand on each enabled lane.  The result on
18096 disabled lanes is undefined.
18097
18098 Examples:
18099 """""""""
18100
18101 .. code-block:: llvm
18102
18103       %r = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18104       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18105
18106       %t = sdiv <4 x i32> %a, %b
18107       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18108
18109
18110 .. _int_vp_udiv:
18111
18112 '``llvm.vp.udiv.*``' Intrinsics
18113 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18114
18115 Syntax:
18116 """""""
18117 This is an overloaded intrinsic.
18118
18119 ::
18120
18121       declare <16 x i32>  @llvm.vp.udiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18122       declare <vscale x 4 x i32>  @llvm.vp.udiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18123       declare <256 x i64>  @llvm.vp.udiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18124
18125 Overview:
18126 """""""""
18127
18128 Predicated, unsigned division of two vectors of integers.
18129
18130
18131 Arguments:
18132 """"""""""
18133
18134 The first two operands and the result have the same vector of integer type. The third operand is the vector mask and has the same number of elements as the result vector type. The fourth operand is the explicit vector length of the operation.
18135
18136 Semantics:
18137 """"""""""
18138
18139 The '``llvm.vp.udiv``' intrinsic performs unsigned division
18140 (:ref:`udiv <i_udiv>`) of the first and second vector operand on each enabled
18141 lane. The result on disabled lanes is undefined.
18142
18143 Examples:
18144 """""""""
18145
18146 .. code-block:: llvm
18147
18148       %r = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18149       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18150
18151       %t = udiv <4 x i32> %a, %b
18152       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18153
18154
18155
18156 .. _int_vp_srem:
18157
18158 '``llvm.vp.srem.*``' Intrinsics
18159 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18160
18161 Syntax:
18162 """""""
18163 This is an overloaded intrinsic.
18164
18165 ::
18166
18167       declare <16 x i32>  @llvm.vp.srem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18168       declare <vscale x 4 x i32>  @llvm.vp.srem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18169       declare <256 x i64>  @llvm.vp.srem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18170
18171 Overview:
18172 """""""""
18173
18174 Predicated computations of the signed remainder of two integer vectors.
18175
18176
18177 Arguments:
18178 """"""""""
18179
18180 The first two operands and the result have the same vector of integer type. The
18181 third operand is the vector mask and has the same number of elements as the
18182 result vector type. The fourth operand is the explicit vector length of the
18183 operation.
18184
18185 Semantics:
18186 """"""""""
18187
18188 The '``llvm.vp.srem``' intrinsic computes the remainder of the signed division
18189 (:ref:`srem <i_srem>`) of the first and second vector operand on each enabled
18190 lane.  The result on disabled lanes is undefined.
18191
18192 Examples:
18193 """""""""
18194
18195 .. code-block:: llvm
18196
18197       %r = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18198       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18199
18200       %t = srem <4 x i32> %a, %b
18201       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18202
18203
18204
18205 .. _int_vp_urem:
18206
18207 '``llvm.vp.urem.*``' Intrinsics
18208 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18209
18210 Syntax:
18211 """""""
18212 This is an overloaded intrinsic.
18213
18214 ::
18215
18216       declare <16 x i32>  @llvm.vp.urem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18217       declare <vscale x 4 x i32>  @llvm.vp.urem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18218       declare <256 x i64>  @llvm.vp.urem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18219
18220 Overview:
18221 """""""""
18222
18223 Predicated computation of the unsigned remainder of two integer vectors.
18224
18225
18226 Arguments:
18227 """"""""""
18228
18229 The first two operands and the result have the same vector of integer type. The
18230 third operand is the vector mask and has the same number of elements as the
18231 result vector type. The fourth operand is the explicit vector length of the
18232 operation.
18233
18234 Semantics:
18235 """"""""""
18236
18237 The '``llvm.vp.urem``' intrinsic computes the remainder of the unsigned division
18238 (:ref:`urem <i_urem>`) of the first and second vector operand on each enabled
18239 lane.  The result on disabled lanes is undefined.
18240
18241 Examples:
18242 """""""""
18243
18244 .. code-block:: llvm
18245
18246       %r = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18247       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18248
18249       %t = urem <4 x i32> %a, %b
18250       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18251
18252
18253 .. _int_vp_ashr:
18254
18255 '``llvm.vp.ashr.*``' Intrinsics
18256 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18257
18258 Syntax:
18259 """""""
18260 This is an overloaded intrinsic.
18261
18262 ::
18263
18264       declare <16 x i32>  @llvm.vp.ashr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18265       declare <vscale x 4 x i32>  @llvm.vp.ashr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18266       declare <256 x i64>  @llvm.vp.ashr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18267
18268 Overview:
18269 """""""""
18270
18271 Vector-predicated arithmetic right-shift.
18272
18273
18274 Arguments:
18275 """"""""""
18276
18277 The first two operands and the result have the same vector of integer type. The
18278 third operand is the vector mask and has the same number of elements as the
18279 result vector type. The fourth operand is the explicit vector length of the
18280 operation.
18281
18282 Semantics:
18283 """"""""""
18284
18285 The '``llvm.vp.ashr``' intrinsic computes the arithmetic right shift
18286 (:ref:`ashr <i_ashr>`) of the first operand by the second operand on each
18287 enabled lane. The result on disabled lanes is undefined.
18288
18289 Examples:
18290 """""""""
18291
18292 .. code-block:: llvm
18293
18294       %r = call <4 x i32> @llvm.vp.ashr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18295       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18296
18297       %t = ashr <4 x i32> %a, %b
18298       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18299
18300
18301 .. _int_vp_lshr:
18302
18303
18304 '``llvm.vp.lshr.*``' Intrinsics
18305 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18306
18307 Syntax:
18308 """""""
18309 This is an overloaded intrinsic.
18310
18311 ::
18312
18313       declare <16 x i32>  @llvm.vp.lshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18314       declare <vscale x 4 x i32>  @llvm.vp.lshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18315       declare <256 x i64>  @llvm.vp.lshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18316
18317 Overview:
18318 """""""""
18319
18320 Vector-predicated logical right-shift.
18321
18322
18323 Arguments:
18324 """"""""""
18325
18326 The first two operands and the result have the same vector of integer type. The
18327 third operand is the vector mask and has the same number of elements as the
18328 result vector type. The fourth operand is the explicit vector length of the
18329 operation.
18330
18331 Semantics:
18332 """"""""""
18333
18334 The '``llvm.vp.lshr``' intrinsic computes the logical right shift
18335 (:ref:`lshr <i_lshr>`) of the first operand by the second operand on each
18336 enabled lane. The result on disabled lanes is undefined.
18337
18338 Examples:
18339 """""""""
18340
18341 .. code-block:: llvm
18342
18343       %r = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18344       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18345
18346       %t = lshr <4 x i32> %a, %b
18347       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18348
18349
18350 .. _int_vp_shl:
18351
18352 '``llvm.vp.shl.*``' Intrinsics
18353 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18354
18355 Syntax:
18356 """""""
18357 This is an overloaded intrinsic.
18358
18359 ::
18360
18361       declare <16 x i32>  @llvm.vp.shl.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18362       declare <vscale x 4 x i32>  @llvm.vp.shl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18363       declare <256 x i64>  @llvm.vp.shl.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18364
18365 Overview:
18366 """""""""
18367
18368 Vector-predicated left shift.
18369
18370
18371 Arguments:
18372 """"""""""
18373
18374 The first two operands and the result have the same vector of integer type. The
18375 third operand is the vector mask and has the same number of elements as the
18376 result vector type. The fourth operand is the explicit vector length of the
18377 operation.
18378
18379 Semantics:
18380 """"""""""
18381
18382 The '``llvm.vp.shl``' intrinsic computes the left shift (:ref:`shl <i_shl>`) of
18383 the first operand by the second operand on each enabled lane.  The result on
18384 disabled lanes is undefined.
18385
18386 Examples:
18387 """""""""
18388
18389 .. code-block:: llvm
18390
18391       %r = call <4 x i32> @llvm.vp.shl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18392       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18393
18394       %t = shl <4 x i32> %a, %b
18395       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18396
18397
18398 .. _int_vp_or:
18399
18400 '``llvm.vp.or.*``' Intrinsics
18401 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18402
18403 Syntax:
18404 """""""
18405 This is an overloaded intrinsic.
18406
18407 ::
18408
18409       declare <16 x i32>  @llvm.vp.or.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18410       declare <vscale x 4 x i32>  @llvm.vp.or.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18411       declare <256 x i64>  @llvm.vp.or.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18412
18413 Overview:
18414 """""""""
18415
18416 Vector-predicated or.
18417
18418
18419 Arguments:
18420 """"""""""
18421
18422 The first two operands and the result have the same vector of integer type. The
18423 third operand is the vector mask and has the same number of elements as the
18424 result vector type. The fourth operand is the explicit vector length of the
18425 operation.
18426
18427 Semantics:
18428 """"""""""
18429
18430 The '``llvm.vp.or``' intrinsic performs a bitwise or (:ref:`or <i_or>`) of the
18431 first two operands on each enabled lane.  The result on disabled lanes is
18432 undefined.
18433
18434 Examples:
18435 """""""""
18436
18437 .. code-block:: llvm
18438
18439       %r = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18440       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18441
18442       %t = or <4 x i32> %a, %b
18443       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18444
18445
18446 .. _int_vp_and:
18447
18448 '``llvm.vp.and.*``' Intrinsics
18449 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18450
18451 Syntax:
18452 """""""
18453 This is an overloaded intrinsic.
18454
18455 ::
18456
18457       declare <16 x i32>  @llvm.vp.and.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18458       declare <vscale x 4 x i32>  @llvm.vp.and.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18459       declare <256 x i64>  @llvm.vp.and.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18460
18461 Overview:
18462 """""""""
18463
18464 Vector-predicated and.
18465
18466
18467 Arguments:
18468 """"""""""
18469
18470 The first two operands and the result have the same vector of integer type. The
18471 third operand is the vector mask and has the same number of elements as the
18472 result vector type. The fourth operand is the explicit vector length of the
18473 operation.
18474
18475 Semantics:
18476 """"""""""
18477
18478 The '``llvm.vp.and``' intrinsic performs a bitwise and (:ref:`and <i_or>`) of
18479 the first two operands on each enabled lane.  The result on disabled lanes is
18480 undefined.
18481
18482 Examples:
18483 """""""""
18484
18485 .. code-block:: llvm
18486
18487       %r = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18488       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18489
18490       %t = and <4 x i32> %a, %b
18491       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18492
18493
18494 .. _int_vp_xor:
18495
18496 '``llvm.vp.xor.*``' Intrinsics
18497 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18498
18499 Syntax:
18500 """""""
18501 This is an overloaded intrinsic.
18502
18503 ::
18504
18505       declare <16 x i32>  @llvm.vp.xor.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18506       declare <vscale x 4 x i32>  @llvm.vp.xor.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18507       declare <256 x i64>  @llvm.vp.xor.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18508
18509 Overview:
18510 """""""""
18511
18512 Vector-predicated, bitwise xor.
18513
18514
18515 Arguments:
18516 """"""""""
18517
18518 The first two operands and the result have the same vector of integer type. The
18519 third operand is the vector mask and has the same number of elements as the
18520 result vector type. The fourth operand is the explicit vector length of the
18521 operation.
18522
18523 Semantics:
18524 """"""""""
18525
18526 The '``llvm.vp.xor``' intrinsic performs a bitwise xor (:ref:`xor <i_xor>`) of
18527 the first two operands on each enabled lane.
18528 The result on disabled lanes is undefined.
18529
18530 Examples:
18531 """""""""
18532
18533 .. code-block:: llvm
18534
18535       %r = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18536       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18537
18538       %t = xor <4 x i32> %a, %b
18539       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18540
18541
18542 .. _int_vp_fadd:
18543
18544 '``llvm.vp.fadd.*``' Intrinsics
18545 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18546
18547 Syntax:
18548 """""""
18549 This is an overloaded intrinsic.
18550
18551 ::
18552
18553       declare <16 x float>  @llvm.vp.fadd.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18554       declare <vscale x 4 x float>  @llvm.vp.fadd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18555       declare <256 x double>  @llvm.vp.fadd.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18556
18557 Overview:
18558 """""""""
18559
18560 Predicated floating-point addition of two vectors of floating-point values.
18561
18562
18563 Arguments:
18564 """"""""""
18565
18566 The first two operands and the result have the same vector of floating-point type. The
18567 third operand is the vector mask and has the same number of elements as the
18568 result vector type. The fourth operand is the explicit vector length of the
18569 operation.
18570
18571 Semantics:
18572 """"""""""
18573
18574 The '``llvm.vp.fadd``' intrinsic performs floating-point addition (:ref:`add <i_fadd>`)
18575 of the first and second vector operand on each enabled lane.  The result on
18576 disabled lanes is undefined.  The operation is performed in the default
18577 floating-point environment.
18578
18579 Examples:
18580 """""""""
18581
18582 .. code-block:: llvm
18583
18584       %r = call <4 x float> @llvm.vp.fadd.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18585       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18586
18587       %t = fadd <4 x float> %a, %b
18588       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18589
18590
18591 .. _int_vp_fsub:
18592
18593 '``llvm.vp.fsub.*``' Intrinsics
18594 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18595
18596 Syntax:
18597 """""""
18598 This is an overloaded intrinsic.
18599
18600 ::
18601
18602       declare <16 x float>  @llvm.vp.fsub.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18603       declare <vscale x 4 x float>  @llvm.vp.fsub.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18604       declare <256 x double>  @llvm.vp.fsub.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18605
18606 Overview:
18607 """""""""
18608
18609 Predicated floating-point subtraction of two vectors of floating-point values.
18610
18611
18612 Arguments:
18613 """"""""""
18614
18615 The first two operands and the result have the same vector of floating-point type. The
18616 third operand is the vector mask and has the same number of elements as the
18617 result vector type. The fourth operand is the explicit vector length of the
18618 operation.
18619
18620 Semantics:
18621 """"""""""
18622
18623 The '``llvm.vp.fsub``' intrinsic performs floating-point subtraction (:ref:`add <i_fsub>`)
18624 of the first and second vector operand on each enabled lane.  The result on
18625 disabled lanes is undefined.  The operation is performed in the default
18626 floating-point environment.
18627
18628 Examples:
18629 """""""""
18630
18631 .. code-block:: llvm
18632
18633       %r = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18634       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18635
18636       %t = fsub <4 x float> %a, %b
18637       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18638
18639
18640 .. _int_vp_fmul:
18641
18642 '``llvm.vp.fmul.*``' Intrinsics
18643 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18644
18645 Syntax:
18646 """""""
18647 This is an overloaded intrinsic.
18648
18649 ::
18650
18651       declare <16 x float>  @llvm.vp.fmul.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18652       declare <vscale x 4 x float>  @llvm.vp.fmul.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18653       declare <256 x double>  @llvm.vp.fmul.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18654
18655 Overview:
18656 """""""""
18657
18658 Predicated floating-point multiplication of two vectors of floating-point values.
18659
18660
18661 Arguments:
18662 """"""""""
18663
18664 The first two operands and the result have the same vector of floating-point type. The
18665 third operand is the vector mask and has the same number of elements as the
18666 result vector type. The fourth operand is the explicit vector length of the
18667 operation.
18668
18669 Semantics:
18670 """"""""""
18671
18672 The '``llvm.vp.fmul``' intrinsic performs floating-point multiplication (:ref:`add <i_fmul>`)
18673 of the first and second vector operand on each enabled lane.  The result on
18674 disabled lanes is undefined.  The operation is performed in the default
18675 floating-point environment.
18676
18677 Examples:
18678 """""""""
18679
18680 .. code-block:: llvm
18681
18682       %r = call <4 x float> @llvm.vp.fmul.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18683       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18684
18685       %t = fmul <4 x float> %a, %b
18686       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18687
18688
18689 .. _int_vp_fdiv:
18690
18691 '``llvm.vp.fdiv.*``' Intrinsics
18692 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18693
18694 Syntax:
18695 """""""
18696 This is an overloaded intrinsic.
18697
18698 ::
18699
18700       declare <16 x float>  @llvm.vp.fdiv.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18701       declare <vscale x 4 x float>  @llvm.vp.fdiv.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18702       declare <256 x double>  @llvm.vp.fdiv.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18703
18704 Overview:
18705 """""""""
18706
18707 Predicated floating-point division of two vectors of floating-point values.
18708
18709
18710 Arguments:
18711 """"""""""
18712
18713 The first two operands and the result have the same vector of floating-point type. The
18714 third operand is the vector mask and has the same number of elements as the
18715 result vector type. The fourth operand is the explicit vector length of the
18716 operation.
18717
18718 Semantics:
18719 """"""""""
18720
18721 The '``llvm.vp.fdiv``' intrinsic performs floating-point division (:ref:`add <i_fdiv>`)
18722 of the first and second vector operand on each enabled lane.  The result on
18723 disabled lanes is undefined.  The operation is performed in the default
18724 floating-point environment.
18725
18726 Examples:
18727 """""""""
18728
18729 .. code-block:: llvm
18730
18731       %r = call <4 x float> @llvm.vp.fdiv.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18732       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18733
18734       %t = fdiv <4 x float> %a, %b
18735       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18736
18737
18738 .. _int_vp_frem:
18739
18740 '``llvm.vp.frem.*``' Intrinsics
18741 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18742
18743 Syntax:
18744 """""""
18745 This is an overloaded intrinsic.
18746
18747 ::
18748
18749       declare <16 x float>  @llvm.vp.frem.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18750       declare <vscale x 4 x float>  @llvm.vp.frem.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18751       declare <256 x double>  @llvm.vp.frem.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18752
18753 Overview:
18754 """""""""
18755
18756 Predicated floating-point remainder of two vectors of floating-point values.
18757
18758
18759 Arguments:
18760 """"""""""
18761
18762 The first two operands and the result have the same vector of floating-point type. The
18763 third operand is the vector mask and has the same number of elements as the
18764 result vector type. The fourth operand is the explicit vector length of the
18765 operation.
18766
18767 Semantics:
18768 """"""""""
18769
18770 The '``llvm.vp.frem``' intrinsic performs floating-point remainder (:ref:`add <i_frem>`)
18771 of the first and second vector operand on each enabled lane.  The result on
18772 disabled lanes is undefined.  The operation is performed in the default
18773 floating-point environment.
18774
18775 Examples:
18776 """""""""
18777
18778 .. code-block:: llvm
18779
18780       %r = call <4 x float> @llvm.vp.frem.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18781       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18782
18783       %t = frem <4 x float> %a, %b
18784       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18785
18786
18787
18788 .. _int_vp_reduce_add:
18789
18790 '``llvm.vp.reduce.add.*``' Intrinsics
18791 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18792
18793 Syntax:
18794 """""""
18795 This is an overloaded intrinsic.
18796
18797 ::
18798
18799       declare i32 @llvm.vp.reduce.add.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
18800       declare i16 @llvm.vp.reduce.add.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18801
18802 Overview:
18803 """""""""
18804
18805 Predicated integer ``ADD`` reduction of a vector and a scalar starting value,
18806 returning the result as a scalar.
18807
18808 Arguments:
18809 """"""""""
18810
18811 The first operand is the start value of the reduction, which must be a scalar
18812 integer type equal to the result type. The second operand is the vector on
18813 which the reduction is performed and must be a vector of integer values whose
18814 element type is the result/start type. The third operand is the vector mask and
18815 is a vector of boolean values with the same number of elements as the vector
18816 operand. The fourth operand is the explicit vector length of the operation.
18817
18818 Semantics:
18819 """"""""""
18820
18821 The '``llvm.vp.reduce.add``' intrinsic performs the integer ``ADD`` reduction
18822 (:ref:`llvm.vector.reduce.add <int_vector_reduce_add>`) of the vector operand
18823 ``val`` on each enabled lane, adding it to the scalar ``start_value``. Disabled
18824 lanes are treated as containing the neutral value ``0`` (i.e. having no effect
18825 on the reduction operation). If the vector length is zero, the result is equal
18826 to ``start_value``.
18827
18828 To ignore the start value, the neutral value can be used.
18829
18830 Examples:
18831 """""""""
18832
18833 .. code-block:: llvm
18834
18835       %r = call i32 @llvm.vp.reduce.add.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
18836       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18837       ; are treated as though %mask were false for those lanes.
18838
18839       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> zeroinitializer
18840       %reduction = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %masked.a)
18841       %also.r = add i32 %reduction, %start
18842
18843
18844 .. _int_vp_reduce_fadd:
18845
18846 '``llvm.vp.reduce.fadd.*``' Intrinsics
18847 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18848
18849 Syntax:
18850 """""""
18851 This is an overloaded intrinsic.
18852
18853 ::
18854
18855       declare float @llvm.vp.reduce.fadd.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
18856       declare double @llvm.vp.reduce.fadd.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18857
18858 Overview:
18859 """""""""
18860
18861 Predicated floating-point ``ADD`` reduction of a vector and a scalar starting
18862 value, returning the result as a scalar.
18863
18864 Arguments:
18865 """"""""""
18866
18867 The first operand is the start value of the reduction, which must be a scalar
18868 floating-point type equal to the result type. The second operand is the vector
18869 on which the reduction is performed and must be a vector of floating-point
18870 values whose element type is the result/start type. The third operand is the
18871 vector mask and is a vector of boolean values with the same number of elements
18872 as the vector operand. The fourth operand is the explicit vector length of the
18873 operation.
18874
18875 Semantics:
18876 """"""""""
18877
18878 The '``llvm.vp.reduce.fadd``' intrinsic performs the floating-point ``ADD``
18879 reduction (:ref:`llvm.vector.reduce.fadd <int_vector_reduce_fadd>`) of the
18880 vector operand ``val`` on each enabled lane, adding it to the scalar
18881 ``start_value``. Disabled lanes are treated as containing the neutral value
18882 ``-0.0`` (i.e. having no effect on the reduction operation). If no lanes are
18883 enabled, the resulting value will be equal to ``start_value``.
18884
18885 To ignore the start value, the neutral value can be used.
18886
18887 See the unpredicated version (:ref:`llvm.vector.reduce.fadd
18888 <int_vector_reduce_fadd>`) for more detail on the semantics of the reduction.
18889
18890 Examples:
18891 """""""""
18892
18893 .. code-block:: llvm
18894
18895       %r = call float @llvm.vp.reduce.fadd.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
18896       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18897       ; are treated as though %mask were false for those lanes.
18898
18899       %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -0.0, float -0.0, float -0.0, float -0.0>
18900       %also.r = call float @llvm.vector.reduce.fadd.v4f32(float %start, <4 x float> %masked.a)
18901
18902
18903 .. _int_vp_reduce_mul:
18904
18905 '``llvm.vp.reduce.mul.*``' Intrinsics
18906 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18907
18908 Syntax:
18909 """""""
18910 This is an overloaded intrinsic.
18911
18912 ::
18913
18914       declare i32 @llvm.vp.reduce.mul.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
18915       declare i16 @llvm.vp.reduce.mul.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18916
18917 Overview:
18918 """""""""
18919
18920 Predicated integer ``MUL`` reduction of a vector and a scalar starting value,
18921 returning the result as a scalar.
18922
18923
18924 Arguments:
18925 """"""""""
18926
18927 The first operand is the start value of the reduction, which must be a scalar
18928 integer type equal to the result type. The second operand is the vector on
18929 which the reduction is performed and must be a vector of integer values whose
18930 element type is the result/start type. The third operand is the vector mask and
18931 is a vector of boolean values with the same number of elements as the vector
18932 operand. The fourth operand is the explicit vector length of the operation.
18933
18934 Semantics:
18935 """"""""""
18936
18937 The '``llvm.vp.reduce.mul``' intrinsic performs the integer ``MUL`` reduction
18938 (:ref:`llvm.vector.reduce.mul <int_vector_reduce_mul>`) of the vector operand ``val``
18939 on each enabled lane, multiplying it by the scalar ``start_value``. Disabled
18940 lanes are treated as containing the neutral value ``1`` (i.e. having no effect
18941 on the reduction operation). If the vector length is zero, the result is the
18942 start value.
18943
18944 To ignore the start value, the neutral value can be used.
18945
18946 Examples:
18947 """""""""
18948
18949 .. code-block:: llvm
18950
18951       %r = call i32 @llvm.vp.reduce.mul.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
18952       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18953       ; are treated as though %mask were false for those lanes.
18954
18955       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 1, i32 1, i32 1, i32 1>
18956       %reduction = call i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %masked.a)
18957       %also.r = mul i32 %reduction, %start
18958
18959 .. _int_vp_reduce_fmul:
18960
18961 '``llvm.vp.reduce.fmul.*``' Intrinsics
18962 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18963
18964 Syntax:
18965 """""""
18966 This is an overloaded intrinsic.
18967
18968 ::
18969
18970       declare float @llvm.vp.reduce.fmul.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
18971       declare double @llvm.vp.reduce.fmul.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18972
18973 Overview:
18974 """""""""
18975
18976 Predicated floating-point ``MUL`` reduction of a vector and a scalar starting
18977 value, returning the result as a scalar.
18978
18979
18980 Arguments:
18981 """"""""""
18982
18983 The first operand is the start value of the reduction, which must be a scalar
18984 floating-point type equal to the result type. The second operand is the vector
18985 on which the reduction is performed and must be a vector of floating-point
18986 values whose element type is the result/start type. The third operand is the
18987 vector mask and is a vector of boolean values with the same number of elements
18988 as the vector operand. The fourth operand is the explicit vector length of the
18989 operation.
18990
18991 Semantics:
18992 """"""""""
18993
18994 The '``llvm.vp.reduce.fmul``' intrinsic performs the floating-point ``MUL``
18995 reduction (:ref:`llvm.vector.reduce.fmul <int_vector_reduce_fmul>`) of the
18996 vector operand ``val`` on each enabled lane, multiplying it by the scalar
18997 `start_value``. Disabled lanes are treated as containing the neutral value
18998 ``1.0`` (i.e. having no effect on the reduction operation). If no lanes are
18999 enabled, the resulting value will be equal to the starting value.
19000
19001 To ignore the start value, the neutral value can be used.
19002
19003 See the unpredicated version (:ref:`llvm.vector.reduce.fmul
19004 <int_vector_reduce_fmul>`) for more detail on the semantics.
19005
19006 Examples:
19007 """""""""
19008
19009 .. code-block:: llvm
19010
19011       %r = call float @llvm.vp.reduce.fmul.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
19012       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19013       ; are treated as though %mask were false for those lanes.
19014
19015       %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float 1.0, float 1.0, float 1.0, float 1.0>
19016       %also.r = call float @llvm.vector.reduce.fmul.v4f32(float %start, <4 x float> %masked.a)
19017
19018
19019 .. _int_vp_reduce_and:
19020
19021 '``llvm.vp.reduce.and.*``' Intrinsics
19022 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19023
19024 Syntax:
19025 """""""
19026 This is an overloaded intrinsic.
19027
19028 ::
19029
19030       declare i32 @llvm.vp.reduce.and.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19031       declare i16 @llvm.vp.reduce.and.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19032
19033 Overview:
19034 """""""""
19035
19036 Predicated integer ``AND`` reduction of a vector and a scalar starting value,
19037 returning the result as a scalar.
19038
19039
19040 Arguments:
19041 """"""""""
19042
19043 The first operand is the start value of the reduction, which must be a scalar
19044 integer type equal to the result type. The second operand is the vector on
19045 which the reduction is performed and must be a vector of integer values whose
19046 element type is the result/start type. The third operand is the vector mask and
19047 is a vector of boolean values with the same number of elements as the vector
19048 operand. The fourth operand is the explicit vector length of the operation.
19049
19050 Semantics:
19051 """"""""""
19052
19053 The '``llvm.vp.reduce.and``' intrinsic performs the integer ``AND`` reduction
19054 (:ref:`llvm.vector.reduce.and <int_vector_reduce_and>`) of the vector operand
19055 ``val`` on each enabled lane, performing an '``and``' of that with with the
19056 scalar ``start_value``. Disabled lanes are treated as containing the neutral
19057 value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
19058 operation). If the vector length is zero, the result is the start value.
19059
19060 To ignore the start value, the neutral value can be used.
19061
19062 Examples:
19063 """""""""
19064
19065 .. code-block:: llvm
19066
19067       %r = call i32 @llvm.vp.reduce.and.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19068       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19069       ; are treated as though %mask were false for those lanes.
19070
19071       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
19072       %reduction = call i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %masked.a)
19073       %also.r = and i32 %reduction, %start
19074
19075
19076 .. _int_vp_reduce_or:
19077
19078 '``llvm.vp.reduce.or.*``' Intrinsics
19079 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19080
19081 Syntax:
19082 """""""
19083 This is an overloaded intrinsic.
19084
19085 ::
19086
19087       declare i32 @llvm.vp.reduce.or.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19088       declare i16 @llvm.vp.reduce.or.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19089
19090 Overview:
19091 """""""""
19092
19093 Predicated integer ``OR`` reduction of a vector and a scalar starting value,
19094 returning the result as a scalar.
19095
19096
19097 Arguments:
19098 """"""""""
19099
19100 The first operand is the start value of the reduction, which must be a scalar
19101 integer type equal to the result type. The second operand is the vector on
19102 which the reduction is performed and must be a vector of integer values whose
19103 element type is the result/start type. The third operand is the vector mask and
19104 is a vector of boolean values with the same number of elements as the vector
19105 operand. The fourth operand is the explicit vector length of the operation.
19106
19107 Semantics:
19108 """"""""""
19109
19110 The '``llvm.vp.reduce.or``' intrinsic performs the integer ``OR`` reduction
19111 (:ref:`llvm.vector.reduce.or <int_vector_reduce_or>`) of the vector operand
19112 ``val`` on each enabled lane, performing an '``or``' of that with the scalar
19113 ``start_value``. Disabled lanes are treated as containing the neutral value
19114 ``0`` (i.e. having no effect on the reduction operation). If the vector length
19115 is zero, the result is the start value.
19116
19117 To ignore the start value, the neutral value can be used.
19118
19119 Examples:
19120 """""""""
19121
19122 .. code-block:: llvm
19123
19124       %r = call i32 @llvm.vp.reduce.or.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19125       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19126       ; are treated as though %mask were false for those lanes.
19127
19128       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
19129       %reduction = call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %masked.a)
19130       %also.r = or i32 %reduction, %start
19131
19132 .. _int_vp_reduce_xor:
19133
19134 '``llvm.vp.reduce.xor.*``' Intrinsics
19135 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19136
19137 Syntax:
19138 """""""
19139 This is an overloaded intrinsic.
19140
19141 ::
19142
19143       declare i32 @llvm.vp.reduce.xor.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19144       declare i16 @llvm.vp.reduce.xor.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19145
19146 Overview:
19147 """""""""
19148
19149 Predicated integer ``XOR`` reduction of a vector and a scalar starting value,
19150 returning the result as a scalar.
19151
19152
19153 Arguments:
19154 """"""""""
19155
19156 The first operand is the start value of the reduction, which must be a scalar
19157 integer type equal to the result type. The second operand is the vector on
19158 which the reduction is performed and must be a vector of integer values whose
19159 element type is the result/start type. The third operand is the vector mask and
19160 is a vector of boolean values with the same number of elements as the vector
19161 operand. The fourth operand is the explicit vector length of the operation.
19162
19163 Semantics:
19164 """"""""""
19165
19166 The '``llvm.vp.reduce.xor``' intrinsic performs the integer ``XOR`` reduction
19167 (:ref:`llvm.vector.reduce.xor <int_vector_reduce_xor>`) of the vector operand
19168 ``val`` on each enabled lane, performing an '``xor``' of that with the scalar
19169 ``start_value``. Disabled lanes are treated as containing the neutral value
19170 ``0`` (i.e. having no effect on the reduction operation). If the vector length
19171 is zero, the result is the start value.
19172
19173 To ignore the start value, the neutral value can be used.
19174
19175 Examples:
19176 """""""""
19177
19178 .. code-block:: llvm
19179
19180       %r = call i32 @llvm.vp.reduce.xor.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19181       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19182       ; are treated as though %mask were false for those lanes.
19183
19184       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
19185       %reduction = call i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %masked.a)
19186       %also.r = xor i32 %reduction, %start
19187
19188
19189 .. _int_vp_reduce_smax:
19190
19191 '``llvm.vp.reduce.smax.*``' Intrinsics
19192 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19193
19194 Syntax:
19195 """""""
19196 This is an overloaded intrinsic.
19197
19198 ::
19199
19200       declare i32 @llvm.vp.reduce.smax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19201       declare i16 @llvm.vp.reduce.smax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19202
19203 Overview:
19204 """""""""
19205
19206 Predicated signed-integer ``MAX`` reduction of a vector and a scalar starting
19207 value, returning the result as a scalar.
19208
19209
19210 Arguments:
19211 """"""""""
19212
19213 The first operand is the start value of the reduction, which must be a scalar
19214 integer type equal to the result type. The second operand is the vector on
19215 which the reduction is performed and must be a vector of integer values whose
19216 element type is the result/start type. The third operand is the vector mask and
19217 is a vector of boolean values with the same number of elements as the vector
19218 operand. The fourth operand is the explicit vector length of the operation.
19219
19220 Semantics:
19221 """"""""""
19222
19223 The '``llvm.vp.reduce.smax``' intrinsic performs the signed-integer ``MAX``
19224 reduction (:ref:`llvm.vector.reduce.smax <int_vector_reduce_smax>`) of the
19225 vector operand ``val`` on each enabled lane, and taking the maximum of that and
19226 the scalar ``start_value``. Disabled lanes are treated as containing the
19227 neutral value ``INT_MIN`` (i.e. having no effect on the reduction operation).
19228 If the vector length is zero, the result is the start value.
19229
19230 To ignore the start value, the neutral value can be used.
19231
19232 Examples:
19233 """""""""
19234
19235 .. code-block:: llvm
19236
19237       %r = call i8 @llvm.vp.reduce.smax.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
19238       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19239       ; are treated as though %mask were false for those lanes.
19240
19241       %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 -128, i8 -128, i8 -128, i8 -128>
19242       %reduction = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> %masked.a)
19243       %also.r = call i8 @llvm.smax.i8(i8 %reduction, i8 %start)
19244
19245
19246 .. _int_vp_reduce_smin:
19247
19248 '``llvm.vp.reduce.smin.*``' Intrinsics
19249 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19250
19251 Syntax:
19252 """""""
19253 This is an overloaded intrinsic.
19254
19255 ::
19256
19257       declare i32 @llvm.vp.reduce.smin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19258       declare i16 @llvm.vp.reduce.smin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19259
19260 Overview:
19261 """""""""
19262
19263 Predicated signed-integer ``MIN`` reduction of a vector and a scalar starting
19264 value, returning the result as a scalar.
19265
19266
19267 Arguments:
19268 """"""""""
19269
19270 The first operand is the start value of the reduction, which must be a scalar
19271 integer type equal to the result type. The second operand is the vector on
19272 which the reduction is performed and must be a vector of integer values whose
19273 element type is the result/start type. The third operand is the vector mask and
19274 is a vector of boolean values with the same number of elements as the vector
19275 operand. The fourth operand is the explicit vector length of the operation.
19276
19277 Semantics:
19278 """"""""""
19279
19280 The '``llvm.vp.reduce.smin``' intrinsic performs the signed-integer ``MIN``
19281 reduction (:ref:`llvm.vector.reduce.smin <int_vector_reduce_smin>`) of the
19282 vector operand ``val`` on each enabled lane, and taking the minimum of that and
19283 the scalar ``start_value``. Disabled lanes are treated as containing the
19284 neutral value ``INT_MAX`` (i.e. having no effect on the reduction operation).
19285 If the vector length is zero, the result is the start value.
19286
19287 To ignore the start value, the neutral value can be used.
19288
19289 Examples:
19290 """""""""
19291
19292 .. code-block:: llvm
19293
19294       %r = call i8 @llvm.vp.reduce.smin.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
19295       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19296       ; are treated as though %mask were false for those lanes.
19297
19298       %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 127, i8 127, i8 127, i8 127>
19299       %reduction = call i8 @llvm.vector.reduce.smin.v4i8(<4 x i8> %masked.a)
19300       %also.r = call i8 @llvm.smin.i8(i8 %reduction, i8 %start)
19301
19302
19303 .. _int_vp_reduce_umax:
19304
19305 '``llvm.vp.reduce.umax.*``' Intrinsics
19306 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19307
19308 Syntax:
19309 """""""
19310 This is an overloaded intrinsic.
19311
19312 ::
19313
19314       declare i32 @llvm.vp.reduce.umax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19315       declare i16 @llvm.vp.reduce.umax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19316
19317 Overview:
19318 """""""""
19319
19320 Predicated unsigned-integer ``MAX`` reduction of a vector and a scalar starting
19321 value, returning the result as a scalar.
19322
19323
19324 Arguments:
19325 """"""""""
19326
19327 The first operand is the start value of the reduction, which must be a scalar
19328 integer type equal to the result type. The second operand is the vector on
19329 which the reduction is performed and must be a vector of integer values whose
19330 element type is the result/start type. The third operand is the vector mask and
19331 is a vector of boolean values with the same number of elements as the vector
19332 operand. The fourth operand is the explicit vector length of the operation.
19333
19334 Semantics:
19335 """"""""""
19336
19337 The '``llvm.vp.reduce.umax``' intrinsic performs the unsigned-integer ``MAX``
19338 reduction (:ref:`llvm.vector.reduce.umax <int_vector_reduce_umax>`) of the
19339 vector operand ``val`` on each enabled lane, and taking the maximum of that and
19340 the scalar ``start_value``. Disabled lanes are treated as containing the
19341 neutral value ``0`` (i.e. having no effect on the reduction operation). If the
19342 vector length is zero, the result is the start value.
19343
19344 To ignore the start value, the neutral value can be used.
19345
19346 Examples:
19347 """""""""
19348
19349 .. code-block:: llvm
19350
19351       %r = call i32 @llvm.vp.reduce.umax.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19352       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19353       ; are treated as though %mask were false for those lanes.
19354
19355       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
19356       %reduction = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %masked.a)
19357       %also.r = call i32 @llvm.umax.i32(i32 %reduction, i32 %start)
19358
19359
19360 .. _int_vp_reduce_umin:
19361
19362 '``llvm.vp.reduce.umin.*``' Intrinsics
19363 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19364
19365 Syntax:
19366 """""""
19367 This is an overloaded intrinsic.
19368
19369 ::
19370
19371       declare i32 @llvm.vp.reduce.umin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19372       declare i16 @llvm.vp.reduce.umin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19373
19374 Overview:
19375 """""""""
19376
19377 Predicated unsigned-integer ``MIN`` reduction of a vector and a scalar starting
19378 value, returning the result as a scalar.
19379
19380
19381 Arguments:
19382 """"""""""
19383
19384 The first operand is the start value of the reduction, which must be a scalar
19385 integer type equal to the result type. The second operand is the vector on
19386 which the reduction is performed and must be a vector of integer values whose
19387 element type is the result/start type. The third operand is the vector mask and
19388 is a vector of boolean values with the same number of elements as the vector
19389 operand. The fourth operand is the explicit vector length of the operation.
19390
19391 Semantics:
19392 """"""""""
19393
19394 The '``llvm.vp.reduce.umin``' intrinsic performs the unsigned-integer ``MIN``
19395 reduction (:ref:`llvm.vector.reduce.umin <int_vector_reduce_umin>`) of the
19396 vector operand ``val`` on each enabled lane, taking the minimum of that and the
19397 scalar ``start_value``. Disabled lanes are treated as containing the neutral
19398 value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
19399 operation). If the vector length is zero, the result is the start value.
19400
19401 To ignore the start value, the neutral value can be used.
19402
19403 Examples:
19404 """""""""
19405
19406 .. code-block:: llvm
19407
19408       %r = call i32 @llvm.vp.reduce.umin.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19409       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19410       ; are treated as though %mask were false for those lanes.
19411
19412       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
19413       %reduction = call i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %masked.a)
19414       %also.r = call i32 @llvm.umin.i32(i32 %reduction, i32 %start)
19415
19416
19417 .. _int_vp_reduce_fmax:
19418
19419 '``llvm.vp.reduce.fmax.*``' Intrinsics
19420 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19421
19422 Syntax:
19423 """""""
19424 This is an overloaded intrinsic.
19425
19426 ::
19427
19428       declare float @llvm.vp.reduce.fmax.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
19429       declare double @llvm.vp.reduce.fmax.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19430
19431 Overview:
19432 """""""""
19433
19434 Predicated floating-point ``MAX`` reduction of a vector and a scalar starting
19435 value, returning the result as a scalar.
19436
19437
19438 Arguments:
19439 """"""""""
19440
19441 The first operand is the start value of the reduction, which must be a scalar
19442 floating-point type equal to the result type. The second operand is the vector
19443 on which the reduction is performed and must be a vector of floating-point
19444 values whose element type is the result/start type. The third operand is the
19445 vector mask and is a vector of boolean values with the same number of elements
19446 as the vector operand. The fourth operand is the explicit vector length of the
19447 operation.
19448
19449 Semantics:
19450 """"""""""
19451
19452 The '``llvm.vp.reduce.fmax``' intrinsic performs the floating-point ``MAX``
19453 reduction (:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>`) of the
19454 vector operand ``val`` on each enabled lane, taking the maximum of that and the
19455 scalar ``start_value``. Disabled lanes are treated as containing the neutral
19456 value (i.e. having no effect on the reduction operation). If the vector length
19457 is zero, the result is the start value.
19458
19459 The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
19460 flags are set, the neutral value is ``-QNAN``. If ``nnan``  and ``ninf`` are
19461 both set, then the neutral value is the smallest floating-point value for the
19462 result type. If only ``nnan`` is set then the neutral value is ``-Infinity``.
19463
19464 This instruction has the same comparison semantics as the
19465 :ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>` intrinsic (and thus the
19466 '``llvm.maxnum.*``' intrinsic). That is, the result will always be a number
19467 unless all elements of the vector and the starting value are ``NaN``. For a
19468 vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
19469 ``-0.0`` elements, the sign of the result is unspecified.
19470
19471 To ignore the start value, the neutral value can be used.
19472
19473 Examples:
19474 """""""""
19475
19476 .. code-block:: llvm
19477
19478       %r = call float @llvm.vp.reduce.fmax.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
19479       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19480       ; are treated as though %mask were false for those lanes.
19481
19482       %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
19483       %reduction = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> %masked.a)
19484       %also.r = call float @llvm.maxnum.f32(float %reduction, float %start)
19485
19486
19487 .. _int_vp_reduce_fmin:
19488
19489 '``llvm.vp.reduce.fmin.*``' Intrinsics
19490 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19491
19492 Syntax:
19493 """""""
19494 This is an overloaded intrinsic.
19495
19496 ::
19497
19498       declare float @llvm.vp.reduce.fmin.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
19499       declare double @llvm.vp.reduce.fmin.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19500
19501 Overview:
19502 """""""""
19503
19504 Predicated floating-point ``MIN`` reduction of a vector and a scalar starting
19505 value, returning the result as a scalar.
19506
19507
19508 Arguments:
19509 """"""""""
19510
19511 The first operand is the start value of the reduction, which must be a scalar
19512 floating-point type equal to the result type. The second operand is the vector
19513 on which the reduction is performed and must be a vector of floating-point
19514 values whose element type is the result/start type. The third operand is the
19515 vector mask and is a vector of boolean values with the same number of elements
19516 as the vector operand. The fourth operand is the explicit vector length of the
19517 operation.
19518
19519 Semantics:
19520 """"""""""
19521
19522 The '``llvm.vp.reduce.fmin``' intrinsic performs the floating-point ``MIN``
19523 reduction (:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>`) of the
19524 vector operand ``val`` on each enabled lane, taking the minimum of that and the
19525 scalar ``start_value``. Disabled lanes are treated as containing the neutral
19526 value (i.e. having no effect on the reduction operation). If the vector length
19527 is zero, the result is the start value.
19528
19529 The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
19530 flags are set, the neutral value is ``+QNAN``. If ``nnan``  and ``ninf`` are
19531 both set, then the neutral value is the largest floating-point value for the
19532 result type. If only ``nnan`` is set then the neutral value is ``+Infinity``.
19533
19534 This instruction has the same comparison semantics as the
19535 :ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>` intrinsic (and thus the
19536 '``llvm.minnum.*``' intrinsic). That is, the result will always be a number
19537 unless all elements of the vector and the starting value are ``NaN``. For a
19538 vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
19539 ``-0.0`` elements, the sign of the result is unspecified.
19540
19541 To ignore the start value, the neutral value can be used.
19542
19543 Examples:
19544 """""""""
19545
19546 .. code-block:: llvm
19547
19548       %r = call float @llvm.vp.reduce.fmin.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
19549       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19550       ; are treated as though %mask were false for those lanes.
19551
19552       %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
19553       %reduction = call float @llvm.vector.reduce.fmin.v4f32(<4 x float> %masked.a)
19554       %also.r = call float @llvm.minnum.f32(float %reduction, float %start)
19555
19556
19557 .. _int_get_active_lane_mask:
19558
19559 '``llvm.get.active.lane.mask.*``' Intrinsics
19560 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19561
19562 Syntax:
19563 """""""
19564 This is an overloaded intrinsic.
19565
19566 ::
19567
19568       declare <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %base, i32 %n)
19569       declare <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 %base, i64 %n)
19570       declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 %base, i64 %n)
19571       declare <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %base, i64 %n)
19572
19573
19574 Overview:
19575 """""""""
19576
19577 Create a mask representing active and inactive vector lanes.
19578
19579
19580 Arguments:
19581 """"""""""
19582
19583 Both operands have the same scalar integer type. The result is a vector with
19584 the i1 element type.
19585
19586 Semantics:
19587 """"""""""
19588
19589 The '``llvm.get.active.lane.mask.*``' intrinsics are semantically equivalent
19590 to:
19591
19592 ::
19593
19594       %m[i] = icmp ult (%base + i), %n
19595
19596 where ``%m`` is a vector (mask) of active/inactive lanes with its elements
19597 indexed by ``i``,  and ``%base``, ``%n`` are the two arguments to
19598 ``llvm.get.active.lane.mask.*``, ``%icmp`` is an integer compare and ``ult``
19599 the unsigned less-than comparison operator.  Overflow cannot occur in
19600 ``(%base + i)`` and its comparison against ``%n`` as it is performed in integer
19601 numbers and not in machine numbers.  If ``%n`` is ``0``, then the result is a
19602 poison value. The above is equivalent to:
19603
19604 ::
19605
19606       %m = @llvm.get.active.lane.mask(%base, %n)
19607
19608 This can, for example, be emitted by the loop vectorizer in which case
19609 ``%base`` is the first element of the vector induction variable (VIV) and
19610 ``%n`` is the loop tripcount. Thus, these intrinsics perform an element-wise
19611 less than comparison of VIV with the loop tripcount, producing a mask of
19612 true/false values representing active/inactive vector lanes, except if the VIV
19613 overflows in which case they return false in the lanes where the VIV overflows.
19614 The arguments are scalar types to accommodate scalable vector types, for which
19615 it is unknown what the type of the step vector needs to be that enumerate its
19616 lanes without overflow.
19617
19618 This mask ``%m`` can e.g. be used in masked load/store instructions. These
19619 intrinsics provide a hint to the backend. I.e., for a vector loop, the
19620 back-edge taken count of the original scalar loop is explicit as the second
19621 argument.
19622
19623
19624 Examples:
19625 """""""""
19626
19627 .. code-block:: llvm
19628
19629       %active.lane.mask = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i64(i64 %elem0, i64 429)
19630       %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %active.lane.mask, <4 x i32> undef)
19631
19632
19633 .. _int_experimental_vp_splice:
19634
19635 '``llvm.experimental.vp.splice``' Intrinsic
19636 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19637
19638 Syntax:
19639 """""""
19640 This is an overloaded intrinsic.
19641
19642 ::
19643
19644       declare <2 x double> @llvm.experimental.vp.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm, <2 x i1> %mask, i32 %evl1, i32 %evl2)
19645       declare <vscale x 4 x i32> @llvm.experimental.vp.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm, <vscale x 4 x i1> %mask, i32 %evl1, i32 %evl2)
19646
19647 Overview:
19648 """""""""
19649
19650 The '``llvm.experimental.vp.splice.*``' intrinsic is the vector length
19651 predicated version of the '``llvm.experimental.vector.splice.*``' intrinsic.
19652
19653 Arguments:
19654 """"""""""
19655
19656 The result and the first two arguments ``vec1`` and ``vec2`` are vectors with
19657 the same type.  The third argument ``imm`` is an immediate signed integer that
19658 indicates the offset index.  The fourth argument ``mask`` is a vector mask and
19659 has the same number of elements as the result.  The last two arguments ``evl1``
19660 and ``evl2`` are unsigned integers indicating the explicit vector lengths of
19661 ``vec1`` and ``vec2`` respectively.  ``imm``, ``evl1`` and ``evl2`` should
19662 respect the following constraints: ``-evl1 <= imm < evl1``, ``0 <= evl1 <= VL``
19663 and ``0 <= evl2 <= VL``, where ``VL`` is the runtime vector factor. If these
19664 constraints are not satisfied the intrinsic has undefined behaviour.
19665
19666 Semantics:
19667 """"""""""
19668
19669 Effectively, this intrinsic concatenates ``vec1[0..evl1-1]`` and
19670 ``vec2[0..evl2-1]`` and creates the result vector by selecting the elements in a
19671 window of size ``evl2``, starting at index ``imm`` (for a positive immediate) of
19672 the concatenated vector. Elements in the result vector beyond ``evl2`` are
19673 ``undef``.  If ``imm`` is negative the starting index is ``evl1 + imm``.  The result
19674 vector of active vector length ``evl2`` contains ``evl1 - imm`` (``-imm`` for
19675 negative ``imm``) elements from indices ``[imm..evl1 - 1]``
19676 (``[evl1 + imm..evl1 -1]`` for negative ``imm``) of ``vec1`` followed by the
19677 first ``evl2 - (evl1 - imm)`` (``evl2 + imm`` for negative ``imm``) elements of
19678 ``vec2``. If ``evl1 - imm`` (``-imm``) >= ``evl2``, only the first ``evl2``
19679 elements are considered and the remaining are ``undef``.  The lanes in the result
19680 vector disabled by ``mask`` are ``undef``.
19681
19682 Examples:
19683 """""""""
19684
19685 .. code-block:: text
19686
19687  llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, 1, 2, 3)  ==> <B, E, F, undef> ; index
19688  llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, -2, 3, 2) ==> <B, C, undef, undef> ; trailing elements
19689
19690
19691 .. _int_vp_load:
19692
19693 '``llvm.vp.load``' Intrinsic
19694 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19695
19696 Syntax:
19697 """""""
19698 This is an overloaded intrinsic.
19699
19700 ::
19701
19702     declare <4 x float> @llvm.vp.load.v4f32.p0v4f32(<4 x float>* %ptr, <4 x i1> %mask, i32 %evl)
19703     declare <vscale x 2 x i16> @llvm.vp.load.nxv2i16.p0nxv2i16(<vscale x 2 x i16>* %ptr, <vscale x 2 x i1> %mask, i32 %evl)
19704     declare <8 x float> @llvm.vp.load.v8f32.p1v8f32(<8 x float> addrspace(1)* %ptr, <8 x i1> %mask, i32 %evl)
19705     declare <vscale x 1 x i64> @llvm.vp.load.nxv1i64.p6nxv1i64(<vscale x 1 x i64> addrspace(6)* %ptr, <vscale x 1 x i1> %mask, i32 %evl)
19706
19707 Overview:
19708 """""""""
19709
19710 The '``llvm.vp.load.*``' intrinsic is the vector length predicated version of
19711 the :ref:`llvm.masked.load <int_mload>` intrinsic.
19712
19713 Arguments:
19714 """"""""""
19715
19716 The first operand is the base pointer for the load. The second operand is a
19717 vector of boolean values with the same number of elements as the return type.
19718 The third is the explicit vector length of the operation. The return type and
19719 underlying type of the base pointer are the same vector types.
19720
19721 The :ref:`align <attr_align>` parameter attribute can be provided for the first
19722 operand.
19723
19724 Semantics:
19725 """"""""""
19726
19727 The '``llvm.vp.load``' intrinsic reads a vector from memory in the same way as
19728 the '``llvm.masked.load``' intrinsic, where the mask is taken from the
19729 combination of the '``mask``' and '``evl``' operands in the usual VP way.
19730 Certain '``llvm.masked.load``' operands do not have corresponding operands in
19731 '``llvm.vp.load``': the '``passthru``' operand is implicitly ``undef``; the
19732 '``alignment``' operand is taken as the ``align`` parameter attribute, if
19733 provided. The default alignment is taken as the ABI alignment of the return
19734 type as specified by the :ref:`datalayout string<langref_datalayout>`.
19735
19736 Examples:
19737 """""""""
19738
19739 .. code-block:: text
19740
19741      %r = call <8 x i8> @llvm.vp.load.v8i8.p0v8i8(<8 x i8>* align 2 %ptr, <8 x i1> %mask, i32 %evl)
19742      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19743
19744      %also.r = call <8 x i8> @llvm.masked.load.v8i8.p0v8i8(<8 x i8>* %ptr, i32 2, <8 x i1> %mask, <8 x i8> undef)
19745
19746
19747 .. _int_vp_store:
19748
19749 '``llvm.vp.store``' Intrinsic
19750 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19751
19752 Syntax:
19753 """""""
19754 This is an overloaded intrinsic.
19755
19756 ::
19757
19758     declare void @llvm.vp.store.v4f32.p0v4f32(<4 x float> %val, <4 x float>* %ptr, <4 x i1> %mask, i32 %evl)
19759     declare void @llvm.vp.store.nxv2i16.p0nxv2i16(<vscale x 2 x i16> %val, <vscale x 2 x i16>* %ptr, <vscale x 2 x i1> %mask, i32 %evl)
19760     declare void @llvm.vp.store.v8f32.p1v8f32(<8 x float> %val, <8 x float> addrspace(1)* %ptr, <8 x i1> %mask, i32 %evl)
19761     declare void @llvm.vp.store.nxv1i64.p6nxv1i64(<vscale x 1 x i64> %val, <vscale x 1 x i64> addrspace(6)* %ptr, <vscale x 1 x i1> %mask, i32 %evl)
19762
19763 Overview:
19764 """""""""
19765
19766 The '``llvm.vp.store.*``' intrinsic is the vector length predicated version of
19767 the :ref:`llvm.masked.store <int_mstore>` intrinsic.
19768
19769 Arguments:
19770 """"""""""
19771
19772 The first operand is the vector value to be written to memory. The second
19773 operand is the base pointer for the store. It has the same underlying type as
19774 the value operand. The third operand is a vector of boolean values with the
19775 same number of elements as the return type. The fourth is the explicit vector
19776 length of the operation.
19777
19778 The :ref:`align <attr_align>` parameter attribute can be provided for the
19779 second operand.
19780
19781 Semantics:
19782 """"""""""
19783
19784 The '``llvm.vp.store``' intrinsic reads a vector from memory in the same way as
19785 the '``llvm.masked.store``' intrinsic, where the mask is taken from the
19786 combination of the '``mask``' and '``evl``' operands in the usual VP way. The
19787 alignment of the operation (corresponding to the '``alignment``' operand of
19788 '``llvm.masked.store``') is specified by the ``align`` parameter attribute (see
19789 above). If it is not provided then the ABI alignment of the type of the
19790 '``value``' operand as specified by the :ref:`datalayout
19791 string<langref_datalayout>` is used instead.
19792
19793 Examples:
19794 """""""""
19795
19796 .. code-block:: text
19797
19798      call void @llvm.vp.store.v8i8.p0v8i8(<8 x i8> %val, <8 x i8>* align 4 %ptr, <8 x i1> %mask, i32 %evl)
19799      ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below.
19800
19801      call void @llvm.masked.store.v8i8.p0v8i8(<8 x i8> %val, <8 x i8>* %ptr, i32 4, <8 x i1> %mask)
19802
19803
19804 .. _int_vp_gather:
19805
19806 '``llvm.vp.gather``' Intrinsic
19807 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19808
19809 Syntax:
19810 """""""
19811 This is an overloaded intrinsic.
19812
19813 ::
19814
19815     declare <4 x double> @llvm.vp.gather.v4f64.v4p0f64(<4 x double*> %ptrs, <4 x i1> %mask, i32 %evl)
19816     declare <vscale x 2 x i8> @llvm.vp.gather.nxv2i8.nxv2p0i8(<vscale x 2 x i8*> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
19817     declare <2 x float> @llvm.vp.gather.v2f32.v2p2f32(<2 x float addrspace(2)*> %ptrs, <2 x i1> %mask, i32 %evl)
19818     declare <vscale x 4 x i32> @llvm.vp.gather.nxv4i32.nxv4p4i32(<vscale x 4 x i32 addrspace(4)*> %ptrs, <vscale x 4 x i1> %mask, i32 %evl)
19819
19820 Overview:
19821 """""""""
19822
19823 The '``llvm.vp.gather.*``' intrinsic is the vector length predicated version of
19824 the :ref:`llvm.masked.gather <int_mgather>` intrinsic.
19825
19826 Arguments:
19827 """"""""""
19828
19829 The first operand is a vector of pointers which holds all memory addresses to
19830 read. The second operand is a vector of boolean values with the same number of
19831 elements as the return type. The third is the explicit vector length of the
19832 operation. The return type and underlying type of the vector of pointers are
19833 the same vector types.
19834
19835 The :ref:`align <attr_align>` parameter attribute can be provided for the first
19836 operand.
19837
19838 Semantics:
19839 """"""""""
19840
19841 The '``llvm.vp.gather``' intrinsic reads multiple scalar values from memory in
19842 the same way as the '``llvm.masked.gather``' intrinsic, where the mask is taken
19843 from the combination of the '``mask``' and '``evl``' operands in the usual VP
19844 way. Certain '``llvm.masked.gather``' operands do not have corresponding
19845 operands in '``llvm.vp.gather``': the '``passthru``' operand is implicitly
19846 ``undef``; the '``alignment``' operand is taken as the ``align`` parameter, if
19847 provided. The default alignment is taken as the ABI alignment of the source
19848 addresses as specified by the :ref:`datalayout string<langref_datalayout>`.
19849
19850 Examples:
19851 """""""""
19852
19853 .. code-block:: text
19854
19855      %r = call <8 x i8> @llvm.vp.gather.v8i8.v8p0i8(<8 x i8*>  align 8 %ptrs, <8 x i1> %mask, i32 %evl)
19856      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19857
19858      %also.r = call <8 x i8> @llvm.masked.gather.v8i8.v8p0i8(<8 x i8*> %ptrs, i32 8, <8 x i1> %mask, <8 x i8> undef)
19859
19860
19861 .. _int_vp_scatter:
19862
19863 '``llvm.vp.scatter``' Intrinsic
19864 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19865
19866 Syntax:
19867 """""""
19868 This is an overloaded intrinsic.
19869
19870 ::
19871
19872     declare void @llvm.vp.scatter.v4f64.v4p0f64(<4 x double> %val, <4 x double*> %ptrs, <4 x i1> %mask, i32 %evl)
19873     declare void @llvm.vp.scatter.nxv2i8.nxv2p0i8(<vscale x 2 x i8> %val, <vscale x 2 x i8*> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
19874     declare void @llvm.vp.scatter.v2f32.v2p2f32(<2 x float> %val, <2 x float addrspace(2)*> %ptrs, <2 x i1> %mask, i32 %evl)
19875     declare void @llvm.vp.scatter.nxv4i32.nxv4p4i32(<vscale x 4 x i32> %val, <vscale x 4 x i32 addrspace(4)*> %ptrs, <vscale x 4 x i1> %mask, i32 %evl)
19876
19877 Overview:
19878 """""""""
19879
19880 The '``llvm.vp.scatter.*``' intrinsic is the vector length predicated version of
19881 the :ref:`llvm.masked.scatter <int_mscatter>` intrinsic.
19882
19883 Arguments:
19884 """"""""""
19885
19886 The first operand is a vector value to be written to memory. The second operand
19887 is a vector of pointers, pointing to where the value elements should be stored.
19888 The third operand is a vector of boolean values with the same number of
19889 elements as the return type. The fourth is the explicit vector length of the
19890 operation.
19891
19892 The :ref:`align <attr_align>` parameter attribute can be provided for the
19893 second operand.
19894
19895 Semantics:
19896 """"""""""
19897
19898 The '``llvm.vp.scatter``' intrinsic writes multiple scalar values to memory in
19899 the same way as the '``llvm.masked.scatter``' intrinsic, where the mask is
19900 taken from the combination of the '``mask``' and '``evl``' operands in the
19901 usual VP way. The '``alignment``' operand of the '``llvm.masked.scatter``' does
19902 not have a corresponding operand in '``llvm.vp.scatter``': it is instead
19903 provided via the optional ``align`` parameter attribute on the
19904 vector-of-pointers operand. Otherwise it is taken as the ABI alignment of the
19905 destination addresses as specified by the :ref:`datalayout
19906 string<langref_datalayout>`.
19907
19908 Examples:
19909 """""""""
19910
19911 .. code-block:: text
19912
19913      call void @llvm.vp.scatter.v8i8.v8p0i8(<8 x i8> %val, <8 x i8*> align 1 %ptrs, <8 x i1> %mask, i32 %evl)
19914      ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below.
19915
19916      call void @llvm.masked.scatter.v8i8.v8p0i8(<8 x i8> %val, <8 x i8*> %ptrs, i32 1, <8 x i1> %mask)
19917
19918
19919 .. _int_mload_mstore:
19920
19921 Masked Vector Load and Store Intrinsics
19922 ---------------------------------------
19923
19924 LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed.
19925
19926 .. _int_mload:
19927
19928 '``llvm.masked.load.*``' Intrinsics
19929 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19930
19931 Syntax:
19932 """""""
19933 This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type.
19934
19935 ::
19936
19937       declare <16 x float>  @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
19938       declare <2 x double>  @llvm.masked.load.v2f64.p0v2f64  (<2 x double>* <ptr>, i32 <alignment>, <2 x i1>  <mask>, <2 x double> <passthru>)
19939       ;; The data is a vector of pointers to double
19940       declare <8 x double*> @llvm.masked.load.v8p0f64.p0v8p0f64    (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>)
19941       ;; The data is a vector of function pointers
19942       declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f.p0v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>)
19943
19944 Overview:
19945 """""""""
19946
19947 Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
19948
19949
19950 Arguments:
19951 """"""""""
19952
19953 The first operand is the base pointer for the load. The second operand is the alignment of the source location. It must be a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' operand are the same vector types.
19954
19955 Semantics:
19956 """"""""""
19957
19958 The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
19959 The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.
19960
19961
19962 ::
19963
19964        %res = call <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru)
19965
19966        ;; The result of the two following instructions is identical aside from potential memory access exception
19967        %loadlal = load <16 x float>, <16 x float>* %ptr, align 4
19968        %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru
19969
19970 .. _int_mstore:
19971
19972 '``llvm.masked.store.*``' Intrinsics
19973 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19974
19975 Syntax:
19976 """""""
19977 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type.
19978
19979 ::
19980
19981        declare void @llvm.masked.store.v8i32.p0v8i32  (<8  x i32>   <value>, <8  x i32>*   <ptr>, i32 <alignment>,  <8  x i1> <mask>)
19982        declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>,  <16 x i1> <mask>)
19983        ;; The data is a vector of pointers to double
19984        declare void @llvm.masked.store.v8p0f64.p0v8p0f64    (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>)
19985        ;; The data is a vector of function pointers
19986        declare void @llvm.masked.store.v4p0f_i32f.p0v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>)
19987
19988 Overview:
19989 """""""""
19990
19991 Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
19992
19993 Arguments:
19994 """"""""""
19995
19996 The first operand is the vector value to be written to memory. The second operand is the base pointer for the store, it has the same underlying type as the value operand. The third operand is the alignment of the destination location. It must be a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
19997
19998
19999 Semantics:
20000 """"""""""
20001
20002 The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
20003 The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes.
20004
20005 ::
20006
20007        call void @llvm.masked.store.v16f32.p0v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4,  <16 x i1> %mask)
20008
20009        ;; The result of the following instructions is identical aside from potential data races and memory access exceptions
20010        %oldval = load <16 x float>, <16 x float>* %ptr, align 4
20011        %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval
20012        store <16 x float> %res, <16 x float>* %ptr, align 4
20013
20014
20015 Masked Vector Gather and Scatter Intrinsics
20016 -------------------------------------------
20017
20018 LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed.
20019
20020 .. _int_mgather:
20021
20022 '``llvm.masked.gather.*``' Intrinsics
20023 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20024
20025 Syntax:
20026 """""""
20027 This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector.
20028
20029 ::
20030
20031       declare <16 x float> @llvm.masked.gather.v16f32.v16p0f32   (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
20032       declare <2 x double> @llvm.masked.gather.v2f64.v2p1f64     (<2 x double addrspace(1)*> <ptrs>, i32 <alignment>, <2 x i1>  <mask>, <2 x double> <passthru>)
20033       declare <8 x float*> @llvm.masked.gather.v8p0f32.v8p0p0f32 (<8 x float**> <ptrs>, i32 <alignment>, <8 x i1>  <mask>, <8 x float*> <passthru>)
20034
20035 Overview:
20036 """""""""
20037
20038 Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
20039
20040
20041 Arguments:
20042 """"""""""
20043
20044 The first operand is a vector of pointers which holds all memory addresses to read. The second operand is an alignment of the source addresses. It must be 0 or a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' operand are the same vector types.
20045
20046 Semantics:
20047 """"""""""
20048
20049 The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations.
20050 The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks.
20051
20052
20053 ::
20054
20055        %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0f64 (<4 x double*> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> undef)
20056
20057        ;; The gather with all-true mask is equivalent to the following instruction sequence
20058        %ptr0 = extractelement <4 x double*> %ptrs, i32 0
20059        %ptr1 = extractelement <4 x double*> %ptrs, i32 1
20060        %ptr2 = extractelement <4 x double*> %ptrs, i32 2
20061        %ptr3 = extractelement <4 x double*> %ptrs, i32 3
20062
20063        %val0 = load double, double* %ptr0, align 8
20064        %val1 = load double, double* %ptr1, align 8
20065        %val2 = load double, double* %ptr2, align 8
20066        %val3 = load double, double* %ptr3, align 8
20067
20068        %vec0    = insertelement <4 x double>undef, %val0, 0
20069        %vec01   = insertelement <4 x double>%vec0, %val1, 1
20070        %vec012  = insertelement <4 x double>%vec01, %val2, 2
20071        %vec0123 = insertelement <4 x double>%vec012, %val3, 3
20072
20073 .. _int_mscatter:
20074
20075 '``llvm.masked.scatter.*``' Intrinsics
20076 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20077
20078 Syntax:
20079 """""""
20080 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element.
20081
20082 ::
20083
20084        declare void @llvm.masked.scatter.v8i32.v8p0i32     (<8 x i32>     <value>, <8 x i32*>     <ptrs>, i32 <alignment>, <8 x i1>  <mask>)
20085        declare void @llvm.masked.scatter.v16f32.v16p1f32   (<16 x float>  <value>, <16 x float addrspace(1)*>  <ptrs>, i32 <alignment>, <16 x i1> <mask>)
20086        declare void @llvm.masked.scatter.v4p0f64.v4p0p0f64 (<4 x double*> <value>, <4 x double**> <ptrs>, i32 <alignment>, <4 x i1>  <mask>)
20087
20088 Overview:
20089 """""""""
20090
20091 Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
20092
20093 Arguments:
20094 """"""""""
20095
20096 The first operand is a vector value to be written to memory. The second operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The third operand is an alignment of the destination addresses. It must be 0 or a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
20097
20098 Semantics:
20099 """"""""""
20100
20101 The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
20102
20103 ::
20104
20105        ;; This instruction unconditionally stores data vector in multiple addresses
20106        call @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4,  <8 x i1>  <true, true, .. true>)
20107
20108        ;; It is equivalent to a list of scalar stores
20109        %val0 = extractelement <8 x i32> %value, i32 0
20110        %val1 = extractelement <8 x i32> %value, i32 1
20111        ..
20112        %val7 = extractelement <8 x i32> %value, i32 7
20113        %ptr0 = extractelement <8 x i32*> %ptrs, i32 0
20114        %ptr1 = extractelement <8 x i32*> %ptrs, i32 1
20115        ..
20116        %ptr7 = extractelement <8 x i32*> %ptrs, i32 7
20117        ;; Note: the order of the following stores is important when they overlap:
20118        store i32 %val0, i32* %ptr0, align 4
20119        store i32 %val1, i32* %ptr1, align 4
20120        ..
20121        store i32 %val7, i32* %ptr7, align 4
20122
20123
20124 Masked Vector Expanding Load and Compressing Store Intrinsics
20125 -------------------------------------------------------------
20126
20127 LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`.
20128
20129 .. _int_expandload:
20130
20131 '``llvm.masked.expandload.*``' Intrinsics
20132 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20133
20134 Syntax:
20135 """""""
20136 This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask.
20137
20138 ::
20139
20140       declare <16 x float>  @llvm.masked.expandload.v16f32 (float* <ptr>, <16 x i1> <mask>, <16 x float> <passthru>)
20141       declare <2 x i64>     @llvm.masked.expandload.v2i64 (i64* <ptr>, <2 x i1>  <mask>, <2 x i64> <passthru>)
20142
20143 Overview:
20144 """""""""
20145
20146 Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "expandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' operand.
20147
20148
20149 Arguments:
20150 """"""""""
20151
20152 The first operand is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second operand, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' operand have the same vector type.
20153
20154 Semantics:
20155 """"""""""
20156
20157 The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example:
20158
20159 .. code-block:: c
20160
20161     // In this loop we load from B and spread the elements into array A.
20162     double *A, B; int *C;
20163     for (int i = 0; i < size; ++i) {
20164       if (C[i] != 0)
20165         A[i] = B[j++];
20166     }
20167
20168
20169 .. code-block:: llvm
20170
20171     ; Load several elements from array B and expand them in a vector.
20172     ; The number of loaded elements is equal to the number of '1' elements in the Mask.
20173     %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(double* %Bptr, <8 x i1> %Mask, <8 x double> undef)
20174     ; Store the result in A
20175     call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> %Tmp, <8 x double>* %Aptr, i32 8, <8 x i1> %Mask)
20176
20177     ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
20178     %MaskI = bitcast <8 x i1> %Mask to i8
20179     %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
20180     %MaskI64 = zext i8 %MaskIPopcnt to i64
20181     %BNextInd = add i64 %BInd, %MaskI64
20182
20183
20184 Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles.
20185 If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load.
20186
20187 .. _int_compressstore:
20188
20189 '``llvm.masked.compressstore.*``' Intrinsics
20190 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20191
20192 Syntax:
20193 """""""
20194 This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector.
20195
20196 ::
20197
20198       declare void @llvm.masked.compressstore.v8i32  (<8  x i32>   <value>, i32*   <ptr>, <8  x i1> <mask>)
20199       declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, float* <ptr>, <16 x i1> <mask>)
20200
20201 Overview:
20202 """""""""
20203
20204 Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask.
20205
20206 Arguments:
20207 """"""""""
20208
20209 The first operand is the input vector, from which elements are collected and written to memory. The second operand is the base pointer for the store, it has the same underlying type as the element of the input vector operand. The third operand is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements.
20210
20211
20212 Semantics:
20213 """"""""""
20214
20215 The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependences like in the following example:
20216
20217 .. code-block:: c
20218
20219     // In this loop we load elements from A and store them consecutively in B
20220     double *A, B; int *C;
20221     for (int i = 0; i < size; ++i) {
20222       if (C[i] != 0)
20223         B[j++] = A[i]
20224     }
20225
20226
20227 .. code-block:: llvm
20228
20229     ; Load elements from A.
20230     %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double>* %Aptr, i32 8, <8 x i1> %Mask, <8 x double> undef)
20231     ; Store all selected elements consecutively in array B
20232     call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, double* %Bptr, <8 x i1> %Mask)
20233
20234     ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
20235     %MaskI = bitcast <8 x i1> %Mask to i8
20236     %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
20237     %MaskI64 = zext i8 %MaskIPopcnt to i64
20238     %BNextInd = add i64 %BInd, %MaskI64
20239
20240
20241 Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations.
20242
20243
20244 Memory Use Markers
20245 ------------------
20246
20247 This class of intrinsics provides information about the
20248 :ref:`lifetime of memory objects <objectlifetime>` and ranges where variables
20249 are immutable.
20250
20251 .. _int_lifestart:
20252
20253 '``llvm.lifetime.start``' Intrinsic
20254 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20255
20256 Syntax:
20257 """""""
20258
20259 ::
20260
20261       declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>)
20262
20263 Overview:
20264 """""""""
20265
20266 The '``llvm.lifetime.start``' intrinsic specifies the start of a memory
20267 object's lifetime.
20268
20269 Arguments:
20270 """"""""""
20271
20272 The first argument is a constant integer representing the size of the
20273 object, or -1 if it is variable sized. The second argument is a pointer
20274 to the object.
20275
20276 Semantics:
20277 """"""""""
20278
20279 If ``ptr`` is a stack-allocated object and it points to the first byte of
20280 the object, the object is initially marked as dead.
20281 ``ptr`` is conservatively considered as a non-stack-allocated object if
20282 the stack coloring algorithm that is used in the optimization pipeline cannot
20283 conclude that ``ptr`` is a stack-allocated object.
20284
20285 After '``llvm.lifetime.start``', the stack object that ``ptr`` points is marked
20286 as alive and has an uninitialized value.
20287 The stack object is marked as dead when either
20288 :ref:`llvm.lifetime.end <int_lifeend>` to the alloca is executed or the
20289 function returns.
20290
20291 After :ref:`llvm.lifetime.end <int_lifeend>` is called,
20292 '``llvm.lifetime.start``' on the stack object can be called again.
20293 The second '``llvm.lifetime.start``' call marks the object as alive, but it
20294 does not change the address of the object.
20295
20296 If ``ptr`` is a non-stack-allocated object, it does not point to the first
20297 byte of the object or it is a stack object that is already alive, it simply
20298 fills all bytes of the object with ``poison``.
20299
20300
20301 .. _int_lifeend:
20302
20303 '``llvm.lifetime.end``' Intrinsic
20304 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20305
20306 Syntax:
20307 """""""
20308
20309 ::
20310
20311       declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>)
20312
20313 Overview:
20314 """""""""
20315
20316 The '``llvm.lifetime.end``' intrinsic specifies the end of a memory object's
20317 lifetime.
20318
20319 Arguments:
20320 """"""""""
20321
20322 The first argument is a constant integer representing the size of the
20323 object, or -1 if it is variable sized. The second argument is a pointer
20324 to the object.
20325
20326 Semantics:
20327 """"""""""
20328
20329 If ``ptr`` is a stack-allocated object and it points to the first byte of the
20330 object, the object is dead.
20331 ``ptr`` is conservatively considered as a non-stack-allocated object if
20332 the stack coloring algorithm that is used in the optimization pipeline cannot
20333 conclude that ``ptr`` is a stack-allocated object.
20334
20335 Calling ``llvm.lifetime.end`` on an already dead alloca is no-op.
20336
20337 If ``ptr`` is a non-stack-allocated object or it does not point to the first
20338 byte of the object, it is equivalent to simply filling all bytes of the object
20339 with ``poison``.
20340
20341
20342 '``llvm.invariant.start``' Intrinsic
20343 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20344
20345 Syntax:
20346 """""""
20347 This is an overloaded intrinsic. The memory object can belong to any address space.
20348
20349 ::
20350
20351       declare {}* @llvm.invariant.start.p0i8(i64 <size>, i8* nocapture <ptr>)
20352
20353 Overview:
20354 """""""""
20355
20356 The '``llvm.invariant.start``' intrinsic specifies that the contents of
20357 a memory object will not change.
20358
20359 Arguments:
20360 """"""""""
20361
20362 The first argument is a constant integer representing the size of the
20363 object, or -1 if it is variable sized. The second argument is a pointer
20364 to the object.
20365
20366 Semantics:
20367 """"""""""
20368
20369 This intrinsic indicates that until an ``llvm.invariant.end`` that uses
20370 the return value, the referenced memory location is constant and
20371 unchanging.
20372
20373 '``llvm.invariant.end``' Intrinsic
20374 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20375
20376 Syntax:
20377 """""""
20378 This is an overloaded intrinsic. The memory object can belong to any address space.
20379
20380 ::
20381
20382       declare void @llvm.invariant.end.p0i8({}* <start>, i64 <size>, i8* nocapture <ptr>)
20383
20384 Overview:
20385 """""""""
20386
20387 The '``llvm.invariant.end``' intrinsic specifies that the contents of a
20388 memory object are mutable.
20389
20390 Arguments:
20391 """"""""""
20392
20393 The first argument is the matching ``llvm.invariant.start`` intrinsic.
20394 The second argument is a constant integer representing the size of the
20395 object, or -1 if it is variable sized and the third argument is a
20396 pointer to the object.
20397
20398 Semantics:
20399 """"""""""
20400
20401 This intrinsic indicates that the memory is mutable again.
20402
20403 '``llvm.launder.invariant.group``' Intrinsic
20404 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20405
20406 Syntax:
20407 """""""
20408 This is an overloaded intrinsic. The memory object can belong to any address
20409 space. The returned pointer must belong to the same address space as the
20410 argument.
20411
20412 ::
20413
20414       declare i8* @llvm.launder.invariant.group.p0i8(i8* <ptr>)
20415
20416 Overview:
20417 """""""""
20418
20419 The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant
20420 established by ``invariant.group`` metadata no longer holds, to obtain a new
20421 pointer value that carries fresh invariant group information. It is an
20422 experimental intrinsic, which means that its semantics might change in the
20423 future.
20424
20425
20426 Arguments:
20427 """"""""""
20428
20429 The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer
20430 to the memory.
20431
20432 Semantics:
20433 """"""""""
20434
20435 Returns another pointer that aliases its argument but which is considered different
20436 for the purposes of ``load``/``store`` ``invariant.group`` metadata.
20437 It does not read any accessible memory and the execution can be speculated.
20438
20439 '``llvm.strip.invariant.group``' Intrinsic
20440 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20441
20442 Syntax:
20443 """""""
20444 This is an overloaded intrinsic. The memory object can belong to any address
20445 space. The returned pointer must belong to the same address space as the
20446 argument.
20447
20448 ::
20449
20450       declare i8* @llvm.strip.invariant.group.p0i8(i8* <ptr>)
20451
20452 Overview:
20453 """""""""
20454
20455 The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant
20456 established by ``invariant.group`` metadata no longer holds, to obtain a new pointer
20457 value that does not carry the invariant information. It is an experimental
20458 intrinsic, which means that its semantics might change in the future.
20459
20460
20461 Arguments:
20462 """"""""""
20463
20464 The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer
20465 to the memory.
20466
20467 Semantics:
20468 """"""""""
20469
20470 Returns another pointer that aliases its argument but which has no associated
20471 ``invariant.group`` metadata.
20472 It does not read any memory and can be speculated.
20473
20474
20475
20476 .. _constrainedfp:
20477
20478 Constrained Floating-Point Intrinsics
20479 -------------------------------------
20480
20481 These intrinsics are used to provide special handling of floating-point
20482 operations when specific rounding mode or floating-point exception behavior is
20483 required.  By default, LLVM optimization passes assume that the rounding mode is
20484 round-to-nearest and that floating-point exceptions will not be monitored.
20485 Constrained FP intrinsics are used to support non-default rounding modes and
20486 accurately preserve exception behavior without compromising LLVM's ability to
20487 optimize FP code when the default behavior is used.
20488
20489 If any FP operation in a function is constrained then they all must be
20490 constrained. This is required for correct LLVM IR. Optimizations that
20491 move code around can create miscompiles if mixing of constrained and normal
20492 operations is done. The correct way to mix constrained and less constrained
20493 operations is to use the rounding mode and exception handling metadata to
20494 mark constrained intrinsics as having LLVM's default behavior.
20495
20496 Each of these intrinsics corresponds to a normal floating-point operation. The
20497 data arguments and the return value are the same as the corresponding FP
20498 operation.
20499
20500 The rounding mode argument is a metadata string specifying what
20501 assumptions, if any, the optimizer can make when transforming constant
20502 values. Some constrained FP intrinsics omit this argument. If required
20503 by the intrinsic, this argument must be one of the following strings:
20504
20505 ::
20506
20507       "round.dynamic"
20508       "round.tonearest"
20509       "round.downward"
20510       "round.upward"
20511       "round.towardzero"
20512       "round.tonearestaway"
20513
20514 If this argument is "round.dynamic" optimization passes must assume that the
20515 rounding mode is unknown and may change at runtime.  No transformations that
20516 depend on rounding mode may be performed in this case.
20517
20518 The other possible values for the rounding mode argument correspond to the
20519 similarly named IEEE rounding modes.  If the argument is any of these values
20520 optimization passes may perform transformations as long as they are consistent
20521 with the specified rounding mode.
20522
20523 For example, 'x-0'->'x' is not a valid transformation if the rounding mode is
20524 "round.downward" or "round.dynamic" because if the value of 'x' is +0 then
20525 'x-0' should evaluate to '-0' when rounding downward.  However, this
20526 transformation is legal for all other rounding modes.
20527
20528 For values other than "round.dynamic" optimization passes may assume that the
20529 actual runtime rounding mode (as defined in a target-specific manner) matches
20530 the specified rounding mode, but this is not guaranteed.  Using a specific
20531 non-dynamic rounding mode which does not match the actual rounding mode at
20532 runtime results in undefined behavior.
20533
20534 The exception behavior argument is a metadata string describing the floating
20535 point exception semantics that required for the intrinsic. This argument
20536 must be one of the following strings:
20537
20538 ::
20539
20540       "fpexcept.ignore"
20541       "fpexcept.maytrap"
20542       "fpexcept.strict"
20543
20544 If this argument is "fpexcept.ignore" optimization passes may assume that the
20545 exception status flags will not be read and that floating-point exceptions will
20546 be masked.  This allows transformations to be performed that may change the
20547 exception semantics of the original code.  For example, FP operations may be
20548 speculatively executed in this case whereas they must not be for either of the
20549 other possible values of this argument.
20550
20551 If the exception behavior argument is "fpexcept.maytrap" optimization passes
20552 must avoid transformations that may raise exceptions that would not have been
20553 raised by the original code (such as speculatively executing FP operations), but
20554 passes are not required to preserve all exceptions that are implied by the
20555 original code.  For example, exceptions may be potentially hidden by constant
20556 folding.
20557
20558 If the exception behavior argument is "fpexcept.strict" all transformations must
20559 strictly preserve the floating-point exception semantics of the original code.
20560 Any FP exception that would have been raised by the original code must be raised
20561 by the transformed code, and the transformed code must not raise any FP
20562 exceptions that would not have been raised by the original code.  This is the
20563 exception behavior argument that will be used if the code being compiled reads
20564 the FP exception status flags, but this mode can also be used with code that
20565 unmasks FP exceptions.
20566
20567 The number and order of floating-point exceptions is NOT guaranteed.  For
20568 example, a series of FP operations that each may raise exceptions may be
20569 vectorized into a single instruction that raises each unique exception a single
20570 time.
20571
20572 Proper :ref:`function attributes <fnattrs>` usage is required for the
20573 constrained intrinsics to function correctly.
20574
20575 All function *calls* done in a function that uses constrained floating
20576 point intrinsics must have the ``strictfp`` attribute.
20577
20578 All function *definitions* that use constrained floating point intrinsics
20579 must have the ``strictfp`` attribute.
20580
20581 '``llvm.experimental.constrained.fadd``' Intrinsic
20582 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20583
20584 Syntax:
20585 """""""
20586
20587 ::
20588
20589       declare <type>
20590       @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>,
20591                                           metadata <rounding mode>,
20592                                           metadata <exception behavior>)
20593
20594 Overview:
20595 """""""""
20596
20597 The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its
20598 two operands.
20599
20600
20601 Arguments:
20602 """"""""""
20603
20604 The first two arguments to the '``llvm.experimental.constrained.fadd``'
20605 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20606 of floating-point values. Both arguments must have identical types.
20607
20608 The third and fourth arguments specify the rounding mode and exception
20609 behavior as described above.
20610
20611 Semantics:
20612 """"""""""
20613
20614 The value produced is the floating-point sum of the two value operands and has
20615 the same type as the operands.
20616
20617
20618 '``llvm.experimental.constrained.fsub``' Intrinsic
20619 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20620
20621 Syntax:
20622 """""""
20623
20624 ::
20625
20626       declare <type>
20627       @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>,
20628                                           metadata <rounding mode>,
20629                                           metadata <exception behavior>)
20630
20631 Overview:
20632 """""""""
20633
20634 The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference
20635 of its two operands.
20636
20637
20638 Arguments:
20639 """"""""""
20640
20641 The first two arguments to the '``llvm.experimental.constrained.fsub``'
20642 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20643 of floating-point values. Both arguments must have identical types.
20644
20645 The third and fourth arguments specify the rounding mode and exception
20646 behavior as described above.
20647
20648 Semantics:
20649 """"""""""
20650
20651 The value produced is the floating-point difference of the two value operands
20652 and has the same type as the operands.
20653
20654
20655 '``llvm.experimental.constrained.fmul``' Intrinsic
20656 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20657
20658 Syntax:
20659 """""""
20660
20661 ::
20662
20663       declare <type>
20664       @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>,
20665                                           metadata <rounding mode>,
20666                                           metadata <exception behavior>)
20667
20668 Overview:
20669 """""""""
20670
20671 The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of
20672 its two operands.
20673
20674
20675 Arguments:
20676 """"""""""
20677
20678 The first two arguments to the '``llvm.experimental.constrained.fmul``'
20679 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20680 of floating-point values. Both arguments must have identical types.
20681
20682 The third and fourth arguments specify the rounding mode and exception
20683 behavior as described above.
20684
20685 Semantics:
20686 """"""""""
20687
20688 The value produced is the floating-point product of the two value operands and
20689 has the same type as the operands.
20690
20691
20692 '``llvm.experimental.constrained.fdiv``' Intrinsic
20693 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20694
20695 Syntax:
20696 """""""
20697
20698 ::
20699
20700       declare <type>
20701       @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>,
20702                                           metadata <rounding mode>,
20703                                           metadata <exception behavior>)
20704
20705 Overview:
20706 """""""""
20707
20708 The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of
20709 its two operands.
20710
20711
20712 Arguments:
20713 """"""""""
20714
20715 The first two arguments to the '``llvm.experimental.constrained.fdiv``'
20716 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20717 of floating-point values. Both arguments must have identical types.
20718
20719 The third and fourth arguments specify the rounding mode and exception
20720 behavior as described above.
20721
20722 Semantics:
20723 """"""""""
20724
20725 The value produced is the floating-point quotient of the two value operands and
20726 has the same type as the operands.
20727
20728
20729 '``llvm.experimental.constrained.frem``' Intrinsic
20730 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20731
20732 Syntax:
20733 """""""
20734
20735 ::
20736
20737       declare <type>
20738       @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
20739                                           metadata <rounding mode>,
20740                                           metadata <exception behavior>)
20741
20742 Overview:
20743 """""""""
20744
20745 The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder
20746 from the division of its two operands.
20747
20748
20749 Arguments:
20750 """"""""""
20751
20752 The first two arguments to the '``llvm.experimental.constrained.frem``'
20753 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20754 of floating-point values. Both arguments must have identical types.
20755
20756 The third and fourth arguments specify the rounding mode and exception
20757 behavior as described above.  The rounding mode argument has no effect, since
20758 the result of frem is never rounded, but the argument is included for
20759 consistency with the other constrained floating-point intrinsics.
20760
20761 Semantics:
20762 """"""""""
20763
20764 The value produced is the floating-point remainder from the division of the two
20765 value operands and has the same type as the operands.  The remainder has the
20766 same sign as the dividend.
20767
20768 '``llvm.experimental.constrained.fma``' Intrinsic
20769 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20770
20771 Syntax:
20772 """""""
20773
20774 ::
20775
20776       declare <type>
20777       @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>,
20778                                           metadata <rounding mode>,
20779                                           metadata <exception behavior>)
20780
20781 Overview:
20782 """""""""
20783
20784 The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a
20785 fused-multiply-add operation on its operands.
20786
20787 Arguments:
20788 """"""""""
20789
20790 The first three arguments to the '``llvm.experimental.constrained.fma``'
20791 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector
20792 <t_vector>` of floating-point values. All arguments must have identical types.
20793
20794 The fourth and fifth arguments specify the rounding mode and exception behavior
20795 as described above.
20796
20797 Semantics:
20798 """"""""""
20799
20800 The result produced is the product of the first two operands added to the third
20801 operand computed with infinite precision, and then rounded to the target
20802 precision.
20803
20804 '``llvm.experimental.constrained.fptoui``' Intrinsic
20805 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20806
20807 Syntax:
20808 """""""
20809
20810 ::
20811
20812       declare <ty2>
20813       @llvm.experimental.constrained.fptoui(<type> <value>,
20814                                           metadata <exception behavior>)
20815
20816 Overview:
20817 """""""""
20818
20819 The '``llvm.experimental.constrained.fptoui``' intrinsic converts a
20820 floating-point ``value`` to its unsigned integer equivalent of type ``ty2``.
20821
20822 Arguments:
20823 """"""""""
20824
20825 The first argument to the '``llvm.experimental.constrained.fptoui``'
20826 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
20827 <t_vector>` of floating point values.
20828
20829 The second argument specifies the exception behavior as described above.
20830
20831 Semantics:
20832 """"""""""
20833
20834 The result produced is an unsigned integer converted from the floating
20835 point operand. The value is truncated, so it is rounded towards zero.
20836
20837 '``llvm.experimental.constrained.fptosi``' Intrinsic
20838 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20839
20840 Syntax:
20841 """""""
20842
20843 ::
20844
20845       declare <ty2>
20846       @llvm.experimental.constrained.fptosi(<type> <value>,
20847                                           metadata <exception behavior>)
20848
20849 Overview:
20850 """""""""
20851
20852 The '``llvm.experimental.constrained.fptosi``' intrinsic converts
20853 :ref:`floating-point <t_floating>` ``value`` to type ``ty2``.
20854
20855 Arguments:
20856 """"""""""
20857
20858 The first argument to the '``llvm.experimental.constrained.fptosi``'
20859 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
20860 <t_vector>` of floating point values.
20861
20862 The second argument specifies the exception behavior as described above.
20863
20864 Semantics:
20865 """"""""""
20866
20867 The result produced is a signed integer converted from the floating
20868 point operand. The value is truncated, so it is rounded towards zero.
20869
20870 '``llvm.experimental.constrained.uitofp``' Intrinsic
20871 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20872
20873 Syntax:
20874 """""""
20875
20876 ::
20877
20878       declare <ty2>
20879       @llvm.experimental.constrained.uitofp(<type> <value>,
20880                                           metadata <rounding mode>,
20881                                           metadata <exception behavior>)
20882
20883 Overview:
20884 """""""""
20885
20886 The '``llvm.experimental.constrained.uitofp``' intrinsic converts an
20887 unsigned integer ``value`` to a floating-point of type ``ty2``.
20888
20889 Arguments:
20890 """"""""""
20891
20892 The first argument to the '``llvm.experimental.constrained.uitofp``'
20893 intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
20894 <t_vector>` of integer values.
20895
20896 The second and third arguments specify the rounding mode and exception
20897 behavior as described above.
20898
20899 Semantics:
20900 """"""""""
20901
20902 An inexact floating-point exception will be raised if rounding is required.
20903 Any result produced is a floating point value converted from the input
20904 integer operand.
20905
20906 '``llvm.experimental.constrained.sitofp``' Intrinsic
20907 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20908
20909 Syntax:
20910 """""""
20911
20912 ::
20913
20914       declare <ty2>
20915       @llvm.experimental.constrained.sitofp(<type> <value>,
20916                                           metadata <rounding mode>,
20917                                           metadata <exception behavior>)
20918
20919 Overview:
20920 """""""""
20921
20922 The '``llvm.experimental.constrained.sitofp``' intrinsic converts a
20923 signed integer ``value`` to a floating-point of type ``ty2``.
20924
20925 Arguments:
20926 """"""""""
20927
20928 The first argument to the '``llvm.experimental.constrained.sitofp``'
20929 intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
20930 <t_vector>` of integer values.
20931
20932 The second and third arguments specify the rounding mode and exception
20933 behavior as described above.
20934
20935 Semantics:
20936 """"""""""
20937
20938 An inexact floating-point exception will be raised if rounding is required.
20939 Any result produced is a floating point value converted from the input
20940 integer operand.
20941
20942 '``llvm.experimental.constrained.fptrunc``' Intrinsic
20943 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20944
20945 Syntax:
20946 """""""
20947
20948 ::
20949
20950       declare <ty2>
20951       @llvm.experimental.constrained.fptrunc(<type> <value>,
20952                                           metadata <rounding mode>,
20953                                           metadata <exception behavior>)
20954
20955 Overview:
20956 """""""""
20957
20958 The '``llvm.experimental.constrained.fptrunc``' intrinsic truncates ``value``
20959 to type ``ty2``.
20960
20961 Arguments:
20962 """"""""""
20963
20964 The first argument to the '``llvm.experimental.constrained.fptrunc``'
20965 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
20966 <t_vector>` of floating point values. This argument must be larger in size
20967 than the result.
20968
20969 The second and third arguments specify the rounding mode and exception
20970 behavior as described above.
20971
20972 Semantics:
20973 """"""""""
20974
20975 The result produced is a floating point value truncated to be smaller in size
20976 than the operand.
20977
20978 '``llvm.experimental.constrained.fpext``' Intrinsic
20979 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20980
20981 Syntax:
20982 """""""
20983
20984 ::
20985
20986       declare <ty2>
20987       @llvm.experimental.constrained.fpext(<type> <value>,
20988                                           metadata <exception behavior>)
20989
20990 Overview:
20991 """""""""
20992
20993 The '``llvm.experimental.constrained.fpext``' intrinsic extends a
20994 floating-point ``value`` to a larger floating-point value.
20995
20996 Arguments:
20997 """"""""""
20998
20999 The first argument to the '``llvm.experimental.constrained.fpext``'
21000 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
21001 <t_vector>` of floating point values. This argument must be smaller in size
21002 than the result.
21003
21004 The second argument specifies the exception behavior as described above.
21005
21006 Semantics:
21007 """"""""""
21008
21009 The result produced is a floating point value extended to be larger in size
21010 than the operand. All restrictions that apply to the fpext instruction also
21011 apply to this intrinsic.
21012
21013 '``llvm.experimental.constrained.fcmp``' and '``llvm.experimental.constrained.fcmps``' Intrinsics
21014 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21015
21016 Syntax:
21017 """""""
21018
21019 ::
21020
21021       declare <ty2>
21022       @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>,
21023                                           metadata <condition code>,
21024                                           metadata <exception behavior>)
21025       declare <ty2>
21026       @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>,
21027                                            metadata <condition code>,
21028                                            metadata <exception behavior>)
21029
21030 Overview:
21031 """""""""
21032
21033 The '``llvm.experimental.constrained.fcmp``' and
21034 '``llvm.experimental.constrained.fcmps``' intrinsics return a boolean
21035 value or vector of boolean values based on comparison of its operands.
21036
21037 If the operands are floating-point scalars, then the result type is a
21038 boolean (:ref:`i1 <t_integer>`).
21039
21040 If the operands are floating-point vectors, then the result type is a
21041 vector of boolean with the same number of elements as the operands being
21042 compared.
21043
21044 The '``llvm.experimental.constrained.fcmp``' intrinsic performs a quiet
21045 comparison operation while the '``llvm.experimental.constrained.fcmps``'
21046 intrinsic performs a signaling comparison operation.
21047
21048 Arguments:
21049 """"""""""
21050
21051 The first two arguments to the '``llvm.experimental.constrained.fcmp``'
21052 and '``llvm.experimental.constrained.fcmps``' intrinsics must be
21053 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
21054 of floating-point values. Both arguments must have identical types.
21055
21056 The third argument is the condition code indicating the kind of comparison
21057 to perform. It must be a metadata string with one of the following values:
21058
21059 - "``oeq``": ordered and equal
21060 - "``ogt``": ordered and greater than
21061 - "``oge``": ordered and greater than or equal
21062 - "``olt``": ordered and less than
21063 - "``ole``": ordered and less than or equal
21064 - "``one``": ordered and not equal
21065 - "``ord``": ordered (no nans)
21066 - "``ueq``": unordered or equal
21067 - "``ugt``": unordered or greater than
21068 - "``uge``": unordered or greater than or equal
21069 - "``ult``": unordered or less than
21070 - "``ule``": unordered or less than or equal
21071 - "``une``": unordered or not equal
21072 - "``uno``": unordered (either nans)
21073
21074 *Ordered* means that neither operand is a NAN while *unordered* means
21075 that either operand may be a NAN.
21076
21077 The fourth argument specifies the exception behavior as described above.
21078
21079 Semantics:
21080 """"""""""
21081
21082 ``op1`` and ``op2`` are compared according to the condition code given
21083 as the third argument. If the operands are vectors, then the
21084 vectors are compared element by element. Each comparison performed
21085 always yields an :ref:`i1 <t_integer>` result, as follows:
21086
21087 - "``oeq``": yields ``true`` if both operands are not a NAN and ``op1``
21088   is equal to ``op2``.
21089 - "``ogt``": yields ``true`` if both operands are not a NAN and ``op1``
21090   is greater than ``op2``.
21091 - "``oge``": yields ``true`` if both operands are not a NAN and ``op1``
21092   is greater than or equal to ``op2``.
21093 - "``olt``": yields ``true`` if both operands are not a NAN and ``op1``
21094   is less than ``op2``.
21095 - "``ole``": yields ``true`` if both operands are not a NAN and ``op1``
21096   is less than or equal to ``op2``.
21097 - "``one``": yields ``true`` if both operands are not a NAN and ``op1``
21098   is not equal to ``op2``.
21099 - "``ord``": yields ``true`` if both operands are not a NAN.
21100 - "``ueq``": yields ``true`` if either operand is a NAN or ``op1`` is
21101   equal to ``op2``.
21102 - "``ugt``": yields ``true`` if either operand is a NAN or ``op1`` is
21103   greater than ``op2``.
21104 - "``uge``": yields ``true`` if either operand is a NAN or ``op1`` is
21105   greater than or equal to ``op2``.
21106 - "``ult``": yields ``true`` if either operand is a NAN or ``op1`` is
21107   less than ``op2``.
21108 - "``ule``": yields ``true`` if either operand is a NAN or ``op1`` is
21109   less than or equal to ``op2``.
21110 - "``une``": yields ``true`` if either operand is a NAN or ``op1`` is
21111   not equal to ``op2``.
21112 - "``uno``": yields ``true`` if either operand is a NAN.
21113
21114 The quiet comparison operation performed by
21115 '``llvm.experimental.constrained.fcmp``' will only raise an exception
21116 if either operand is a SNAN.  The signaling comparison operation
21117 performed by '``llvm.experimental.constrained.fcmps``' will raise an
21118 exception if either operand is a NAN (QNAN or SNAN). Such an exception
21119 does not preclude a result being produced (e.g. exception might only
21120 set a flag), therefore the distinction between ordered and unordered
21121 comparisons is also relevant for the
21122 '``llvm.experimental.constrained.fcmps``' intrinsic.
21123
21124 '``llvm.experimental.constrained.fmuladd``' Intrinsic
21125 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21126
21127 Syntax:
21128 """""""
21129
21130 ::
21131
21132       declare <type>
21133       @llvm.experimental.constrained.fmuladd(<type> <op1>, <type> <op2>,
21134                                              <type> <op3>,
21135                                              metadata <rounding mode>,
21136                                              metadata <exception behavior>)
21137
21138 Overview:
21139 """""""""
21140
21141 The '``llvm.experimental.constrained.fmuladd``' intrinsic represents
21142 multiply-add expressions that can be fused if the code generator determines
21143 that (a) the target instruction set has support for a fused operation,
21144 and (b) that the fused operation is more efficient than the equivalent,
21145 separate pair of mul and add instructions.
21146
21147 Arguments:
21148 """"""""""
21149
21150 The first three arguments to the '``llvm.experimental.constrained.fmuladd``'
21151 intrinsic must be floating-point or vector of floating-point values.
21152 All three arguments must have identical types.
21153
21154 The fourth and fifth arguments specify the rounding mode and exception behavior
21155 as described above.
21156
21157 Semantics:
21158 """"""""""
21159
21160 The expression:
21161
21162 ::
21163
21164       %0 = call float @llvm.experimental.constrained.fmuladd.f32(%a, %b, %c,
21165                                                                  metadata <rounding mode>,
21166                                                                  metadata <exception behavior>)
21167
21168 is equivalent to the expression:
21169
21170 ::
21171
21172       %0 = call float @llvm.experimental.constrained.fmul.f32(%a, %b,
21173                                                               metadata <rounding mode>,
21174                                                               metadata <exception behavior>)
21175       %1 = call float @llvm.experimental.constrained.fadd.f32(%0, %c,
21176                                                               metadata <rounding mode>,
21177                                                               metadata <exception behavior>)
21178
21179 except that it is unspecified whether rounding will be performed between the
21180 multiplication and addition steps. Fusion is not guaranteed, even if the target
21181 platform supports it.
21182 If a fused multiply-add is required, the corresponding
21183 :ref:`llvm.experimental.constrained.fma <int_fma>` intrinsic function should be
21184 used instead.
21185 This never sets errno, just as '``llvm.experimental.constrained.fma.*``'.
21186
21187 Constrained libm-equivalent Intrinsics
21188 --------------------------------------
21189
21190 In addition to the basic floating-point operations for which constrained
21191 intrinsics are described above, there are constrained versions of various
21192 operations which provide equivalent behavior to a corresponding libm function.
21193 These intrinsics allow the precise behavior of these operations with respect to
21194 rounding mode and exception behavior to be controlled.
21195
21196 As with the basic constrained floating-point intrinsics, the rounding mode
21197 and exception behavior arguments only control the behavior of the optimizer.
21198 They do not change the runtime floating-point environment.
21199
21200
21201 '``llvm.experimental.constrained.sqrt``' Intrinsic
21202 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21203
21204 Syntax:
21205 """""""
21206
21207 ::
21208
21209       declare <type>
21210       @llvm.experimental.constrained.sqrt(<type> <op1>,
21211                                           metadata <rounding mode>,
21212                                           metadata <exception behavior>)
21213
21214 Overview:
21215 """""""""
21216
21217 The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root
21218 of the specified value, returning the same value as the libm '``sqrt``'
21219 functions would, but without setting ``errno``.
21220
21221 Arguments:
21222 """"""""""
21223
21224 The first argument and the return type are floating-point numbers of the same
21225 type.
21226
21227 The second and third arguments specify the rounding mode and exception
21228 behavior as described above.
21229
21230 Semantics:
21231 """"""""""
21232
21233 This function returns the nonnegative square root of the specified value.
21234 If the value is less than negative zero, a floating-point exception occurs
21235 and the return value is architecture specific.
21236
21237
21238 '``llvm.experimental.constrained.pow``' Intrinsic
21239 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21240
21241 Syntax:
21242 """""""
21243
21244 ::
21245
21246       declare <type>
21247       @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>,
21248                                          metadata <rounding mode>,
21249                                          metadata <exception behavior>)
21250
21251 Overview:
21252 """""""""
21253
21254 The '``llvm.experimental.constrained.pow``' intrinsic returns the first operand
21255 raised to the (positive or negative) power specified by the second operand.
21256
21257 Arguments:
21258 """"""""""
21259
21260 The first two arguments and the return value are floating-point numbers of the
21261 same type.  The second argument specifies the power to which the first argument
21262 should be raised.
21263
21264 The third and fourth arguments specify the rounding mode and exception
21265 behavior as described above.
21266
21267 Semantics:
21268 """"""""""
21269
21270 This function returns the first value raised to the second power,
21271 returning the same values as the libm ``pow`` functions would, and
21272 handles error conditions in the same way.
21273
21274
21275 '``llvm.experimental.constrained.powi``' Intrinsic
21276 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21277
21278 Syntax:
21279 """""""
21280
21281 ::
21282
21283       declare <type>
21284       @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>,
21285                                           metadata <rounding mode>,
21286                                           metadata <exception behavior>)
21287
21288 Overview:
21289 """""""""
21290
21291 The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand
21292 raised to the (positive or negative) power specified by the second operand. The
21293 order of evaluation of multiplications is not defined. When a vector of
21294 floating-point type is used, the second argument remains a scalar integer value.
21295
21296
21297 Arguments:
21298 """"""""""
21299
21300 The first argument and the return value are floating-point numbers of the same
21301 type.  The second argument is a 32-bit signed integer specifying the power to
21302 which the first argument should be raised.
21303
21304 The third and fourth arguments specify the rounding mode and exception
21305 behavior as described above.
21306
21307 Semantics:
21308 """"""""""
21309
21310 This function returns the first value raised to the second power with an
21311 unspecified sequence of rounding operations.
21312
21313
21314 '``llvm.experimental.constrained.sin``' Intrinsic
21315 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21316
21317 Syntax:
21318 """""""
21319
21320 ::
21321
21322       declare <type>
21323       @llvm.experimental.constrained.sin(<type> <op1>,
21324                                          metadata <rounding mode>,
21325                                          metadata <exception behavior>)
21326
21327 Overview:
21328 """""""""
21329
21330 The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the
21331 first operand.
21332
21333 Arguments:
21334 """"""""""
21335
21336 The first argument and the return type are floating-point numbers of the same
21337 type.
21338
21339 The second and third arguments specify the rounding mode and exception
21340 behavior as described above.
21341
21342 Semantics:
21343 """"""""""
21344
21345 This function returns the sine of the specified operand, returning the
21346 same values as the libm ``sin`` functions would, and handles error
21347 conditions in the same way.
21348
21349
21350 '``llvm.experimental.constrained.cos``' Intrinsic
21351 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21352
21353 Syntax:
21354 """""""
21355
21356 ::
21357
21358       declare <type>
21359       @llvm.experimental.constrained.cos(<type> <op1>,
21360                                          metadata <rounding mode>,
21361                                          metadata <exception behavior>)
21362
21363 Overview:
21364 """""""""
21365
21366 The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the
21367 first operand.
21368
21369 Arguments:
21370 """"""""""
21371
21372 The first argument and the return type are floating-point numbers of the same
21373 type.
21374
21375 The second and third arguments specify the rounding mode and exception
21376 behavior as described above.
21377
21378 Semantics:
21379 """"""""""
21380
21381 This function returns the cosine of the specified operand, returning the
21382 same values as the libm ``cos`` functions would, and handles error
21383 conditions in the same way.
21384
21385
21386 '``llvm.experimental.constrained.exp``' Intrinsic
21387 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21388
21389 Syntax:
21390 """""""
21391
21392 ::
21393
21394       declare <type>
21395       @llvm.experimental.constrained.exp(<type> <op1>,
21396                                          metadata <rounding mode>,
21397                                          metadata <exception behavior>)
21398
21399 Overview:
21400 """""""""
21401
21402 The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e
21403 exponential of the specified value.
21404
21405 Arguments:
21406 """"""""""
21407
21408 The first argument and the return value are floating-point numbers of the same
21409 type.
21410
21411 The second and third arguments specify the rounding mode and exception
21412 behavior as described above.
21413
21414 Semantics:
21415 """"""""""
21416
21417 This function returns the same values as the libm ``exp`` functions
21418 would, and handles error conditions in the same way.
21419
21420
21421 '``llvm.experimental.constrained.exp2``' Intrinsic
21422 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21423
21424 Syntax:
21425 """""""
21426
21427 ::
21428
21429       declare <type>
21430       @llvm.experimental.constrained.exp2(<type> <op1>,
21431                                           metadata <rounding mode>,
21432                                           metadata <exception behavior>)
21433
21434 Overview:
21435 """""""""
21436
21437 The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2
21438 exponential of the specified value.
21439
21440
21441 Arguments:
21442 """"""""""
21443
21444 The first argument and the return value are floating-point numbers of the same
21445 type.
21446
21447 The second and third arguments specify the rounding mode and exception
21448 behavior as described above.
21449
21450 Semantics:
21451 """"""""""
21452
21453 This function returns the same values as the libm ``exp2`` functions
21454 would, and handles error conditions in the same way.
21455
21456
21457 '``llvm.experimental.constrained.log``' Intrinsic
21458 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21459
21460 Syntax:
21461 """""""
21462
21463 ::
21464
21465       declare <type>
21466       @llvm.experimental.constrained.log(<type> <op1>,
21467                                          metadata <rounding mode>,
21468                                          metadata <exception behavior>)
21469
21470 Overview:
21471 """""""""
21472
21473 The '``llvm.experimental.constrained.log``' intrinsic computes the base-e
21474 logarithm of the specified value.
21475
21476 Arguments:
21477 """"""""""
21478
21479 The first argument and the return value are floating-point numbers of the same
21480 type.
21481
21482 The second and third arguments specify the rounding mode and exception
21483 behavior as described above.
21484
21485
21486 Semantics:
21487 """"""""""
21488
21489 This function returns the same values as the libm ``log`` functions
21490 would, and handles error conditions in the same way.
21491
21492
21493 '``llvm.experimental.constrained.log10``' Intrinsic
21494 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21495
21496 Syntax:
21497 """""""
21498
21499 ::
21500
21501       declare <type>
21502       @llvm.experimental.constrained.log10(<type> <op1>,
21503                                            metadata <rounding mode>,
21504                                            metadata <exception behavior>)
21505
21506 Overview:
21507 """""""""
21508
21509 The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10
21510 logarithm of the specified value.
21511
21512 Arguments:
21513 """"""""""
21514
21515 The first argument and the return value are floating-point numbers of the same
21516 type.
21517
21518 The second and third arguments specify the rounding mode and exception
21519 behavior as described above.
21520
21521 Semantics:
21522 """"""""""
21523
21524 This function returns the same values as the libm ``log10`` functions
21525 would, and handles error conditions in the same way.
21526
21527
21528 '``llvm.experimental.constrained.log2``' Intrinsic
21529 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21530
21531 Syntax:
21532 """""""
21533
21534 ::
21535
21536       declare <type>
21537       @llvm.experimental.constrained.log2(<type> <op1>,
21538                                           metadata <rounding mode>,
21539                                           metadata <exception behavior>)
21540
21541 Overview:
21542 """""""""
21543
21544 The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2
21545 logarithm of the specified value.
21546
21547 Arguments:
21548 """"""""""
21549
21550 The first argument and the return value are floating-point numbers of the same
21551 type.
21552
21553 The second and third arguments specify the rounding mode and exception
21554 behavior as described above.
21555
21556 Semantics:
21557 """"""""""
21558
21559 This function returns the same values as the libm ``log2`` functions
21560 would, and handles error conditions in the same way.
21561
21562
21563 '``llvm.experimental.constrained.rint``' Intrinsic
21564 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21565
21566 Syntax:
21567 """""""
21568
21569 ::
21570
21571       declare <type>
21572       @llvm.experimental.constrained.rint(<type> <op1>,
21573                                           metadata <rounding mode>,
21574                                           metadata <exception behavior>)
21575
21576 Overview:
21577 """""""""
21578
21579 The '``llvm.experimental.constrained.rint``' intrinsic returns the first
21580 operand rounded to the nearest integer. It may raise an inexact floating-point
21581 exception if the operand is not an integer.
21582
21583 Arguments:
21584 """"""""""
21585
21586 The first argument and the return value are floating-point numbers of the same
21587 type.
21588
21589 The second and third arguments specify the rounding mode and exception
21590 behavior as described above.
21591
21592 Semantics:
21593 """"""""""
21594
21595 This function returns the same values as the libm ``rint`` functions
21596 would, and handles error conditions in the same way.  The rounding mode is
21597 described, not determined, by the rounding mode argument.  The actual rounding
21598 mode is determined by the runtime floating-point environment.  The rounding
21599 mode argument is only intended as information to the compiler.
21600
21601
21602 '``llvm.experimental.constrained.lrint``' Intrinsic
21603 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21604
21605 Syntax:
21606 """""""
21607
21608 ::
21609
21610       declare <inttype>
21611       @llvm.experimental.constrained.lrint(<fptype> <op1>,
21612                                            metadata <rounding mode>,
21613                                            metadata <exception behavior>)
21614
21615 Overview:
21616 """""""""
21617
21618 The '``llvm.experimental.constrained.lrint``' intrinsic returns the first
21619 operand rounded to the nearest integer. An inexact floating-point exception
21620 will be raised if the operand is not an integer. An invalid exception is
21621 raised if the result is too large to fit into a supported integer type,
21622 and in this case the result is undefined.
21623
21624 Arguments:
21625 """"""""""
21626
21627 The first argument is a floating-point number. The return value is an
21628 integer type. Not all types are supported on all targets. The supported
21629 types are the same as the ``llvm.lrint`` intrinsic and the ``lrint``
21630 libm functions.
21631
21632 The second and third arguments specify the rounding mode and exception
21633 behavior as described above.
21634
21635 Semantics:
21636 """"""""""
21637
21638 This function returns the same values as the libm ``lrint`` functions
21639 would, and handles error conditions in the same way.
21640
21641 The rounding mode is described, not determined, by the rounding mode
21642 argument.  The actual rounding mode is determined by the runtime floating-point
21643 environment.  The rounding mode argument is only intended as information
21644 to the compiler.
21645
21646 If the runtime floating-point environment is using the default rounding mode
21647 then the results will be the same as the llvm.lrint intrinsic.
21648
21649
21650 '``llvm.experimental.constrained.llrint``' Intrinsic
21651 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21652
21653 Syntax:
21654 """""""
21655
21656 ::
21657
21658       declare <inttype>
21659       @llvm.experimental.constrained.llrint(<fptype> <op1>,
21660                                             metadata <rounding mode>,
21661                                             metadata <exception behavior>)
21662
21663 Overview:
21664 """""""""
21665
21666 The '``llvm.experimental.constrained.llrint``' intrinsic returns the first
21667 operand rounded to the nearest integer. An inexact floating-point exception
21668 will be raised if the operand is not an integer. An invalid exception is
21669 raised if the result is too large to fit into a supported integer type,
21670 and in this case the result is undefined.
21671
21672 Arguments:
21673 """"""""""
21674
21675 The first argument is a floating-point number. The return value is an
21676 integer type. Not all types are supported on all targets. The supported
21677 types are the same as the ``llvm.llrint`` intrinsic and the ``llrint``
21678 libm functions.
21679
21680 The second and third arguments specify the rounding mode and exception
21681 behavior as described above.
21682
21683 Semantics:
21684 """"""""""
21685
21686 This function returns the same values as the libm ``llrint`` functions
21687 would, and handles error conditions in the same way.
21688
21689 The rounding mode is described, not determined, by the rounding mode
21690 argument.  The actual rounding mode is determined by the runtime floating-point
21691 environment.  The rounding mode argument is only intended as information
21692 to the compiler.
21693
21694 If the runtime floating-point environment is using the default rounding mode
21695 then the results will be the same as the llvm.llrint intrinsic.
21696
21697
21698 '``llvm.experimental.constrained.nearbyint``' Intrinsic
21699 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21700
21701 Syntax:
21702 """""""
21703
21704 ::
21705
21706       declare <type>
21707       @llvm.experimental.constrained.nearbyint(<type> <op1>,
21708                                                metadata <rounding mode>,
21709                                                metadata <exception behavior>)
21710
21711 Overview:
21712 """""""""
21713
21714 The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first
21715 operand rounded to the nearest integer. It will not raise an inexact
21716 floating-point exception if the operand is not an integer.
21717
21718
21719 Arguments:
21720 """"""""""
21721
21722 The first argument and the return value are floating-point numbers of the same
21723 type.
21724
21725 The second and third arguments specify the rounding mode and exception
21726 behavior as described above.
21727
21728 Semantics:
21729 """"""""""
21730
21731 This function returns the same values as the libm ``nearbyint`` functions
21732 would, and handles error conditions in the same way.  The rounding mode is
21733 described, not determined, by the rounding mode argument.  The actual rounding
21734 mode is determined by the runtime floating-point environment.  The rounding
21735 mode argument is only intended as information to the compiler.
21736
21737
21738 '``llvm.experimental.constrained.maxnum``' Intrinsic
21739 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21740
21741 Syntax:
21742 """""""
21743
21744 ::
21745
21746       declare <type>
21747       @llvm.experimental.constrained.maxnum(<type> <op1>, <type> <op2>
21748                                             metadata <exception behavior>)
21749
21750 Overview:
21751 """""""""
21752
21753 The '``llvm.experimental.constrained.maxnum``' intrinsic returns the maximum
21754 of the two arguments.
21755
21756 Arguments:
21757 """"""""""
21758
21759 The first two arguments and the return value are floating-point numbers
21760 of the same type.
21761
21762 The third argument specifies the exception behavior as described above.
21763
21764 Semantics:
21765 """"""""""
21766
21767 This function follows the IEEE-754 semantics for maxNum.
21768
21769
21770 '``llvm.experimental.constrained.minnum``' Intrinsic
21771 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21772
21773 Syntax:
21774 """""""
21775
21776 ::
21777
21778       declare <type>
21779       @llvm.experimental.constrained.minnum(<type> <op1>, <type> <op2>
21780                                             metadata <exception behavior>)
21781
21782 Overview:
21783 """""""""
21784
21785 The '``llvm.experimental.constrained.minnum``' intrinsic returns the minimum
21786 of the two arguments.
21787
21788 Arguments:
21789 """"""""""
21790
21791 The first two arguments and the return value are floating-point numbers
21792 of the same type.
21793
21794 The third argument specifies the exception behavior as described above.
21795
21796 Semantics:
21797 """"""""""
21798
21799 This function follows the IEEE-754 semantics for minNum.
21800
21801
21802 '``llvm.experimental.constrained.maximum``' Intrinsic
21803 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21804
21805 Syntax:
21806 """""""
21807
21808 ::
21809
21810       declare <type>
21811       @llvm.experimental.constrained.maximum(<type> <op1>, <type> <op2>
21812                                              metadata <exception behavior>)
21813
21814 Overview:
21815 """""""""
21816
21817 The '``llvm.experimental.constrained.maximum``' intrinsic returns the maximum
21818 of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
21819
21820 Arguments:
21821 """"""""""
21822
21823 The first two arguments and the return value are floating-point numbers
21824 of the same type.
21825
21826 The third argument specifies the exception behavior as described above.
21827
21828 Semantics:
21829 """"""""""
21830
21831 This function follows semantics specified in the draft of IEEE 754-2018.
21832
21833
21834 '``llvm.experimental.constrained.minimum``' Intrinsic
21835 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21836
21837 Syntax:
21838 """""""
21839
21840 ::
21841
21842       declare <type>
21843       @llvm.experimental.constrained.minimum(<type> <op1>, <type> <op2>
21844                                              metadata <exception behavior>)
21845
21846 Overview:
21847 """""""""
21848
21849 The '``llvm.experimental.constrained.minimum``' intrinsic returns the minimum
21850 of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
21851
21852 Arguments:
21853 """"""""""
21854
21855 The first two arguments and the return value are floating-point numbers
21856 of the same type.
21857
21858 The third argument specifies the exception behavior as described above.
21859
21860 Semantics:
21861 """"""""""
21862
21863 This function follows semantics specified in the draft of IEEE 754-2018.
21864
21865
21866 '``llvm.experimental.constrained.ceil``' Intrinsic
21867 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21868
21869 Syntax:
21870 """""""
21871
21872 ::
21873
21874       declare <type>
21875       @llvm.experimental.constrained.ceil(<type> <op1>,
21876                                           metadata <exception behavior>)
21877
21878 Overview:
21879 """""""""
21880
21881 The '``llvm.experimental.constrained.ceil``' intrinsic returns the ceiling of the
21882 first operand.
21883
21884 Arguments:
21885 """"""""""
21886
21887 The first argument and the return value are floating-point numbers of the same
21888 type.
21889
21890 The second argument specifies the exception behavior as described above.
21891
21892 Semantics:
21893 """"""""""
21894
21895 This function returns the same values as the libm ``ceil`` functions
21896 would and handles error conditions in the same way.
21897
21898
21899 '``llvm.experimental.constrained.floor``' Intrinsic
21900 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21901
21902 Syntax:
21903 """""""
21904
21905 ::
21906
21907       declare <type>
21908       @llvm.experimental.constrained.floor(<type> <op1>,
21909                                            metadata <exception behavior>)
21910
21911 Overview:
21912 """""""""
21913
21914 The '``llvm.experimental.constrained.floor``' intrinsic returns the floor of the
21915 first operand.
21916
21917 Arguments:
21918 """"""""""
21919
21920 The first argument and the return value are floating-point numbers of the same
21921 type.
21922
21923 The second argument specifies the exception behavior as described above.
21924
21925 Semantics:
21926 """"""""""
21927
21928 This function returns the same values as the libm ``floor`` functions
21929 would and handles error conditions in the same way.
21930
21931
21932 '``llvm.experimental.constrained.round``' Intrinsic
21933 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21934
21935 Syntax:
21936 """""""
21937
21938 ::
21939
21940       declare <type>
21941       @llvm.experimental.constrained.round(<type> <op1>,
21942                                            metadata <exception behavior>)
21943
21944 Overview:
21945 """""""""
21946
21947 The '``llvm.experimental.constrained.round``' intrinsic returns the first
21948 operand rounded to the nearest integer.
21949
21950 Arguments:
21951 """"""""""
21952
21953 The first argument and the return value are floating-point numbers of the same
21954 type.
21955
21956 The second argument specifies the exception behavior as described above.
21957
21958 Semantics:
21959 """"""""""
21960
21961 This function returns the same values as the libm ``round`` functions
21962 would and handles error conditions in the same way.
21963
21964
21965 '``llvm.experimental.constrained.roundeven``' Intrinsic
21966 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21967
21968 Syntax:
21969 """""""
21970
21971 ::
21972
21973       declare <type>
21974       @llvm.experimental.constrained.roundeven(<type> <op1>,
21975                                                metadata <exception behavior>)
21976
21977 Overview:
21978 """""""""
21979
21980 The '``llvm.experimental.constrained.roundeven``' intrinsic returns the first
21981 operand rounded to the nearest integer in floating-point format, rounding
21982 halfway cases to even (that is, to the nearest value that is an even integer),
21983 regardless of the current rounding direction.
21984
21985 Arguments:
21986 """"""""""
21987
21988 The first argument and the return value are floating-point numbers of the same
21989 type.
21990
21991 The second argument specifies the exception behavior as described above.
21992
21993 Semantics:
21994 """"""""""
21995
21996 This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
21997 also behaves in the same way as C standard function ``roundeven`` and can signal
21998 the invalid operation exception for a SNAN operand.
21999
22000
22001 '``llvm.experimental.constrained.lround``' Intrinsic
22002 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22003
22004 Syntax:
22005 """""""
22006
22007 ::
22008
22009       declare <inttype>
22010       @llvm.experimental.constrained.lround(<fptype> <op1>,
22011                                             metadata <exception behavior>)
22012
22013 Overview:
22014 """""""""
22015
22016 The '``llvm.experimental.constrained.lround``' intrinsic returns the first
22017 operand rounded to the nearest integer with ties away from zero.  It will
22018 raise an inexact floating-point exception if the operand is not an integer.
22019 An invalid exception is raised if the result is too large to fit into a
22020 supported integer type, and in this case the result is undefined.
22021
22022 Arguments:
22023 """"""""""
22024
22025 The first argument is a floating-point number. The return value is an
22026 integer type. Not all types are supported on all targets. The supported
22027 types are the same as the ``llvm.lround`` intrinsic and the ``lround``
22028 libm functions.
22029
22030 The second argument specifies the exception behavior as described above.
22031
22032 Semantics:
22033 """"""""""
22034
22035 This function returns the same values as the libm ``lround`` functions
22036 would and handles error conditions in the same way.
22037
22038
22039 '``llvm.experimental.constrained.llround``' Intrinsic
22040 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22041
22042 Syntax:
22043 """""""
22044
22045 ::
22046
22047       declare <inttype>
22048       @llvm.experimental.constrained.llround(<fptype> <op1>,
22049                                              metadata <exception behavior>)
22050
22051 Overview:
22052 """""""""
22053
22054 The '``llvm.experimental.constrained.llround``' intrinsic returns the first
22055 operand rounded to the nearest integer with ties away from zero. It will
22056 raise an inexact floating-point exception if the operand is not an integer.
22057 An invalid exception is raised if the result is too large to fit into a
22058 supported integer type, and in this case the result is undefined.
22059
22060 Arguments:
22061 """"""""""
22062
22063 The first argument is a floating-point number. The return value is an
22064 integer type. Not all types are supported on all targets. The supported
22065 types are the same as the ``llvm.llround`` intrinsic and the ``llround``
22066 libm functions.
22067
22068 The second argument specifies the exception behavior as described above.
22069
22070 Semantics:
22071 """"""""""
22072
22073 This function returns the same values as the libm ``llround`` functions
22074 would and handles error conditions in the same way.
22075
22076
22077 '``llvm.experimental.constrained.trunc``' Intrinsic
22078 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22079
22080 Syntax:
22081 """""""
22082
22083 ::
22084
22085       declare <type>
22086       @llvm.experimental.constrained.trunc(<type> <op1>,
22087                                            metadata <exception behavior>)
22088
22089 Overview:
22090 """""""""
22091
22092 The '``llvm.experimental.constrained.trunc``' intrinsic returns the first
22093 operand rounded to the nearest integer not larger in magnitude than the
22094 operand.
22095
22096 Arguments:
22097 """"""""""
22098
22099 The first argument and the return value are floating-point numbers of the same
22100 type.
22101
22102 The second argument specifies the exception behavior as described above.
22103
22104 Semantics:
22105 """"""""""
22106
22107 This function returns the same values as the libm ``trunc`` functions
22108 would and handles error conditions in the same way.
22109
22110 .. _int_experimental_noalias_scope_decl:
22111
22112 '``llvm.experimental.noalias.scope.decl``' Intrinsic
22113 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22114
22115 Syntax:
22116 """""""
22117
22118
22119 ::
22120
22121       declare void @llvm.experimental.noalias.scope.decl(metadata !id.scope.list)
22122
22123 Overview:
22124 """""""""
22125
22126 The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
22127 noalias scope is declared. When the intrinsic is duplicated, a decision must
22128 also be made about the scope: depending on the reason of the duplication,
22129 the scope might need to be duplicated as well.
22130
22131
22132 Arguments:
22133 """"""""""
22134
22135 The ``!id.scope.list`` argument is metadata that is a list of ``noalias``
22136 metadata references. The format is identical to that required for ``noalias``
22137 metadata. This list must have exactly one element.
22138
22139 Semantics:
22140 """"""""""
22141
22142 The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
22143 noalias scope is declared. When the intrinsic is duplicated, a decision must
22144 also be made about the scope: depending on the reason of the duplication,
22145 the scope might need to be duplicated as well.
22146
22147 For example, when the intrinsic is used inside a loop body, and that loop is
22148 unrolled, the associated noalias scope must also be duplicated. Otherwise, the
22149 noalias property it signifies would spill across loop iterations, whereas it
22150 was only valid within a single iteration.
22151
22152 .. code-block:: llvm
22153
22154   ; This examples shows two possible positions for noalias.decl and how they impact the semantics:
22155   ; If it is outside the loop (Version 1), then %a and %b are noalias across *all* iterations.
22156   ; If it is inside the loop (Version 2), then %a and %b are noalias only within *one* iteration.
22157   declare void @decl_in_loop(i8* %a.base, i8* %b.base) {
22158   entry:
22159     ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 1: noalias decl outside loop
22160     br label %loop
22161
22162   loop:
22163     %a = phi i8* [ %a.base, %entry ], [ %a.inc, %loop ]
22164     %b = phi i8* [ %b.base, %entry ], [ %b.inc, %loop ]
22165     ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 2: noalias decl inside loop
22166     %val = load i8, i8* %a, !alias.scope !2
22167     store i8 %val, i8* %b, !noalias !2
22168     %a.inc = getelementptr inbounds i8, i8* %a, i64 1
22169     %b.inc = getelementptr inbounds i8, i8* %b, i64 1
22170     %cond = call i1 @cond()
22171     br i1 %cond, label %loop, label %exit
22172
22173   exit:
22174     ret void
22175   }
22176
22177   !0 = !{!0} ; domain
22178   !1 = !{!1, !0} ; scope
22179   !2 = !{!1} ; scope list
22180
22181 Multiple calls to `@llvm.experimental.noalias.scope.decl` for the same scope
22182 are possible, but one should never dominate another. Violations are pointed out
22183 by the verifier as they indicate a problem in either a transformation pass or
22184 the input.
22185
22186
22187 Floating Point Environment Manipulation intrinsics
22188 --------------------------------------------------
22189
22190 These functions read or write floating point environment, such as rounding
22191 mode or state of floating point exceptions. Altering the floating point
22192 environment requires special care. See :ref:`Floating Point Environment <floatenv>`.
22193
22194 '``llvm.flt.rounds``' Intrinsic
22195 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22196
22197 Syntax:
22198 """""""
22199
22200 ::
22201
22202       declare i32 @llvm.flt.rounds()
22203
22204 Overview:
22205 """""""""
22206
22207 The '``llvm.flt.rounds``' intrinsic reads the current rounding mode.
22208
22209 Semantics:
22210 """"""""""
22211
22212 The '``llvm.flt.rounds``' intrinsic returns the current rounding mode.
22213 Encoding of the returned values is same as the result of ``FLT_ROUNDS``,
22214 specified by C standard:
22215
22216 ::
22217
22218     0  - toward zero
22219     1  - to nearest, ties to even
22220     2  - toward positive infinity
22221     3  - toward negative infinity
22222     4  - to nearest, ties away from zero
22223
22224 Other values may be used to represent additional rounding modes, supported by a
22225 target. These values are target-specific.
22226
22227
22228 '``llvm.set.rounding``' Intrinsic
22229 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22230
22231 Syntax:
22232 """""""
22233
22234 ::
22235
22236       declare void @llvm.set.rounding(i32 <val>)
22237
22238 Overview:
22239 """""""""
22240
22241 The '``llvm.set.rounding``' intrinsic sets current rounding mode.
22242
22243 Arguments:
22244 """"""""""
22245
22246 The argument is the required rounding mode. Encoding of rounding mode is
22247 the same as used by '``llvm.flt.rounds``'.
22248
22249 Semantics:
22250 """"""""""
22251
22252 The '``llvm.set.rounding``' intrinsic sets the current rounding mode. It is
22253 similar to C library function 'fesetround', however this intrinsic does not
22254 return any value and uses platform-independent representation of IEEE rounding
22255 modes.
22256
22257
22258 General Intrinsics
22259 ------------------
22260
22261 This class of intrinsics is designed to be generic and has no specific
22262 purpose.
22263
22264 '``llvm.var.annotation``' Intrinsic
22265 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22266
22267 Syntax:
22268 """""""
22269
22270 ::
22271
22272       declare void @llvm.var.annotation(i8* <val>, i8* <str>, i8* <str>, i32  <int>)
22273
22274 Overview:
22275 """""""""
22276
22277 The '``llvm.var.annotation``' intrinsic.
22278
22279 Arguments:
22280 """"""""""
22281
22282 The first argument is a pointer to a value, the second is a pointer to a
22283 global string, the third is a pointer to a global string which is the
22284 source file name, and the last argument is the line number.
22285
22286 Semantics:
22287 """"""""""
22288
22289 This intrinsic allows annotation of local variables with arbitrary
22290 strings. This can be useful for special purpose optimizations that want
22291 to look for these annotations. These have no other defined use; they are
22292 ignored by code generation and optimization.
22293
22294 '``llvm.ptr.annotation.*``' Intrinsic
22295 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22296
22297 Syntax:
22298 """""""
22299
22300 This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a
22301 pointer to an integer of any width. *NOTE* you must specify an address space for
22302 the pointer. The identifier for the default address space is the integer
22303 '``0``'.
22304
22305 ::
22306
22307       declare i8*   @llvm.ptr.annotation.p<address space>i8(i8* <val>, i8* <str>, i8* <str>, i32  <int>)
22308       declare i16*  @llvm.ptr.annotation.p<address space>i16(i16* <val>, i8* <str>, i8* <str>, i32  <int>)
22309       declare i32*  @llvm.ptr.annotation.p<address space>i32(i32* <val>, i8* <str>, i8* <str>, i32  <int>)
22310       declare i64*  @llvm.ptr.annotation.p<address space>i64(i64* <val>, i8* <str>, i8* <str>, i32  <int>)
22311       declare i256* @llvm.ptr.annotation.p<address space>i256(i256* <val>, i8* <str>, i8* <str>, i32  <int>)
22312
22313 Overview:
22314 """""""""
22315
22316 The '``llvm.ptr.annotation``' intrinsic.
22317
22318 Arguments:
22319 """"""""""
22320
22321 The first argument is a pointer to an integer value of arbitrary bitwidth
22322 (result of some expression), the second is a pointer to a global string, the
22323 third is a pointer to a global string which is the source file name, and the
22324 last argument is the line number. It returns the value of the first argument.
22325
22326 Semantics:
22327 """"""""""
22328
22329 This intrinsic allows annotation of a pointer to an integer with arbitrary
22330 strings. This can be useful for special purpose optimizations that want to look
22331 for these annotations. These have no other defined use; they are ignored by code
22332 generation and optimization.
22333
22334 '``llvm.annotation.*``' Intrinsic
22335 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22336
22337 Syntax:
22338 """""""
22339
22340 This is an overloaded intrinsic. You can use '``llvm.annotation``' on
22341 any integer bit width.
22342
22343 ::
22344
22345       declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32  <int>)
22346       declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32  <int>)
22347       declare i32 @llvm.annotation.i32(i32 <val>, i8* <str>, i8* <str>, i32  <int>)
22348       declare i64 @llvm.annotation.i64(i64 <val>, i8* <str>, i8* <str>, i32  <int>)
22349       declare i256 @llvm.annotation.i256(i256 <val>, i8* <str>, i8* <str>, i32  <int>)
22350
22351 Overview:
22352 """""""""
22353
22354 The '``llvm.annotation``' intrinsic.
22355
22356 Arguments:
22357 """"""""""
22358
22359 The first argument is an integer value (result of some expression), the
22360 second is a pointer to a global string, the third is a pointer to a
22361 global string which is the source file name, and the last argument is
22362 the line number. It returns the value of the first argument.
22363
22364 Semantics:
22365 """"""""""
22366
22367 This intrinsic allows annotations to be put on arbitrary expressions
22368 with arbitrary strings. This can be useful for special purpose
22369 optimizations that want to look for these annotations. These have no
22370 other defined use; they are ignored by code generation and optimization.
22371
22372 '``llvm.codeview.annotation``' Intrinsic
22373 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22374
22375 Syntax:
22376 """""""
22377
22378 This annotation emits a label at its program point and an associated
22379 ``S_ANNOTATION`` codeview record with some additional string metadata. This is
22380 used to implement MSVC's ``__annotation`` intrinsic. It is marked
22381 ``noduplicate``, so calls to this intrinsic prevent inlining and should be
22382 considered expensive.
22383
22384 ::
22385
22386       declare void @llvm.codeview.annotation(metadata)
22387
22388 Arguments:
22389 """"""""""
22390
22391 The argument should be an MDTuple containing any number of MDStrings.
22392
22393 '``llvm.trap``' Intrinsic
22394 ^^^^^^^^^^^^^^^^^^^^^^^^^
22395
22396 Syntax:
22397 """""""
22398
22399 ::
22400
22401       declare void @llvm.trap() cold noreturn nounwind
22402
22403 Overview:
22404 """""""""
22405
22406 The '``llvm.trap``' intrinsic.
22407
22408 Arguments:
22409 """"""""""
22410
22411 None.
22412
22413 Semantics:
22414 """"""""""
22415
22416 This intrinsic is lowered to the target dependent trap instruction. If
22417 the target does not have a trap instruction, this intrinsic will be
22418 lowered to a call of the ``abort()`` function.
22419
22420 '``llvm.debugtrap``' Intrinsic
22421 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22422
22423 Syntax:
22424 """""""
22425
22426 ::
22427
22428       declare void @llvm.debugtrap() nounwind
22429
22430 Overview:
22431 """""""""
22432
22433 The '``llvm.debugtrap``' intrinsic.
22434
22435 Arguments:
22436 """"""""""
22437
22438 None.
22439
22440 Semantics:
22441 """"""""""
22442
22443 This intrinsic is lowered to code which is intended to cause an
22444 execution trap with the intention of requesting the attention of a
22445 debugger.
22446
22447 '``llvm.ubsantrap``' Intrinsic
22448 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22449
22450 Syntax:
22451 """""""
22452
22453 ::
22454
22455       declare void @llvm.ubsantrap(i8 immarg) cold noreturn nounwind
22456
22457 Overview:
22458 """""""""
22459
22460 The '``llvm.ubsantrap``' intrinsic.
22461
22462 Arguments:
22463 """"""""""
22464
22465 An integer describing the kind of failure detected.
22466
22467 Semantics:
22468 """"""""""
22469
22470 This intrinsic is lowered to code which is intended to cause an execution trap,
22471 embedding the argument into encoding of that trap somehow to discriminate
22472 crashes if possible.
22473
22474 Equivalent to ``@llvm.trap`` for targets that do not support this behaviour.
22475
22476 '``llvm.stackprotector``' Intrinsic
22477 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22478
22479 Syntax:
22480 """""""
22481
22482 ::
22483
22484       declare void @llvm.stackprotector(i8* <guard>, i8** <slot>)
22485
22486 Overview:
22487 """""""""
22488
22489 The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it
22490 onto the stack at ``slot``. The stack slot is adjusted to ensure that it
22491 is placed on the stack before local variables.
22492
22493 Arguments:
22494 """"""""""
22495
22496 The ``llvm.stackprotector`` intrinsic requires two pointer arguments.
22497 The first argument is the value loaded from the stack guard
22498 ``@__stack_chk_guard``. The second variable is an ``alloca`` that has
22499 enough space to hold the value of the guard.
22500
22501 Semantics:
22502 """"""""""
22503
22504 This intrinsic causes the prologue/epilogue inserter to force the position of
22505 the ``AllocaInst`` stack slot to be before local variables on the stack. This is
22506 to ensure that if a local variable on the stack is overwritten, it will destroy
22507 the value of the guard. When the function exits, the guard on the stack is
22508 checked against the original guard by ``llvm.stackprotectorcheck``. If they are
22509 different, then ``llvm.stackprotectorcheck`` causes the program to abort by
22510 calling the ``__stack_chk_fail()`` function.
22511
22512 '``llvm.stackguard``' Intrinsic
22513 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22514
22515 Syntax:
22516 """""""
22517
22518 ::
22519
22520       declare i8* @llvm.stackguard()
22521
22522 Overview:
22523 """""""""
22524
22525 The ``llvm.stackguard`` intrinsic returns the system stack guard value.
22526
22527 It should not be generated by frontends, since it is only for internal usage.
22528 The reason why we create this intrinsic is that we still support IR form Stack
22529 Protector in FastISel.
22530
22531 Arguments:
22532 """"""""""
22533
22534 None.
22535
22536 Semantics:
22537 """"""""""
22538
22539 On some platforms, the value returned by this intrinsic remains unchanged
22540 between loads in the same thread. On other platforms, it returns the same
22541 global variable value, if any, e.g. ``@__stack_chk_guard``.
22542
22543 Currently some platforms have IR-level customized stack guard loading (e.g.
22544 X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be
22545 in the future.
22546
22547 '``llvm.objectsize``' Intrinsic
22548 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22549
22550 Syntax:
22551 """""""
22552
22553 ::
22554
22555       declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
22556       declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
22557
22558 Overview:
22559 """""""""
22560
22561 The ``llvm.objectsize`` intrinsic is designed to provide information to the
22562 optimizer to determine whether a) an operation (like memcpy) will overflow a
22563 buffer that corresponds to an object, or b) that a runtime check for overflow
22564 isn't necessary. An object in this context means an allocation of a specific
22565 class, structure, array, or other object.
22566
22567 Arguments:
22568 """"""""""
22569
22570 The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a
22571 pointer to or into the ``object``. The second argument determines whether
22572 ``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is
22573 unknown. The third argument controls how ``llvm.objectsize`` acts when ``null``
22574 in address space 0 is used as its pointer argument. If it's ``false``,
22575 ``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if
22576 the ``null`` is in a non-zero address space or if ``true`` is given for the
22577 third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth
22578 argument to ``llvm.objectsize`` determines if the value should be evaluated at
22579 runtime.
22580
22581 The second, third, and fourth arguments only accept constants.
22582
22583 Semantics:
22584 """"""""""
22585
22586 The ``llvm.objectsize`` intrinsic is lowered to a value representing the size of
22587 the object concerned. If the size cannot be determined, ``llvm.objectsize``
22588 returns ``i32/i64 -1 or 0`` (depending on the ``min`` argument).
22589
22590 '``llvm.expect``' Intrinsic
22591 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
22592
22593 Syntax:
22594 """""""
22595
22596 This is an overloaded intrinsic. You can use ``llvm.expect`` on any
22597 integer bit width.
22598
22599 ::
22600
22601       declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>)
22602       declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>)
22603       declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>)
22604
22605 Overview:
22606 """""""""
22607
22608 The ``llvm.expect`` intrinsic provides information about expected (the
22609 most probable) value of ``val``, which can be used by optimizers.
22610
22611 Arguments:
22612 """"""""""
22613
22614 The ``llvm.expect`` intrinsic takes two arguments. The first argument is
22615 a value. The second argument is an expected value.
22616
22617 Semantics:
22618 """"""""""
22619
22620 This intrinsic is lowered to the ``val``.
22621
22622 '``llvm.expect.with.probability``' Intrinsic
22623 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22624
22625 Syntax:
22626 """""""
22627
22628 This intrinsic is similar to ``llvm.expect``. This is an overloaded intrinsic.
22629 You can use ``llvm.expect.with.probability`` on any integer bit width.
22630
22631 ::
22632
22633       declare i1 @llvm.expect.with.probability.i1(i1 <val>, i1 <expected_val>, double <prob>)
22634       declare i32 @llvm.expect.with.probability.i32(i32 <val>, i32 <expected_val>, double <prob>)
22635       declare i64 @llvm.expect.with.probability.i64(i64 <val>, i64 <expected_val>, double <prob>)
22636
22637 Overview:
22638 """""""""
22639
22640 The ``llvm.expect.with.probability`` intrinsic provides information about
22641 expected value of ``val`` with probability(or confidence) ``prob``, which can
22642 be used by optimizers.
22643
22644 Arguments:
22645 """"""""""
22646
22647 The ``llvm.expect.with.probability`` intrinsic takes three arguments. The first
22648 argument is a value. The second argument is an expected value. The third
22649 argument is a probability.
22650
22651 Semantics:
22652 """"""""""
22653
22654 This intrinsic is lowered to the ``val``.
22655
22656 .. _int_assume:
22657
22658 '``llvm.assume``' Intrinsic
22659 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22660
22661 Syntax:
22662 """""""
22663
22664 ::
22665
22666       declare void @llvm.assume(i1 %cond)
22667
22668 Overview:
22669 """""""""
22670
22671 The ``llvm.assume`` allows the optimizer to assume that the provided
22672 condition is true. This information can then be used in simplifying other parts
22673 of the code.
22674
22675 More complex assumptions can be encoded as
22676 :ref:`assume operand bundles <assume_opbundles>`.
22677
22678 Arguments:
22679 """"""""""
22680
22681 The argument of the call is the condition which the optimizer may assume is
22682 always true.
22683
22684 Semantics:
22685 """"""""""
22686
22687 The intrinsic allows the optimizer to assume that the provided condition is
22688 always true whenever the control flow reaches the intrinsic call. No code is
22689 generated for this intrinsic, and instructions that contribute only to the
22690 provided condition are not used for code generation. If the condition is
22691 violated during execution, the behavior is undefined.
22692
22693 Note that the optimizer might limit the transformations performed on values
22694 used by the ``llvm.assume`` intrinsic in order to preserve the instructions
22695 only used to form the intrinsic's input argument. This might prove undesirable
22696 if the extra information provided by the ``llvm.assume`` intrinsic does not cause
22697 sufficient overall improvement in code quality. For this reason,
22698 ``llvm.assume`` should not be used to document basic mathematical invariants
22699 that the optimizer can otherwise deduce or facts that are of little use to the
22700 optimizer.
22701
22702 .. _int_ssa_copy:
22703
22704 '``llvm.ssa.copy``' Intrinsic
22705 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22706
22707 Syntax:
22708 """""""
22709
22710 ::
22711
22712       declare type @llvm.ssa.copy(type %operand) returned(1) readnone
22713
22714 Arguments:
22715 """"""""""
22716
22717 The first argument is an operand which is used as the returned value.
22718
22719 Overview:
22720 """"""""""
22721
22722 The ``llvm.ssa.copy`` intrinsic can be used to attach information to
22723 operations by copying them and giving them new names.  For example,
22724 the PredicateInfo utility uses it to build Extended SSA form, and
22725 attach various forms of information to operands that dominate specific
22726 uses.  It is not meant for general use, only for building temporary
22727 renaming forms that require value splits at certain points.
22728
22729 .. _type.test:
22730
22731 '``llvm.type.test``' Intrinsic
22732 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22733
22734 Syntax:
22735 """""""
22736
22737 ::
22738
22739       declare i1 @llvm.type.test(i8* %ptr, metadata %type) nounwind readnone
22740
22741
22742 Arguments:
22743 """"""""""
22744
22745 The first argument is a pointer to be tested. The second argument is a
22746 metadata object representing a :doc:`type identifier <TypeMetadata>`.
22747
22748 Overview:
22749 """""""""
22750
22751 The ``llvm.type.test`` intrinsic tests whether the given pointer is associated
22752 with the given type identifier.
22753
22754 .. _type.checked.load:
22755
22756 '``llvm.type.checked.load``' Intrinsic
22757 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22758
22759 Syntax:
22760 """""""
22761
22762 ::
22763
22764       declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly
22765
22766
22767 Arguments:
22768 """"""""""
22769
22770 The first argument is a pointer from which to load a function pointer. The
22771 second argument is the byte offset from which to load the function pointer. The
22772 third argument is a metadata object representing a :doc:`type identifier
22773 <TypeMetadata>`.
22774
22775 Overview:
22776 """""""""
22777
22778 The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a
22779 virtual table pointer using type metadata. This intrinsic is used to implement
22780 control flow integrity in conjunction with virtual call optimization. The
22781 virtual call optimization pass will optimize away ``llvm.type.checked.load``
22782 intrinsics associated with devirtualized calls, thereby removing the type
22783 check in cases where it is not needed to enforce the control flow integrity
22784 constraint.
22785
22786 If the given pointer is associated with a type metadata identifier, this
22787 function returns true as the second element of its return value. (Note that
22788 the function may also return true if the given pointer is not associated
22789 with a type metadata identifier.) If the function's return value's second
22790 element is true, the following rules apply to the first element:
22791
22792 - If the given pointer is associated with the given type metadata identifier,
22793   it is the function pointer loaded from the given byte offset from the given
22794   pointer.
22795
22796 - If the given pointer is not associated with the given type metadata
22797   identifier, it is one of the following (the choice of which is unspecified):
22798
22799   1. The function pointer that would have been loaded from an arbitrarily chosen
22800      (through an unspecified mechanism) pointer associated with the type
22801      metadata.
22802
22803   2. If the function has a non-void return type, a pointer to a function that
22804      returns an unspecified value without causing side effects.
22805
22806 If the function's return value's second element is false, the value of the
22807 first element is undefined.
22808
22809
22810 '``llvm.arithmetic.fence``' Intrinsic
22811 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22812
22813 Syntax:
22814 """""""
22815
22816 ::
22817
22818       declare <type>
22819       @llvm.arithmetic.fence(<type> <op>)
22820
22821 Overview:
22822 """""""""
22823
22824 The purpose of the ``llvm.arithmetic.fence`` intrinsic
22825 is to prevent the optimizer from performing fast-math optimizations,
22826 particularly reassociation,
22827 between the argument and the expression that contains the argument.
22828 It can be used to preserve the parentheses in the source language.
22829
22830 Arguments:
22831 """"""""""
22832
22833 The ``llvm.arithmetic.fence`` intrinsic takes only one argument.
22834 The argument and the return value are floating-point numbers,
22835 or vector floating-point numbers, of the same type.
22836
22837 Semantics:
22838 """"""""""
22839
22840 This intrinsic returns the value of its operand. The optimizer can optimize
22841 the argument, but the optimizer cannot hoist any component of the operand
22842 to the containing context, and the optimizer cannot move the calculation of
22843 any expression in the containing context into the operand.
22844
22845
22846 '``llvm.donothing``' Intrinsic
22847 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22848
22849 Syntax:
22850 """""""
22851
22852 ::
22853
22854       declare void @llvm.donothing() nounwind readnone
22855
22856 Overview:
22857 """""""""
22858
22859 The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only
22860 three intrinsics (besides ``llvm.experimental.patchpoint`` and
22861 ``llvm.experimental.gc.statepoint``) that can be called with an invoke
22862 instruction.
22863
22864 Arguments:
22865 """"""""""
22866
22867 None.
22868
22869 Semantics:
22870 """"""""""
22871
22872 This intrinsic does nothing, and it's removed by optimizers and ignored
22873 by codegen.
22874
22875 '``llvm.experimental.deoptimize``' Intrinsic
22876 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22877
22878 Syntax:
22879 """""""
22880
22881 ::
22882
22883       declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ]
22884
22885 Overview:
22886 """""""""
22887
22888 This intrinsic, together with :ref:`deoptimization operand bundles
22889 <deopt_opbundles>`, allow frontends to express transfer of control and
22890 frame-local state from the currently executing (typically more specialized,
22891 hence faster) version of a function into another (typically more generic, hence
22892 slower) version.
22893
22894 In languages with a fully integrated managed runtime like Java and JavaScript
22895 this intrinsic can be used to implement "uncommon trap" or "side exit" like
22896 functionality.  In unmanaged languages like C and C++, this intrinsic can be
22897 used to represent the slow paths of specialized functions.
22898
22899
22900 Arguments:
22901 """"""""""
22902
22903 The intrinsic takes an arbitrary number of arguments, whose meaning is
22904 decided by the :ref:`lowering strategy<deoptimize_lowering>`.
22905
22906 Semantics:
22907 """"""""""
22908
22909 The ``@llvm.experimental.deoptimize`` intrinsic executes an attached
22910 deoptimization continuation (denoted using a :ref:`deoptimization
22911 operand bundle <deopt_opbundles>`) and returns the value returned by
22912 the deoptimization continuation.  Defining the semantic properties of
22913 the continuation itself is out of scope of the language reference --
22914 as far as LLVM is concerned, the deoptimization continuation can
22915 invoke arbitrary side effects, including reading from and writing to
22916 the entire heap.
22917
22918 Deoptimization continuations expressed using ``"deopt"`` operand bundles always
22919 continue execution to the end of the physical frame containing them, so all
22920 calls to ``@llvm.experimental.deoptimize`` must be in "tail position":
22921
22922    - ``@llvm.experimental.deoptimize`` cannot be invoked.
22923    - The call must immediately precede a :ref:`ret <i_ret>` instruction.
22924    - The ``ret`` instruction must return the value produced by the
22925      ``@llvm.experimental.deoptimize`` call if there is one, or void.
22926
22927 Note that the above restrictions imply that the return type for a call to
22928 ``@llvm.experimental.deoptimize`` will match the return type of its immediate
22929 caller.
22930
22931 The inliner composes the ``"deopt"`` continuations of the caller into the
22932 ``"deopt"`` continuations present in the inlinee, and also updates calls to this
22933 intrinsic to return directly from the frame of the function it inlined into.
22934
22935 All declarations of ``@llvm.experimental.deoptimize`` must share the
22936 same calling convention.
22937
22938 .. _deoptimize_lowering:
22939
22940 Lowering:
22941 """""""""
22942
22943 Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the
22944 symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to
22945 ensure that this symbol is defined).  The call arguments to
22946 ``@llvm.experimental.deoptimize`` are lowered as if they were formal
22947 arguments of the specified types, and not as varargs.
22948
22949
22950 '``llvm.experimental.guard``' Intrinsic
22951 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22952
22953 Syntax:
22954 """""""
22955
22956 ::
22957
22958       declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ]
22959
22960 Overview:
22961 """""""""
22962
22963 This intrinsic, together with :ref:`deoptimization operand bundles
22964 <deopt_opbundles>`, allows frontends to express guards or checks on
22965 optimistic assumptions made during compilation.  The semantics of
22966 ``@llvm.experimental.guard`` is defined in terms of
22967 ``@llvm.experimental.deoptimize`` -- its body is defined to be
22968 equivalent to:
22969
22970 .. code-block:: text
22971
22972   define void @llvm.experimental.guard(i1 %pred, <args...>) {
22973     %realPred = and i1 %pred, undef
22974     br i1 %realPred, label %continue, label %leave [, !make.implicit !{}]
22975
22976   leave:
22977     call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ]
22978     ret void
22979
22980   continue:
22981     ret void
22982   }
22983
22984
22985 with the optional ``[, !make.implicit !{}]`` present if and only if it
22986 is present on the call site.  For more details on ``!make.implicit``,
22987 see :doc:`FaultMaps`.
22988
22989 In words, ``@llvm.experimental.guard`` executes the attached
22990 ``"deopt"`` continuation if (but **not** only if) its first argument
22991 is ``false``.  Since the optimizer is allowed to replace the ``undef``
22992 with an arbitrary value, it can optimize guard to fail "spuriously",
22993 i.e. without the original condition being false (hence the "not only
22994 if"); and this allows for "check widening" type optimizations.
22995
22996 ``@llvm.experimental.guard`` cannot be invoked.
22997
22998 After ``@llvm.experimental.guard`` was first added, a more general
22999 formulation was found in ``@llvm.experimental.widenable.condition``.
23000 Support for ``@llvm.experimental.guard`` is slowly being rephrased in
23001 terms of this alternate.
23002
23003 '``llvm.experimental.widenable.condition``' Intrinsic
23004 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23005
23006 Syntax:
23007 """""""
23008
23009 ::
23010
23011       declare i1 @llvm.experimental.widenable.condition()
23012
23013 Overview:
23014 """""""""
23015
23016 This intrinsic represents a "widenable condition" which is
23017 boolean expressions with the following property: whether this
23018 expression is `true` or `false`, the program is correct and
23019 well-defined.
23020
23021 Together with :ref:`deoptimization operand bundles <deopt_opbundles>`,
23022 ``@llvm.experimental.widenable.condition`` allows frontends to
23023 express guards or checks on optimistic assumptions made during
23024 compilation and represent them as branch instructions on special
23025 conditions.
23026
23027 While this may appear similar in semantics to `undef`, it is very
23028 different in that an invocation produces a particular, singular
23029 value. It is also intended to be lowered late, and remain available
23030 for specific optimizations and transforms that can benefit from its
23031 special properties.
23032
23033 Arguments:
23034 """"""""""
23035
23036 None.
23037
23038 Semantics:
23039 """"""""""
23040
23041 The intrinsic ``@llvm.experimental.widenable.condition()``
23042 returns either `true` or `false`. For each evaluation of a call
23043 to this intrinsic, the program must be valid and correct both if
23044 it returns `true` and if it returns `false`. This allows
23045 transformation passes to replace evaluations of this intrinsic
23046 with either value whenever one is beneficial.
23047
23048 When used in a branch condition, it allows us to choose between
23049 two alternative correct solutions for the same problem, like
23050 in example below:
23051
23052 .. code-block:: text
23053
23054     %cond = call i1 @llvm.experimental.widenable.condition()
23055     br i1 %cond, label %solution_1, label %solution_2
23056
23057   label %fast_path:
23058     ; Apply memory-consuming but fast solution for a task.
23059
23060   label %slow_path:
23061     ; Cheap in memory but slow solution.
23062
23063 Whether the result of intrinsic's call is `true` or `false`,
23064 it should be correct to pick either solution. We can switch
23065 between them by replacing the result of
23066 ``@llvm.experimental.widenable.condition`` with different
23067 `i1` expressions.
23068
23069 This is how it can be used to represent guards as widenable branches:
23070
23071 .. code-block:: text
23072
23073   block:
23074     ; Unguarded instructions
23075     call void @llvm.experimental.guard(i1 %cond, <args...>) ["deopt"(<deopt_args...>)]
23076     ; Guarded instructions
23077
23078 Can be expressed in an alternative equivalent form of explicit branch using
23079 ``@llvm.experimental.widenable.condition``:
23080
23081 .. code-block:: text
23082
23083   block:
23084     ; Unguarded instructions
23085     %widenable_condition = call i1 @llvm.experimental.widenable.condition()
23086     %guard_condition = and i1 %cond, %widenable_condition
23087     br i1 %guard_condition, label %guarded, label %deopt
23088
23089   guarded:
23090     ; Guarded instructions
23091
23092   deopt:
23093     call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ]
23094
23095 So the block `guarded` is only reachable when `%cond` is `true`,
23096 and it should be valid to go to the block `deopt` whenever `%cond`
23097 is `true` or `false`.
23098
23099 ``@llvm.experimental.widenable.condition`` will never throw, thus
23100 it cannot be invoked.
23101
23102 Guard widening:
23103 """""""""""""""
23104
23105 When ``@llvm.experimental.widenable.condition()`` is used in
23106 condition of a guard represented as explicit branch, it is
23107 legal to widen the guard's condition with any additional
23108 conditions.
23109
23110 Guard widening looks like replacement of
23111
23112 .. code-block:: text
23113
23114   %widenable_cond = call i1 @llvm.experimental.widenable.condition()
23115   %guard_cond = and i1 %cond, %widenable_cond
23116   br i1 %guard_cond, label %guarded, label %deopt
23117
23118 with
23119
23120 .. code-block:: text
23121
23122   %widenable_cond = call i1 @llvm.experimental.widenable.condition()
23123   %new_cond = and i1 %any_other_cond, %widenable_cond
23124   %new_guard_cond = and i1 %cond, %new_cond
23125   br i1 %new_guard_cond, label %guarded, label %deopt
23126
23127 for this branch. Here `%any_other_cond` is an arbitrarily chosen
23128 well-defined `i1` value. By making guard widening, we may
23129 impose stricter conditions on `guarded` block and bail to the
23130 deopt when the new condition is not met.
23131
23132 Lowering:
23133 """""""""
23134
23135 Default lowering strategy is replacing the result of
23136 call of ``@llvm.experimental.widenable.condition``  with
23137 constant `true`. However it is always correct to replace
23138 it with any other `i1` value. Any pass can
23139 freely do it if it can benefit from non-default lowering.
23140
23141
23142 '``llvm.load.relative``' Intrinsic
23143 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23144
23145 Syntax:
23146 """""""
23147
23148 ::
23149
23150       declare i8* @llvm.load.relative.iN(i8* %ptr, iN %offset) argmemonly nounwind readonly
23151
23152 Overview:
23153 """""""""
23154
23155 This intrinsic loads a 32-bit value from the address ``%ptr + %offset``,
23156 adds ``%ptr`` to that value and returns it. The constant folder specifically
23157 recognizes the form of this intrinsic and the constant initializers it may
23158 load from; if a loaded constant initializer is known to have the form
23159 ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``.
23160
23161 LLVM provides that the calculation of such a constant initializer will
23162 not overflow at link time under the medium code model if ``x`` is an
23163 ``unnamed_addr`` function. However, it does not provide this guarantee for
23164 a constant initializer folded into a function body. This intrinsic can be
23165 used to avoid the possibility of overflows when loading from such a constant.
23166
23167 .. _llvm_sideeffect:
23168
23169 '``llvm.sideeffect``' Intrinsic
23170 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23171
23172 Syntax:
23173 """""""
23174
23175 ::
23176
23177       declare void @llvm.sideeffect() inaccessiblememonly nounwind willreturn
23178
23179 Overview:
23180 """""""""
23181
23182 The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers
23183 treat it as having side effects, so it can be inserted into a loop to
23184 indicate that the loop shouldn't be assumed to terminate (which could
23185 potentially lead to the loop being optimized away entirely), even if it's
23186 an infinite loop with no other side effects.
23187
23188 Arguments:
23189 """"""""""
23190
23191 None.
23192
23193 Semantics:
23194 """"""""""
23195
23196 This intrinsic actually does nothing, but optimizers must assume that it
23197 has externally observable side effects.
23198
23199 '``llvm.is.constant.*``' Intrinsic
23200 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23201
23202 Syntax:
23203 """""""
23204
23205 This is an overloaded intrinsic. You can use llvm.is.constant with any argument type.
23206
23207 ::
23208
23209       declare i1 @llvm.is.constant.i32(i32 %operand) nounwind readnone
23210       declare i1 @llvm.is.constant.f32(float %operand) nounwind readnone
23211       declare i1 @llvm.is.constant.TYPENAME(TYPE %operand) nounwind readnone
23212
23213 Overview:
23214 """""""""
23215
23216 The '``llvm.is.constant``' intrinsic will return true if the argument
23217 is known to be a manifest compile-time constant. It is guaranteed to
23218 fold to either true or false before generating machine code.
23219
23220 Semantics:
23221 """"""""""
23222
23223 This intrinsic generates no code. If its argument is known to be a
23224 manifest compile-time constant value, then the intrinsic will be
23225 converted to a constant true value. Otherwise, it will be converted to
23226 a constant false value.
23227
23228 In particular, note that if the argument is a constant expression
23229 which refers to a global (the address of which _is_ a constant, but
23230 not manifest during the compile), then the intrinsic evaluates to
23231 false.
23232
23233 The result also intentionally depends on the result of optimization
23234 passes -- e.g., the result can change depending on whether a
23235 function gets inlined or not. A function's parameters are
23236 obviously not constant. However, a call like
23237 ``llvm.is.constant.i32(i32 %param)`` *can* return true after the
23238 function is inlined, if the value passed to the function parameter was
23239 a constant.
23240
23241 On the other hand, if constant folding is not run, it will never
23242 evaluate to true, even in simple cases.
23243
23244 .. _int_ptrmask:
23245
23246 '``llvm.ptrmask``' Intrinsic
23247 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23248
23249 Syntax:
23250 """""""
23251
23252 ::
23253
23254       declare ptrty llvm.ptrmask(ptrty %ptr, intty %mask) readnone speculatable
23255
23256 Arguments:
23257 """"""""""
23258
23259 The first argument is a pointer. The second argument is an integer.
23260
23261 Overview:
23262 """"""""""
23263
23264 The ``llvm.ptrmask`` intrinsic masks out bits of the pointer according to a mask.
23265 This allows stripping data from tagged pointers without converting them to an
23266 integer (ptrtoint/inttoptr). As a consequence, we can preserve more information
23267 to facilitate alias analysis and underlying-object detection.
23268
23269 Semantics:
23270 """"""""""
23271
23272 The result of ``ptrmask(ptr, mask)`` is equivalent to
23273 ``getelementptr ptr, (ptrtoint(ptr) & mask) - ptrtoint(ptr)``. Both the returned
23274 pointer and the first argument are based on the same underlying object (for more
23275 information on the *based on* terminology see
23276 :ref:`the pointer aliasing rules <pointeraliasing>`). If the bitwidth of the
23277 mask argument does not match the pointer size of the target, the mask is
23278 zero-extended or truncated accordingly.
23279
23280 .. _int_vscale:
23281
23282 '``llvm.vscale``' Intrinsic
23283 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
23284
23285 Syntax:
23286 """""""
23287
23288 ::
23289
23290       declare i32 llvm.vscale.i32()
23291       declare i64 llvm.vscale.i64()
23292
23293 Overview:
23294 """""""""
23295
23296 The ``llvm.vscale`` intrinsic returns the value for ``vscale`` in scalable
23297 vectors such as ``<vscale x 16 x i8>``.
23298
23299 Semantics:
23300 """"""""""
23301
23302 ``vscale`` is a positive value that is constant throughout program
23303 execution, but is unknown at compile time.
23304 If the result value does not fit in the result type, then the result is
23305 a :ref:`poison value <poisonvalues>`.
23306
23307
23308 Stack Map Intrinsics
23309 --------------------
23310
23311 LLVM provides experimental intrinsics to support runtime patching
23312 mechanisms commonly desired in dynamic language JITs. These intrinsics
23313 are described in :doc:`StackMaps`.
23314
23315 Element Wise Atomic Memory Intrinsics
23316 -------------------------------------
23317
23318 These intrinsics are similar to the standard library memory intrinsics except
23319 that they perform memory transfer as a sequence of atomic memory accesses.
23320
23321 .. _int_memcpy_element_unordered_atomic:
23322
23323 '``llvm.memcpy.element.unordered.atomic``' Intrinsic
23324 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23325
23326 Syntax:
23327 """""""
23328
23329 This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on
23330 any integer bit width and for different address spaces. Not all targets
23331 support all bit widths however.
23332
23333 ::
23334
23335       declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>,
23336                                                                        i8* <src>,
23337                                                                        i32 <len>,
23338                                                                        i32 <element_size>)
23339       declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>,
23340                                                                        i8* <src>,
23341                                                                        i64 <len>,
23342                                                                        i32 <element_size>)
23343
23344 Overview:
23345 """""""""
23346
23347 The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the
23348 '``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated
23349 as arrays with elements that are exactly ``element_size`` bytes, and the copy between
23350 buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations
23351 that are a positive integer multiple of the ``element_size`` in size.
23352
23353 Arguments:
23354 """"""""""
23355
23356 The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>`
23357 intrinsic, with the added constraint that ``len`` is required to be a positive integer
23358 multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
23359 ``element_size``, then the behaviour of the intrinsic is undefined.
23360
23361 ``element_size`` must be a compile-time constant positive power of two no greater than
23362 target-specific atomic access size limit.
23363
23364 For each of the input pointers ``align`` parameter attribute must be specified. It
23365 must be a power of two no less than the ``element_size``. Caller guarantees that
23366 both the source and destination pointers are aligned to that boundary.
23367
23368 Semantics:
23369 """"""""""
23370
23371 The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of
23372 memory from the source location to the destination location. These locations are not
23373 allowed to overlap. The memory copy is performed as a sequence of load/store operations
23374 where each access is guaranteed to be a multiple of ``element_size`` bytes wide and
23375 aligned at an ``element_size`` boundary.
23376
23377 The order of the copy is unspecified. The same value may be read from the source
23378 buffer many times, but only one write is issued to the destination buffer per
23379 element. It is well defined to have concurrent reads and writes to both source and
23380 destination provided those reads and writes are unordered atomic when specified.
23381
23382 This intrinsic does not provide any additional ordering guarantees over those
23383 provided by a set of unordered loads from the source location and stores to the
23384 destination.
23385
23386 Lowering:
23387 """""""""
23388
23389 In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is
23390 lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*'
23391 is replaced with an actual element size. See :ref:`RewriteStatepointsForGC intrinsic
23392 lowering <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
23393 lowering.
23394
23395 Optimizer is allowed to inline memory copy when it's profitable to do so.
23396
23397 '``llvm.memmove.element.unordered.atomic``' Intrinsic
23398 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23399
23400 Syntax:
23401 """""""
23402
23403 This is an overloaded intrinsic. You can use
23404 ``llvm.memmove.element.unordered.atomic`` on any integer bit width and for
23405 different address spaces. Not all targets support all bit widths however.
23406
23407 ::
23408
23409       declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>,
23410                                                                         i8* <src>,
23411                                                                         i32 <len>,
23412                                                                         i32 <element_size>)
23413       declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>,
23414                                                                         i8* <src>,
23415                                                                         i64 <len>,
23416                                                                         i32 <element_size>)
23417
23418 Overview:
23419 """""""""
23420
23421 The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization
23422 of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and
23423 ``src`` are treated as arrays with elements that are exactly ``element_size``
23424 bytes, and the copy between buffers uses a sequence of
23425 :ref:`unordered atomic <ordering>` load/store operations that are a positive
23426 integer multiple of the ``element_size`` in size.
23427
23428 Arguments:
23429 """"""""""
23430
23431 The first three arguments are the same as they are in the
23432 :ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that
23433 ``len`` is required to be a positive integer multiple of the ``element_size``.
23434 If ``len`` is not a positive integer multiple of ``element_size``, then the
23435 behaviour of the intrinsic is undefined.
23436
23437 ``element_size`` must be a compile-time constant positive power of two no
23438 greater than a target-specific atomic access size limit.
23439
23440 For each of the input pointers the ``align`` parameter attribute must be
23441 specified. It must be a power of two no less than the ``element_size``. Caller
23442 guarantees that both the source and destination pointers are aligned to that
23443 boundary.
23444
23445 Semantics:
23446 """"""""""
23447
23448 The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes
23449 of memory from the source location to the destination location. These locations
23450 are allowed to overlap. The memory copy is performed as a sequence of load/store
23451 operations where each access is guaranteed to be a multiple of ``element_size``
23452 bytes wide and aligned at an ``element_size`` boundary.
23453
23454 The order of the copy is unspecified. The same value may be read from the source
23455 buffer many times, but only one write is issued to the destination buffer per
23456 element. It is well defined to have concurrent reads and writes to both source
23457 and destination provided those reads and writes are unordered atomic when
23458 specified.
23459
23460 This intrinsic does not provide any additional ordering guarantees over those
23461 provided by a set of unordered loads from the source location and stores to the
23462 destination.
23463
23464 Lowering:
23465 """""""""
23466
23467 In the most general case call to the
23468 '``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol
23469 ``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an
23470 actual element size. See :ref:`RewriteStatepointsForGC intrinsic lowering
23471 <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
23472 lowering.
23473
23474 The optimizer is allowed to inline the memory copy when it's profitable to do so.
23475
23476 .. _int_memset_element_unordered_atomic:
23477
23478 '``llvm.memset.element.unordered.atomic``' Intrinsic
23479 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23480
23481 Syntax:
23482 """""""
23483
23484 This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on
23485 any integer bit width and for different address spaces. Not all targets
23486 support all bit widths however.
23487
23488 ::
23489
23490       declare void @llvm.memset.element.unordered.atomic.p0i8.i32(i8* <dest>,
23491                                                                   i8 <value>,
23492                                                                   i32 <len>,
23493                                                                   i32 <element_size>)
23494       declare void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* <dest>,
23495                                                                   i8 <value>,
23496                                                                   i64 <len>,
23497                                                                   i32 <element_size>)
23498
23499 Overview:
23500 """""""""
23501
23502 The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the
23503 '``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array
23504 with elements that are exactly ``element_size`` bytes, and the assignment to that array
23505 uses uses a sequence of :ref:`unordered atomic <ordering>` store operations
23506 that are a positive integer multiple of the ``element_size`` in size.
23507
23508 Arguments:
23509 """"""""""
23510
23511 The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>`
23512 intrinsic, with the added constraint that ``len`` is required to be a positive integer
23513 multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
23514 ``element_size``, then the behaviour of the intrinsic is undefined.
23515
23516 ``element_size`` must be a compile-time constant positive power of two no greater than
23517 target-specific atomic access size limit.
23518
23519 The ``dest`` input pointer must have the ``align`` parameter attribute specified. It
23520 must be a power of two no less than the ``element_size``. Caller guarantees that
23521 the destination pointer is aligned to that boundary.
23522
23523 Semantics:
23524 """"""""""
23525
23526 The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of
23527 memory starting at the destination location to the given ``value``. The memory is
23528 set with a sequence of store operations where each access is guaranteed to be a
23529 multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary.
23530
23531 The order of the assignment is unspecified. Only one write is issued to the
23532 destination buffer per element. It is well defined to have concurrent reads and
23533 writes to the destination provided those reads and writes are unordered atomic
23534 when specified.
23535
23536 This intrinsic does not provide any additional ordering guarantees over those
23537 provided by a set of unordered stores to the destination.
23538
23539 Lowering:
23540 """""""""
23541
23542 In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is
23543 lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*'
23544 is replaced with an actual element size.
23545
23546 The optimizer is allowed to inline the memory assignment when it's profitable to do so.
23547
23548 Objective-C ARC Runtime Intrinsics
23549 ----------------------------------
23550
23551 LLVM provides intrinsics that lower to Objective-C ARC runtime entry points.
23552 LLVM is aware of the semantics of these functions, and optimizes based on that
23553 knowledge. You can read more about the details of Objective-C ARC `here
23554 <https://clang.llvm.org/docs/AutomaticReferenceCounting.html>`_.
23555
23556 '``llvm.objc.autorelease``' Intrinsic
23557 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23558
23559 Syntax:
23560 """""""
23561 ::
23562
23563       declare i8* @llvm.objc.autorelease(i8*)
23564
23565 Lowering:
23566 """""""""
23567
23568 Lowers to a call to `objc_autorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autorelease>`_.
23569
23570 '``llvm.objc.autoreleasePoolPop``' Intrinsic
23571 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23572
23573 Syntax:
23574 """""""
23575 ::
23576
23577       declare void @llvm.objc.autoreleasePoolPop(i8*)
23578
23579 Lowering:
23580 """""""""
23581
23582 Lowers to a call to `objc_autoreleasePoolPop <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpop-void-pool>`_.
23583
23584 '``llvm.objc.autoreleasePoolPush``' Intrinsic
23585 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23586
23587 Syntax:
23588 """""""
23589 ::
23590
23591       declare i8* @llvm.objc.autoreleasePoolPush()
23592
23593 Lowering:
23594 """""""""
23595
23596 Lowers to a call to `objc_autoreleasePoolPush <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpush-void>`_.
23597
23598 '``llvm.objc.autoreleaseReturnValue``' Intrinsic
23599 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23600
23601 Syntax:
23602 """""""
23603 ::
23604
23605       declare i8* @llvm.objc.autoreleaseReturnValue(i8*)
23606
23607 Lowering:
23608 """""""""
23609
23610 Lowers to a call to `objc_autoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue>`_.
23611
23612 '``llvm.objc.copyWeak``' Intrinsic
23613 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23614
23615 Syntax:
23616 """""""
23617 ::
23618
23619       declare void @llvm.objc.copyWeak(i8**, i8**)
23620
23621 Lowering:
23622 """""""""
23623
23624 Lowers to a call to `objc_copyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-copyweak-id-dest-id-src>`_.
23625
23626 '``llvm.objc.destroyWeak``' Intrinsic
23627 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23628
23629 Syntax:
23630 """""""
23631 ::
23632
23633       declare void @llvm.objc.destroyWeak(i8**)
23634
23635 Lowering:
23636 """""""""
23637
23638 Lowers to a call to `objc_destroyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-destroyweak-id-object>`_.
23639
23640 '``llvm.objc.initWeak``' Intrinsic
23641 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23642
23643 Syntax:
23644 """""""
23645 ::
23646
23647       declare i8* @llvm.objc.initWeak(i8**, i8*)
23648
23649 Lowering:
23650 """""""""
23651
23652 Lowers to a call to `objc_initWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-initweak>`_.
23653
23654 '``llvm.objc.loadWeak``' Intrinsic
23655 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23656
23657 Syntax:
23658 """""""
23659 ::
23660
23661       declare i8* @llvm.objc.loadWeak(i8**)
23662
23663 Lowering:
23664 """""""""
23665
23666 Lowers to a call to `objc_loadWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweak>`_.
23667
23668 '``llvm.objc.loadWeakRetained``' Intrinsic
23669 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23670
23671 Syntax:
23672 """""""
23673 ::
23674
23675       declare i8* @llvm.objc.loadWeakRetained(i8**)
23676
23677 Lowering:
23678 """""""""
23679
23680 Lowers to a call to `objc_loadWeakRetained <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweakretained>`_.
23681
23682 '``llvm.objc.moveWeak``' Intrinsic
23683 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23684
23685 Syntax:
23686 """""""
23687 ::
23688
23689       declare void @llvm.objc.moveWeak(i8**, i8**)
23690
23691 Lowering:
23692 """""""""
23693
23694 Lowers to a call to `objc_moveWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-moveweak-id-dest-id-src>`_.
23695
23696 '``llvm.objc.release``' Intrinsic
23697 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23698
23699 Syntax:
23700 """""""
23701 ::
23702
23703       declare void @llvm.objc.release(i8*)
23704
23705 Lowering:
23706 """""""""
23707
23708 Lowers to a call to `objc_release <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-release-id-value>`_.
23709
23710 '``llvm.objc.retain``' Intrinsic
23711 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23712
23713 Syntax:
23714 """""""
23715 ::
23716
23717       declare i8* @llvm.objc.retain(i8*)
23718
23719 Lowering:
23720 """""""""
23721
23722 Lowers to a call to `objc_retain <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain>`_.
23723
23724 '``llvm.objc.retainAutorelease``' Intrinsic
23725 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23726
23727 Syntax:
23728 """""""
23729 ::
23730
23731       declare i8* @llvm.objc.retainAutorelease(i8*)
23732
23733 Lowering:
23734 """""""""
23735
23736 Lowers to a call to `objc_retainAutorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautorelease>`_.
23737
23738 '``llvm.objc.retainAutoreleaseReturnValue``' Intrinsic
23739 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23740
23741 Syntax:
23742 """""""
23743 ::
23744
23745       declare i8* @llvm.objc.retainAutoreleaseReturnValue(i8*)
23746
23747 Lowering:
23748 """""""""
23749
23750 Lowers to a call to `objc_retainAutoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasereturnvalue>`_.
23751
23752 '``llvm.objc.retainAutoreleasedReturnValue``' Intrinsic
23753 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23754
23755 Syntax:
23756 """""""
23757 ::
23758
23759       declare i8* @llvm.objc.retainAutoreleasedReturnValue(i8*)
23760
23761 Lowering:
23762 """""""""
23763
23764 Lowers to a call to `objc_retainAutoreleasedReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasedreturnvalue>`_.
23765
23766 '``llvm.objc.retainBlock``' Intrinsic
23767 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23768
23769 Syntax:
23770 """""""
23771 ::
23772
23773       declare i8* @llvm.objc.retainBlock(i8*)
23774
23775 Lowering:
23776 """""""""
23777
23778 Lowers to a call to `objc_retainBlock <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainblock>`_.
23779
23780 '``llvm.objc.storeStrong``' Intrinsic
23781 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23782
23783 Syntax:
23784 """""""
23785 ::
23786
23787       declare void @llvm.objc.storeStrong(i8**, i8*)
23788
23789 Lowering:
23790 """""""""
23791
23792 Lowers to a call to `objc_storeStrong <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-storestrong-id-object-id-value>`_.
23793
23794 '``llvm.objc.storeWeak``' Intrinsic
23795 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23796
23797 Syntax:
23798 """""""
23799 ::
23800
23801       declare i8* @llvm.objc.storeWeak(i8**, i8*)
23802
23803 Lowering:
23804 """""""""
23805
23806 Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_.
23807
23808 Preserving Debug Information Intrinsics
23809 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23810
23811 These intrinsics are used to carry certain debuginfo together with
23812 IR-level operations. For example, it may be desirable to
23813 know the structure/union name and the original user-level field
23814 indices. Such information got lost in IR GetElementPtr instruction
23815 since the IR types are different from debugInfo types and unions
23816 are converted to structs in IR.
23817
23818 '``llvm.preserve.array.access.index``' Intrinsic
23819 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23820
23821 Syntax:
23822 """""""
23823 ::
23824
23825       declare <ret_type>
23826       @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(<type> base,
23827                                                                            i32 dim,
23828                                                                            i32 index)
23829
23830 Overview:
23831 """""""""
23832
23833 The '``llvm.preserve.array.access.index``' intrinsic returns the getelementptr address
23834 based on array base ``base``, array dimension ``dim`` and the last access index ``index``
23835 into the array. The return type ``ret_type`` is a pointer type to the array element.
23836 The array ``dim`` and ``index`` are preserved which is more robust than
23837 getelementptr instruction which may be subject to compiler transformation.
23838 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
23839 to provide array or pointer debuginfo type.
23840 The metadata is a ``DICompositeType`` or ``DIDerivedType`` representing the
23841 debuginfo version of ``type``.
23842
23843 Arguments:
23844 """"""""""
23845
23846 The ``base`` is the array base address.  The ``dim`` is the array dimension.
23847 The ``base`` is a pointer if ``dim`` equals 0.
23848 The ``index`` is the last access index into the array or pointer.
23849
23850 The ``base`` argument must be annotated with an :ref:`elementtype
23851 <attr_elementtype>` attribute at the call-site. This attribute specifies the
23852 getelementptr element type.
23853
23854 Semantics:
23855 """"""""""
23856
23857 The '``llvm.preserve.array.access.index``' intrinsic produces the same result
23858 as a getelementptr with base ``base`` and access operands ``{dim's 0's, index}``.
23859
23860 '``llvm.preserve.union.access.index``' Intrinsic
23861 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23862
23863 Syntax:
23864 """""""
23865 ::
23866
23867       declare <type>
23868       @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(<type> base,
23869                                                                         i32 di_index)
23870
23871 Overview:
23872 """""""""
23873
23874 The '``llvm.preserve.union.access.index``' intrinsic carries the debuginfo field index
23875 ``di_index`` and returns the ``base`` address.
23876 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
23877 to provide union debuginfo type.
23878 The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
23879 The return type ``type`` is the same as the ``base`` type.
23880
23881 Arguments:
23882 """"""""""
23883
23884 The ``base`` is the union base address. The ``di_index`` is the field index in debuginfo.
23885
23886 Semantics:
23887 """"""""""
23888
23889 The '``llvm.preserve.union.access.index``' intrinsic returns the ``base`` address.
23890
23891 '``llvm.preserve.struct.access.index``' Intrinsic
23892 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23893
23894 Syntax:
23895 """""""
23896 ::
23897
23898       declare <ret_type>
23899       @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(<type> base,
23900                                                                  i32 gep_index,
23901                                                                  i32 di_index)
23902
23903 Overview:
23904 """""""""
23905
23906 The '``llvm.preserve.struct.access.index``' intrinsic returns the getelementptr address
23907 based on struct base ``base`` and IR struct member index ``gep_index``.
23908 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
23909 to provide struct debuginfo type.
23910 The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
23911 The return type ``ret_type`` is a pointer type to the structure member.
23912
23913 Arguments:
23914 """"""""""
23915
23916 The ``base`` is the structure base address. The ``gep_index`` is the struct member index
23917 based on IR structures. The ``di_index`` is the struct member index based on debuginfo.
23918
23919 The ``base`` argument must be annotated with an :ref:`elementtype
23920 <attr_elementtype>` attribute at the call-site. This attribute specifies the
23921 getelementptr element type.
23922
23923 Semantics:
23924 """"""""""
23925
23926 The '``llvm.preserve.struct.access.index``' intrinsic produces the same result
23927 as a getelementptr with base ``base`` and access operands ``{0, gep_index}``.