llvm/docs/LangRef.rst

   1 ==============================
   2 LLVM Language Reference Manual
   3 ==============================
   4
   5 .. contents::
   6    :local:
   7    :depth: 4
   8
   9 Abstract
  10 ========
  11
  12 This document is a reference manual for the LLVM assembly language. LLVM
  13 is a Static Single Assignment (SSA) based representation that provides
  14 type safety, low-level operations, flexibility, and the capability of
  15 representing 'all' high-level languages cleanly. It is the common code
  16 representation used throughout all phases of the LLVM compilation
  17 strategy.
  18
  19 Introduction
  20 ============
  21
  22 The LLVM code representation is designed to be used in three different
  23 forms: as an in-memory compiler IR, as an on-disk bitcode representation
  24 (suitable for fast loading by a Just-In-Time compiler), and as a human
  25 readable assembly language representation. This allows LLVM to provide a
  26 powerful intermediate representation for efficient compiler
  27 transformations and analysis, while providing a natural means to debug
  28 and visualize the transformations. The three different forms of LLVM are
  29 all equivalent. This document describes the human readable
  30 representation and notation.
  31
  32 The LLVM representation aims to be light-weight and low-level while
  33 being expressive, typed, and extensible at the same time. It aims to be
  34 a "universal IR" of sorts, by being at a low enough level that
  35 high-level ideas may be cleanly mapped to it (similar to how
  36 microprocessors are "universal IR's", allowing many source languages to
  37 be mapped to them). By providing type information, LLVM can be used as
  38 the target of optimizations: for example, through pointer analysis, it
  39 can be proven that a C automatic variable is never accessed outside of
  40 the current function, allowing it to be promoted to a simple SSA value
  41 instead of a memory location.
  42
  43 .. _wellformed:
  44
  45 Well-Formedness
  46 ---------------
  47
  48 It is important to note that this document describes 'well formed' LLVM
  49 assembly language. There is a difference between what the parser accepts
  50 and what is considered 'well formed'. For example, the following
  51 instruction is syntactically okay, but not well formed:
  52
  53 .. code-block:: llvm
  54
  55     %x = add i32 1, %x
  56
  57 because the definition of ``%x`` does not dominate all of its uses. The
  58 LLVM infrastructure provides a verification pass that may be used to
  59 verify that an LLVM module is well formed. This pass is automatically
  60 run by the parser after parsing input assembly and by the optimizer
  61 before it outputs bitcode. The violations pointed out by the verifier
  62 pass indicate bugs in transformation passes or input to the parser.
  63
  64 .. _identifiers:
  65
  66 Identifiers
  67 ===========
  68
  69 LLVM identifiers come in two basic types: global and local. Global
  70 identifiers (functions, global variables) begin with the ``'@'``
  71 character. Local identifiers (register names, types) begin with the
  72 ``'%'`` character. Additionally, there are three different formats for
  73 identifiers, for different purposes:
  74
  75 #. Named values are represented as a string of characters with their
  76    prefix. For example, ``%foo``, ``@DivisionByZero``,
  77    ``%a.really.long.identifier``. The actual regular expression used is
  78    '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other
  79    characters in their names can be surrounded with quotes. Special
  80    characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII
  81    code for the character in hexadecimal. In this way, any character can
  82    be used in a name value, even quotes themselves. The ``"\01"`` prefix
  83    can be used on global values to suppress mangling.
  84 #. Unnamed values are represented as an unsigned numeric value with
  85    their prefix. For example, ``%12``, ``@2``, ``%44``.
  86 #. Constants, which are described in the section Constants_ below.
  87
  88 LLVM requires that values start with a prefix for two reasons: Compilers
  89 don't need to worry about name clashes with reserved words, and the set
  90 of reserved words may be expanded in the future without penalty.
  91 Additionally, unnamed identifiers allow a compiler to quickly come up
  92 with a temporary variable without having to avoid symbol table
  93 conflicts.
  94
  95 Reserved words in LLVM are very similar to reserved words in other
  96 languages. There are keywords for different opcodes ('``add``',
  97 '``bitcast``', '``ret``', etc...), for primitive type names ('``void``',
  98 '``i32``', etc...), and others. These reserved words cannot conflict
  99 with variable names, because none of them start with a prefix character
 100 (``'%'`` or ``'@'``).
 101
 102 Here is an example of LLVM code to multiply the integer variable
 103 '``%X``' by 8:
 104
 105 The easy way:
 106
 107 .. code-block:: llvm
 108
 109     %result = mul i32 %X, 8
 110
 111 After strength reduction:
 112
 113 .. code-block:: llvm
 114
 115     %result = shl i32 %X, 3
 116
 117 And the hard way:
 118
 119 .. code-block:: llvm
 120
 121     %0 = add i32 %X, %X           ; yields i32:%0
 122     %1 = add i32 %0, %0           ; yields i32:%1
 123     %result = add i32 %1, %1
 124
 125 This last way of multiplying ``%X`` by 8 illustrates several important
 126 lexical features of LLVM:
 127
 128 #. Comments are delimited with a '``;``' and go until the end of line.
 129 #. Unnamed temporaries are created when the result of a computation is
 130    not assigned to a named value.
 131 #. Unnamed temporaries are numbered sequentially (using a per-function
 132    incrementing counter, starting with 0). Note that basic blocks and unnamed
 133    function parameters are included in this numbering. For example, if the
 134    entry basic block is not given a label name and all function parameters are
 135    named, then it will get number 0.
 136
 137 It also shows a convention that we follow in this document. When
 138 demonstrating instructions, we will follow an instruction with a comment
 139 that defines the type and name of value produced.
 140
 141 High Level Structure
 142 ====================
 143
 144 Module Structure
 145 ----------------
 146
 147 LLVM programs are composed of ``Module``'s, each of which is a
 148 translation unit of the input programs. Each module consists of
 149 functions, global variables, and symbol table entries. Modules may be
 150 combined together with the LLVM linker, which merges function (and
 151 global variable) definitions, resolves forward declarations, and merges
 152 symbol table entries. Here is an example of the "hello world" module:
 153
 154 .. code-block:: llvm
 155
 156     ; Declare the string constant as a global constant.
 157     @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00"
 158
 159     ; External declaration of the puts function
 160     declare i32 @puts(i8* nocapture) nounwind
 161
 162     ; Definition of main function
 163     define i32 @main() {   ; i32()*
 164       ; Convert [13 x i8]* to i8*...
 165       %cast210 = getelementptr [13 x i8], [13 x i8]* @.str, i64 0, i64 0
 166
 167       ; Call puts function to write out the string to stdout.
 168       call i32 @puts(i8* %cast210)
 169       ret i32 0
 170     }
 171
 172     ; Named metadata
 173     !0 = !{i32 42, null, !"string"}
 174     !foo = !{!0}
 175
 176 This example is made up of a :ref:`global variable <globalvars>` named
 177 "``.str``", an external declaration of the "``puts``" function, a
 178 :ref:`function definition <functionstructure>` for "``main``" and
 179 :ref:`named metadata <namedmetadatastructure>` "``foo``".
 180
 181 In general, a module is made up of a list of global values (where both
 182 functions and global variables are global values). Global values are
 183 represented by a pointer to a memory location (in this case, a pointer
 184 to an array of char, and a pointer to a function), and have one of the
 185 following :ref:`linkage types <linkage>`.
 186
 187 .. _linkage:
 188
 189 Linkage Types
 190 -------------
 191
 192 All Global Variables and Functions have one of the following types of
 193 linkage:
 194
 195 ``private``
 196     Global values with "``private``" linkage are only directly
 197     accessible by objects in the current module. In particular, linking
 198     code into a module with a private global value may cause the
 199     private to be renamed as necessary to avoid collisions. Because the
 200     symbol is private to the module, all references can be updated. This
 201     doesn't show up in any symbol table in the object file.
 202 ``internal``
 203     Similar to private, but the value shows as a local symbol
 204     (``STB_LOCAL`` in the case of ELF) in the object file. This
 205     corresponds to the notion of the '``static``' keyword in C.
 206 ``available_externally``
 207     Globals with "``available_externally``" linkage are never emitted into
 208     the object file corresponding to the LLVM module. From the linker's
 209     perspective, an ``available_externally`` global is equivalent to
 210     an external declaration. They exist to allow inlining and other
 211     optimizations to take place given knowledge of the definition of the
 212     global, which is known to be somewhere outside the module. Globals
 213     with ``available_externally`` linkage are allowed to be discarded at
 214     will, and allow inlining and other optimizations. This linkage type is
 215     only allowed on definitions, not declarations.
 216 ``linkonce``
 217     Globals with "``linkonce``" linkage are merged with other globals of
 218     the same name when linkage occurs. This can be used to implement
 219     some forms of inline functions, templates, or other code which must
 220     be generated in each translation unit that uses it, but where the
 221     body may be overridden with a more definitive definition later.
 222     Unreferenced ``linkonce`` globals are allowed to be discarded. Note
 223     that ``linkonce`` linkage does not actually allow the optimizer to
 224     inline the body of this function into callers because it doesn't
 225     know if this definition of the function is the definitive definition
 226     within the program or whether it will be overridden by a stronger
 227     definition. To enable inlining and other optimizations, use
 228     "``linkonce_odr``" linkage.
 229 ``weak``
 230     "``weak``" linkage has the same merging semantics as ``linkonce``
 231     linkage, except that unreferenced globals with ``weak`` linkage may
 232     not be discarded. This is used for globals that are declared "weak"
 233     in C source code.
 234 ``common``
 235     "``common``" linkage is most similar to "``weak``" linkage, but they
 236     are used for tentative definitions in C, such as "``int X;``" at
 237     global scope. Symbols with "``common``" linkage are merged in the
 238     same way as ``weak symbols``, and they may not be deleted if
 239     unreferenced. ``common`` symbols may not have an explicit section,
 240     must have a zero initializer, and may not be marked
 241     ':ref:`constant <globalvars>`'. Functions and aliases may not have
 242     common linkage.
 243
 244 .. _linkage_appending:
 245
 246 ``appending``
 247     "``appending``" linkage may only be applied to global variables of
 248     pointer to array type. When two global variables with appending
 249     linkage are linked together, the two global arrays are appended
 250     together. This is the LLVM, typesafe, equivalent of having the
 251     system linker append together "sections" with identical names when
 252     .o files are linked.
 253
 254     Unfortunately this doesn't correspond to any feature in .o files, so it
 255     can only be used for variables like ``llvm.global_ctors`` which llvm
 256     interprets specially.
 257
 258 ``extern_weak``
 259     The semantics of this linkage follow the ELF object file model: the
 260     symbol is weak until linked, if not linked, the symbol becomes null
 261     instead of being an undefined reference.
 262 ``linkonce_odr``, ``weak_odr``
 263     Some languages allow differing globals to be merged, such as two
 264     functions with different semantics. Other languages, such as
 265     ``C++``, ensure that only equivalent globals are ever merged (the
 266     "one definition rule" --- "ODR"). Such languages can use the
 267     ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the
 268     global will only be merged with equivalent globals. These linkage
 269     types are otherwise the same as their non-``odr`` versions.
 270 ``external``
 271     If none of the above identifiers are used, the global is externally
 272     visible, meaning that it participates in linkage and can be used to
 273     resolve external symbol references.
 274
 275 It is illegal for a global variable or function *declaration* to have any
 276 linkage type other than ``external`` or ``extern_weak``.
 277
 278 .. _callingconv:
 279
 280 Calling Conventions
 281 -------------------
 282
 283 LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and
 284 :ref:`invokes <i_invoke>` can all have an optional calling convention
 285 specified for the call. The calling convention of any pair of dynamic
 286 caller/callee must match, or the behavior of the program is undefined.
 287 The following calling conventions are supported by LLVM, and more may be
 288 added in the future:
 289
 290 "``ccc``" - The C calling convention
 291     This calling convention (the default if no other calling convention
 292     is specified) matches the target C calling conventions. This calling
 293     convention supports varargs function calls and tolerates some
 294     mismatch in the declared prototype and implemented declaration of
 295     the function (as does normal C).
 296 "``fastcc``" - The fast calling convention
 297     This calling convention attempts to make calls as fast as possible
 298     (e.g. by passing things in registers). This calling convention
 299     allows the target to use whatever tricks it wants to produce fast
 300     code for the target, without having to conform to an externally
 301     specified ABI (Application Binary Interface). `Tail calls can only
 302     be optimized when this, the tailcc, the GHC or the HiPE convention is
 303     used. <CodeGenerator.html#id80>`_ This calling convention does not
 304     support varargs and requires the prototype of all callees to exactly
 305     match the prototype of the function definition.
 306 "``coldcc``" - The cold calling convention
 307     This calling convention attempts to make code in the caller as
 308     efficient as possible under the assumption that the call is not
 309     commonly executed. As such, these calls often preserve all registers
 310     so that the call does not break any live ranges in the caller side.
 311     This calling convention does not support varargs and requires the
 312     prototype of all callees to exactly match the prototype of the
 313     function definition. Furthermore the inliner doesn't consider such function
 314     calls for inlining.
 315 "``cc 10``" - GHC convention
 316     This calling convention has been implemented specifically for use by
 317     the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_.
 318     It passes everything in registers, going to extremes to achieve this
 319     by disabling callee save registers. This calling convention should
 320     not be used lightly but only for specific situations such as an
 321     alternative to the *register pinning* performance technique often
 322     used when implementing functional programming languages. At the
 323     moment only X86 supports this convention and it has the following
 324     limitations:
 325
 326     -  On *X86-32* only supports up to 4 bit type parameters. No
 327        floating-point types are supported.
 328     -  On *X86-64* only supports up to 10 bit type parameters and 6
 329        floating-point parameters.
 330
 331     This calling convention supports `tail call
 332     optimization <CodeGenerator.html#id80>`_ but requires both the
 333     caller and callee are using it.
 334 "``cc 11``" - The HiPE calling convention
 335     This calling convention has been implemented specifically for use by
 336     the `High-Performance Erlang
 337     (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the*
 338     native code compiler of the `Ericsson's Open Source Erlang/OTP
 339     system <http://www.erlang.org/download.shtml>`_. It uses more
 340     registers for argument passing than the ordinary C calling
 341     convention and defines no callee-saved registers. The calling
 342     convention properly supports `tail call
 343     optimization <CodeGenerator.html#id80>`_ but requires that both the
 344     caller and the callee use it. It uses a *register pinning*
 345     mechanism, similar to GHC's convention, for keeping frequently
 346     accessed runtime components pinned to specific hardware registers.
 347     At the moment only X86 supports this convention (both 32 and 64
 348     bit).
 349 "``webkit_jscc``" - WebKit's JavaScript calling convention
 350     This calling convention has been implemented for `WebKit FTL JIT
 351     <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the
 352     stack right to left (as cdecl does), and returns a value in the
 353     platform's customary return register.
 354 "``anyregcc``" - Dynamic calling convention for code patching
 355     This is a special convention that supports patching an arbitrary code
 356     sequence in place of a call site. This convention forces the call
 357     arguments into registers but allows them to be dynamically
 358     allocated. This can currently only be used with calls to
 359     llvm.experimental.patchpoint because only this intrinsic records
 360     the location of its arguments in a side table. See :doc:`StackMaps`.
 361 "``preserve_mostcc``" - The `PreserveMost` calling convention
 362     This calling convention attempts to make the code in the caller as
 363     unintrusive as possible. This convention behaves identically to the `C`
 364     calling convention on how arguments and return values are passed, but it
 365     uses a different set of caller/callee-saved registers. This alleviates the
 366     burden of saving and recovering a large register set before and after the
 367     call in the caller. If the arguments are passed in callee-saved registers,
 368     then they will be preserved by the callee across the call. This doesn't
 369     apply for values returned in callee-saved registers.
 370
 371     - On X86-64 the callee preserves all general purpose registers, except for
 372       R11. R11 can be used as a scratch register. Floating-point registers
 373       (XMMs/YMMs) are not preserved and need to be saved by the caller.
 374
 375     The idea behind this convention is to support calls to runtime functions
 376     that have a hot path and a cold path. The hot path is usually a small piece
 377     of code that doesn't use many registers. The cold path might need to call out to
 378     another function and therefore only needs to preserve the caller-saved
 379     registers, which haven't already been saved by the caller. The
 380     `PreserveMost` calling convention is very similar to the `cold` calling
 381     convention in terms of caller/callee-saved registers, but they are used for
 382     different types of function calls. `coldcc` is for function calls that are
 383     rarely executed, whereas `preserve_mostcc` function calls are intended to be
 384     on the hot path and definitely executed a lot. Furthermore `preserve_mostcc`
 385     doesn't prevent the inliner from inlining the function call.
 386
 387     This calling convention will be used by a future version of the ObjectiveC
 388     runtime and should therefore still be considered experimental at this time.
 389     Although this convention was created to optimize certain runtime calls to
 390     the ObjectiveC runtime, it is not limited to this runtime and might be used
 391     by other runtimes in the future too. The current implementation only
 392     supports X86-64, but the intention is to support more architectures in the
 393     future.
 394 "``preserve_allcc``" - The `PreserveAll` calling convention
 395     This calling convention attempts to make the code in the caller even less
 396     intrusive than the `PreserveMost` calling convention. This calling
 397     convention also behaves identical to the `C` calling convention on how
 398     arguments and return values are passed, but it uses a different set of
 399     caller/callee-saved registers. This removes the burden of saving and
 400     recovering a large register set before and after the call in the caller. If
 401     the arguments are passed in callee-saved registers, then they will be
 402     preserved by the callee across the call. This doesn't apply for values
 403     returned in callee-saved registers.
 404
 405     - On X86-64 the callee preserves all general purpose registers, except for
 406       R11. R11 can be used as a scratch register. Furthermore it also preserves
 407       all floating-point registers (XMMs/YMMs).
 408
 409     The idea behind this convention is to support calls to runtime functions
 410     that don't need to call out to any other functions.
 411
 412     This calling convention, like the `PreserveMost` calling convention, will be
 413     used by a future version of the ObjectiveC runtime and should be considered
 414     experimental at this time.
 415 "``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
 416     Clang generates an access function to access C++-style TLS. The access
 417     function generally has an entry block, an exit block and an initialization
 418     block that is run at the first time. The entry and exit blocks can access
 419     a few TLS IR variables, each access will be lowered to a platform-specific
 420     sequence.
 421
 422     This calling convention aims to minimize overhead in the caller by
 423     preserving as many registers as possible (all the registers that are
 424     preserved on the fast path, composed of the entry and exit blocks).
 425
 426     This calling convention behaves identical to the `C` calling convention on
 427     how arguments and return values are passed, but it uses a different set of
 428     caller/callee-saved registers.
 429
 430     Given that each platform has its own lowering sequence, hence its own set
 431     of preserved registers, we can't use the existing `PreserveMost`.
 432
 433     - On X86-64 the callee preserves all general purpose registers, except for
 434       RDI and RAX.
 435 "``tailcc``" - Tail callable calling convention
 436     This calling convention ensures that calls in tail position will always be
 437     tail call optimized. This calling convention is equivalent to fastcc,
 438     except for an additional guarantee that tail calls will be produced
 439     whenever possible. `Tail calls can only be optimized when this, the fastcc,
 440     the GHC or the HiPE convention is used. <CodeGenerator.html#id80>`_ This
 441     calling convention does not support varargs and requires the prototype of
 442     all callees to exactly match the prototype of the function definition.
 443 "``swiftcc``" - This calling convention is used for Swift language.
 444     - On X86-64 RCX and R8 are available for additional integer returns, and
 445       XMM2 and XMM3 are available for additional FP/vector returns.
 446     - On iOS platforms, we use AAPCS-VFP calling convention.
 447 "``swifttailcc``"
 448     This calling convention is like ``swiftcc`` in most respects, but also the
 449     callee pops the argument area of the stack so that mandatory tail calls are
 450     possible as in ``tailcc``.
 451 "``cfguard_checkcc``" - Windows Control Flow Guard (Check mechanism)
 452     This calling convention is used for the Control Flow Guard check function,
 453     calls to which can be inserted before indirect calls to check that the call
 454     target is a valid function address. The check function has no return value,
 455     but it will trigger an OS-level error if the address is not a valid target.
 456     The set of registers preserved by the check function, and the register
 457     containing the target address are architecture-specific.
 458
 459     - On X86 the target address is passed in ECX.
 460     - On ARM the target address is passed in R0.
 461     - On AArch64 the target address is passed in X15.
 462 "``cc <n>``" - Numbered convention
 463     Any calling convention may be specified by number, allowing
 464     target-specific calling conventions to be used. Target specific
 465     calling conventions start at 64.
 466
 467 More calling conventions can be added/defined on an as-needed basis, to
 468 support Pascal conventions or any other well-known target-independent
 469 convention.
 470
 471 .. _visibilitystyles:
 472
 473 Visibility Styles
 474 -----------------
 475
 476 All Global Variables and Functions have one of the following visibility
 477 styles:
 478
 479 "``default``" - Default style
 480     On targets that use the ELF object file format, default visibility
 481     means that the declaration is visible to other modules and, in
 482     shared libraries, means that the declared entity may be overridden.
 483     On Darwin, default visibility means that the declaration is visible
 484     to other modules. Default visibility corresponds to "external
 485     linkage" in the language.
 486 "``hidden``" - Hidden style
 487     Two declarations of an object with hidden visibility refer to the
 488     same object if they are in the same shared object. Usually, hidden
 489     visibility indicates that the symbol will not be placed into the
 490     dynamic symbol table, so no other module (executable or shared
 491     library) can reference it directly.
 492 "``protected``" - Protected style
 493     On ELF, protected visibility indicates that the symbol will be
 494     placed in the dynamic symbol table, but that references within the
 495     defining module will bind to the local symbol. That is, the symbol
 496     cannot be overridden by another module.
 497
 498 A symbol with ``internal`` or ``private`` linkage must have ``default``
 499 visibility.
 500
 501 .. _dllstorageclass:
 502
 503 DLL Storage Classes
 504 -------------------
 505
 506 All Global Variables, Functions and Aliases can have one of the following
 507 DLL storage class:
 508
 509 ``dllimport``
 510     "``dllimport``" causes the compiler to reference a function or variable via
 511     a global pointer to a pointer that is set up by the DLL exporting the
 512     symbol. On Microsoft Windows targets, the pointer name is formed by
 513     combining ``__imp_`` and the function or variable name.
 514 ``dllexport``
 515     "``dllexport``" causes the compiler to provide a global pointer to a pointer
 516     in a DLL, so that it can be referenced with the ``dllimport`` attribute. On
 517     Microsoft Windows targets, the pointer name is formed by combining
 518     ``__imp_`` and the function or variable name. Since this storage class
 519     exists for defining a dll interface, the compiler, assembler and linker know
 520     it is externally referenced and must refrain from deleting the symbol.
 521
 522 .. _tls_model:
 523
 524 Thread Local Storage Models
 525 ---------------------------
 526
 527 A variable may be defined as ``thread_local``, which means that it will
 528 not be shared by threads (each thread will have a separated copy of the
 529 variable). Not all targets support thread-local variables. Optionally, a
 530 TLS model may be specified:
 531
 532 ``localdynamic``
 533     For variables that are only used within the current shared library.
 534 ``initialexec``
 535     For variables in modules that will not be loaded dynamically.
 536 ``localexec``
 537     For variables defined in the executable and only used within it.
 538
 539 If no explicit model is given, the "general dynamic" model is used.
 540
 541 The models correspond to the ELF TLS models; see `ELF Handling For
 542 Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
 543 more information on under which circumstances the different models may
 544 be used. The target may choose a different TLS model if the specified
 545 model is not supported, or if a better choice of model can be made.
 546
 547 A model can also be specified in an alias, but then it only governs how
 548 the alias is accessed. It will not have any effect in the aliasee.
 549
 550 For platforms without linker support of ELF TLS model, the -femulated-tls
 551 flag can be used to generate GCC compatible emulated TLS code.
 552
 553 .. _runtime_preemption_model:
 554
 555 Runtime Preemption Specifiers
 556 -----------------------------
 557
 558 Global variables, functions and aliases may have an optional runtime preemption
 559 specifier. If a preemption specifier isn't given explicitly, then a
 560 symbol is assumed to be ``dso_preemptable``.
 561
 562 ``dso_preemptable``
 563     Indicates that the function or variable may be replaced by a symbol from
 564     outside the linkage unit at runtime.
 565
 566 ``dso_local``
 567     The compiler may assume that a function or variable marked as ``dso_local``
 568     will resolve to a symbol within the same linkage unit. Direct access will
 569     be generated even if the definition is not within this compilation unit.
 570
 571 .. _namedtypes:
 572
 573 Structure Types
 574 ---------------
 575
 576 LLVM IR allows you to specify both "identified" and "literal" :ref:`structure
 577 types <t_struct>`. Literal types are uniqued structurally, but identified types
 578 are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used
 579 to forward declare a type that is not yet available.
 580
 581 An example of an identified structure specification is:
 582
 583 .. code-block:: llvm
 584
 585     %mytype = type { %mytype*, i32 }
 586
 587 Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only
 588 literal types are uniqued in recent versions of LLVM.
 589
 590 .. _nointptrtype:
 591
 592 Non-Integral Pointer Type
 593 -------------------------
 594
 595 Note: non-integral pointer types are a work in progress, and they should be
 596 considered experimental at this time.
 597
 598 LLVM IR optionally allows the frontend to denote pointers in certain address
 599 spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`.
 600 Non-integral pointer types represent pointers that have an *unspecified* bitwise
 601 representation; that is, the integral representation may be target dependent or
 602 unstable (not backed by a fixed integer).
 603
 604 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
 605 integral (i.e. normal) pointers in that they convert integers to and from
 606 corresponding pointer types, but there are additional implications to be
 607 aware of.  Because the bit-representation of a non-integral pointer may
 608 not be stable, two identical casts of the same operand may or may not
 609 return the same value.  Said differently, the conversion to or from the
 610 non-integral type depends on environmental state in an implementation
 611 defined manner.
 612
 613 If the frontend wishes to observe a *particular* value following a cast, the
 614 generated IR must fence with the underlying environment in an implementation
 615 defined manner. (In practice, this tends to require ``noinline`` routines for
 616 such operations.)
 617
 618 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
 619 non-integral types are analogous to ones on integral types with one
 620 key exception: the optimizer may not, in general, insert new dynamic
 621 occurrences of such casts.  If a new cast is inserted, the optimizer would
 622 need to either ensure that a) all possible values are valid, or b)
 623 appropriate fencing is inserted.  Since the appropriate fencing is
 624 implementation defined, the optimizer can't do the latter.  The former is
 625 challenging as many commonly expected properties, such as
 626 ``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
 627
 628 .. _globalvars:
 629
 630 Global Variables
 631 ----------------
 632
 633 Global variables define regions of memory allocated at compilation time
 634 instead of run-time.
 635
 636 Global variable definitions must be initialized.
 637
 638 Global variables in other translation units can also be declared, in which
 639 case they don't have an initializer.
 640
 641 Global variables can optionally specify a :ref:`linkage type <linkage>`.
 642
 643 Either global variable definitions or declarations may have an explicit section
 644 to be placed in and may have an optional explicit alignment specified. If there
 645 is a mismatch between the explicit or inferred section information for the
 646 variable declaration and its definition the resulting behavior is undefined.
 647
 648 A variable may be defined as a global ``constant``, which indicates that
 649 the contents of the variable will **never** be modified (enabling better
 650 optimization, allowing the global data to be placed in the read-only
 651 section of an executable, etc). Note that variables that need runtime
 652 initialization cannot be marked ``constant`` as there is a store to the
 653 variable.
 654
 655 LLVM explicitly allows *declarations* of global variables to be marked
 656 constant, even if the final definition of the global is not. This
 657 capability can be used to enable slightly better optimization of the
 658 program, but requires the language definition to guarantee that
 659 optimizations based on the 'constantness' are valid for the translation
 660 units that do not include the definition.
 661
 662 As SSA values, global variables define pointer values that are in scope
 663 (i.e. they dominate) all basic blocks in the program. Global variables
 664 always define a pointer to their "content" type because they describe a
 665 region of memory, and all memory objects in LLVM are accessed through
 666 pointers.
 667
 668 Global variables can be marked with ``unnamed_addr`` which indicates
 669 that the address is not significant, only the content. Constants marked
 670 like this can be merged with other constants if they have the same
 671 initializer. Note that a constant with significant address *can* be
 672 merged with a ``unnamed_addr`` constant, the result being a constant
 673 whose address is significant.
 674
 675 If the ``local_unnamed_addr`` attribute is given, the address is known to
 676 not be significant within the module.
 677
 678 A global variable may be declared to reside in a target-specific
 679 numbered address space. For targets that support them, address spaces
 680 may affect how optimizations are performed and/or what target
 681 instructions are used to access the variable. The default address space
 682 is zero. The address space qualifier must precede any other attributes.
 683
 684 LLVM allows an explicit section to be specified for globals. If the
 685 target supports it, it will emit globals to the section specified.
 686 Additionally, the global can placed in a comdat if the target has the necessary
 687 support.
 688
 689 External declarations may have an explicit section specified. Section
 690 information is retained in LLVM IR for targets that make use of this
 691 information. Attaching section information to an external declaration is an
 692 assertion that its definition is located in the specified section. If the
 693 definition is located in a different section, the behavior is undefined.
 694
 695 By default, global initializers are optimized by assuming that global
 696 variables defined within the module are not modified from their
 697 initial values before the start of the global initializer. This is
 698 true even for variables potentially accessible from outside the
 699 module, including those with external linkage or appearing in
 700 ``@llvm.used`` or dllexported variables. This assumption may be suppressed
 701 by marking the variable with ``externally_initialized``.
 702
 703 An explicit alignment may be specified for a global, which must be a
 704 power of 2. If not present, or if the alignment is set to zero, the
 705 alignment of the global is set by the target to whatever it feels
 706 convenient. If an explicit alignment is specified, the global is forced
 707 to have exactly that alignment. Targets and optimizers are not allowed
 708 to over-align the global if the global has an assigned section. In this
 709 case, the extra alignment could be observable: for example, code could
 710 assume that the globals are densely packed in their section and try to
 711 iterate over them as an array, alignment padding would break this
 712 iteration. The maximum alignment is ``1 << 29``.
 713
 714 For global variables declarations, as well as definitions that may be
 715 replaced at link time (``linkonce``, ``weak``, ``extern_weak`` and ``common``
 716 linkage types), LLVM makes no assumptions about the allocation size of the
 717 variables, except that they may not overlap. The alignment of a global variable
 718 declaration or replaceable definition must not be greater than the alignment of
 719 the definition it resolves to.
 720
 721 Globals can also have a :ref:`DLL storage class <dllstorageclass>`,
 722 an optional :ref:`runtime preemption specifier <runtime_preemption_model>`,
 723 an optional :ref:`global attributes <glattrs>` and
 724 an optional list of attached :ref:`metadata <metadata>`.
 725
 726 Variables and aliases can have a
 727 :ref:`Thread Local Storage Model <tls_model>`.
 728
 729 :ref:`Scalable vectors <t_vector>` cannot be global variables or members of
 730 arrays because their size is unknown at compile time. They are allowed in
 731 structs to facilitate intrinsics returning multiple values. Structs containing
 732 scalable vectors cannot be used in loads, stores, allocas, or GEPs.
 733
 734 Syntax::
 735
 736       @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility]
 737                          [DLLStorageClass] [ThreadLocal]
 738                          [(unnamed_addr|local_unnamed_addr)] [AddrSpace]
 739                          [ExternallyInitialized]
 740                          <global | constant> <Type> [<InitializerConstant>]
 741                          [, section "name"] [, comdat [($name)]]
 742                          [, align <Alignment>] (, !name !N)*
 743
 744 For example, the following defines a global in a numbered address space
 745 with an initializer, section, and alignment:
 746
 747 .. code-block:: llvm
 748
 749     @G = addrspace(5) constant float 1.0, section "foo", align 4
 750
 751 The following example just declares a global variable
 752
 753 .. code-block:: llvm
 754
 755    @G = external global i32
 756
 757 The following example defines a thread-local global with the
 758 ``initialexec`` TLS model:
 759
 760 .. code-block:: llvm
 761
 762     @G = thread_local(initialexec) global i32 0, align 4
 763
 764 .. _functionstructure:
 765
 766 Functions
 767 ---------
 768
 769 LLVM function definitions consist of the "``define``" keyword, an
 770 optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption
 771 specifier <runtime_preemption_model>`,  an optional :ref:`visibility
 772 style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`,
 773 an optional :ref:`calling convention <callingconv>`,
 774 an optional ``unnamed_addr`` attribute, a return type, an optional
 775 :ref:`parameter attribute <paramattrs>` for the return type, a function
 776 name, a (possibly empty) argument list (each with optional :ref:`parameter
 777 attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
 778 an optional address space, an optional section, an optional alignment,
 779 an optional :ref:`comdat <langref_comdats>`,
 780 an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`,
 781 an optional :ref:`prologue <prologuedata>`,
 782 an optional :ref:`personality <personalityfn>`,
 783 an optional list of attached :ref:`metadata <metadata>`,
 784 an opening curly brace, a list of basic blocks, and a closing curly brace.
 785
 786 LLVM function declarations consist of the "``declare``" keyword, an
 787 optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style
 788 <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an
 789 optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr``
 790 or ``local_unnamed_addr`` attribute, an optional address space, a return type,
 791 an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly
 792 empty list of arguments, an optional alignment, an optional :ref:`garbage
 793 collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional
 794 :ref:`prologue <prologuedata>`.
 795
 796 A function definition contains a list of basic blocks, forming the CFG (Control
 797 Flow Graph) for the function. Each basic block may optionally start with a label
 798 (giving the basic block a symbol table entry), contains a list of instructions,
 799 and ends with a :ref:`terminator <terminators>` instruction (such as a branch or
 800 function return). If an explicit label name is not provided, a block is assigned
 801 an implicit numbered label, using the next value from the same counter as used
 802 for unnamed temporaries (:ref:`see above<identifiers>`). For example, if a
 803 function entry block does not have an explicit label, it will be assigned label
 804 "%0", then the first unnamed temporary in that block will be "%1", etc. If a
 805 numeric label is explicitly specified, it must match the numeric label that
 806 would be used implicitly.
 807
 808 The first basic block in a function is special in two ways: it is
 809 immediately executed on entrance to the function, and it is not allowed
 810 to have predecessor basic blocks (i.e. there can not be any branches to
 811 the entry block of a function). Because the block can have no
 812 predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
 813
 814 LLVM allows an explicit section to be specified for functions. If the
 815 target supports it, it will emit functions to the section specified.
 816 Additionally, the function can be placed in a COMDAT.
 817
 818 An explicit alignment may be specified for a function. If not present,
 819 or if the alignment is set to zero, the alignment of the function is set
 820 by the target to whatever it feels convenient. If an explicit alignment
 821 is specified, the function is forced to have at least that much
 822 alignment. All alignments must be a power of 2.
 823
 824 If the ``unnamed_addr`` attribute is given, the address is known to not
 825 be significant and two identical functions can be merged.
 826
 827 If the ``local_unnamed_addr`` attribute is given, the address is known to
 828 not be significant within the module.
 829
 830 If an explicit address space is not given, it will default to the program
 831 address space from the :ref:`datalayout string<langref_datalayout>`.
 832
 833 Syntax::
 834
 835     define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass]
 836            [cconv] [ret attrs]
 837            <ResultType> @<FunctionName> ([argument list])
 838            [(unnamed_addr|local_unnamed_addr)] [AddrSpace] [fn Attrs]
 839            [section "name"] [comdat [($name)]] [align N] [gc] [prefix Constant]
 840            [prologue Constant] [personality Constant] (!name !N)* { ... }
 841
 842 The argument list is a comma separated sequence of arguments where each
 843 argument is of the following form:
 844
 845 Syntax::
 846
 847    <type> [parameter Attrs] [name]
 848
 849
 850 .. _langref_aliases:
 851
 852 Aliases
 853 -------
 854
 855 Aliases, unlike function or variables, don't create any new data. They
 856 are just a new symbol and metadata for an existing position.
 857
 858 Aliases have a name and an aliasee that is either a global value or a
 859 constant expression.
 860
 861 Aliases may have an optional :ref:`linkage type <linkage>`, an optional
 862 :ref:`runtime preemption specifier <runtime_preemption_model>`, an optional
 863 :ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
 864 <dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
 865
 866 Syntax::
 867
 868     @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee>
 869
 870 The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
 871 ``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers
 872 might not correctly handle dropping a weak symbol that is aliased.
 873
 874 Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as
 875 the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
 876 to the same content.
 877
 878 If the ``local_unnamed_addr`` attribute is given, the address is known to
 879 not be significant within the module.
 880
 881 Since aliases are only a second name, some restrictions apply, of which
 882 some can only be checked when producing an object file:
 883
 884 * The expression defining the aliasee must be computable at assembly
 885   time. Since it is just a name, no relocations can be used.
 886
 887 * No alias in the expression can be weak as the possibility of the
 888   intermediate alias being overridden cannot be represented in an
 889   object file.
 890
 891 * No global value in the expression can be a declaration, since that
 892   would require a relocation, which is not possible.
 893
 894 .. _langref_ifunc:
 895
 896 IFuncs
 897 -------
 898
 899 IFuncs, like as aliases, don't create any new data or func. They are just a new
 900 symbol that dynamic linker resolves at runtime by calling a resolver function.
 901
 902 IFuncs have a name and a resolver that is a function called by dynamic linker
 903 that returns address of another function associated with the name.
 904
 905 IFunc may have an optional :ref:`linkage type <linkage>` and an optional
 906 :ref:`visibility style <visibility>`.
 907
 908 Syntax::
 909
 910     @<Name> = [Linkage] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver>
 911
 912
 913 .. _langref_comdats:
 914
 915 Comdats
 916 -------
 917
 918 Comdat IR provides access to object file COMDAT/section group functionality
 919 which represents interrelated sections.
 920
 921 Comdats have a name which represents the COMDAT key and a selection kind to
 922 provide input on how the linker deduplicates comdats with the same key in two
 923 different object files. A comdat must be included or omitted as a unit.
 924 Discarding the whole comdat is allowed but discarding a subset is not.
 925
 926 A global object may be a member of at most one comdat. Aliases are placed in the
 927 same COMDAT that their aliasee computes to, if any.
 928
 929 Syntax::
 930
 931     $<Name> = comdat SelectionKind
 932
 933 For selection kinds other than ``nodeduplicate``, only one of the duplicate
 934 comdats may be retained by the linker and the members of the remaining comdats
 935 must be discarded. The following selection kinds are supported:
 936
 937 ``any``
 938     The linker may choose any COMDAT key, the choice is arbitrary.
 939 ``exactmatch``
 940     The linker may choose any COMDAT key but the sections must contain the
 941     same data.
 942 ``largest``
 943     The linker will choose the section containing the largest COMDAT key.
 944 ``nodeduplicate``
 945     No deduplication is performed.
 946 ``samesize``
 947     The linker may choose any COMDAT key but the sections must contain the
 948     same amount of data.
 949
 950 - XCOFF and Mach-O don't support COMDATs.
 951 - COFF supports all selection kinds. Non-``nodeduplicate`` selection kinds need
 952   a non-local linkage COMDAT symbol.
 953 - ELF supports ``any`` and ``nodeduplicate``.
 954 - WebAssembly only supports ``any``.
 955
 956 Here is an example of a COFF COMDAT where a function will only be selected if
 957 the COMDAT key's section is the largest:
 958
 959 .. code-block:: text
 960
 961    $foo = comdat largest
 962    @foo = global i32 2, comdat($foo)
 963
 964    define void @bar() comdat($foo) {
 965      ret void
 966    }
 967
 968 In a COFF object file, this will create a COMDAT section with selection kind
 969 ``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
 970 and another COMDAT section with selection kind
 971 ``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
 972 section and contains the contents of the ``@bar`` symbol.
 973
 974 As a syntactic sugar the ``$name`` can be omitted if the name is the same as
 975 the global name:
 976
 977 .. code-block:: llvm
 978
 979   $foo = comdat any
 980   @foo = global i32 2, comdat
 981   @bar = global i32 3, comdat($foo)
 982
 983 There are some restrictions on the properties of the global object.
 984 It, or an alias to it, must have the same name as the COMDAT group when
 985 targeting COFF.
 986 The contents and size of this object may be used during link-time to determine
 987 which COMDAT groups get selected depending on the selection kind.
 988 Because the name of the object must match the name of the COMDAT group, the
 989 linkage of the global object must not be local; local symbols can get renamed
 990 if a collision occurs in the symbol table.
 991
 992 The combined use of COMDATS and section attributes may yield surprising results.
 993 For example:
 994
 995 .. code-block:: llvm
 996
 997    $foo = comdat any
 998    $bar = comdat any
 999    @g1 = global i32 42, section "sec", comdat($foo)
1000    @g2 = global i32 42, section "sec", comdat($bar)
1001
1002 From the object file perspective, this requires the creation of two sections
1003 with the same name. This is necessary because both globals belong to different
1004 COMDAT groups and COMDATs, at the object file level, are represented by
1005 sections.
1006
1007 Note that certain IR constructs like global variables and functions may
1008 create COMDATs in the object file in addition to any which are specified using
1009 COMDAT IR. This arises when the code generator is configured to emit globals
1010 in individual sections (e.g. when `-data-sections` or `-function-sections`
1011 is supplied to `llc`).
1012
1013 .. _namedmetadatastructure:
1014
1015 Named Metadata
1016 --------------
1017
1018 Named metadata is a collection of metadata. :ref:`Metadata
1019 nodes <metadata>` (but not metadata strings) are the only valid
1020 operands for a named metadata.
1021
1022 #. Named metadata are represented as a string of characters with the
1023    metadata prefix. The rules for metadata names are the same as for
1024    identifiers, but quoted names are not allowed. ``"\xx"`` type escapes
1025    are still valid, which allows any character to be part of a name.
1026
1027 Syntax::
1028
1029     ; Some unnamed metadata nodes, which are referenced by the named metadata.
1030     !0 = !{!"zero"}
1031     !1 = !{!"one"}
1032     !2 = !{!"two"}
1033     ; A named metadata.
1034     !name = !{!0, !1, !2}
1035
1036 .. _paramattrs:
1037
1038 Parameter Attributes
1039 --------------------
1040
1041 The return type and each parameter of a function type may have a set of
1042 *parameter attributes* associated with them. Parameter attributes are
1043 used to communicate additional information about the result or
1044 parameters of a function. Parameter attributes are considered to be part
1045 of the function, not of the function type, so functions with different
1046 parameter attributes can have the same function type.
1047
1048 Parameter attributes are simple keywords that follow the type specified.
1049 If multiple parameter attributes are needed, they are space separated.
1050 For example:
1051
1052 .. code-block:: llvm
1053
1054     declare i32 @printf(i8* noalias nocapture, ...)
1055     declare i32 @atoi(i8 zeroext)
1056     declare signext i8 @returns_signed_char()
1057
1058 Note that any attributes for the function result (``nounwind``,
1059 ``readonly``) come immediately after the argument list.
1060
1061 Currently, only the following parameter attributes are defined:
1062
1063 ``zeroext``
1064     This indicates to the code generator that the parameter or return
1065     value should be zero-extended to the extent required by the target's
1066     ABI by the caller (for a parameter) or the callee (for a return value).
1067 ``signext``
1068     This indicates to the code generator that the parameter or return
1069     value should be sign-extended to the extent required by the target's
1070     ABI (which is usually 32-bits) by the caller (for a parameter) or
1071     the callee (for a return value).
1072 ``inreg``
1073     This indicates that this parameter or return value should be treated
1074     in a special target-dependent fashion while emitting code for
1075     a function call or return (usually, by putting it in a register as
1076     opposed to memory, though some targets use it to distinguish between
1077     two different kinds of registers). Use of this attribute is
1078     target-specific.
1079 ``byval(<ty>)``
1080     This indicates that the pointer parameter should really be passed by
1081     value to the function. The attribute implies that a hidden copy of
1082     the pointee is made between the caller and the callee, so the callee
1083     is unable to modify the value in the caller. This attribute is only
1084     valid on LLVM pointer arguments. It is generally used to pass
1085     structs and arrays by value, but is also valid on pointers to
1086     scalars. The copy is considered to belong to the caller not the
1087     callee (for example, ``readonly`` functions should not write to
1088     ``byval`` parameters). This is not a valid attribute for return
1089     values.
1090
1091     The byval type argument indicates the in-memory value type, and
1092     must be the same as the pointee type of the argument.
1093
1094     The byval attribute also supports specifying an alignment with the
1095     align attribute. It indicates the alignment of the stack slot to
1096     form and the known alignment of the pointer specified to the call
1097     site. If the alignment is not specified, then the code generator
1098     makes a target-specific assumption.
1099
1100 .. _attr_byref:
1101
1102 ``byref(<ty>)``
1103
1104     The ``byref`` argument attribute allows specifying the pointee
1105     memory type of an argument. This is similar to ``byval``, but does
1106     not imply a copy is made anywhere, or that the argument is passed
1107     on the stack. This implies the pointer is dereferenceable up to
1108     the storage size of the type.
1109
1110     It is not generally permissible to introduce a write to an
1111     ``byref`` pointer. The pointer may have any address space and may
1112     be read only.
1113
1114     This is not a valid attribute for return values.
1115
1116     The alignment for an ``byref`` parameter can be explicitly
1117     specified by combining it with the ``align`` attribute, similar to
1118     ``byval``. If the alignment is not specified, then the code generator
1119     makes a target-specific assumption.
1120
1121     This is intended for representing ABI constraints, and is not
1122     intended to be inferred for optimization use.
1123
1124 .. _attr_preallocated:
1125
1126 ``preallocated(<ty>)``
1127     This indicates that the pointer parameter should really be passed by
1128     value to the function, and that the pointer parameter's pointee has
1129     already been initialized before the call instruction. This attribute
1130     is only valid on LLVM pointer arguments. The argument must be the value
1131     returned by the appropriate
1132     :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` on non
1133     ``musttail`` calls, or the corresponding caller parameter in ``musttail``
1134     calls, although it is ignored during codegen.
1135
1136     A non ``musttail`` function call with a ``preallocated`` attribute in
1137     any parameter must have a ``"preallocated"`` operand bundle. A ``musttail``
1138     function call cannot have a ``"preallocated"`` operand bundle.
1139
1140     The preallocated attribute requires a type argument, which must be
1141     the same as the pointee type of the argument.
1142
1143     The preallocated attribute also supports specifying an alignment with the
1144     align attribute. It indicates the alignment of the stack slot to
1145     form and the known alignment of the pointer specified to the call
1146     site. If the alignment is not specified, then the code generator
1147     makes a target-specific assumption.
1148
1149 .. _attr_inalloca:
1150
1151 ``inalloca(<ty>)``
1152
1153     The ``inalloca`` argument attribute allows the caller to take the
1154     address of outgoing stack arguments. An ``inalloca`` argument must
1155     be a pointer to stack memory produced by an ``alloca`` instruction.
1156     The alloca, or argument allocation, must also be tagged with the
1157     inalloca keyword. Only the last argument may have the ``inalloca``
1158     attribute, and that argument is guaranteed to be passed in memory.
1159
1160     An argument allocation may be used by a call at most once because
1161     the call may deallocate it. The ``inalloca`` attribute cannot be
1162     used in conjunction with other attributes that affect argument
1163     storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The
1164     ``inalloca`` attribute also disables LLVM's implicit lowering of
1165     large aggregate return values, which means that frontend authors
1166     must lower them with ``sret`` pointers.
1167
1168     When the call site is reached, the argument allocation must have
1169     been the most recent stack allocation that is still live, or the
1170     behavior is undefined. It is possible to allocate additional stack
1171     space after an argument allocation and before its call site, but it
1172     must be cleared off with :ref:`llvm.stackrestore
1173     <int_stackrestore>`.
1174
1175     The inalloca attribute requires a type argument, which must be the
1176     same as the pointee type of the argument.
1177
1178     See :doc:`InAlloca` for more information on how to use this
1179     attribute.
1180
1181 ``sret(<ty>)``
1182     This indicates that the pointer parameter specifies the address of a
1183     structure that is the return value of the function in the source
1184     program. This pointer must be guaranteed by the caller to be valid:
1185     loads and stores to the structure may be assumed by the callee not
1186     to trap and to be properly aligned. This is not a valid attribute
1187     for return values.
1188
1189     The sret type argument specifies the in memory type, which must be
1190     the same as the pointee type of the argument.
1191
1192 .. _attr_elementtype:
1193
1194 ``elementtype(<ty>)``
1195
1196     The ``elementtype`` argument attribute can be used to specify a pointer
1197     element type in a way that is compatible with `opaque pointers
1198     <OpaquePointers.html>`.
1199
1200     The ``elementtype`` attribute by itself does not carry any specific
1201     semantics. However, certain intrinsics may require this attribute to be
1202     present and assign it particular semantics. This will be documented on
1203     individual intrinsics.
1204
1205     The attribute may only be applied to pointer typed arguments of intrinsic
1206     calls. It cannot be applied to non-intrinsic calls, and cannot be applied
1207     to parameters on function declarations. For non-opaque pointers, the type
1208     passed to ``elementtype`` must match the pointer element type.
1209
1210 .. _attr_align:
1211
1212 ``align <n>`` or ``align(<n>)``
1213     This indicates that the pointer value has the specified alignment.
1214     If the pointer value does not have the specified alignment,
1215     :ref:`poison value <poisonvalues>` is returned or passed instead. The
1216     ``align`` attribute should be combined with the ``noundef`` attribute to
1217     ensure a pointer is aligned, or otherwise the behavior is undefined. Note
1218     that ``align 1`` has no effect on non-byval, non-preallocated arguments.
1219
1220     Note that this attribute has additional semantics when combined with the
1221     ``byval`` or ``preallocated`` attribute, which are documented there.
1222
1223 .. _noalias:
1224
1225 ``noalias``
1226     This indicates that memory locations accessed via pointer values
1227     :ref:`based <pointeraliasing>` on the argument or return value are not also
1228     accessed, during the execution of the function, via pointer values not
1229     *based* on the argument or return value. This guarantee only holds for
1230     memory locations that are *modified*, by any means, during the execution of
1231     the function. The attribute on a return value also has additional semantics
1232     described below. The caller shares the responsibility with the callee for
1233     ensuring that these requirements are met.  For further details, please see
1234     the discussion of the NoAlias response in :ref:`alias analysis <Must, May,
1235     or No>`.
1236
1237     Note that this definition of ``noalias`` is intentionally similar
1238     to the definition of ``restrict`` in C99 for function arguments.
1239
1240     For function return values, C99's ``restrict`` is not meaningful,
1241     while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias``
1242     attribute on return values are stronger than the semantics of the attribute
1243     when used on function arguments. On function return values, the ``noalias``
1244     attribute indicates that the function acts like a system memory allocation
1245     function, returning a pointer to allocated storage disjoint from the
1246     storage for any other object accessible to the caller.
1247
1248 .. _nocapture:
1249
1250 ``nocapture``
1251     This indicates that the callee does not :ref:`capture <pointercapture>` the
1252     pointer. This is not a valid attribute for return values.
1253     This attribute applies only to the particular copy of the pointer passed in
1254     this argument. A caller could pass two copies of the same pointer with one
1255     being annotated nocapture and the other not, and the callee could validly
1256     capture through the non annotated parameter.
1257
1258 .. code-block:: llvm
1259
1260     define void @f(i8* nocapture %a, i8* %b) {
1261       ; (capture %b)
1262     }
1263
1264     call void @f(i8* @glb, i8* @glb) ; well-defined
1265
1266 ``nofree``
1267     This indicates that callee does not free the pointer argument. This is not
1268     a valid attribute for return values.
1269
1270 .. _nest:
1271
1272 ``nest``
1273     This indicates that the pointer parameter can be excised using the
1274     :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid
1275     attribute for return values and can only be applied to one parameter.
1276
1277 ``returned``
1278     This indicates that the function always returns the argument as its return
1279     value. This is a hint to the optimizer and code generator used when
1280     generating the caller, allowing value propagation, tail call optimization,
1281     and omission of register saves and restores in some cases; it is not
1282     checked or enforced when generating the callee. The parameter and the
1283     function return type must be valid operands for the
1284     :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for
1285     return values and can only be applied to one parameter.
1286
1287 ``nonnull``
1288     This indicates that the parameter or return pointer is not null. This
1289     attribute may only be applied to pointer typed parameters. This is not
1290     checked or enforced by LLVM; if the parameter or return pointer is null,
1291     :ref:`poison value <poisonvalues>` is returned or passed instead.
1292     The ``nonnull`` attribute should be combined with the ``noundef`` attribute
1293     to ensure a pointer is not null or otherwise the behavior is undefined.
1294
1295 ``dereferenceable(<n>)``
1296     This indicates that the parameter or return pointer is dereferenceable. This
1297     attribute may only be applied to pointer typed parameters. A pointer that
1298     is dereferenceable can be loaded from speculatively without a risk of
1299     trapping. The number of bytes known to be dereferenceable must be provided
1300     in parentheses. It is legal for the number of bytes to be less than the
1301     size of the pointee type. The ``nonnull`` attribute does not imply
1302     dereferenceability (consider a pointer to one element past the end of an
1303     array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in
1304     ``addrspace(0)`` (which is the default address space), except if the
1305     ``null_pointer_is_valid`` function attribute is present.
1306     ``n`` should be a positive number. The pointer should be well defined,
1307     otherwise it is undefined behavior. This means ``dereferenceable(<n>)``
1308     implies ``noundef``.
1309
1310 ``dereferenceable_or_null(<n>)``
1311     This indicates that the parameter or return value isn't both
1312     non-null and non-dereferenceable (up to ``<n>`` bytes) at the same
1313     time. All non-null pointers tagged with
1314     ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``.
1315     For address space 0 ``dereferenceable_or_null(<n>)`` implies that
1316     a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``,
1317     and in other address spaces ``dereferenceable_or_null(<n>)``
1318     implies that a pointer is at least one of ``dereferenceable(<n>)``
1319     or ``null`` (i.e. it may be both ``null`` and
1320     ``dereferenceable(<n>)``). This attribute may only be applied to
1321     pointer typed parameters.
1322
1323 ``swiftself``
1324     This indicates that the parameter is the self/context parameter. This is not
1325     a valid attribute for return values and can only be applied to one
1326     parameter.
1327
1328 ``swiftasync``
1329     This indicates that the parameter is the asynchronous context parameter and
1330     triggers the creation of a target-specific extended frame record to store
1331     this pointer. This is not a valid attribute for return values and can only
1332     be applied to one parameter.
1333
1334 ``swifterror``
1335     This attribute is motivated to model and optimize Swift error handling. It
1336     can be applied to a parameter with pointer to pointer type or a
1337     pointer-sized alloca. At the call site, the actual argument that corresponds
1338     to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or
1339     the ``swifterror`` parameter of the caller. A ``swifterror`` value (either
1340     the parameter or the alloca) can only be loaded and stored from, or used as
1341     a ``swifterror`` argument. This is not a valid attribute for return values
1342     and can only be applied to one parameter.
1343
1344     These constraints allow the calling convention to optimize access to
1345     ``swifterror`` variables by associating them with a specific register at
1346     call boundaries rather than placing them in memory. Since this does change
1347     the calling convention, a function which uses the ``swifterror`` attribute
1348     on a parameter is not ABI-compatible with one which does not.
1349
1350     These constraints also allow LLVM to assume that a ``swifterror`` argument
1351     does not alias any other memory visible within a function and that a
1352     ``swifterror`` alloca passed as an argument does not escape.
1353
1354 ``immarg``
1355     This indicates the parameter is required to be an immediate
1356     value. This must be a trivial immediate integer or floating-point
1357     constant. Undef or constant expressions are not valid. This is
1358     only valid on intrinsic declarations and cannot be applied to a
1359     call site or arbitrary function.
1360
1361 ``noundef``
1362     This attribute applies to parameters and return values. If the value
1363     representation contains any undefined or poison bits, the behavior is
1364     undefined. Note that this does not refer to padding introduced by the
1365     type's storage representation.
1366
1367 ``alignstack(<n>)``
1368     This indicates the alignment that should be considered by the backend when
1369     assigning this parameter to a stack slot during calling convention
1370     lowering. The enforcement of the specified alignment is target-dependent,
1371     as target-specific calling convention rules may override this value. This
1372     attribute serves the purpose of carrying language specific alignment
1373     information that is not mapped to base types in the backend (for example,
1374     over-alignment specification through language attributes).
1375
1376 .. _gc:
1377
1378 Garbage Collector Strategy Names
1379 --------------------------------
1380
1381 Each function may specify a garbage collector strategy name, which is simply a
1382 string:
1383
1384 .. code-block:: llvm
1385
1386     define void @f() gc "name" { ... }
1387
1388 The supported values of *name* includes those :ref:`built in to LLVM
1389 <builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC
1390 strategy will cause the compiler to alter its output in order to support the
1391 named garbage collection algorithm. Note that LLVM itself does not contain a
1392 garbage collector, this functionality is restricted to generating machine code
1393 which can interoperate with a collector provided externally.
1394
1395 .. _prefixdata:
1396
1397 Prefix Data
1398 -----------
1399
1400 Prefix data is data associated with a function which the code
1401 generator will emit immediately before the function's entrypoint.
1402 The purpose of this feature is to allow frontends to associate
1403 language-specific runtime metadata with specific functions and make it
1404 available through the function pointer while still allowing the
1405 function pointer to be called.
1406
1407 To access the data for a given function, a program may bitcast the
1408 function pointer to a pointer to the constant's type and dereference
1409 index -1. This implies that the IR symbol points just past the end of
1410 the prefix data. For instance, take the example of a function annotated
1411 with a single ``i32``,
1412
1413 .. code-block:: llvm
1414
1415     define void @f() prefix i32 123 { ... }
1416
1417 The prefix data can be referenced as,
1418
1419 .. code-block:: llvm
1420
1421     %0 = bitcast void* () @f to i32*
1422     %a = getelementptr inbounds i32, i32* %0, i32 -1
1423     %b = load i32, i32* %a
1424
1425 Prefix data is laid out as if it were an initializer for a global variable
1426 of the prefix data's type. The function will be placed such that the
1427 beginning of the prefix data is aligned. This means that if the size
1428 of the prefix data is not a multiple of the alignment size, the
1429 function's entrypoint will not be aligned. If alignment of the
1430 function's entrypoint is desired, padding must be added to the prefix
1431 data.
1432
1433 A function may have prefix data but no body. This has similar semantics
1434 to the ``available_externally`` linkage in that the data may be used by the
1435 optimizers but will not be emitted in the object file.
1436
1437 .. _prologuedata:
1438
1439 Prologue Data
1440 -------------
1441
1442 The ``prologue`` attribute allows arbitrary code (encoded as bytes) to
1443 be inserted prior to the function body. This can be used for enabling
1444 function hot-patching and instrumentation.
1445
1446 To maintain the semantics of ordinary function calls, the prologue data must
1447 have a particular format. Specifically, it must begin with a sequence of
1448 bytes which decode to a sequence of machine instructions, valid for the
1449 module's target, which transfer control to the point immediately succeeding
1450 the prologue data, without performing any other visible action. This allows
1451 the inliner and other passes to reason about the semantics of the function
1452 definition without needing to reason about the prologue data. Obviously this
1453 makes the format of the prologue data highly target dependent.
1454
1455 A trivial example of valid prologue data for the x86 architecture is ``i8 144``,
1456 which encodes the ``nop`` instruction:
1457
1458 .. code-block:: text
1459
1460     define void @f() prologue i8 144 { ... }
1461
1462 Generally prologue data can be formed by encoding a relative branch instruction
1463 which skips the metadata, as in this example of valid prologue data for the
1464 x86_64 architecture, where the first two bytes encode ``jmp .+10``:
1465
1466 .. code-block:: text
1467
1468     %0 = type <{ i8, i8, i8* }>
1469
1470     define void @f() prologue %0 <{ i8 235, i8 8, i8* @md}> { ... }
1471
1472 A function may have prologue data but no body. This has similar semantics
1473 to the ``available_externally`` linkage in that the data may be used by the
1474 optimizers but will not be emitted in the object file.
1475
1476 .. _personalityfn:
1477
1478 Personality Function
1479 --------------------
1480
1481 The ``personality`` attribute permits functions to specify what function
1482 to use for exception handling.
1483
1484 .. _attrgrp:
1485
1486 Attribute Groups
1487 ----------------
1488
1489 Attribute groups are groups of attributes that are referenced by objects within
1490 the IR. They are important for keeping ``.ll`` files readable, because a lot of
1491 functions will use the same set of attributes. In the degenerative case of a
1492 ``.ll`` file that corresponds to a single ``.c`` file, the single attribute
1493 group will capture the important command line flags used to build that file.
1494
1495 An attribute group is a module-level object. To use an attribute group, an
1496 object references the attribute group's ID (e.g. ``#37``). An object may refer
1497 to more than one attribute group. In that situation, the attributes from the
1498 different groups are merged.
1499
1500 Here is an example of attribute groups for a function that should always be
1501 inlined, has a stack alignment of 4, and which shouldn't use SSE instructions:
1502
1503 .. code-block:: llvm
1504
1505    ; Target-independent attributes:
1506    attributes #0 = { alwaysinline alignstack=4 }
1507
1508    ; Target-dependent attributes:
1509    attributes #1 = { "no-sse" }
1510
1511    ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
1512    define void @f() #0 #1 { ... }
1513
1514 .. _fnattrs:
1515
1516 Function Attributes
1517 -------------------
1518
1519 Function attributes are set to communicate additional information about
1520 a function. Function attributes are considered to be part of the
1521 function, not of the function type, so functions with different function
1522 attributes can have the same function type.
1523
1524 Function attributes are simple keywords that follow the type specified.
1525 If multiple attributes are needed, they are space separated. For
1526 example:
1527
1528 .. code-block:: llvm
1529
1530     define void @f() noinline { ... }
1531     define void @f() alwaysinline { ... }
1532     define void @f() alwaysinline optsize { ... }
1533     define void @f() optsize { ... }
1534
1535 ``alignstack(<n>)``
1536     This attribute indicates that, when emitting the prologue and
1537     epilogue, the backend should forcibly align the stack pointer.
1538     Specify the desired alignment, which must be a power of two, in
1539     parentheses.
1540 ``allocsize(<EltSizeParam>[, <NumEltsParam>])``
1541     This attribute indicates that the annotated function will always return at
1542     least a given number of bytes (or null). Its arguments are zero-indexed
1543     parameter numbers; if one argument is provided, then it's assumed that at
1544     least ``CallSite.Args[EltSizeParam]`` bytes will be available at the
1545     returned pointer. If two are provided, then it's assumed that
1546     ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are
1547     available. The referenced parameters must be integer types. No assumptions
1548     are made about the contents of the returned block of memory.
1549 ``alwaysinline``
1550     This attribute indicates that the inliner should attempt to inline
1551     this function into callers whenever possible, ignoring any active
1552     inlining size threshold for this caller.
1553 ``builtin``
1554     This indicates that the callee function at a call site should be
1555     recognized as a built-in function, even though the function's declaration
1556     uses the ``nobuiltin`` attribute. This is only valid at call sites for
1557     direct calls to functions that are declared with the ``nobuiltin``
1558     attribute.
1559 ``cold``
1560     This attribute indicates that this function is rarely called. When
1561     computing edge weights, basic blocks post-dominated by a cold
1562     function call are also considered to be cold; and, thus, given low
1563     weight.
1564 ``convergent``
1565     In some parallel execution models, there exist operations that cannot be
1566     made control-dependent on any additional values.  We call such operations
1567     ``convergent``, and mark them with this attribute.
1568
1569     The ``convergent`` attribute may appear on functions or call/invoke
1570     instructions.  When it appears on a function, it indicates that calls to
1571     this function should not be made control-dependent on additional values.
1572     For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so
1573     calls to this intrinsic cannot be made control-dependent on additional
1574     values.
1575
1576     When it appears on a call/invoke, the ``convergent`` attribute indicates
1577     that we should treat the call as though we're calling a convergent
1578     function.  This is particularly useful on indirect calls; without this we
1579     may treat such calls as though the target is non-convergent.
1580
1581     The optimizer may remove the ``convergent`` attribute on functions when it
1582     can prove that the function does not execute any convergent operations.
1583     Similarly, the optimizer may remove ``convergent`` on calls/invokes when it
1584     can prove that the call/invoke cannot call a convergent function.
1585 ``disable_sanitizer_instrumentation``
1586     When instrumenting code with sanitizers, it can be important to skip certain
1587     functions to ensure no instrumentation is applied to them.
1588
1589     This attribute is not always similar to absent ``sanitize_<name>``
1590     attributes: depending on the specific sanitizer, code can be inserted into
1591     functions regardless of the ``sanitize_<name>`` attribute to prevent false
1592     positive reports.
1593
1594     ``disable_sanitizer_instrumentation`` disables all kinds of instrumentation,
1595     taking precedence over the ``sanitize_<name>`` attributes and other compiler
1596     flags.
1597
1598 ``"frame-pointer"``
1599     This attribute tells the code generator whether the function
1600     should keep the frame pointer. The code generator may emit the frame pointer
1601     even if this attribute says the frame pointer can be eliminated.
1602     The allowed string values are:
1603
1604      * ``"none"`` (default) - the frame pointer can be eliminated.
1605      * ``"non-leaf"`` - the frame pointer should be kept if the function calls
1606        other functions.
1607      * ``"all"`` - the frame pointer should be kept.
1608 ``hot``
1609     This attribute indicates that this function is a hot spot of the program
1610     execution. The function will be optimized more aggressively and will be
1611     placed into special subsection of the text section to improving locality.
1612
1613     When profile feedback is enabled, this attribute has the precedence over
1614     the profile information. By marking a function ``hot``, users can work
1615     around the cases where the training input does not have good coverage
1616     on all the hot functions.
1617 ``inaccessiblememonly``
1618     This attribute indicates that the function may only access memory that
1619     is not accessible by the module being compiled. This is a weaker form
1620     of ``readnone``. If the function reads or writes other memory, the
1621     behavior is undefined.
1622 ``inaccessiblemem_or_argmemonly``
1623     This attribute indicates that the function may only access memory that is
1624     either not accessible by the module being compiled, or is pointed to
1625     by its pointer arguments. This is a weaker form of  ``argmemonly``. If the
1626     function reads or writes other memory, the behavior is undefined.
1627 ``inlinehint``
1628     This attribute indicates that the source code contained a hint that
1629     inlining this function is desirable (such as the "inline" keyword in
1630     C/C++). It is just a hint; it imposes no requirements on the
1631     inliner.
1632 ``jumptable``
1633     This attribute indicates that the function should be added to a
1634     jump-instruction table at code-generation time, and that all address-taken
1635     references to this function should be replaced with a reference to the
1636     appropriate jump-instruction-table function pointer. Note that this creates
1637     a new pointer for the original function, which means that code that depends
1638     on function-pointer identity can break. So, any function annotated with
1639     ``jumptable`` must also be ``unnamed_addr``.
1640 ``minsize``
1641     This attribute suggests that optimization passes and code generator
1642     passes make choices that keep the code size of this function as small
1643     as possible and perform optimizations that may sacrifice runtime
1644     performance in order to minimize the size of the generated code.
1645 ``naked``
1646     This attribute disables prologue / epilogue emission for the
1647     function. This can have very system-specific consequences.
1648 ``"no-inline-line-tables"``
1649     When this attribute is set to true, the inliner discards source locations
1650     when inlining code and instead uses the source location of the call site.
1651     Breakpoints set on code that was inlined into the current function will
1652     not fire during the execution of the inlined call sites. If the debugger
1653     stops inside an inlined call site, it will appear to be stopped at the
1654     outermost inlined call site.
1655 ``no-jump-tables``
1656     When this attribute is set to true, the jump tables and lookup tables that
1657     can be generated from a switch case lowering are disabled.
1658 ``nobuiltin``
1659     This indicates that the callee function at a call site is not recognized as
1660     a built-in function. LLVM will retain the original call and not replace it
1661     with equivalent code based on the semantics of the built-in function, unless
1662     the call site uses the ``builtin`` attribute. This is valid at call sites
1663     and on function declarations and definitions.
1664 ``noduplicate``
1665     This attribute indicates that calls to the function cannot be
1666     duplicated. A call to a ``noduplicate`` function may be moved
1667     within its parent function, but may not be duplicated within
1668     its parent function.
1669
1670     A function containing a ``noduplicate`` call may still
1671     be an inlining candidate, provided that the call is not
1672     duplicated by inlining. That implies that the function has
1673     internal linkage and only has one call site, so the original
1674     call is dead after inlining.
1675 ``nofree``
1676     This function attribute indicates that the function does not, directly or
1677     transitively, call a memory-deallocation function (``free``, for example)
1678     on a memory allocation which existed before the call.
1679
1680     As a result, uncaptured pointers that are known to be dereferenceable
1681     prior to a call to a function with the ``nofree`` attribute are still
1682     known to be dereferenceable after the call. The capturing condition is
1683     necessary in environments where the function might communicate the
1684     pointer to another thread which then deallocates the memory.  Alternatively,
1685     ``nosync`` would ensure such communication cannot happen and even captured
1686     pointers cannot be freed by the function.
1687
1688     A ``nofree`` function is explicitly allowed to free memory which it
1689     allocated or (if not ``nosync``) arrange for another thread to free
1690     memory on it's behalf.  As a result, perhaps surprisingly, a ``nofree``
1691     function can return a pointer to a previously deallocated memory object.
1692 ``noimplicitfloat``
1693     Disallows implicit floating-point code. This inhibits optimizations that
1694     use floating-point code and floating-point/SIMD/vector registers for
1695     operations that are not nominally floating-point. LLVM instructions that
1696     perform floating-point operations or require access to floating-point
1697     registers may still cause floating-point code to be generated.
1698 ``noinline``
1699     This attribute indicates that the inliner should never inline this
1700     function in any situation. This attribute may not be used together
1701     with the ``alwaysinline`` attribute.
1702 ``nomerge``
1703     This attribute indicates that calls to this function should never be merged
1704     during optimization. For example, it will prevent tail merging otherwise
1705     identical code sequences that raise an exception or terminate the program.
1706     Tail merging normally reduces the precision of source location information,
1707     making stack traces less useful for debugging. This attribute gives the
1708     user control over the tradeoff between code size and debug information
1709     precision.
1710 ``nonlazybind``
1711     This attribute suppresses lazy symbol binding for the function. This
1712     may make calls to the function faster, at the cost of extra program
1713     startup time if the function is not called during program startup.
1714 ``noprofile``
1715     This function attribute prevents instrumentation based profiling, used for
1716     coverage or profile based optimization, from being added to a function,
1717     even when inlined.
1718 ``noredzone``
1719     This attribute indicates that the code generator should not use a
1720     red zone, even if the target-specific ABI normally permits it.
1721 ``indirect-tls-seg-refs``
1722     This attribute indicates that the code generator should not use
1723     direct TLS access through segment registers, even if the
1724     target-specific ABI normally permits it.
1725 ``noreturn``
1726     This function attribute indicates that the function never returns
1727     normally, hence through a return instruction. This produces undefined
1728     behavior at runtime if the function ever does dynamically return. Annotated
1729     functions may still raise an exception, i.a., ``nounwind`` is not implied.
1730 ``norecurse``
1731     This function attribute indicates that the function does not call itself
1732     either directly or indirectly down any possible call path. This produces
1733     undefined behavior at runtime if the function ever does recurse.
1734 ``willreturn``
1735     This function attribute indicates that a call of this function will
1736     either exhibit undefined behavior or comes back and continues execution
1737     at a point in the existing call stack that includes the current invocation.
1738     Annotated functions may still raise an exception, i.a., ``nounwind`` is not implied.
1739     If an invocation of an annotated function does not return control back
1740     to a point in the call stack, the behavior is undefined.
1741 ``nosync``
1742     This function attribute indicates that the function does not communicate
1743     (synchronize) with another thread through memory or other well-defined means.
1744     Synchronization is considered possible in the presence of `atomic` accesses
1745     that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses,
1746     as well as `convergent` function calls. Note that through `convergent` function calls
1747     non-memory communication, e.g., cross-lane operations, are possible and are also
1748     considered synchronization. However `convergent` does not contradict `nosync`.
1749     If an annotated function does ever synchronize with another thread,
1750     the behavior is undefined.
1751 ``nounwind``
1752     This function attribute indicates that the function never raises an
1753     exception. If the function does raise an exception, its runtime
1754     behavior is undefined. However, functions marked nounwind may still
1755     trap or generate asynchronous exceptions. Exception handling schemes
1756     that are recognized by LLVM to handle asynchronous exceptions, such
1757     as SEH, will still provide their implementation defined semantics.
1758 ``nosanitize_coverage``
1759     This attribute indicates that SanitizerCoverage instrumentation is disabled
1760     for this function.
1761 ``null_pointer_is_valid``
1762    If ``null_pointer_is_valid`` is set, then the ``null`` address
1763    in address-space 0 is considered to be a valid address for memory loads and
1764    stores. Any analysis or optimization should not treat dereferencing a
1765    pointer to ``null`` as undefined behavior in this function.
1766    Note: Comparing address of a global variable to ``null`` may still
1767    evaluate to false because of a limitation in querying this attribute inside
1768    constant expressions.
1769 ``optforfuzzing``
1770     This attribute indicates that this function should be optimized
1771     for maximum fuzzing signal.
1772 ``optnone``
1773     This function attribute indicates that most optimization passes will skip
1774     this function, with the exception of interprocedural optimization passes.
1775     Code generation defaults to the "fast" instruction selector.
1776     This attribute cannot be used together with the ``alwaysinline``
1777     attribute; this attribute is also incompatible
1778     with the ``minsize`` attribute and the ``optsize`` attribute.
1779
1780     This attribute requires the ``noinline`` attribute to be specified on
1781     the function as well, so the function is never inlined into any caller.
1782     Only functions with the ``alwaysinline`` attribute are valid
1783     candidates for inlining into the body of this function.
1784 ``optsize``
1785     This attribute suggests that optimization passes and code generator
1786     passes make choices that keep the code size of this function low,
1787     and otherwise do optimizations specifically to reduce code size as
1788     long as they do not significantly impact runtime performance.
1789 ``"patchable-function"``
1790     This attribute tells the code generator that the code
1791     generated for this function needs to follow certain conventions that
1792     make it possible for a runtime function to patch over it later.
1793     The exact effect of this attribute depends on its string value,
1794     for which there currently is one legal possibility:
1795
1796      * ``"prologue-short-redirect"`` - This style of patchable
1797        function is intended to support patching a function prologue to
1798        redirect control away from the function in a thread safe
1799        manner.  It guarantees that the first instruction of the
1800        function will be large enough to accommodate a short jump
1801        instruction, and will be sufficiently aligned to allow being
1802        fully changed via an atomic compare-and-swap instruction.
1803        While the first requirement can be satisfied by inserting large
1804        enough NOP, LLVM can and will try to re-purpose an existing
1805        instruction (i.e. one that would have to be emitted anyway) as
1806        the patchable instruction larger than a short jump.
1807
1808        ``"prologue-short-redirect"`` is currently only supported on
1809        x86-64.
1810
1811     This attribute by itself does not imply restrictions on
1812     inter-procedural optimizations.  All of the semantic effects the
1813     patching may have to be separately conveyed via the linkage type.
1814 ``"probe-stack"``
1815     This attribute indicates that the function will trigger a guard region
1816     in the end of the stack. It ensures that accesses to the stack must be
1817     no further apart than the size of the guard region to a previous
1818     access of the stack. It takes one required string value, the name of
1819     the stack probing function that will be called.
1820
1821     If a function that has a ``"probe-stack"`` attribute is inlined into
1822     a function with another ``"probe-stack"`` attribute, the resulting
1823     function has the ``"probe-stack"`` attribute of the caller. If a
1824     function that has a ``"probe-stack"`` attribute is inlined into a
1825     function that has no ``"probe-stack"`` attribute at all, the resulting
1826     function has the ``"probe-stack"`` attribute of the callee.
1827 ``readnone``
1828     On a function, this attribute indicates that the function computes its
1829     result (or decides to unwind an exception) based strictly on its arguments,
1830     without dereferencing any pointer arguments or otherwise accessing
1831     any mutable state (e.g. memory, control registers, etc) visible to
1832     caller functions. It does not write through any pointer arguments
1833     (including ``byval`` arguments) and never changes any state visible
1834     to callers. This means while it cannot unwind exceptions by calling
1835     the ``C++`` exception throwing methods (since they write to memory), there may
1836     be non-``C++`` mechanisms that throw exceptions without writing to LLVM
1837     visible memory.
1838
1839     On an argument, this attribute indicates that the function does not
1840     dereference that pointer argument, even though it may read or write the
1841     memory that the pointer points to if accessed through other pointers.
1842
1843     If a readnone function reads or writes memory visible to the program, or
1844     has other side-effects, the behavior is undefined. If a function reads from
1845     or writes to a readnone pointer argument, the behavior is undefined.
1846 ``readonly``
1847     On a function, this attribute indicates that the function does not write
1848     through any pointer arguments (including ``byval`` arguments) or otherwise
1849     modify any state (e.g. memory, control registers, etc) visible to
1850     caller functions. It may dereference pointer arguments and read
1851     state that may be set in the caller. A readonly function always
1852     returns the same value (or unwinds an exception identically) when
1853     called with the same set of arguments and global state.  This means while it
1854     cannot unwind exceptions by calling the ``C++`` exception throwing methods
1855     (since they write to memory), there may be non-``C++`` mechanisms that throw
1856     exceptions without writing to LLVM visible memory.
1857
1858     On an argument, this attribute indicates that the function does not write
1859     through this pointer argument, even though it may write to the memory that
1860     the pointer points to.
1861
1862     If a readonly function writes memory visible to the program, or
1863     has other side-effects, the behavior is undefined. If a function writes to
1864     a readonly pointer argument, the behavior is undefined.
1865 ``"stack-probe-size"``
1866     This attribute controls the behavior of stack probes: either
1867     the ``"probe-stack"`` attribute, or ABI-required stack probes, if any.
1868     It defines the size of the guard region. It ensures that if the function
1869     may use more stack space than the size of the guard region, stack probing
1870     sequence will be emitted. It takes one required integer value, which
1871     is 4096 by default.
1872
1873     If a function that has a ``"stack-probe-size"`` attribute is inlined into
1874     a function with another ``"stack-probe-size"`` attribute, the resulting
1875     function has the ``"stack-probe-size"`` attribute that has the lower
1876     numeric value. If a function that has a ``"stack-probe-size"`` attribute is
1877     inlined into a function that has no ``"stack-probe-size"`` attribute
1878     at all, the resulting function has the ``"stack-probe-size"`` attribute
1879     of the callee.
1880 ``"no-stack-arg-probe"``
1881     This attribute disables ABI-required stack probes, if any.
1882 ``writeonly``
1883     On a function, this attribute indicates that the function may write to but
1884     does not read from memory.
1885
1886     On an argument, this attribute indicates that the function may write to but
1887     does not read through this pointer argument (even though it may read from
1888     the memory that the pointer points to).
1889
1890     If a writeonly function reads memory visible to the program, or
1891     has other side-effects, the behavior is undefined. If a function reads
1892     from a writeonly pointer argument, the behavior is undefined.
1893 ``argmemonly``
1894     This attribute indicates that the only memory accesses inside function are
1895     loads and stores from objects pointed to by its pointer-typed arguments,
1896     with arbitrary offsets. Or in other words, all memory operations in the
1897     function can refer to memory only using pointers based on its function
1898     arguments.
1899
1900     Note that ``argmemonly`` can be used together with ``readonly`` attribute
1901     in order to specify that function reads only from its arguments.
1902
1903     If an argmemonly function reads or writes memory other than the pointer
1904     arguments, or has other side-effects, the behavior is undefined.
1905 ``returns_twice``
1906     This attribute indicates that this function can return twice. The C
1907     ``setjmp`` is an example of such a function. The compiler disables
1908     some optimizations (like tail calls) in the caller of these
1909     functions.
1910 ``safestack``
1911     This attribute indicates that
1912     `SafeStack <https://clang.llvm.org/docs/SafeStack.html>`_
1913     protection is enabled for this function.
1914
1915     If a function that has a ``safestack`` attribute is inlined into a
1916     function that doesn't have a ``safestack`` attribute or which has an
1917     ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting
1918     function will have a ``safestack`` attribute.
1919 ``sanitize_address``
1920     This attribute indicates that AddressSanitizer checks
1921     (dynamic address safety analysis) are enabled for this function.
1922 ``sanitize_memory``
1923     This attribute indicates that MemorySanitizer checks (dynamic detection
1924     of accesses to uninitialized memory) are enabled for this function.
1925 ``sanitize_thread``
1926     This attribute indicates that ThreadSanitizer checks
1927     (dynamic thread safety analysis) are enabled for this function.
1928 ``sanitize_hwaddress``
1929     This attribute indicates that HWAddressSanitizer checks
1930     (dynamic address safety analysis based on tagged pointers) are enabled for
1931     this function.
1932 ``sanitize_memtag``
1933     This attribute indicates that MemTagSanitizer checks
1934     (dynamic address safety analysis based on Armv8 MTE) are enabled for
1935     this function.
1936 ``speculative_load_hardening``
1937     This attribute indicates that
1938     `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_
1939     should be enabled for the function body.
1940
1941     Speculative Load Hardening is a best-effort mitigation against
1942     information leak attacks that make use of control flow
1943     miss-speculation - specifically miss-speculation of whether a branch
1944     is taken or not. Typically vulnerabilities enabling such attacks are
1945     classified as "Spectre variant #1". Notably, this does not attempt to
1946     mitigate against miss-speculation of branch target, classified as
1947     "Spectre variant #2" vulnerabilities.
1948
1949     When inlining, the attribute is sticky. Inlining a function that carries
1950     this attribute will cause the caller to gain the attribute. This is intended
1951     to provide a maximally conservative model where the code in a function
1952     annotated with this attribute will always (even after inlining) end up
1953     hardened.
1954 ``speculatable``
1955     This function attribute indicates that the function does not have any
1956     effects besides calculating its result and does not have undefined behavior.
1957     Note that ``speculatable`` is not enough to conclude that along any
1958     particular execution path the number of calls to this function will not be
1959     externally observable. This attribute is only valid on functions
1960     and declarations, not on individual call sites. If a function is
1961     incorrectly marked as speculatable and really does exhibit
1962     undefined behavior, the undefined behavior may be observed even
1963     if the call site is dead code.
1964
1965 ``ssp``
1966     This attribute indicates that the function should emit a stack
1967     smashing protector. It is in the form of a "canary" --- a random value
1968     placed on the stack before the local variables that's checked upon
1969     return from the function to see if it has been overwritten. A
1970     heuristic is used to determine if a function needs stack protectors
1971     or not. The heuristic used will enable protectors for functions with:
1972
1973     - Character arrays larger than ``ssp-buffer-size`` (default 8).
1974     - Aggregates containing character arrays larger than ``ssp-buffer-size``.
1975     - Calls to alloca() with variable sizes or constant sizes greater than
1976       ``ssp-buffer-size``.
1977
1978     Variables that are identified as requiring a protector will be arranged
1979     on the stack such that they are adjacent to the stack protector guard.
1980
1981     A function with the ``ssp`` attribute but without the ``alwaysinline``
1982     attribute cannot be inlined into a function without a
1983     ``ssp/sspreq/sspstrong`` attribute. If inlined, the caller will get the
1984     ``ssp`` attribute. ``call``, ``invoke``, and ``callbr`` instructions with
1985     the ``alwaysinline`` attribute force inlining.
1986 ``sspstrong``
1987     This attribute indicates that the function should emit a stack smashing
1988     protector. This attribute causes a strong heuristic to be used when
1989     determining if a function needs stack protectors. The strong heuristic
1990     will enable protectors for functions with:
1991
1992     - Arrays of any size and type
1993     - Aggregates containing an array of any size and type.
1994     - Calls to alloca().
1995     - Local variables that have had their address taken.
1996
1997     Variables that are identified as requiring a protector will be arranged
1998     on the stack such that they are adjacent to the stack protector guard.
1999     The specific layout rules are:
2000
2001     #. Large arrays and structures containing large arrays
2002        (``>= ssp-buffer-size``) are closest to the stack protector.
2003     #. Small arrays and structures containing small arrays
2004        (``< ssp-buffer-size``) are 2nd closest to the protector.
2005     #. Variables that have had their address taken are 3rd closest to the
2006        protector.
2007
2008     This overrides the ``ssp`` function attribute.
2009
2010     A function with the ``sspstrong`` attribute but without the
2011     ``alwaysinline`` attribute cannot be inlined into a function without a
2012     ``ssp/sspstrong/sspreq`` attribute. If inlined, the caller will get the
2013     ``sspstrong`` attribute unless the ``sspreq`` attribute exists.  ``call``,
2014     ``invoke``, and ``callbr`` instructions with the ``alwaysinline`` attribute
2015     force inlining.
2016 ``sspreq``
2017     This attribute indicates that the function should *always* emit a stack
2018     smashing protector. This overrides the ``ssp`` and ``sspstrong`` function
2019     attributes.
2020
2021     Variables that are identified as requiring a protector will be arranged
2022     on the stack such that they are adjacent to the stack protector guard.
2023     The specific layout rules are:
2024
2025     #. Large arrays and structures containing large arrays
2026        (``>= ssp-buffer-size``) are closest to the stack protector.
2027     #. Small arrays and structures containing small arrays
2028        (``< ssp-buffer-size``) are 2nd closest to the protector.
2029     #. Variables that have had their address taken are 3rd closest to the
2030        protector.
2031
2032     A function with the ``sspreq`` attribute but without the ``alwaysinline``
2033     attribute cannot be inlined into a function without a
2034     ``ssp/sspstrong/sspreq`` attribute. If inlined, the caller will get the
2035     ``sspreq`` attribute.  ``call``, ``invoke``, and ``callbr`` instructions
2036     with the ``alwaysinline`` attribute force inlining.
2037
2038 ``strictfp``
2039     This attribute indicates that the function was called from a scope that
2040     requires strict floating-point semantics.  LLVM will not attempt any
2041     optimizations that require assumptions about the floating-point rounding
2042     mode or that might alter the state of floating-point status flags that
2043     might otherwise be set or cleared by calling this function. LLVM will
2044     not introduce any new floating-point instructions that may trap.
2045
2046 ``"denormal-fp-math"``
2047     This indicates the denormal (subnormal) handling that may be
2048     assumed for the default floating-point environment. This is a
2049     comma separated pair. The elements may be one of ``"ieee"``,
2050     ``"preserve-sign"``, or ``"positive-zero"``. The first entry
2051     indicates the flushing mode for the result of floating point
2052     operations. The second indicates the handling of denormal inputs
2053     to floating point instructions. For compatibility with older
2054     bitcode, if the second value is omitted, both input and output
2055     modes will assume the same mode.
2056
2057     If this is attribute is not specified, the default is
2058     ``"ieee,ieee"``.
2059
2060     If the output mode is ``"preserve-sign"``, or ``"positive-zero"``,
2061     denormal outputs may be flushed to zero by standard floating-point
2062     operations. It is not mandated that flushing to zero occurs, but if
2063     a denormal output is flushed to zero, it must respect the sign
2064     mode. Not all targets support all modes. While this indicates the
2065     expected floating point mode the function will be executed with,
2066     this does not make any attempt to ensure the mode is
2067     consistent. User or platform code is expected to set the floating
2068     point mode appropriately before function entry.
2069
2070    If the input mode is ``"preserve-sign"``, or ``"positive-zero"``, a
2071    floating-point operation must treat any input denormal value as
2072    zero. In some situations, if an instruction does not respect this
2073    mode, the input may need to be converted to 0 as if by
2074    ``@llvm.canonicalize`` during lowering for correctness.
2075
2076 ``"denormal-fp-math-f32"``
2077     Same as ``"denormal-fp-math"``, but only controls the behavior of
2078     the 32-bit float type (or vectors of 32-bit floats). If both are
2079     are present, this overrides ``"denormal-fp-math"``. Not all targets
2080     support separately setting the denormal mode per type, and no
2081     attempt is made to diagnose unsupported uses. Currently this
2082     attribute is respected by the AMDGPU and NVPTX backends.
2083
2084 ``"thunk"``
2085     This attribute indicates that the function will delegate to some other
2086     function with a tail call. The prototype of a thunk should not be used for
2087     optimization purposes. The caller is expected to cast the thunk prototype to
2088     match the thunk target prototype.
2089 ``uwtable``
2090     This attribute indicates that the ABI being targeted requires that
2091     an unwind table entry be produced for this function even if we can
2092     show that no exceptions passes by it. This is normally the case for
2093     the ELF x86-64 abi, but it can be disabled for some compilation
2094     units.
2095 ``nocf_check``
2096     This attribute indicates that no control-flow check will be performed on
2097     the attributed entity. It disables -fcf-protection=<> for a specific
2098     entity to fine grain the HW control flow protection mechanism. The flag
2099     is target independent and currently appertains to a function or function
2100     pointer.
2101 ``shadowcallstack``
2102     This attribute indicates that the ShadowCallStack checks are enabled for
2103     the function. The instrumentation checks that the return address for the
2104     function has not changed between the function prolog and epilog. It is
2105     currently x86_64-specific.
2106 ``mustprogress``
2107     This attribute indicates that the function is required to return, unwind,
2108     or interact with the environment in an observable way e.g. via a volatile
2109     memory access, I/O, or other synchronization.  The ``mustprogress``
2110     attribute is intended to model the requirements of the first section of
2111     [intro.progress] of the C++ Standard. As a consequence, a loop in a
2112     function with the `mustprogress` attribute can be assumed to terminate if
2113     it does not interact with the environment in an observable way, and
2114     terminating loops without side-effects can be removed. If a `mustprogress`
2115     function does not satisfy this contract, the behavior is undefined.  This
2116     attribute does not apply transitively to callees, but does apply to call
2117     sites within the function. Note that `willreturn` implies `mustprogress`.
2118 ``"warn-stack-size"="<threshold>"``
2119     This attribute sets a threshold to emit diagnostics once the frame size is
2120     known should the frame size exceed the specified value.  It takes one
2121     required integer value, which should be a non-negative integer, and less
2122     than `UINT_MAX`.  It's unspecified which threshold will be used when
2123     duplicate definitions are linked together with differing values.
2124 ``vscale_range(<min>[, <max>])``
2125     This attribute indicates the minimum and maximum vscale value for the given
2126     function. A value of 0 means unbounded. If the optional max value is omitted
2127     then max is set to the value of min. If the attribute is not present, no
2128     assumptions are made about the range of vscale.
2129
2130 Call Site Attributes
2131 ----------------------
2132
2133 In addition to function attributes the following call site only
2134 attributes are supported:
2135
2136 ``vector-function-abi-variant``
2137     This attribute can be attached to a :ref:`call <i_call>` to list
2138     the vector functions associated to the function. Notice that the
2139     attribute cannot be attached to a :ref:`invoke <i_invoke>` or a
2140     :ref:`callbr <i_callbr>` instruction. The attribute consists of a
2141     comma separated list of mangled names. The order of the list does
2142     not imply preference (it is logically a set). The compiler is free
2143     to pick any listed vector function of its choosing.
2144
2145     The syntax for the mangled names is as follows:::
2146
2147         _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)]
2148
2149     When present, the attribute informs the compiler that the function
2150     ``<scalar_name>`` has a corresponding vector variant that can be
2151     used to perform the concurrent invocation of ``<scalar_name>`` on
2152     vectors. The shape of the vector function is described by the
2153     tokens between the prefix ``_ZGV`` and the ``<scalar_name>``
2154     token. The standard name of the vector function is
2155     ``_ZGV<isa><mask><vlen><parameters>_<scalar_name>``. When present,
2156     the optional token ``(<vector_redirection>)`` informs the compiler
2157     that a custom name is provided in addition to the standard one
2158     (custom names can be provided for example via the use of ``declare
2159     variant`` in OpenMP 5.0). The declaration of the variant must be
2160     present in the IR Module. The signature of the vector variant is
2161     determined by the rules of the Vector Function ABI (VFABI)
2162     specifications of the target. For Arm and X86, the VFABI can be
2163     found at https://github.com/ARM-software/abi-aa and
2164     https://software.intel.com/content/www/us/en/develop/download/vector-simd-function-abi.html,
2165     respectively.
2166
2167     For X86 and Arm targets, the values of the tokens in the standard
2168     name are those that are defined in the VFABI. LLVM has an internal
2169     ``<isa>`` token that can be used to create scalar-to-vector
2170     mappings for functions that are not directly associated to any of
2171     the target ISAs (for example, some of the mappings stored in the
2172     TargetLibraryInfo). Valid values for the ``<isa>`` token are:::
2173
2174         <isa>:= b | c | d | e  -> X86 SSE, AVX, AVX2, AVX512
2175               | n | s          -> Armv8 Advanced SIMD, SVE
2176               | __LLVM__       -> Internal LLVM Vector ISA
2177
2178     For all targets currently supported (x86, Arm and Internal LLVM),
2179     the remaining tokens can have the following values:::
2180
2181         <mask>:= M | N         -> mask | no mask
2182
2183         <vlen>:= number        -> number of lanes
2184                | x             -> VLA (Vector Length Agnostic)
2185
2186         <parameters>:= v              -> vector
2187                      | l | l <number> -> linear
2188                      | R | R <number> -> linear with ref modifier
2189                      | L | L <number> -> linear with val modifier
2190                      | U | U <number> -> linear with uval modifier
2191                      | ls <pos>       -> runtime linear
2192                      | Rs <pos>       -> runtime linear with ref modifier
2193                      | Ls <pos>       -> runtime linear with val modifier
2194                      | Us <pos>       -> runtime linear with uval modifier
2195                      | u              -> uniform
2196
2197         <scalar_name>:= name of the scalar function
2198
2199         <vector_redirection>:= optional, custom name of the vector function
2200
2201 ``preallocated(<ty>)``
2202     This attribute is required on calls to ``llvm.call.preallocated.arg``
2203     and cannot be used on any other call. See
2204     :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` for more
2205     details.
2206
2207 .. _glattrs:
2208
2209 Global Attributes
2210 -----------------
2211
2212 Attributes may be set to communicate additional information about a global variable.
2213 Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable
2214 are grouped into a single :ref:`attribute group <attrgrp>`.
2215
2216 .. _opbundles:
2217
2218 Operand Bundles
2219 ---------------
2220
2221 Operand bundles are tagged sets of SSA values that can be associated
2222 with certain LLVM instructions (currently only ``call`` s and
2223 ``invoke`` s).  In a way they are like metadata, but dropping them is
2224 incorrect and will change program semantics.
2225
2226 Syntax::
2227
2228     operand bundle set ::= '[' operand bundle (, operand bundle )* ']'
2229     operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
2230     bundle operand ::= SSA value
2231     tag ::= string constant
2232
2233 Operand bundles are **not** part of a function's signature, and a
2234 given function may be called from multiple places with different kinds
2235 of operand bundles.  This reflects the fact that the operand bundles
2236 are conceptually a part of the ``call`` (or ``invoke``), not the
2237 callee being dispatched to.
2238
2239 Operand bundles are a generic mechanism intended to support
2240 runtime-introspection-like functionality for managed languages.  While
2241 the exact semantics of an operand bundle depend on the bundle tag,
2242 there are certain limitations to how much the presence of an operand
2243 bundle can influence the semantics of a program.  These restrictions
2244 are described as the semantics of an "unknown" operand bundle.  As
2245 long as the behavior of an operand bundle is describable within these
2246 restrictions, LLVM does not need to have special knowledge of the
2247 operand bundle to not miscompile programs containing it.
2248
2249 - The bundle operands for an unknown operand bundle escape in unknown
2250   ways before control is transferred to the callee or invokee.
2251 - Calls and invokes with operand bundles have unknown read / write
2252   effect on the heap on entry and exit (even if the call target is
2253   ``readnone`` or ``readonly``), unless they're overridden with
2254   callsite specific attributes.
2255 - An operand bundle at a call site cannot change the implementation
2256   of the called function.  Inter-procedural optimizations work as
2257   usual as long as they take into account the first two properties.
2258
2259 More specific types of operand bundles are described below.
2260
2261 .. _deopt_opbundles:
2262
2263 Deoptimization Operand Bundles
2264 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2265
2266 Deoptimization operand bundles are characterized by the ``"deopt"``
2267 operand bundle tag.  These operand bundles represent an alternate
2268 "safe" continuation for the call site they're attached to, and can be
2269 used by a suitable runtime to deoptimize the compiled frame at the
2270 specified call site.  There can be at most one ``"deopt"`` operand
2271 bundle attached to a call site.  Exact details of deoptimization is
2272 out of scope for the language reference, but it usually involves
2273 rewriting a compiled frame into a set of interpreted frames.
2274
2275 From the compiler's perspective, deoptimization operand bundles make
2276 the call sites they're attached to at least ``readonly``.  They read
2277 through all of their pointer typed operands (even if they're not
2278 otherwise escaped) and the entire visible heap.  Deoptimization
2279 operand bundles do not capture their operands except during
2280 deoptimization, in which case control will not be returned to the
2281 compiled frame.
2282
2283 The inliner knows how to inline through calls that have deoptimization
2284 operand bundles.  Just like inlining through a normal call site
2285 involves composing the normal and exceptional continuations, inlining
2286 through a call site with a deoptimization operand bundle needs to
2287 appropriately compose the "safe" deoptimization continuation.  The
2288 inliner does this by prepending the parent's deoptimization
2289 continuation to every deoptimization continuation in the inlined body.
2290 E.g. inlining ``@f`` into ``@g`` in the following example
2291
2292 .. code-block:: llvm
2293
2294     define void @f() {
2295       call void @x()  ;; no deopt state
2296       call void @y() [ "deopt"(i32 10) ]
2297       call void @y() [ "deopt"(i32 10), "unknown"(i8* null) ]
2298       ret void
2299     }
2300
2301     define void @g() {
2302       call void @f() [ "deopt"(i32 20) ]
2303       ret void
2304     }
2305
2306 will result in
2307
2308 .. code-block:: llvm
2309
2310     define void @g() {
2311       call void @x()  ;; still no deopt state
2312       call void @y() [ "deopt"(i32 20, i32 10) ]
2313       call void @y() [ "deopt"(i32 20, i32 10), "unknown"(i8* null) ]
2314       ret void
2315     }
2316
2317 It is the frontend's responsibility to structure or encode the
2318 deoptimization state in a way that syntactically prepending the
2319 caller's deoptimization state to the callee's deoptimization state is
2320 semantically equivalent to composing the caller's deoptimization
2321 continuation after the callee's deoptimization continuation.
2322
2323 .. _ob_funclet:
2324
2325 Funclet Operand Bundles
2326 ^^^^^^^^^^^^^^^^^^^^^^^
2327
2328 Funclet operand bundles are characterized by the ``"funclet"``
2329 operand bundle tag.  These operand bundles indicate that a call site
2330 is within a particular funclet.  There can be at most one
2331 ``"funclet"`` operand bundle attached to a call site and it must have
2332 exactly one bundle operand.
2333
2334 If any funclet EH pads have been "entered" but not "exited" (per the
2335 `description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_),
2336 it is undefined behavior to execute a ``call`` or ``invoke`` which:
2337
2338 * does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind
2339   intrinsic, or
2340 * has a ``"funclet"`` bundle whose operand is not the most-recently-entered
2341   not-yet-exited funclet EH pad.
2342
2343 Similarly, if no funclet EH pads have been entered-but-not-yet-exited,
2344 executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior.
2345
2346 GC Transition Operand Bundles
2347 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2348
2349 GC transition operand bundles are characterized by the
2350 ``"gc-transition"`` operand bundle tag. These operand bundles mark a
2351 call as a transition between a function with one GC strategy to a
2352 function with a different GC strategy. If coordinating the transition
2353 between GC strategies requires additional code generation at the call
2354 site, these bundles may contain any values that are needed by the
2355 generated code.  For more details, see :ref:`GC Transitions
2356 <gc_transition_args>`.
2357
2358 The bundle contain an arbitrary list of Values which need to be passed
2359 to GC transition code. They will be lowered and passed as operands to
2360 the appropriate GC_TRANSITION nodes in the selection DAG. It is assumed
2361 that these arguments must be available before and after (but not
2362 necessarily during) the execution of the callee.
2363
2364 .. _assume_opbundles:
2365
2366 Assume Operand Bundles
2367 ^^^^^^^^^^^^^^^^^^^^^^
2368
2369 Operand bundles on an :ref:`llvm.assume <int_assume>` allows representing
2370 assumptions that a :ref:`parameter attribute <paramattrs>` or a
2371 :ref:`function attribute <fnattrs>` holds for a certain value at a certain
2372 location. Operand bundles enable assumptions that are either hard or impossible
2373 to represent as a boolean argument of an :ref:`llvm.assume <int_assume>`.
2374
2375 An assume operand bundle has the form:
2376
2377 ::
2378
2379       "<tag>"([ <holds for value> [, <attribute argument>] ])
2380
2381 * The tag of the operand bundle is usually the name of attribute that can be
2382   assumed to hold. It can also be `ignore`, this tag doesn't contain any
2383   information and should be ignored.
2384 * The first argument if present is the value for which the attribute hold.
2385 * The second argument if present is an argument of the attribute.
2386
2387 If there are no arguments the attribute is a property of the call location.
2388
2389 If the represented attribute expects a constant argument, the argument provided
2390 to the operand bundle should be a constant as well.
2391
2392 For example:
2393
2394 .. code-block:: llvm
2395
2396       call void @llvm.assume(i1 true) ["align"(i32* %val, i32 8)]
2397
2398 allows the optimizer to assume that at location of call to
2399 :ref:`llvm.assume <int_assume>` ``%val`` has an alignment of at least 8.
2400
2401 .. code-block:: llvm
2402
2403       call void @llvm.assume(i1 %cond) ["cold"(), "nonnull"(i64* %val)]
2404
2405 allows the optimizer to assume that the :ref:`llvm.assume <int_assume>`
2406 call location is cold and that ``%val`` may not be null.
2407
2408 Just like for the argument of :ref:`llvm.assume <int_assume>`, if any of the
2409 provided guarantees are violated at runtime the behavior is undefined.
2410
2411 Even if the assumed property can be encoded as a boolean value, like
2412 ``nonnull``, using operand bundles to express the property can still have
2413 benefits:
2414
2415 * Attributes that can be expressed via operand bundles are directly the
2416   property that the optimizer uses and cares about. Encoding attributes as
2417   operand bundles removes the need for an instruction sequence that represents
2418   the property (e.g., `icmp ne i32* %p, null` for `nonnull`) and for the
2419   optimizer to deduce the property from that instruction sequence.
2420 * Expressing the property using operand bundles makes it easy to identify the
2421   use of the value as a use in an :ref:`llvm.assume <int_assume>`. This then
2422   simplifies and improves heuristics, e.g., for use "use-sensitive"
2423   optimizations.
2424
2425 .. _ob_preallocated:
2426
2427 Preallocated Operand Bundles
2428 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2429
2430 Preallocated operand bundles are characterized by the ``"preallocated"``
2431 operand bundle tag.  These operand bundles allow separation of the allocation
2432 of the call argument memory from the call site.  This is necessary to pass
2433 non-trivially copyable objects by value in a way that is compatible with MSVC
2434 on some targets.  There can be at most one ``"preallocated"`` operand bundle
2435 attached to a call site and it must have exactly one bundle operand, which is
2436 a token generated by ``@llvm.call.preallocated.setup``.  A call with this
2437 operand bundle should not adjust the stack before entering the function, as
2438 that will have been done by one of the ``@llvm.call.preallocated.*`` intrinsics.
2439
2440 .. code-block:: llvm
2441
2442       %foo = type { i64, i32 }
2443
2444       ...
2445
2446       %t = call token @llvm.call.preallocated.setup(i32 1)
2447       %a = call i8* @llvm.call.preallocated.arg(token %t, i32 0) preallocated(%foo)
2448       %b = bitcast i8* %a to %foo*
2449       ; initialize %b
2450       call void @bar(i32 42, %foo* preallocated(%foo) %b) ["preallocated"(token %t)]
2451
2452 .. _ob_gc_live:
2453
2454 GC Live Operand Bundles
2455 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2456
2457 A "gc-live" operand bundle is only valid on a :ref:`gc.statepoint <gc_statepoint>`
2458 intrinsic. The operand bundle must contain every pointer to a garbage collected
2459 object which potentially needs to be updated by the garbage collector.
2460
2461 When lowered, any relocated value will be recorded in the corresponding
2462 :ref:`stackmap entry <statepoint-stackmap-format>`.  See the intrinsic description
2463 for further details.
2464
2465 ObjC ARC Attached Call Operand Bundles
2466 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2467
2468 A ``"clang.arc.attachedcall`` operand bundle on a call indicates the call is
2469 implicitly followed by a marker instruction and a call to an ObjC runtime
2470 function that uses the result of the call. If the argument passed to the operand
2471 bundle is 0, ``@objc_retainAutoreleasedReturnValue`` is called. If 1 is passed,
2472 ``@objc_unsafeClaimAutoreleasedReturnValue`` is called. The return value of a
2473 call with this bundle is used by a call to ``@llvm.objc.clang.arc.noop.use``
2474 unless the called function's return type is void, in which case the operand
2475 bundle is ignored.
2476
2477 The operand bundle is needed to ensure the call is immediately followed by the
2478 marker instruction or the ObjC runtime call in the final output.
2479
2480 .. _moduleasm:
2481
2482 Module-Level Inline Assembly
2483 ----------------------------
2484
2485 Modules may contain "module-level inline asm" blocks, which corresponds
2486 to the GCC "file scope inline asm" blocks. These blocks are internally
2487 concatenated by LLVM and treated as a single unit, but may be separated
2488 in the ``.ll`` file if desired. The syntax is very simple:
2489
2490 .. code-block:: llvm
2491
2492     module asm "inline asm code goes here"
2493     module asm "more can go here"
2494
2495 The strings can contain any character by escaping non-printable
2496 characters. The escape sequence used is simply "\\xx" where "xx" is the
2497 two digit hex code for the number.
2498
2499 Note that the assembly string *must* be parseable by LLVM's integrated assembler
2500 (unless it is disabled), even when emitting a ``.s`` file.
2501
2502 .. _langref_datalayout:
2503
2504 Data Layout
2505 -----------
2506
2507 A module may specify a target specific data layout string that specifies
2508 how data is to be laid out in memory. The syntax for the data layout is
2509 simply:
2510
2511 .. code-block:: llvm
2512
2513     target datalayout = "layout specification"
2514
2515 The *layout specification* consists of a list of specifications
2516 separated by the minus sign character ('-'). Each specification starts
2517 with a letter and may include other information after the letter to
2518 define some aspect of the data layout. The specifications accepted are
2519 as follows:
2520
2521 ``E``
2522     Specifies that the target lays out data in big-endian form. That is,
2523     the bits with the most significance have the lowest address
2524     location.
2525 ``e``
2526     Specifies that the target lays out data in little-endian form. That
2527     is, the bits with the least significance have the lowest address
2528     location.
2529 ``S<size>``
2530     Specifies the natural alignment of the stack in bits. Alignment
2531     promotion of stack variables is limited to the natural stack
2532     alignment to avoid dynamic stack realignment. The stack alignment
2533     must be a multiple of 8-bits. If omitted, the natural stack
2534     alignment defaults to "unspecified", which does not prevent any
2535     alignment promotions.
2536 ``P<address space>``
2537     Specifies the address space that corresponds to program memory.
2538     Harvard architectures can use this to specify what space LLVM
2539     should place things such as functions into. If omitted, the
2540     program memory space defaults to the default address space of 0,
2541     which corresponds to a Von Neumann architecture that has code
2542     and data in the same space.
2543 ``G<address space>``
2544     Specifies the address space to be used by default when creating global
2545     variables. If omitted, the globals address space defaults to the default
2546     address space 0.
2547     Note: variable declarations without an address space are always created in
2548     address space 0, this property only affects the default value to be used
2549     when creating globals without additional contextual information (e.g. in
2550     LLVM passes).
2551 ``A<address space>``
2552     Specifies the address space of objects created by '``alloca``'.
2553     Defaults to the default address space of 0.
2554 ``p[n]:<size>:<abi>:<pref>:<idx>``
2555     This specifies the *size* of a pointer and its ``<abi>`` and
2556     ``<pref>``\erred alignments for address space ``n``. The fourth parameter
2557     ``<idx>`` is a size of index that used for address calculation. If not
2558     specified, the default index size is equal to the pointer size. All sizes
2559     are in bits. The address space, ``n``, is optional, and if not specified,
2560     denotes the default address space 0. The value of ``n`` must be
2561     in the range [1,2^23).
2562 ``i<size>:<abi>:<pref>``
2563     This specifies the alignment for an integer type of a given bit
2564     ``<size>``. The value of ``<size>`` must be in the range [1,2^23).
2565 ``v<size>:<abi>:<pref>``
2566     This specifies the alignment for a vector type of a given bit
2567     ``<size>``.
2568 ``f<size>:<abi>:<pref>``
2569     This specifies the alignment for a floating-point type of a given bit
2570     ``<size>``. Only values of ``<size>`` that are supported by the target
2571     will work. 32 (float) and 64 (double) are supported on all targets; 80
2572     or 128 (different flavors of long double) are also supported on some
2573     targets.
2574 ``a:<abi>:<pref>``
2575     This specifies the alignment for an object of aggregate type.
2576 ``F<type><abi>``
2577     This specifies the alignment for function pointers.
2578     The options for ``<type>`` are:
2579
2580     * ``i``: The alignment of function pointers is independent of the alignment
2581       of functions, and is a multiple of ``<abi>``.
2582     * ``n``: The alignment of function pointers is a multiple of the explicit
2583       alignment specified on the function, and is a multiple of ``<abi>``.
2584 ``m:<mangling>``
2585     If present, specifies that llvm names are mangled in the output. Symbols
2586     prefixed with the mangling escape character ``\01`` are passed through
2587     directly to the assembler without the escape character. The mangling style
2588     options are
2589
2590     * ``e``: ELF mangling: Private symbols get a ``.L`` prefix.
2591     * ``m``: Mips mangling: Private symbols get a ``$`` prefix.
2592     * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other
2593       symbols get a ``_`` prefix.
2594     * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix.
2595       Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``,
2596       ``__fastcall``, and ``__vectorcall`` have custom mangling that appends
2597       ``@N`` where N is the number of bytes used to pass parameters. C++ symbols
2598       starting with ``?`` are not mangled in any way.
2599     * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C
2600       symbols do not receive a ``_`` prefix.
2601     * ``a``: XCOFF mangling: Private symbols get a ``L..`` prefix.
2602 ``n<size1>:<size2>:<size3>...``
2603     This specifies a set of native integer widths for the target CPU in
2604     bits. For example, it might contain ``n32`` for 32-bit PowerPC,
2605     ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
2606     this set are considered to support most general arithmetic operations
2607     efficiently.
2608 ``ni:<address space0>:<address space1>:<address space2>...``
2609     This specifies pointer types with the specified address spaces
2610     as :ref:`Non-Integral Pointer Type <nointptrtype>` s.  The ``0``
2611     address space cannot be specified as non-integral.
2612
2613 On every specification that takes a ``<abi>:<pref>``, specifying the
2614 ``<pref>`` alignment is optional. If omitted, the preceding ``:``
2615 should be omitted too and ``<pref>`` will be equal to ``<abi>``.
2616
2617 When constructing the data layout for a given target, LLVM starts with a
2618 default set of specifications which are then (possibly) overridden by
2619 the specifications in the ``datalayout`` keyword. The default
2620 specifications are given in this list:
2621
2622 -  ``E`` - big endian
2623 -  ``p:64:64:64`` - 64-bit pointers with 64-bit alignment.
2624 -  ``p[n]:64:64:64`` - Other address spaces are assumed to be the
2625    same as the default address space.
2626 -  ``S0`` - natural stack alignment is unspecified
2627 -  ``i1:8:8`` - i1 is 8-bit (byte) aligned
2628 -  ``i8:8:8`` - i8 is 8-bit (byte) aligned
2629 -  ``i16:16:16`` - i16 is 16-bit aligned
2630 -  ``i32:32:32`` - i32 is 32-bit aligned
2631 -  ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
2632    alignment of 64-bits
2633 -  ``f16:16:16`` - half is 16-bit aligned
2634 -  ``f32:32:32`` - float is 32-bit aligned
2635 -  ``f64:64:64`` - double is 64-bit aligned
2636 -  ``f128:128:128`` - quad is 128-bit aligned
2637 -  ``v64:64:64`` - 64-bit vector is 64-bit aligned
2638 -  ``v128:128:128`` - 128-bit vector is 128-bit aligned
2639 -  ``a:0:64`` - aggregates are 64-bit aligned
2640
2641 When LLVM is determining the alignment for a given type, it uses the
2642 following rules:
2643
2644 #. If the type sought is an exact match for one of the specifications,
2645    that specification is used.
2646 #. If no match is found, and the type sought is an integer type, then
2647    the smallest integer type that is larger than the bitwidth of the
2648    sought type is used. If none of the specifications are larger than
2649    the bitwidth then the largest integer type is used. For example,
2650    given the default specifications above, the i7 type will use the
2651    alignment of i8 (next largest) while both i65 and i256 will use the
2652    alignment of i64 (largest specified).
2653 #. If no match is found, and the type sought is a vector type, then the
2654    largest vector type that is smaller than the sought vector type will
2655    be used as a fall back. This happens because <128 x double> can be
2656    implemented in terms of 64 <2 x double>, for example.
2657
2658 The function of the data layout string may not be what you expect.
2659 Notably, this is not a specification from the frontend of what alignment
2660 the code generator should use.
2661
2662 Instead, if specified, the target data layout is required to match what
2663 the ultimate *code generator* expects. This string is used by the
2664 mid-level optimizers to improve code, and this only works if it matches
2665 what the ultimate code generator uses. There is no way to generate IR
2666 that does not embed this target-specific detail into the IR. If you
2667 don't specify the string, the default specifications will be used to
2668 generate a Data Layout and the optimization phases will operate
2669 accordingly and introduce target specificity into the IR with respect to
2670 these default specifications.
2671
2672 .. _langref_triple:
2673
2674 Target Triple
2675 -------------
2676
2677 A module may specify a target triple string that describes the target
2678 host. The syntax for the target triple is simply:
2679
2680 .. code-block:: llvm
2681
2682     target triple = "x86_64-apple-macosx10.7.0"
2683
2684 The *target triple* string consists of a series of identifiers delimited
2685 by the minus sign character ('-'). The canonical forms are:
2686
2687 ::
2688
2689     ARCHITECTURE-VENDOR-OPERATING_SYSTEM
2690     ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
2691
2692 This information is passed along to the backend so that it generates
2693 code for the proper architecture. It's possible to override this on the
2694 command line with the ``-mtriple`` command line option.
2695
2696 .. _objectlifetime:
2697
2698 Object Lifetime
2699 ----------------------
2700
2701 A memory object, or simply object, is a region of a memory space that is
2702 reserved by a memory allocation such as :ref:`alloca <i_alloca>`, heap
2703 allocation calls, and global variable definitions.
2704 Once it is allocated, the bytes stored in the region can only be read or written
2705 through a pointer that is :ref:`based on <pointeraliasing>` the allocation
2706 value.
2707 If a pointer that is not based on the object tries to read or write to the
2708 object, it is undefined behavior.
2709
2710 A lifetime of a memory object is a property that decides its accessibility.
2711 Unless stated otherwise, a memory object is alive since its allocation, and
2712 dead after its deallocation.
2713 It is undefined behavior to access a memory object that isn't alive, but
2714 operations that don't dereference it such as
2715 :ref:`getelementptr <i_getelementptr>`, :ref:`ptrtoint <i_ptrtoint>` and
2716 :ref:`icmp <i_icmp>` return a valid result.
2717 This explains code motion of these instructions across operations that
2718 impact the object's lifetime.
2719 A stack object's lifetime can be explicitly specified using
2720 :ref:`llvm.lifetime.start <int_lifestart>` and
2721 :ref:`llvm.lifetime.end <int_lifeend>` intrinsic function calls.
2722
2723 .. _pointeraliasing:
2724
2725 Pointer Aliasing Rules
2726 ----------------------
2727
2728 Any memory access must be done through a pointer value associated with
2729 an address range of the memory access, otherwise the behavior is
2730 undefined. Pointer values are associated with address ranges according
2731 to the following rules:
2732
2733 -  A pointer value is associated with the addresses associated with any
2734    value it is *based* on.
2735 -  An address of a global variable is associated with the address range
2736    of the variable's storage.
2737 -  The result value of an allocation instruction is associated with the
2738    address range of the allocated storage.
2739 -  A null pointer in the default address-space is associated with no
2740    address.
2741 -  An :ref:`undef value <undefvalues>` in *any* address-space is
2742    associated with no address.
2743 -  An integer constant other than zero or a pointer value returned from
2744    a function not defined within LLVM may be associated with address
2745    ranges allocated through mechanisms other than those provided by
2746    LLVM. Such ranges shall not overlap with any ranges of addresses
2747    allocated by mechanisms provided by LLVM.
2748
2749 A pointer value is *based* on another pointer value according to the
2750 following rules:
2751
2752 -  A pointer value formed from a scalar ``getelementptr`` operation is *based* on
2753    the pointer-typed operand of the ``getelementptr``.
2754 -  The pointer in lane *l* of the result of a vector ``getelementptr`` operation
2755    is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand
2756    of the ``getelementptr``.
2757 -  The result value of a ``bitcast`` is *based* on the operand of the
2758    ``bitcast``.
2759 -  A pointer value formed by an ``inttoptr`` is *based* on all pointer
2760    values that contribute (directly or indirectly) to the computation of
2761    the pointer's value.
2762 -  The "*based* on" relationship is transitive.
2763
2764 Note that this definition of *"based"* is intentionally similar to the
2765 definition of *"based"* in C99, though it is slightly weaker.
2766
2767 LLVM IR does not associate types with memory. The result type of a
2768 ``load`` merely indicates the size and alignment of the memory from
2769 which to load, as well as the interpretation of the value. The first
2770 operand type of a ``store`` similarly only indicates the size and
2771 alignment of the store.
2772
2773 Consequently, type-based alias analysis, aka TBAA, aka
2774 ``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
2775 :ref:`Metadata <metadata>` may be used to encode additional information
2776 which specialized optimization passes may use to implement type-based
2777 alias analysis.
2778
2779 .. _pointercapture:
2780
2781 Pointer Capture
2782 ---------------
2783
2784 Given a function call and a pointer that is passed as an argument or stored in
2785 the memory before the call, a pointer is *captured* by the call if it makes a
2786 copy of any part of the pointer that outlives the call.
2787 To be precise, a pointer is captured if one or more of the following conditions
2788 hold:
2789
2790 1. The call stores any bit of the pointer carrying information into a place,
2791    and the stored bits can be read from the place by the caller after this call
2792    exits.
2793
2794 .. code-block:: llvm
2795
2796     @glb  = global i8* null
2797     @glb2 = global i8* null
2798     @glb3 = global i8* null
2799     @glbi = global i32 0
2800
2801     define i8* @f(i8* %a, i8* %b, i8* %c, i8* %d, i8* %e) {
2802       store i8* %a, i8** @glb ; %a is captured by this call
2803
2804       store i8* %b,   i8** @glb2 ; %b isn't captured because the stored value is overwritten by the store below
2805       store i8* null, i8** @glb2
2806
2807       store i8* %c,   i8** @glb3
2808       call void @g() ; If @g makes a copy of %c that outlives this call (@f), %c is captured
2809       store i8* null, i8** @glb3
2810
2811       %i = ptrtoint i8* %d to i64
2812       %j = trunc i64 %i to i32
2813       store i32 %j, i32* @glbi ; %d is captured
2814
2815       ret i8* %e ; %e is captured
2816     }
2817
2818 2. The call stores any bit of the pointer carrying information into a place,
2819    and the stored bits can be safely read from the place by another thread via
2820    synchronization.
2821
2822 .. code-block:: llvm
2823
2824     @lock = global i1 true
2825
2826     define void @f(i8* %a) {
2827       store i8* %a, i8** @glb
2828       store atomic i1 false, i1* @lock release ; %a is captured because another thread can safely read @glb
2829       store i8* null, i8** @glb
2830       ret void
2831     }
2832
2833 3. The call's behavior depends on any bit of the pointer carrying information.
2834
2835 .. code-block:: llvm
2836
2837     @glb = global i8 0
2838
2839     define void @f(i8* %a) {
2840       %c = icmp eq i8* %a, @glb
2841       br i1 %c, label %BB_EXIT, label %BB_CONTINUE ; escapes %a
2842     BB_EXIT:
2843       call void @exit()
2844       unreachable
2845     BB_CONTINUE:
2846       ret void
2847     }
2848
2849 4. The pointer is used in a volatile access as its address.
2850
2851
2852 .. _volatile:
2853
2854 Volatile Memory Accesses
2855 ------------------------
2856
2857 Certain memory accesses, such as :ref:`load <i_load>`'s,
2858 :ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be
2859 marked ``volatile``. The optimizers must not change the number of
2860 volatile operations or change their order of execution relative to other
2861 volatile operations. The optimizers *may* change the order of volatile
2862 operations relative to non-volatile operations. This is not Java's
2863 "volatile" and has no cross-thread synchronization behavior.
2864
2865 A volatile load or store may have additional target-specific semantics.
2866 Any volatile operation can have side effects, and any volatile operation
2867 can read and/or modify state which is not accessible via a regular load
2868 or store in this module. Volatile operations may use addresses which do
2869 not point to memory (like MMIO registers). This means the compiler may
2870 not use a volatile operation to prove a non-volatile access to that
2871 address has defined behavior.
2872
2873 The allowed side-effects for volatile accesses are limited.  If a
2874 non-volatile store to a given address would be legal, a volatile
2875 operation may modify the memory at that address. A volatile operation
2876 may not modify any other memory accessible by the module being compiled.
2877 A volatile operation may not call any code in the current module.
2878
2879 The compiler may assume execution will continue after a volatile operation,
2880 so operations which modify memory or may have undefined behavior can be
2881 hoisted past a volatile operation.
2882
2883 As an exception to the preceding rule, the compiler may not assume execution
2884 will continue after a volatile store operation. This restriction is necessary
2885 to support the somewhat common pattern in C of intentionally storing to an
2886 invalid pointer to crash the program. In the future, it might make sense to
2887 allow frontends to control this behavior.
2888
2889 IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy
2890 or llvm.memmove intrinsics even when those intrinsics are flagged volatile.
2891 Likewise, the backend should never split or merge target-legal volatile
2892 load/store instructions. Similarly, IR-level volatile loads and stores cannot
2893 change from integer to floating-point or vice versa.
2894
2895 .. admonition:: Rationale
2896
2897  Platforms may rely on volatile loads and stores of natively supported
2898  data width to be executed as single instruction. For example, in C
2899  this holds for an l-value of volatile primitive type with native
2900  hardware support, but not necessarily for aggregate types. The
2901  frontend upholds these expectations, which are intentionally
2902  unspecified in the IR. The rules above ensure that IR transformations
2903  do not violate the frontend's contract with the language.
2904
2905 .. _memmodel:
2906
2907 Memory Model for Concurrent Operations
2908 --------------------------------------
2909
2910 The LLVM IR does not define any way to start parallel threads of
2911 execution or to register signal handlers. Nonetheless, there are
2912 platform-specific ways to create them, and we define LLVM IR's behavior
2913 in their presence. This model is inspired by the C++0x memory model.
2914
2915 For a more informal introduction to this model, see the :doc:`Atomics`.
2916
2917 We define a *happens-before* partial order as the least partial order
2918 that
2919
2920 -  Is a superset of single-thread program order, and
2921 -  When a *synchronizes-with* ``b``, includes an edge from ``a`` to
2922    ``b``. *Synchronizes-with* pairs are introduced by platform-specific
2923    techniques, like pthread locks, thread creation, thread joining,
2924    etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering
2925    Constraints <ordering>`).
2926
2927 Note that program order does not introduce *happens-before* edges
2928 between a thread and signals executing inside that thread.
2929
2930 Every (defined) read operation (load instructions, memcpy, atomic
2931 loads/read-modify-writes, etc.) R reads a series of bytes written by
2932 (defined) write operations (store instructions, atomic
2933 stores/read-modify-writes, memcpy, etc.). For the purposes of this
2934 section, initialized globals are considered to have a write of the
2935 initializer which is atomic and happens before any other read or write
2936 of the memory in question. For each byte of a read R, R\ :sub:`byte`
2937 may see any write to the same byte, except:
2938
2939 -  If write\ :sub:`1`  happens before write\ :sub:`2`, and
2940    write\ :sub:`2` happens before R\ :sub:`byte`, then
2941    R\ :sub:`byte` does not see write\ :sub:`1`.
2942 -  If R\ :sub:`byte` happens before write\ :sub:`3`, then
2943    R\ :sub:`byte` does not see write\ :sub:`3`.
2944
2945 Given that definition, R\ :sub:`byte` is defined as follows:
2946
2947 -  If R is volatile, the result is target-dependent. (Volatile is
2948    supposed to give guarantees which can support ``sig_atomic_t`` in
2949    C/C++, and may be used for accesses to addresses that do not behave
2950    like normal memory. It does not generally provide cross-thread
2951    synchronization.)
2952 -  Otherwise, if there is no write to the same byte that happens before
2953    R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte.
2954 -  Otherwise, if R\ :sub:`byte` may see exactly one write,
2955    R\ :sub:`byte` returns the value written by that write.
2956 -  Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
2957    see are atomic, it chooses one of the values written. See the :ref:`Atomic
2958    Memory Ordering Constraints <ordering>` section for additional
2959    constraints on how the choice is made.
2960 -  Otherwise R\ :sub:`byte` returns ``undef``.
2961
2962 R returns the value composed of the series of bytes it read. This
2963 implies that some bytes within the value may be ``undef`` **without**
2964 the entire value being ``undef``. Note that this only defines the
2965 semantics of the operation; it doesn't mean that targets will emit more
2966 than one instruction to read the series of bytes.
2967
2968 Note that in cases where none of the atomic intrinsics are used, this
2969 model places only one restriction on IR transformations on top of what
2970 is required for single-threaded execution: introducing a store to a byte
2971 which might not otherwise be stored is not allowed in general.
2972 (Specifically, in the case where another thread might write to and read
2973 from an address, introducing a store can change a load that may see
2974 exactly one write into a load that may see multiple writes.)
2975
2976 .. _ordering:
2977
2978 Atomic Memory Ordering Constraints
2979 ----------------------------------
2980
2981 Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`,
2982 :ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`,
2983 :ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take
2984 ordering parameters that determine which other atomic instructions on
2985 the same address they *synchronize with*. These semantics are borrowed
2986 from Java and C++0x, but are somewhat more colloquial. If these
2987 descriptions aren't precise enough, check those specs (see spec
2988 references in the :doc:`atomics guide <Atomics>`).
2989 :ref:`fence <i_fence>` instructions treat these orderings somewhat
2990 differently since they don't take an address. See that instruction's
2991 documentation for details.
2992
2993 For a simpler introduction to the ordering constraints, see the
2994 :doc:`Atomics`.
2995
2996 ``unordered``
2997     The set of values that can be read is governed by the happens-before
2998     partial order. A value cannot be read unless some operation wrote
2999     it. This is intended to provide a guarantee strong enough to model
3000     Java's non-volatile shared variables. This ordering cannot be
3001     specified for read-modify-write operations; it is not strong enough
3002     to make them atomic in any interesting way.
3003 ``monotonic``
3004     In addition to the guarantees of ``unordered``, there is a single
3005     total order for modifications by ``monotonic`` operations on each
3006     address. All modification orders must be compatible with the
3007     happens-before order. There is no guarantee that the modification
3008     orders can be combined to a global total order for the whole program
3009     (and this often will not be possible). The read in an atomic
3010     read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
3011     :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification
3012     order immediately before the value it writes. If one atomic read
3013     happens before another atomic read of the same address, the later
3014     read must see the same value or a later value in the address's
3015     modification order. This disallows reordering of ``monotonic`` (or
3016     stronger) operations on the same address. If an address is written
3017     ``monotonic``-ally by one thread, and other threads ``monotonic``-ally
3018     read that address repeatedly, the other threads must eventually see
3019     the write. This corresponds to the C++0x/C1x
3020     ``memory_order_relaxed``.
3021 ``acquire``
3022     In addition to the guarantees of ``monotonic``, a
3023     *synchronizes-with* edge may be formed with a ``release`` operation.
3024     This is intended to model C++'s ``memory_order_acquire``.
3025 ``release``
3026     In addition to the guarantees of ``monotonic``, if this operation
3027     writes a value which is subsequently read by an ``acquire``
3028     operation, it *synchronizes-with* that operation. (This isn't a
3029     complete description; see the C++0x definition of a release
3030     sequence.) This corresponds to the C++0x/C1x
3031     ``memory_order_release``.
3032 ``acq_rel`` (acquire+release)
3033     Acts as both an ``acquire`` and ``release`` operation on its
3034     address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``.
3035 ``seq_cst`` (sequentially consistent)
3036     In addition to the guarantees of ``acq_rel`` (``acquire`` for an
3037     operation that only reads, ``release`` for an operation that only
3038     writes), there is a global total order on all
3039     sequentially-consistent operations on all addresses, which is
3040     consistent with the *happens-before* partial order and with the
3041     modification orders of all the affected addresses. Each
3042     sequentially-consistent read sees the last preceding write to the
3043     same address in this global order. This corresponds to the C++0x/C1x
3044     ``memory_order_seq_cst`` and Java volatile.
3045
3046 .. _syncscope:
3047
3048 If an atomic operation is marked ``syncscope("singlethread")``, it only
3049 *synchronizes with* and only participates in the seq\_cst total orderings of
3050 other operations running in the same thread (for example, in signal handlers).
3051
3052 If an atomic operation is marked ``syncscope("<target-scope>")``, where
3053 ``<target-scope>`` is a target specific synchronization scope, then it is target
3054 dependent if it *synchronizes with* and participates in the seq\_cst total
3055 orderings of other operations.
3056
3057 Otherwise, an atomic operation that is not marked ``syncscope("singlethread")``
3058 or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the
3059 seq\_cst total orderings of other operations that are not marked
3060 ``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.
3061
3062 .. _floatenv:
3063
3064 Floating-Point Environment
3065 --------------------------
3066
3067 The default LLVM floating-point environment assumes that floating-point
3068 instructions do not have side effects. Results assume the round-to-nearest
3069 rounding mode. No floating-point exception state is maintained in this
3070 environment. Therefore, there is no attempt to create or preserve invalid
3071 operation (SNaN) or division-by-zero exceptions.
3072
3073 The benefit of this exception-free assumption is that floating-point
3074 operations may be speculated freely without any other fast-math relaxations
3075 to the floating-point model.
3076
3077 Code that requires different behavior than this should use the
3078 :ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
3079
3080 .. _fastmath:
3081
3082 Fast-Math Flags
3083 ---------------
3084
3085 LLVM IR floating-point operations (:ref:`fneg <i_fneg>`, :ref:`fadd <i_fadd>`,
3086 :ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
3087 :ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`), :ref:`phi <i_phi>`,
3088 :ref:`select <i_select>` and :ref:`call <i_call>`
3089 may use the following flags to enable otherwise unsafe
3090 floating-point transformations.
3091
3092 ``nnan``
3093    No NaNs - Allow optimizations to assume the arguments and result are not
3094    NaN. If an argument is a nan, or the result would be a nan, it produces
3095    a :ref:`poison value <poisonvalues>` instead.
3096
3097 ``ninf``
3098    No Infs - Allow optimizations to assume the arguments and result are not
3099    +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it
3100    produces a :ref:`poison value <poisonvalues>` instead.
3101
3102 ``nsz``
3103    No Signed Zeros - Allow optimizations to treat the sign of a zero
3104    argument or result as insignificant. This does not imply that -0.0
3105    is poison and/or guaranteed to not exist in the operation.
3106
3107 ``arcp``
3108    Allow Reciprocal - Allow optimizations to use the reciprocal of an
3109    argument rather than perform division.
3110
3111 ``contract``
3112    Allow floating-point contraction (e.g. fusing a multiply followed by an
3113    addition into a fused multiply-and-add). This does not enable reassociating
3114    to form arbitrary contractions. For example, ``(a*b) + (c*d) + e`` can not
3115    be transformed into ``(a*b) + ((c*d) + e)`` to create two fma operations.
3116
3117 ``afn``
3118    Approximate functions - Allow substitution of approximate calculations for
3119    functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
3120    for places where this can apply to LLVM's intrinsic math functions.
3121
3122 ``reassoc``
3123    Allow reassociation transformations for floating-point instructions.
3124    This may dramatically change results in floating-point.
3125
3126 ``fast``
3127    This flag implies all of the others.
3128
3129 .. _uselistorder:
3130
3131 Use-list Order Directives
3132 -------------------------
3133
3134 Use-list directives encode the in-memory order of each use-list, allowing the
3135 order to be recreated. ``<order-indexes>`` is a comma-separated list of
3136 indexes that are assigned to the referenced value's uses. The referenced
3137 value's use-list is immediately sorted by these indexes.
3138
3139 Use-list directives may appear at function scope or global scope. They are not
3140 instructions, and have no effect on the semantics of the IR. When they're at
3141 function scope, they must appear after the terminator of the final basic block.
3142
3143 If basic blocks have their address taken via ``blockaddress()`` expressions,
3144 ``uselistorder_bb`` can be used to reorder their use-lists from outside their
3145 function's scope.
3146
3147 :Syntax:
3148
3149 ::
3150
3151     uselistorder <ty> <value>, { <order-indexes> }
3152     uselistorder_bb @function, %block { <order-indexes> }
3153
3154 :Examples:
3155
3156 ::
3157
3158     define void @foo(i32 %arg1, i32 %arg2) {
3159     entry:
3160       ; ... instructions ...
3161     bb:
3162       ; ... instructions ...
3163
3164       ; At function scope.
3165       uselistorder i32 %arg1, { 1, 0, 2 }
3166       uselistorder label %bb, { 1, 0 }
3167     }
3168
3169     ; At global scope.
3170     uselistorder i32* @global, { 1, 2, 0 }
3171     uselistorder i32 7, { 1, 0 }
3172     uselistorder i32 (i32) @bar, { 1, 0 }
3173     uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 }
3174
3175 .. _source_filename:
3176
3177 Source Filename
3178 ---------------
3179
3180 The *source filename* string is set to the original module identifier,
3181 which will be the name of the compiled source file when compiling from
3182 source through the clang front end, for example. It is then preserved through
3183 the IR and bitcode.
3184
3185 This is currently necessary to generate a consistent unique global
3186 identifier for local functions used in profile data, which prepends the
3187 source file name to the local function name.
3188
3189 The syntax for the source file name is simply:
3190
3191 .. code-block:: text
3192
3193     source_filename = "/path/to/source.c"
3194
3195 .. _typesystem:
3196
3197 Type System
3198 ===========
3199
3200 The LLVM type system is one of the most important features of the
3201 intermediate representation. Being typed enables a number of
3202 optimizations to be performed on the intermediate representation
3203 directly, without having to do extra analyses on the side before the
3204 transformation. A strong type system makes it easier to read the
3205 generated code and enables novel analyses and transformations that are
3206 not feasible to perform on normal three address code representations.
3207
3208 .. _t_void:
3209
3210 Void Type
3211 ---------
3212
3213 :Overview:
3214
3215
3216 The void type does not represent any value and has no size.
3217
3218 :Syntax:
3219
3220
3221 ::
3222
3223       void
3224
3225
3226 .. _t_function:
3227
3228 Function Type
3229 -------------
3230
3231 :Overview:
3232
3233
3234 The function type can be thought of as a function signature. It consists of a
3235 return type and a list of formal parameter types. The return type of a function
3236 type is a void type or first class type --- except for :ref:`label <t_label>`
3237 and :ref:`metadata <t_metadata>` types.
3238
3239 :Syntax:
3240
3241 ::
3242
3243       <returntype> (<parameter list>)
3244
3245 ...where '``<parameter list>``' is a comma-separated list of type
3246 specifiers. Optionally, the parameter list may include a type ``...``, which
3247 indicates that the function takes a variable number of arguments. Variable
3248 argument functions can access their arguments with the :ref:`variable argument
3249 handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type
3250 except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`.
3251
3252 :Examples:
3253
3254 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3255 | ``i32 (i32)``                   | function taking an ``i32``, returning an ``i32``                                                                                                                    |
3256 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3257 | ``float (i16, i32 *) *``        | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``.                                    |
3258 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3259 | ``i32 (i8*, ...)``              | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. |
3260 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3261 | ``{i32, i32} (i32)``            | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values                                                                 |
3262 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3263
3264 .. _t_firstclass:
3265
3266 First Class Types
3267 -----------------
3268
3269 The :ref:`first class <t_firstclass>` types are perhaps the most important.
3270 Values of these types are the only ones which can be produced by
3271 instructions.
3272
3273 .. _t_single_value:
3274
3275 Single Value Types
3276 ^^^^^^^^^^^^^^^^^^
3277
3278 These are the types that are valid in registers from CodeGen's perspective.
3279
3280 .. _t_integer:
3281
3282 Integer Type
3283 """"""""""""
3284
3285 :Overview:
3286
3287 The integer type is a very simple type that simply specifies an
3288 arbitrary bit width for the integer type desired. Any bit width from 1
3289 bit to 2\ :sup:`23`\ -1 (about 8 million) can be specified.
3290
3291 :Syntax:
3292
3293 ::
3294
3295       iN
3296
3297 The number of bits the integer will occupy is specified by the ``N``
3298 value.
3299
3300 Examples:
3301 *********
3302
3303 +----------------+------------------------------------------------+
3304 | ``i1``         | a single-bit integer.                          |
3305 +----------------+------------------------------------------------+
3306 | ``i32``        | a 32-bit integer.                              |
3307 +----------------+------------------------------------------------+
3308 | ``i1942652``   | a really big integer of over 1 million bits.   |
3309 +----------------+------------------------------------------------+
3310
3311 .. _t_floating:
3312
3313 Floating-Point Types
3314 """"""""""""""""""""
3315
3316 .. list-table::
3317    :header-rows: 1
3318
3319    * - Type
3320      - Description
3321
3322    * - ``half``
3323      - 16-bit floating-point value
3324
3325    * - ``bfloat``
3326      - 16-bit "brain" floating-point value (7-bit significand).  Provides the
3327        same number of exponent bits as ``float``, so that it matches its dynamic
3328        range, but with greatly reduced precision.  Used in Intel's AVX-512 BF16
3329        extensions and Arm's ARMv8.6-A extensions, among others.
3330
3331    * - ``float``
3332      - 32-bit floating-point value
3333
3334    * - ``double``
3335      - 64-bit floating-point value
3336
3337    * - ``fp128``
3338      - 128-bit floating-point value (113-bit significand)
3339
3340    * - ``x86_fp80``
3341      -  80-bit floating-point value (X87)
3342
3343    * - ``ppc_fp128``
3344      - 128-bit floating-point value (two 64-bits)
3345
3346 The binary format of half, float, double, and fp128 correspond to the
3347 IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128
3348 respectively.
3349
3350 X86_amx Type
3351 """"""""""""
3352
3353 :Overview:
3354
3355 The x86_amx type represents a value held in an AMX tile register on an x86
3356 machine. The operations allowed on it are quite limited. Only few intrinsics
3357 are allowed: stride load and store, zero and dot product. No instruction is
3358 allowed for this type. There are no arguments, arrays, pointers, vectors
3359 or constants of this type.
3360
3361 :Syntax:
3362
3363 ::
3364
3365       x86_amx
3366
3367
3368 X86_mmx Type
3369 """"""""""""
3370
3371 :Overview:
3372
3373 The x86_mmx type represents a value held in an MMX register on an x86
3374 machine. The operations allowed on it are quite limited: parameters and
3375 return values, load and store, and bitcast. User-specified MMX
3376 instructions are represented as intrinsic or asm calls with arguments
3377 and/or results of this type. There are no arrays, vectors or constants
3378 of this type.
3379
3380 :Syntax:
3381
3382 ::
3383
3384       x86_mmx
3385
3386
3387 .. _t_pointer:
3388
3389 Pointer Type
3390 """"""""""""
3391
3392 :Overview:
3393
3394 The pointer type is used to specify memory locations. Pointers are
3395 commonly used to reference objects in memory.
3396
3397 Pointer types may have an optional address space attribute defining the
3398 numbered address space where the pointed-to object resides. The default
3399 address space is number zero. The semantics of non-zero address spaces
3400 are target-specific.
3401
3402 Note that LLVM does not permit pointers to void (``void*``) nor does it
3403 permit pointers to labels (``label*``). Use ``i8*`` instead.
3404
3405 LLVM is in the process of transitioning to
3406 `opaque pointers <OpaquePointers.html#opaque-pointers>`_.
3407 Opaque pointers do not have a pointee type. Rather, instructions
3408 interacting through pointers specify the type of the underlying memory
3409 they are interacting with. Opaque pointers are still in the process of
3410 being worked on and are not complete.
3411
3412 :Syntax:
3413
3414 ::
3415
3416       <type> *
3417       ptr
3418
3419 :Examples:
3420
3421 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3422 | ``[4 x i32]*``          | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values.                               |
3423 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3424 | ``i32 (i32*) *``        | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. |
3425 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3426 | ``i32 addrspace(5)*``   | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space 5.                            |
3427 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3428 | ``ptr``                 | An opaque pointer type to a value that resides in address space 0.                                           |
3429 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3430 | ``ptr addrspace(5)``    | An opaque pointer type to a value that resides in address space 5.                                           |
3431 +-------------------------+--------------------------------------------------------------------------------------------------------------+
3432
3433 .. _t_vector:
3434
3435 Vector Type
3436 """""""""""
3437
3438 :Overview:
3439
3440 A vector type is a simple derived type that represents a vector of
3441 elements. Vector types are used when multiple primitive data are
3442 operated in parallel using a single instruction (SIMD). A vector type
3443 requires a size (number of elements), an underlying primitive data type,
3444 and a scalable property to represent vectors where the exact hardware
3445 vector length is unknown at compile time. Vector types are considered
3446 :ref:`first class <t_firstclass>`.
3447
3448 :Memory Layout:
3449
3450 In general vector elements are laid out in memory in the same way as
3451 :ref:`array types <t_array>`. Such an analogy works fine as long as the vector
3452 elements are byte sized. However, when the elements of the vector aren't byte
3453 sized it gets a bit more complicated. One way to describe the layout is by
3454 describing what happens when a vector such as <N x iM> is bitcasted to an
3455 integer type with N*M bits, and then following the rules for storing such an
3456 integer to memory.
3457
3458 A bitcast from a vector type to a scalar integer type will see the elements
3459 being packed together (without padding). The order in which elements are
3460 inserted in the integer depends on endianess. For little endian element zero
3461 is put in the least significant bits of the integer, and for big endian
3462 element zero is put in the most significant bits.
3463
3464 Using a vector such as ``<i4 1, i4 2, i4 3, i4 5>`` as an example, together
3465 with the analogy that we can replace a vector store by a bitcast followed by
3466 an integer store, we get this for big endian:
3467
3468 .. code-block:: llvm
3469
3470       %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
3471
3472       ; Bitcasting from a vector to an integral type can be seen as
3473       ; concatenating the values:
3474       ;   %val now has the hexadecimal value 0x1235.
3475
3476       store i16 %val, i16* %ptr
3477
3478       ; In memory the content will be (8-bit addressing):
3479       ;
3480       ;    [%ptr + 0]: 00010010  (0x12)
3481       ;    [%ptr + 1]: 00110101  (0x35)
3482
3483 The same example for little endian:
3484
3485 .. code-block:: llvm
3486
3487       %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
3488
3489       ; Bitcasting from a vector to an integral type can be seen as
3490       ; concatenating the values:
3491       ;   %val now has the hexadecimal value 0x5321.
3492
3493       store i16 %val, i16* %ptr
3494
3495       ; In memory the content will be (8-bit addressing):
3496       ;
3497       ;    [%ptr + 0]: 01010011  (0x53)
3498       ;    [%ptr + 1]: 00100001  (0x21)
3499
3500 When ``<N*M>`` isn't evenly divisible by the byte size the exact memory layout
3501 is unspecified (just like it is for an integral type of the same size). This
3502 is because different targets could put the padding at different positions when
3503 the type size is smaller than the type's store size.
3504
3505 :Syntax:
3506
3507 ::
3508
3509       < <# elements> x <elementtype> >          ; Fixed-length vector
3510       < vscale x <# elements> x <elementtype> > ; Scalable vector
3511
3512 The number of elements is a constant integer value larger than 0;
3513 elementtype may be any integer, floating-point or pointer type. Vectors
3514 of size zero are not allowed. For scalable vectors, the total number of
3515 elements is a constant multiple (called vscale) of the specified number
3516 of elements; vscale is a positive integer that is unknown at compile time
3517 and the same hardware-dependent constant for all scalable vectors at run
3518 time. The size of a specific scalable vector type is thus constant within
3519 IR, even if the exact size in bytes cannot be determined until run time.
3520
3521 :Examples:
3522
3523 +------------------------+----------------------------------------------------+
3524 | ``<4 x i32>``          | Vector of 4 32-bit integer values.                 |
3525 +------------------------+----------------------------------------------------+
3526 | ``<8 x float>``        | Vector of 8 32-bit floating-point values.          |
3527 +------------------------+----------------------------------------------------+
3528 | ``<2 x i64>``          | Vector of 2 64-bit integer values.                 |
3529 +------------------------+----------------------------------------------------+
3530 | ``<4 x i64*>``         | Vector of 4 pointers to 64-bit integer values.     |
3531 +------------------------+----------------------------------------------------+
3532 | ``<vscale x 4 x i32>`` | Vector with a multiple of 4 32-bit integer values. |
3533 +------------------------+----------------------------------------------------+
3534
3535 .. _t_label:
3536
3537 Label Type
3538 ^^^^^^^^^^
3539
3540 :Overview:
3541
3542 The label type represents code labels.
3543
3544 :Syntax:
3545
3546 ::
3547
3548       label
3549
3550 .. _t_token:
3551
3552 Token Type
3553 ^^^^^^^^^^
3554
3555 :Overview:
3556
3557 The token type is used when a value is associated with an instruction
3558 but all uses of the value must not attempt to introspect or obscure it.
3559 As such, it is not appropriate to have a :ref:`phi <i_phi>` or
3560 :ref:`select <i_select>` of type token.
3561
3562 :Syntax:
3563
3564 ::
3565
3566       token
3567
3568
3569
3570 .. _t_metadata:
3571
3572 Metadata Type
3573 ^^^^^^^^^^^^^
3574
3575 :Overview:
3576
3577 The metadata type represents embedded metadata. No derived types may be
3578 created from metadata except for :ref:`function <t_function>` arguments.
3579
3580 :Syntax:
3581
3582 ::
3583
3584       metadata
3585
3586 .. _t_aggregate:
3587
3588 Aggregate Types
3589 ^^^^^^^^^^^^^^^
3590
3591 Aggregate Types are a subset of derived types that can contain multiple
3592 member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are
3593 aggregate types. :ref:`Vectors <t_vector>` are not considered to be
3594 aggregate types.
3595
3596 .. _t_array:
3597
3598 Array Type
3599 """"""""""
3600
3601 :Overview:
3602
3603 The array type is a very simple derived type that arranges elements
3604 sequentially in memory. The array type requires a size (number of
3605 elements) and an underlying data type.
3606
3607 :Syntax:
3608
3609 ::
3610
3611       [<# elements> x <elementtype>]
3612
3613 The number of elements is a constant integer value; ``elementtype`` may
3614 be any type with a size.
3615
3616 :Examples:
3617
3618 +------------------+--------------------------------------+
3619 | ``[40 x i32]``   | Array of 40 32-bit integer values.   |
3620 +------------------+--------------------------------------+
3621 | ``[41 x i32]``   | Array of 41 32-bit integer values.   |
3622 +------------------+--------------------------------------+
3623 | ``[4 x i8]``     | Array of 4 8-bit integer values.     |
3624 +------------------+--------------------------------------+
3625
3626 Here are some examples of multidimensional arrays:
3627
3628 +-----------------------------+----------------------------------------------------------+
3629 | ``[3 x [4 x i32]]``         | 3x4 array of 32-bit integer values.                      |
3630 +-----------------------------+----------------------------------------------------------+
3631 | ``[12 x [10 x float]]``     | 12x10 array of single precision floating-point values.   |
3632 +-----------------------------+----------------------------------------------------------+
3633 | ``[2 x [3 x [4 x i16]]]``   | 2x3x4 array of 16-bit integer values.                    |
3634 +-----------------------------+----------------------------------------------------------+
3635
3636 There is no restriction on indexing beyond the end of the array implied
3637 by a static type (though there are restrictions on indexing beyond the
3638 bounds of an allocated object in some cases). This means that
3639 single-dimension 'variable sized array' addressing can be implemented in
3640 LLVM with a zero length array type. An implementation of 'pascal style
3641 arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for
3642 example.
3643
3644 .. _t_struct:
3645
3646 Structure Type
3647 """"""""""""""
3648
3649 :Overview:
3650
3651 The structure type is used to represent a collection of data members
3652 together in memory. The elements of a structure may be any type that has
3653 a size.
3654
3655 Structures in memory are accessed using '``load``' and '``store``' by
3656 getting a pointer to a field with the '``getelementptr``' instruction.
3657 Structures in registers are accessed using the '``extractvalue``' and
3658 '``insertvalue``' instructions.
3659
3660 Structures may optionally be "packed" structures, which indicate that
3661 the alignment of the struct is one byte, and that there is no padding
3662 between the elements. In non-packed structs, padding between field types
3663 is inserted as defined by the DataLayout string in the module, which is
3664 required to match what the underlying code generator expects.
3665
3666 Structures can either be "literal" or "identified". A literal structure
3667 is defined inline with other types (e.g. ``{i32, i32}*``) whereas
3668 identified types are always defined at the top level with a name.
3669 Literal types are uniqued by their contents and can never be recursive
3670 or opaque since there is no way to write one. Identified types can be
3671 recursive, can be opaqued, and are never uniqued.
3672
3673 :Syntax:
3674
3675 ::
3676
3677       %T1 = type { <type list> }     ; Identified normal struct type
3678       %T2 = type <{ <type list> }>   ; Identified packed struct type
3679
3680 :Examples:
3681
3682 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3683 | ``{ i32, i32, i32 }``        | A triple of three ``i32`` values                                                                                                                                                      |
3684 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3685 | ``{ float, i32 (i32) * }``   | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``.  |
3686 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3687 | ``<{ i8, i32 }>``            | A packed struct known to be 5 bytes in size.                                                                                                                                          |
3688 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3689
3690 .. _t_opaque:
3691
3692 Opaque Structure Types
3693 """"""""""""""""""""""
3694
3695 :Overview:
3696
3697 Opaque structure types are used to represent structure types that
3698 do not have a body specified. This corresponds (for example) to the C
3699 notion of a forward declared structure. They can be named (``%X``) or
3700 unnamed (``%52``).
3701
3702 :Syntax:
3703
3704 ::
3705
3706       %X = type opaque
3707       %52 = type opaque
3708
3709 :Examples:
3710
3711 +--------------+-------------------+
3712 | ``opaque``   | An opaque type.   |
3713 +--------------+-------------------+
3714
3715 .. _constants:
3716
3717 Constants
3718 =========
3719
3720 LLVM has several different basic types of constants. This section
3721 describes them all and their syntax.
3722
3723 Simple Constants
3724 ----------------
3725
3726 **Boolean constants**
3727     The two strings '``true``' and '``false``' are both valid constants
3728     of the ``i1`` type.
3729 **Integer constants**
3730     Standard integers (such as '4') are constants of the
3731     :ref:`integer <t_integer>` type. Negative numbers may be used with
3732     integer types.
3733 **Floating-point constants**
3734     Floating-point constants use standard decimal notation (e.g.
3735     123.421), exponential notation (e.g. 1.23421e+2), or a more precise
3736     hexadecimal notation (see below). The assembler requires the exact
3737     decimal value of a floating-point constant. For example, the
3738     assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
3739     decimal in binary. Floating-point constants must have a
3740     :ref:`floating-point <t_floating>` type.
3741 **Null pointer constants**
3742     The identifier '``null``' is recognized as a null pointer constant
3743     and must be of :ref:`pointer type <t_pointer>`.
3744 **Token constants**
3745     The identifier '``none``' is recognized as an empty token constant
3746     and must be of :ref:`token type <t_token>`.
3747
3748 The one non-intuitive notation for constants is the hexadecimal form of
3749 floating-point constants. For example, the form
3750 '``double    0x432ff973cafa8000``' is equivalent to (but harder to read
3751 than) '``double 4.5e+15``'. The only time hexadecimal floating-point
3752 constants are required (and the only time that they are generated by the
3753 disassembler) is when a floating-point constant must be emitted but it
3754 cannot be represented as a decimal floating-point number in a reasonable
3755 number of digits. For example, NaN's, infinities, and other special
3756 values are represented in their IEEE hexadecimal format so that assembly
3757 and disassembly do not cause any bits to change in the constants.
3758
3759 When using the hexadecimal form, constants of types bfloat, half, float, and
3760 double are represented using the 16-digit form shown above (which matches the
3761 IEEE754 representation for double); bfloat, half and float values must, however,
3762 be exactly representable as bfloat, IEEE 754 half, and IEEE 754 single
3763 precision respectively. Hexadecimal format is always used for long double, and
3764 there are three forms of long double. The 80-bit format used by x86 is
3765 represented as ``0xK`` followed by 20 hexadecimal digits. The 128-bit format
3766 used by PowerPC (two adjacent doubles) is represented by ``0xM`` followed by 32
3767 hexadecimal digits. The IEEE 128-bit format is represented by ``0xL`` followed
3768 by 32 hexadecimal digits. Long doubles will only work if they match the long
3769 double format on your target.  The IEEE 16-bit format (half precision) is
3770 represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit
3771 format is represented by ``0xR`` followed by 4 hexadecimal digits. All
3772 hexadecimal formats are big-endian (sign bit at the left).
3773
3774 There are no constants of type x86_mmx and x86_amx.
3775
3776 .. _complexconstants:
3777
3778 Complex Constants
3779 -----------------
3780
3781 Complex constants are a (potentially recursive) combination of simple
3782 constants and smaller complex constants.
3783
3784 **Structure constants**
3785     Structure constants are represented with notation similar to
3786     structure type definitions (a comma separated list of elements,
3787     surrounded by braces (``{}``)). For example:
3788     "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as
3789     "``@G = external global i32``". Structure constants must have
3790     :ref:`structure type <t_struct>`, and the number and types of elements
3791     must match those specified by the type.
3792 **Array constants**
3793     Array constants are represented with notation similar to array type
3794     definitions (a comma separated list of elements, surrounded by
3795     square brackets (``[]``)). For example:
3796     "``[ i32 42, i32 11, i32 74 ]``". Array constants must have
3797     :ref:`array type <t_array>`, and the number and types of elements must
3798     match those specified by the type. As a special case, character array
3799     constants may also be represented as a double-quoted string using the ``c``
3800     prefix. For example: "``c"Hello World\0A\00"``".
3801 **Vector constants**
3802     Vector constants are represented with notation similar to vector
3803     type definitions (a comma separated list of elements, surrounded by
3804     less-than/greater-than's (``<>``)). For example:
3805     "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants
3806     must have :ref:`vector type <t_vector>`, and the number and types of
3807     elements must match those specified by the type.
3808 **Zero initialization**
3809     The string '``zeroinitializer``' can be used to zero initialize a
3810     value to zero of *any* type, including scalar and
3811     :ref:`aggregate <t_aggregate>` types. This is often used to avoid
3812     having to print large zero initializers (e.g. for large arrays) and
3813     is always exactly equivalent to using explicit zero initializers.
3814 **Metadata node**
3815     A metadata node is a constant tuple without types. For example:
3816     "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values,
3817     for example: "``!{!0, i32 0, i8* @global, i64 (i64)* @function, !"str"}``".
3818     Unlike other typed constants that are meant to be interpreted as part of
3819     the instruction stream, metadata is a place to attach additional
3820     information such as debug info.
3821
3822 Global Variable and Function Addresses
3823 --------------------------------------
3824
3825 The addresses of :ref:`global variables <globalvars>` and
3826 :ref:`functions <functionstructure>` are always implicitly valid
3827 (link-time) constants. These constants are explicitly referenced when
3828 the :ref:`identifier for the global <identifiers>` is used and always have
3829 :ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM
3830 file:
3831
3832 .. code-block:: llvm
3833
3834     @X = global i32 17
3835     @Y = global i32 42
3836     @Z = global [2 x i32*] [ i32* @X, i32* @Y ]
3837
3838 .. _undefvalues:
3839
3840 Undefined Values
3841 ----------------
3842
3843 The string '``undef``' can be used anywhere a constant is expected, and
3844 indicates that the user of the value may receive an unspecified
3845 bit-pattern. Undefined values may be of any type (other than '``label``'
3846 or '``void``') and be used anywhere a constant is permitted.
3847
3848 Undefined values are useful because they indicate to the compiler that
3849 the program is well defined no matter what value is used. This gives the
3850 compiler more freedom to optimize. Here are some examples of
3851 (potentially surprising) transformations that are valid (in pseudo IR):
3852
3853 .. code-block:: llvm
3854
3855       %A = add %X, undef
3856       %B = sub %X, undef
3857       %C = xor %X, undef
3858     Safe:
3859       %A = undef
3860       %B = undef
3861       %C = undef
3862
3863 This is safe because all of the output bits are affected by the undef
3864 bits. Any output bit can have a zero or one depending on the input bits.
3865
3866 .. code-block:: llvm
3867
3868       %A = or %X, undef
3869       %B = and %X, undef
3870     Safe:
3871       %A = -1
3872       %B = 0
3873     Safe:
3874       %A = %X  ;; By choosing undef as 0
3875       %B = %X  ;; By choosing undef as -1
3876     Unsafe:
3877       %A = undef
3878       %B = undef
3879
3880 These logical operations have bits that are not always affected by the
3881 input. For example, if ``%X`` has a zero bit, then the output of the
3882 '``and``' operation will always be a zero for that bit, no matter what
3883 the corresponding bit from the '``undef``' is. As such, it is unsafe to
3884 optimize or assume that the result of the '``and``' is '``undef``'.
3885 However, it is safe to assume that all bits of the '``undef``' could be
3886 0, and optimize the '``and``' to 0. Likewise, it is safe to assume that
3887 all the bits of the '``undef``' operand to the '``or``' could be set,
3888 allowing the '``or``' to be folded to -1.
3889
3890 .. code-block:: llvm
3891
3892       %A = select undef, %X, %Y
3893       %B = select undef, 42, %Y
3894       %C = select %X, %Y, undef
3895     Safe:
3896       %A = %X     (or %Y)
3897       %B = 42     (or %Y)
3898       %C = %Y
3899     Unsafe:
3900       %A = undef
3901       %B = undef
3902       %C = undef
3903
3904 This set of examples shows that undefined '``select``' (and conditional
3905 branch) conditions can go *either way*, but they have to come from one
3906 of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were
3907 both known to have a clear low bit, then ``%A`` would have to have a
3908 cleared low bit. However, in the ``%C`` example, the optimizer is
3909 allowed to assume that the '``undef``' operand could be the same as
3910 ``%Y``, allowing the whole '``select``' to be eliminated.
3911
3912 .. code-block:: llvm
3913
3914       %A = xor undef, undef
3915
3916       %B = undef
3917       %C = xor %B, %B
3918
3919       %D = undef
3920       %E = icmp slt %D, 4
3921       %F = icmp gte %D, 4
3922
3923     Safe:
3924       %A = undef
3925       %B = undef
3926       %C = undef
3927       %D = undef
3928       %E = undef
3929       %F = undef
3930
3931 This example points out that two '``undef``' operands are not
3932 necessarily the same. This can be surprising to people (and also matches
3933 C semantics) where they assume that "``X^X``" is always zero, even if
3934 ``X`` is undefined. This isn't true for a number of reasons, but the
3935 short answer is that an '``undef``' "variable" can arbitrarily change
3936 its value over its "live range". This is true because the variable
3937 doesn't actually *have a live range*. Instead, the value is logically
3938 read from arbitrary registers that happen to be around when needed, so
3939 the value is not necessarily consistent over time. In fact, ``%A`` and
3940 ``%C`` need to have the same semantics or the core LLVM "replace all
3941 uses with" concept would not hold.
3942
3943 To ensure all uses of a given register observe the same value (even if
3944 '``undef``'), the :ref:`freeze instruction <i_freeze>` can be used.
3945
3946 .. code-block:: llvm
3947
3948       %A = sdiv undef, %X
3949       %B = sdiv %X, undef
3950     Safe:
3951       %A = 0
3952     b: unreachable
3953
3954 These examples show the crucial difference between an *undefined value*
3955 and *undefined behavior*. An undefined value (like '``undef``') is
3956 allowed to have an arbitrary bit-pattern. This means that the ``%A``
3957 operation can be constant folded to '``0``', because the '``undef``'
3958 could be zero, and zero divided by any value is zero.
3959 However, in the second example, we can make a more aggressive
3960 assumption: because the ``undef`` is allowed to be an arbitrary value,
3961 we are allowed to assume that it could be zero. Since a divide by zero
3962 has *undefined behavior*, we are allowed to assume that the operation
3963 does not execute at all. This allows us to delete the divide and all
3964 code after it. Because the undefined operation "can't happen", the
3965 optimizer can assume that it occurs in dead code.
3966
3967 .. code-block:: text
3968
3969     a:  store undef -> %X
3970     b:  store %X -> undef
3971     Safe:
3972     a: <deleted>
3973     b: unreachable
3974
3975 A store *of* an undefined value can be assumed to not have any effect;
3976 we can assume that the value is overwritten with bits that happen to
3977 match what was already there. However, a store *to* an undefined
3978 location could clobber arbitrary memory, therefore, it has undefined
3979 behavior.
3980
3981 Branching on an undefined value is undefined behavior.
3982 This explains optimizations that depend on branch conditions to construct
3983 predicates, such as Correlated Value Propagation and Global Value Numbering.
3984 In case of switch instruction, the branch condition should be frozen, otherwise
3985 it is undefined behavior.
3986
3987 .. code-block:: llvm
3988
3989     Unsafe:
3990       br undef, BB1, BB2 ; UB
3991
3992       %X = and i32 undef, 255
3993       switch %X, label %ret [ .. ] ; UB
3994
3995       store undef, i8* %ptr
3996       %X = load i8* %ptr ; %X is undef
3997       switch i8 %X, label %ret [ .. ] ; UB
3998
3999     Safe:
4000       %X = or i8 undef, 255 ; always 255
4001       switch i8 %X, label %ret [ .. ] ; Well-defined
4002
4003       %X = freeze i1 undef
4004       br %X, BB1, BB2 ; Well-defined (non-deterministic jump)
4005
4006
4007 This is also consistent with the behavior of MemorySanitizer.
4008 MemorySanitizer, detector of uses of uninitialized memory,
4009 defines a branch with condition that depends on an undef value (or
4010 certain other values, like e.g. a result of a load from heap-allocated
4011 memory that has never been stored to) to have an externally visible
4012 side effect. For this reason functions with *sanitize_memory*
4013 attribute are not allowed to produce such branches "out of thin
4014 air". More strictly, an optimization that inserts a conditional branch
4015 is only valid if in all executions where the branch condition has at
4016 least one undefined bit, the same branch condition is evaluated in the
4017 input IR as well.
4018
4019 .. _poisonvalues:
4020
4021 Poison Values
4022 -------------
4023
4024 A poison value is a result of an erroneous operation.
4025 In order to facilitate speculative execution, many instructions do not
4026 invoke immediate undefined behavior when provided with illegal operands,
4027 and return a poison value instead.
4028 The string '``poison``' can be used anywhere a constant is expected, and
4029 operations such as :ref:`add <i_add>` with the ``nsw`` flag can produce
4030 a poison value.
4031
4032 Poison value behavior is defined in terms of value *dependence*:
4033
4034 -  Values other than :ref:`phi <i_phi>` nodes, :ref:`select <i_select>`, and
4035    :ref:`freeze <i_freeze>` instructions depend on their operands.
4036 -  :ref:`Phi <i_phi>` nodes depend on the operand corresponding to
4037    their dynamic predecessor basic block.
4038 -  :ref:`Select <i_select>` instructions depend on their condition operand and
4039    their selected operand.
4040 -  Function arguments depend on the corresponding actual argument values
4041    in the dynamic callers of their functions.
4042 -  :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>`
4043    instructions that dynamically transfer control back to them.
4044 -  :ref:`Invoke <i_invoke>` instructions depend on the
4045    :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing
4046    call instructions that dynamically transfer control back to them.
4047 -  Non-volatile loads and stores depend on the most recent stores to all
4048    of the referenced memory addresses, following the order in the IR
4049    (including loads and stores implied by intrinsics such as
4050    :ref:`@llvm.memcpy <int_memcpy>`.)
4051 -  An instruction with externally visible side effects depends on the
4052    most recent preceding instruction with externally visible side
4053    effects, following the order in the IR. (This includes :ref:`volatile
4054    operations <volatile>`.)
4055 -  An instruction *control-depends* on a :ref:`terminator
4056    instruction <terminators>` if the terminator instruction has
4057    multiple successors and the instruction is always executed when
4058    control transfers to one of the successors, and may not be executed
4059    when control is transferred to another.
4060 -  Additionally, an instruction also *control-depends* on a terminator
4061    instruction if the set of instructions it otherwise depends on would
4062    be different if the terminator had transferred control to a different
4063    successor.
4064 -  Dependence is transitive.
4065 -  Vector elements may be independently poisoned. Therefore, transforms
4066    on instructions such as shufflevector must be careful to propagate
4067    poison across values or elements only as allowed by the original code.
4068
4069 An instruction that *depends* on a poison value, produces a poison value
4070 itself. A poison value may be relaxed into an
4071 :ref:`undef value <undefvalues>`, which takes an arbitrary bit-pattern.
4072 Propagation of poison can be stopped with the
4073 :ref:`freeze instruction <i_freeze>`.
4074
4075 This means that immediate undefined behavior occurs if a poison value is
4076 used as an instruction operand that has any values that trigger undefined
4077 behavior. Notably this includes (but is not limited to):
4078
4079 -  The pointer operand of a :ref:`load <i_load>`, :ref:`store <i_store>` or
4080    any other pointer dereferencing instruction (independent of address
4081    space).
4082 -  The divisor operand of a ``udiv``, ``sdiv``, ``urem`` or ``srem``
4083    instruction.
4084 -  The condition operand of a :ref:`br <i_br>` instruction.
4085 -  The callee operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4086    instruction.
4087 -  The parameter operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4088    instruction, when the function or invoking call site has a ``noundef``
4089    attribute in the corresponding position.
4090 -  The operand of a :ref:`ret <i_ret>` instruction if the function or invoking
4091    call site has a `noundef` attribute in the return value position.
4092
4093 Here are some examples:
4094
4095 .. code-block:: llvm
4096
4097     entry:
4098       %poison = sub nuw i32 0, 1           ; Results in a poison value.
4099       %poison2 = sub i32 poison, 1         ; Also results in a poison value.
4100       %still_poison = and i32 %poison, 0   ; 0, but also poison.
4101       %poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison
4102       store i32 0, i32* %poison_yet_again  ; Undefined behavior due to
4103                                            ; store to poison.
4104
4105       store i32 %poison, i32* @g           ; Poison value stored to memory.
4106       %poison3 = load i32, i32* @g         ; Poison value loaded back from memory.
4107
4108       %narrowaddr = bitcast i32* @g to i16*
4109       %wideaddr = bitcast i32* @g to i64*
4110       %poison4 = load i16, i16* %narrowaddr ; Returns a poison value.
4111       %poison5 = load i64, i64* %wideaddr   ; Returns a poison value.
4112
4113       %cmp = icmp slt i32 %poison, 0       ; Returns a poison value.
4114       br i1 %cmp, label %end, label %end   ; undefined behavior
4115
4116     end:
4117
4118 .. _welldefinedvalues:
4119
4120 Well-Defined Values
4121 -------------------
4122
4123 Given a program execution, a value is *well defined* if the value does not
4124 have an undef bit and is not poison in the execution.
4125 An aggregate value or vector is well defined if its elements are well defined.
4126 The padding of an aggregate isn't considered, since it isn't visible
4127 without storing it into memory and loading it with a different type.
4128
4129 A constant of a :ref:`single value <t_single_value>`, non-vector type is well
4130 defined if it is neither '``undef``' constant nor '``poison``' constant.
4131 The result of :ref:`freeze instruction <i_freeze>` is well defined regardless
4132 of its operand.
4133
4134 .. _blockaddress:
4135
4136 Addresses of Basic Blocks
4137 -------------------------
4138
4139 ``blockaddress(@function, %block)``
4140
4141 The '``blockaddress``' constant computes the address of the specified
4142 basic block in the specified function.
4143
4144 It always has an ``i8 addrspace(P)*`` type, where ``P`` is the address space
4145 of the function containing ``%block`` (usually ``addrspace(0)``).
4146
4147 Taking the address of the entry block is illegal.
4148
4149 This value only has defined behavior when used as an operand to the
4150 ':ref:`indirectbr <i_indirectbr>`' or ':ref:`callbr <i_callbr>`'instruction, or
4151 for comparisons against null. Pointer equality tests between labels addresses
4152 results in undefined behavior --- though, again, comparison against null is ok,
4153 and no label is equal to the null pointer. This may be passed around as an
4154 opaque pointer sized value as long as the bits are not inspected. This
4155 allows ``ptrtoint`` and arithmetic to be performed on these values so
4156 long as the original value is reconstituted before the ``indirectbr`` or
4157 ``callbr`` instruction.
4158
4159 Finally, some targets may provide defined semantics when using the value
4160 as the operand to an inline assembly, but that is target specific.
4161
4162 .. _dso_local_equivalent:
4163
4164 DSO Local Equivalent
4165 --------------------
4166
4167 ``dso_local_equivalent @func``
4168
4169 A '``dso_local_equivalent``' constant represents a function which is
4170 functionally equivalent to a given function, but is always defined in the
4171 current linkage unit. The resulting pointer has the same type as the underlying
4172 function. The resulting pointer is permitted, but not required, to be different
4173 from a pointer to the function, and it may have different values in different
4174 translation units.
4175
4176 The target function may not have ``extern_weak`` linkage.
4177
4178 ``dso_local_equivalent`` can be implemented as such:
4179
4180 - If the function has local linkage, hidden visibility, or is
4181   ``dso_local``, ``dso_local_equivalent`` can be implemented as simply a pointer
4182   to the function.
4183 - ``dso_local_equivalent`` can be implemented with a stub that tail-calls the
4184   function. Many targets support relocations that resolve at link time to either
4185   a function or a stub for it, depending on if the function is defined within the
4186   linkage unit; LLVM will use this when available. (This is commonly called a
4187   "PLT stub".) On other targets, the stub may need to be emitted explicitly.
4188
4189 This can be used wherever a ``dso_local`` instance of a function is needed without
4190 needing to explicitly make the original function ``dso_local``. An instance where
4191 this can be used is for static offset calculations between a function and some other
4192 ``dso_local`` symbol. This is especially useful for the Relative VTables C++ ABI,
4193 where dynamic relocations for function pointers in VTables can be replaced with
4194 static relocations for offsets between the VTable and virtual functions which
4195 may not be ``dso_local``.
4196
4197 This is currently only supported for ELF binary formats.
4198
4199 .. _constantexprs:
4200
4201 Constant Expressions
4202 --------------------
4203
4204 Constant expressions are used to allow expressions involving other
4205 constants to be used as constants. Constant expressions may be of any
4206 :ref:`first class <t_firstclass>` type and may involve any LLVM operation
4207 that does not have side effects (e.g. load and call are not supported).
4208 The following is the syntax for constant expressions:
4209
4210 ``trunc (CST to TYPE)``
4211     Perform the :ref:`trunc operation <i_trunc>` on constants.
4212 ``zext (CST to TYPE)``
4213     Perform the :ref:`zext operation <i_zext>` on constants.
4214 ``sext (CST to TYPE)``
4215     Perform the :ref:`sext operation <i_sext>` on constants.
4216 ``fptrunc (CST to TYPE)``
4217     Truncate a floating-point constant to another floating-point type.
4218     The size of CST must be larger than the size of TYPE. Both types
4219     must be floating-point.
4220 ``fpext (CST to TYPE)``
4221     Floating-point extend a constant to another type. The size of CST
4222     must be smaller or equal to the size of TYPE. Both types must be
4223     floating-point.
4224 ``fptoui (CST to TYPE)``
4225     Convert a floating-point constant to the corresponding unsigned
4226     integer constant. TYPE must be a scalar or vector integer type. CST
4227     must be of scalar or vector floating-point type. Both CST and TYPE
4228     must be scalars, or vectors of the same number of elements. If the
4229     value won't fit in the integer type, the result is a
4230     :ref:`poison value <poisonvalues>`.
4231 ``fptosi (CST to TYPE)``
4232     Convert a floating-point constant to the corresponding signed
4233     integer constant. TYPE must be a scalar or vector integer type. CST
4234     must be of scalar or vector floating-point type. Both CST and TYPE
4235     must be scalars, or vectors of the same number of elements. If the
4236     value won't fit in the integer type, the result is a
4237     :ref:`poison value <poisonvalues>`.
4238 ``uitofp (CST to TYPE)``
4239     Convert an unsigned integer constant to the corresponding
4240     floating-point constant. TYPE must be a scalar or vector floating-point
4241     type.  CST must be of scalar or vector integer type. Both CST and TYPE must
4242     be scalars, or vectors of the same number of elements.
4243 ``sitofp (CST to TYPE)``
4244     Convert a signed integer constant to the corresponding floating-point
4245     constant. TYPE must be a scalar or vector floating-point type.
4246     CST must be of scalar or vector integer type. Both CST and TYPE must
4247     be scalars, or vectors of the same number of elements.
4248 ``ptrtoint (CST to TYPE)``
4249     Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants.
4250 ``inttoptr (CST to TYPE)``
4251     Perform the :ref:`inttoptr operation <i_inttoptr>` on constants.
4252     This one is *really* dangerous!
4253 ``bitcast (CST to TYPE)``
4254     Convert a constant, CST, to another TYPE.
4255     The constraints of the operands are the same as those for the
4256     :ref:`bitcast instruction <i_bitcast>`.
4257 ``addrspacecast (CST to TYPE)``
4258     Convert a constant pointer or constant vector of pointer, CST, to another
4259     TYPE in a different address space. The constraints of the operands are the
4260     same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`.
4261 ``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)``
4262     Perform the :ref:`getelementptr operation <i_getelementptr>` on
4263     constants. As with the :ref:`getelementptr <i_getelementptr>`
4264     instruction, the index list may have one or more indexes, which are
4265     required to make sense for the type of "pointer to TY".
4266 ``select (COND, VAL1, VAL2)``
4267     Perform the :ref:`select operation <i_select>` on constants.
4268 ``icmp COND (VAL1, VAL2)``
4269     Perform the :ref:`icmp operation <i_icmp>` on constants.
4270 ``fcmp COND (VAL1, VAL2)``
4271     Perform the :ref:`fcmp operation <i_fcmp>` on constants.
4272 ``extractelement (VAL, IDX)``
4273     Perform the :ref:`extractelement operation <i_extractelement>` on
4274     constants.
4275 ``insertelement (VAL, ELT, IDX)``
4276     Perform the :ref:`insertelement operation <i_insertelement>` on
4277     constants.
4278 ``shufflevector (VEC1, VEC2, IDXMASK)``
4279     Perform the :ref:`shufflevector operation <i_shufflevector>` on
4280     constants.
4281 ``extractvalue (VAL, IDX0, IDX1, ...)``
4282     Perform the :ref:`extractvalue operation <i_extractvalue>` on
4283     constants. The index list is interpreted in a similar manner as
4284     indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At
4285     least one index value must be specified.
4286 ``insertvalue (VAL, ELT, IDX0, IDX1, ...)``
4287     Perform the :ref:`insertvalue operation <i_insertvalue>` on constants.
4288     The index list is interpreted in a similar manner as indices in a
4289     ':ref:`getelementptr <i_getelementptr>`' operation. At least one index
4290     value must be specified.
4291 ``OPCODE (LHS, RHS)``
4292     Perform the specified operation of the LHS and RHS constants. OPCODE
4293     may be any of the :ref:`binary <binaryops>` or :ref:`bitwise
4294     binary <bitwiseops>` operations. The constraints on operands are
4295     the same as those for the corresponding instruction (e.g. no bitwise
4296     operations on floating-point values are allowed).
4297
4298 Other Values
4299 ============
4300
4301 .. _inlineasmexprs:
4302
4303 Inline Assembler Expressions
4304 ----------------------------
4305
4306 LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
4307 Inline Assembly <moduleasm>`) through the use of a special value. This value
4308 represents the inline assembler as a template string (containing the
4309 instructions to emit), a list of operand constraints (stored as a string), a
4310 flag that indicates whether or not the inline asm expression has side effects,
4311 and a flag indicating whether the function containing the asm needs to align its
4312 stack conservatively.
4313
4314 The template string supports argument substitution of the operands using "``$``"
4315 followed by a number, to indicate substitution of the given register/memory
4316 location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also
4317 be used, where ``MODIFIER`` is a target-specific annotation for how to print the
4318 operand (See :ref:`inline-asm-modifiers`).
4319
4320 A literal "``$``" may be included by using "``$$``" in the template. To include
4321 other special characters into the output, the usual "``\XX``" escapes may be
4322 used, just as in other strings. Note that after template substitution, the
4323 resulting assembly string is parsed by LLVM's integrated assembler unless it is
4324 disabled -- even when emitting a ``.s`` file -- and thus must contain assembly
4325 syntax known to LLVM.
4326
4327 LLVM also supports a few more substitutions useful for writing inline assembly:
4328
4329 - ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob.
4330   This substitution is useful when declaring a local label. Many standard
4331   compiler optimizations, such as inlining, may duplicate an inline asm blob.
4332   Adding a blob-unique identifier ensures that the two labels will not conflict
4333   during assembly. This is used to implement `GCC's %= special format
4334   string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_.
4335 - ``${:comment}``: Expands to the comment character of the current target's
4336   assembly dialect. This is usually ``#``, but many targets use other strings,
4337   such as ``;``, ``//``, or ``!``.
4338 - ``${:private}``: Expands to the assembler private label prefix. Labels with
4339   this prefix will not appear in the symbol table of the assembled object.
4340   Typically the prefix is ``L``, but targets may use other strings. ``.L`` is
4341   relatively popular.
4342
4343 LLVM's support for inline asm is modeled closely on the requirements of Clang's
4344 GCC-compatible inline-asm support. Thus, the feature-set and the constraint and
4345 modifier codes listed here are similar or identical to those in GCC's inline asm
4346 support. However, to be clear, the syntax of the template and constraint strings
4347 described here is *not* the same as the syntax accepted by GCC and Clang, and,
4348 while most constraint letters are passed through as-is by Clang, some get
4349 translated to other codes when converting from the C source to the LLVM
4350 assembly.
4351
4352 An example inline assembler expression is:
4353
4354 .. code-block:: llvm
4355
4356     i32 (i32) asm "bswap $0", "=r,r"
4357
4358 Inline assembler expressions may **only** be used as the callee operand
4359 of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction.
4360 Thus, typically we have:
4361
4362 .. code-block:: llvm
4363
4364     %X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
4365
4366 Inline asms with side effects not visible in the constraint list must be
4367 marked as having side effects. This is done through the use of the
4368 '``sideeffect``' keyword, like so:
4369
4370 .. code-block:: llvm
4371
4372     call void asm sideeffect "eieio", ""()
4373
4374 In some cases inline asms will contain code that will not work unless
4375 the stack is aligned in some way, such as calls or SSE instructions on
4376 x86, yet will not contain code that does that alignment within the asm.
4377 The compiler should make conservative assumptions about what the asm
4378 might contain and should generate its usual stack alignment code in the
4379 prologue if the '``alignstack``' keyword is present:
4380
4381 .. code-block:: llvm
4382
4383     call void asm alignstack "eieio", ""()
4384
4385 Inline asms also support using non-standard assembly dialects. The
4386 assumed dialect is ATT. When the '``inteldialect``' keyword is present,
4387 the inline asm is using the Intel dialect. Currently, ATT and Intel are
4388 the only supported dialects. An example is:
4389
4390 .. code-block:: llvm
4391
4392     call void asm inteldialect "eieio", ""()
4393
4394 In the case that the inline asm might unwind the stack,
4395 the '``unwind``' keyword must be used, so that the compiler emits
4396 unwinding information:
4397
4398 .. code-block:: llvm
4399
4400     call void asm unwind "call func", ""()
4401
4402 If the inline asm unwinds the stack and isn't marked with
4403 the '``unwind``' keyword, the behavior is undefined.
4404
4405 If multiple keywords appear, the '``sideeffect``' keyword must come
4406 first, the '``alignstack``' keyword second, the '``inteldialect``' keyword
4407 third and the '``unwind``' keyword last.
4408
4409 Inline Asm Constraint String
4410 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4411
4412 The constraint list is a comma-separated string, each element containing one or
4413 more constraint codes.
4414
4415 For each element in the constraint list an appropriate register or memory
4416 operand will be chosen, and it will be made available to assembly template
4417 string expansion as ``$0`` for the first constraint in the list, ``$1`` for the
4418 second, etc.
4419
4420 There are three different types of constraints, which are distinguished by a
4421 prefix symbol in front of the constraint code: Output, Input, and Clobber. The
4422 constraints must always be given in that order: outputs first, then inputs, then
4423 clobbers. They cannot be intermingled.
4424
4425 There are also three different categories of constraint codes:
4426
4427 - Register constraint. This is either a register class, or a fixed physical
4428   register. This kind of constraint will allocate a register, and if necessary,
4429   bitcast the argument or result to the appropriate type.
4430 - Memory constraint. This kind of constraint is for use with an instruction
4431   taking a memory operand. Different constraints allow for different addressing
4432   modes used by the target.
4433 - Immediate value constraint. This kind of constraint is for an integer or other
4434   immediate value which can be rendered directly into an instruction. The
4435   various target-specific constraints allow the selection of a value in the
4436   proper range for the instruction you wish to use it with.
4437
4438 Output constraints
4439 """"""""""""""""""
4440
4441 Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This
4442 indicates that the assembly will write to this operand, and the operand will
4443 then be made available as a return value of the ``asm`` expression. Output
4444 constraints do not consume an argument from the call instruction. (Except, see
4445 below about indirect outputs).
4446
4447 Normally, it is expected that no output locations are written to by the assembly
4448 expression until *all* of the inputs have been read. As such, LLVM may assign
4449 the same register to an output and an input. If this is not safe (e.g. if the
4450 assembly contains two instructions, where the first writes to one output, and
4451 the second reads an input and writes to a second output), then the "``&``"
4452 modifier must be used (e.g. "``=&r``") to specify that the output is an
4453 "early-clobber" output. Marking an output as "early-clobber" ensures that LLVM
4454 will not use the same register for any inputs (other than an input tied to this
4455 output).
4456
4457 Input constraints
4458 """""""""""""""""
4459
4460 Input constraints do not have a prefix -- just the constraint codes. Each input
4461 constraint will consume one argument from the call instruction. It is not
4462 permitted for the asm to write to any input register or memory location (unless
4463 that input is tied to an output). Note also that multiple inputs may all be
4464 assigned to the same register, if LLVM can determine that they necessarily all
4465 contain the same value.
4466
4467 Instead of providing a Constraint Code, input constraints may also "tie"
4468 themselves to an output constraint, by providing an integer as the constraint
4469 string. Tied inputs still consume an argument from the call instruction, and
4470 take up a position in the asm template numbering as is usual -- they will simply
4471 be constrained to always use the same register as the output they've been tied
4472 to. For example, a constraint string of "``=r,0``" says to assign a register for
4473 output, and use that register as an input as well (it being the 0'th
4474 constraint).
4475
4476 It is permitted to tie an input to an "early-clobber" output. In that case, no
4477 *other* input may share the same register as the input tied to the early-clobber
4478 (even when the other input has the same value).
4479
4480 You may only tie an input to an output which has a register constraint, not a
4481 memory constraint. Only a single input may be tied to an output.
4482
4483 There is also an "interesting" feature which deserves a bit of explanation: if a
4484 register class constraint allocates a register which is too small for the value
4485 type operand provided as input, the input value will be split into multiple
4486 registers, and all of them passed to the inline asm.
4487
4488 However, this feature is often not as useful as you might think.
4489
4490 Firstly, the registers are *not* guaranteed to be consecutive. So, on those
4491 architectures that have instructions which operate on multiple consecutive
4492 instructions, this is not an appropriate way to support them. (e.g. the 32-bit
4493 SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The
4494 hardware then loads into both the named register, and the next register. This
4495 feature of inline asm would not be useful to support that.)
4496
4497 A few of the targets provide a template string modifier allowing explicit access
4498 to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and
4499 ``D``). On such an architecture, you can actually access the second allocated
4500 register (yet, still, not any subsequent ones). But, in that case, you're still
4501 probably better off simply splitting the value into two separate operands, for
4502 clarity. (e.g. see the description of the ``A`` constraint on X86, which,
4503 despite existing only for use with this feature, is not really a good idea to
4504 use)
4505
4506 Indirect inputs and outputs
4507 """""""""""""""""""""""""""
4508
4509 Indirect output or input constraints can be specified by the "``*``" modifier
4510 (which goes after the "``=``" in case of an output). This indicates that the asm
4511 will write to or read from the contents of an *address* provided as an input
4512 argument. (Note that in this way, indirect outputs act more like an *input* than
4513 an output: just like an input, they consume an argument of the call expression,
4514 rather than producing a return value. An indirect output constraint is an
4515 "output" only in that the asm is expected to write to the contents of the input
4516 memory location, instead of just read from it).
4517
4518 This is most typically used for memory constraint, e.g. "``=*m``", to pass the
4519 address of a variable as a value.
4520
4521 It is also possible to use an indirect *register* constraint, but only on output
4522 (e.g. "``=*r``"). This will cause LLVM to allocate a register for an output
4523 value normally, and then, separately emit a store to the address provided as
4524 input, after the provided inline asm. (It's not clear what value this
4525 functionality provides, compared to writing the store explicitly after the asm
4526 statement, and it can only produce worse code, since it bypasses many
4527 optimization passes. I would recommend not using it.)
4528
4529
4530 Clobber constraints
4531 """""""""""""""""""
4532
4533 A clobber constraint is indicated by a "``~``" prefix. A clobber does not
4534 consume an input operand, nor generate an output. Clobbers cannot use any of the
4535 general constraint code letters -- they may use only explicit register
4536 constraints, e.g. "``~{eax}``". The one exception is that a clobber string of
4537 "``~{memory}``" indicates that the assembly writes to arbitrary undeclared
4538 memory locations -- not only the memory pointed to by a declared indirect
4539 output.
4540
4541 Note that clobbering named registers that are also present in output
4542 constraints is not legal.
4543
4544
4545 Constraint Codes
4546 """"""""""""""""
4547 After a potential prefix comes constraint code, or codes.
4548
4549 A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character
4550 followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
4551 (e.g. "``{eax}``").
4552
4553 The one and two letter constraint codes are typically chosen to be the same as
4554 GCC's constraint codes.
4555
4556 A single constraint may include one or more than constraint code in it, leaving
4557 it up to LLVM to choose which one to use. This is included mainly for
4558 compatibility with the translation of GCC inline asm coming from clang.
4559
4560 There are two ways to specify alternatives, and either or both may be used in an
4561 inline asm constraint list:
4562
4563 1) Append the codes to each other, making a constraint code set. E.g. "``im``"
4564    or "``{eax}m``". This means "choose any of the options in the set". The
4565    choice of constraint is made independently for each constraint in the
4566    constraint list.
4567
4568 2) Use "``|``" between constraint code sets, creating alternatives. Every
4569    constraint in the constraint list must have the same number of alternative
4570    sets. With this syntax, the same alternative in *all* of the items in the
4571    constraint list will be chosen together.
4572
4573 Putting those together, you might have a two operand constraint string like
4574 ``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then
4575 operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1
4576 may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m.
4577
4578 However, the use of either of the alternatives features is *NOT* recommended, as
4579 LLVM is not able to make an intelligent choice about which one to use. (At the
4580 point it currently needs to choose, not enough information is available to do so
4581 in a smart way.) Thus, it simply tries to make a choice that's most likely to
4582 compile, not one that will be optimal performance. (e.g., given "``rm``", it'll
4583 always choose to use memory, not registers). And, if given multiple registers,
4584 or multiple register classes, it will simply choose the first one. (In fact, it
4585 doesn't currently even ensure explicitly specified physical registers are
4586 unique, so specifying multiple physical registers as alternatives, like
4587 ``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was
4588 intended.)
4589
4590 Supported Constraint Code List
4591 """"""""""""""""""""""""""""""
4592
4593 The constraint codes are, in general, expected to behave the same way they do in
4594 GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
4595 inline asm code which was supported by GCC. A mismatch in behavior between LLVM
4596 and GCC likely indicates a bug in LLVM.
4597
4598 Some constraint codes are typically supported by all targets:
4599
4600 - ``r``: A register in the target's general purpose register class.
4601 - ``m``: A memory address operand. It is target-specific what addressing modes
4602   are supported, typical examples are register, or register + register offset,
4603   or register + immediate offset (of some target-specific size).
4604 - ``i``: An integer constant (of target-specific width). Allows either a simple
4605   immediate, or a relocatable value.
4606 - ``n``: An integer constant -- *not* including relocatable values.
4607 - ``s``: An integer constant, but allowing *only* relocatable values.
4608 - ``X``: Allows an operand of any kind, no constraint whatsoever. Typically
4609   useful to pass a label for an asm branch or call.
4610
4611   .. FIXME: but that surely isn't actually okay to jump out of an asm
4612      block without telling llvm about the control transfer???)
4613
4614 - ``{register-name}``: Requires exactly the named physical register.
4615
4616 Other constraints are target-specific:
4617
4618 AArch64:
4619
4620 - ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate.
4621 - ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction,
4622   i.e. 0 to 4095 with optional shift by 12.
4623 - ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or
4624   ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12.
4625 - ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a
4626   logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register.
4627 - ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a
4628   logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register.
4629 - ``M``: An immediate integer for use with the ``MOV`` assembly alias on a
4630   32-bit register. This is a superset of ``K``: in addition to the bitmask
4631   immediate, also allows immediate integers which can be loaded with a single
4632   ``MOVZ`` or ``MOVL`` instruction.
4633 - ``N``: An immediate integer for use with the ``MOV`` assembly alias on a
4634   64-bit register. This is a superset of ``L``.
4635 - ``Q``: Memory address operand must be in a single register (no
4636   offsets). (However, LLVM currently does this for the ``m`` constraint as
4637   well.)
4638 - ``r``: A 32 or 64-bit integer register (W* or X*).
4639 - ``w``: A 32, 64, or 128-bit floating-point, SIMD or SVE vector register.
4640 - ``x``: Like w, but restricted to registers 0 to 15 inclusive.
4641 - ``y``: Like w, but restricted to SVE vector registers Z0 to Z7 inclusive.
4642 - ``Upl``: One of the low eight SVE predicate registers (P0 to P7)
4643 - ``Upa``: Any of the SVE predicate registers (P0 to P15)
4644
4645 AMDGPU:
4646
4647 - ``r``: A 32 or 64-bit integer register.
4648 - ``[0-9]v``: The 32-bit VGPR register, number 0-9.
4649 - ``[0-9]s``: The 32-bit SGPR register, number 0-9.
4650 - ``[0-9]a``: The 32-bit AGPR register, number 0-9.
4651 - ``I``: An integer inline constant in the range from -16 to 64.
4652 - ``J``: A 16-bit signed integer constant.
4653 - ``A``: An integer or a floating-point inline constant.
4654 - ``B``: A 32-bit signed integer constant.
4655 - ``C``: A 32-bit unsigned integer constant or an integer inline constant in the range from -16 to 64.
4656 - ``DA``: A 64-bit constant that can be split into two "A" constants.
4657 - ``DB``: A 64-bit constant that can be split into two "B" constants.
4658
4659 All ARM modes:
4660
4661 - ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address
4662   operand. Treated the same as operand ``m``, at the moment.
4663 - ``Te``: An even general-purpose 32-bit integer register: ``r0,r2,...,r12,r14``
4664 - ``To``: An odd general-purpose 32-bit integer register: ``r1,r3,...,r11``
4665
4666 ARM and ARM's Thumb2 mode:
4667
4668 - ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``)
4669 - ``I``: An immediate integer valid for a data-processing instruction.
4670 - ``J``: An immediate integer between -4095 and 4095.
4671 - ``K``: An immediate integer whose bitwise inverse is valid for a
4672   data-processing instruction. (Can be used with template modifier "``B``" to
4673   print the inverted value).
4674 - ``L``: An immediate integer whose negation is valid for a data-processing
4675   instruction. (Can be used with template modifier "``n``" to print the negated
4676   value).
4677 - ``M``: A power of two or an integer between 0 and 32.
4678 - ``N``: Invalid immediate constraint.
4679 - ``O``: Invalid immediate constraint.
4680 - ``r``: A general-purpose 32-bit integer register (``r0-r15``).
4681 - ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same
4682   as ``r``.
4683 - ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode,
4684   invalid.
4685 - ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4686   ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
4687 - ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4688   ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
4689 - ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4690   ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
4691
4692 ARM's Thumb1 mode:
4693
4694 - ``I``: An immediate integer between 0 and 255.
4695 - ``J``: An immediate integer between -255 and -1.
4696 - ``K``: An immediate integer between 0 and 255, with optional left-shift by
4697   some amount.
4698 - ``L``: An immediate integer between -7 and 7.
4699 - ``M``: An immediate integer which is a multiple of 4 between 0 and 1020.
4700 - ``N``: An immediate integer between 0 and 31.
4701 - ``O``: An immediate integer which is a multiple of 4 between -508 and 508.
4702 - ``r``: A low 32-bit GPR register (``r0-r7``).
4703 - ``l``: A low 32-bit GPR register (``r0-r7``).
4704 - ``h``: A high GPR register (``r0-r7``).
4705 - ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4706   ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
4707 - ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4708   ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
4709 - ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4710   ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
4711
4712
4713 Hexagon:
4714
4715 - ``o``, ``v``: A memory address operand, treated the same as constraint ``m``,
4716   at the moment.
4717 - ``r``: A 32 or 64-bit register.
4718
4719 MSP430:
4720
4721 - ``r``: An 8 or 16-bit register.
4722
4723 MIPS:
4724
4725 - ``I``: An immediate signed 16-bit integer.
4726 - ``J``: An immediate integer zero.
4727 - ``K``: An immediate unsigned 16-bit integer.
4728 - ``L``: An immediate 32-bit integer, where the lower 16 bits are 0.
4729 - ``N``: An immediate integer between -65535 and -1.
4730 - ``O``: An immediate signed 15-bit integer.
4731 - ``P``: An immediate integer between 1 and 65535.
4732 - ``m``: A memory address operand. In MIPS-SE mode, allows a base address
4733   register plus 16-bit immediate offset. In MIPS mode, just a base register.
4734 - ``R``: A memory address operand. In MIPS-SE mode, allows a base address
4735   register plus a 9-bit signed offset. In MIPS mode, the same as constraint
4736   ``m``.
4737 - ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or
4738   ``sc`` instruction on the given subtarget (details vary).
4739 - ``r``, ``d``,  ``y``: A 32 or 64-bit GPR register.
4740 - ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register
4741   (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w``
4742   argument modifier for compatibility with GCC.
4743 - ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always
4744   ``25``).
4745 - ``l``: The ``lo`` register, 32 or 64-bit.
4746 - ``x``: Invalid.
4747
4748 NVPTX:
4749
4750 - ``b``: A 1-bit integer register.
4751 - ``c`` or ``h``: A 16-bit integer register.
4752 - ``r``: A 32-bit integer register.
4753 - ``l`` or ``N``: A 64-bit integer register.
4754 - ``f``: A 32-bit float register.
4755 - ``d``: A 64-bit float register.
4756
4757
4758 PowerPC:
4759
4760 - ``I``: An immediate signed 16-bit integer.
4761 - ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits.
4762 - ``K``: An immediate unsigned 16-bit integer.
4763 - ``L``: An immediate signed 16-bit integer, shifted left 16 bits.
4764 - ``M``: An immediate integer greater than 31.
4765 - ``N``: An immediate integer that is an exact power of 2.
4766 - ``O``: The immediate integer constant 0.
4767 - ``P``: An immediate integer constant whose negation is a signed 16-bit
4768   constant.
4769 - ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently
4770   treated the same as ``m``.
4771 - ``r``: A 32 or 64-bit integer register.
4772 - ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is:
4773   ``R1-R31``).
4774 - ``f``: A 32 or 64-bit float register (``F0-F31``),
4775 - ``v``: For ``4 x f32`` or ``4 x f64`` types, a 128-bit altivec vector
4776    register (``V0-V31``).
4777
4778 - ``y``: Condition register (``CR0-CR7``).
4779 - ``wc``: An individual CR bit in a CR register.
4780 - ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX
4781   register set (overlapping both the floating-point and vector register files).
4782 - ``ws``: A 32 or 64-bit floating-point register, from the full VSX register
4783   set.
4784
4785 RISC-V:
4786
4787 - ``A``: An address operand (using a general-purpose register, without an
4788   offset).
4789 - ``I``: A 12-bit signed integer immediate operand.
4790 - ``J``: A zero integer immediate operand.
4791 - ``K``: A 5-bit unsigned integer immediate operand.
4792 - ``f``: A 32- or 64-bit floating-point register (requires F or D extension).
4793 - ``r``: A 32- or 64-bit general-purpose register (depending on the platform
4794   ``XLEN``).
4795 - ``vr``: A vector register. (requires V extension).
4796 - ``vm``: A vector mask register. (requires V extension).
4797
4798 Sparc:
4799
4800 - ``I``: An immediate 13-bit signed integer.
4801 - ``r``: A 32-bit integer register.
4802 - ``f``: Any floating-point register on SparcV8, or a floating-point
4803   register in the "low" half of the registers on SparcV9.
4804 - ``e``: Any floating-point register. (Same as ``f`` on SparcV8.)
4805
4806 SystemZ:
4807
4808 - ``I``: An immediate unsigned 8-bit integer.
4809 - ``J``: An immediate unsigned 12-bit integer.
4810 - ``K``: An immediate signed 16-bit integer.
4811 - ``L``: An immediate signed 20-bit integer.
4812 - ``M``: An immediate integer 0x7fffffff.
4813 - ``Q``: A memory address operand with a base address and a 12-bit immediate
4814   unsigned displacement.
4815 - ``R``: A memory address operand with a base address, a 12-bit immediate
4816   unsigned displacement, and an index register.
4817 - ``S``: A memory address operand with a base address and a 20-bit immediate
4818   signed displacement.
4819 - ``T``: A memory address operand with a base address, a 20-bit immediate
4820   signed displacement, and an index register.
4821 - ``r`` or ``d``: A 32, 64, or 128-bit integer register.
4822 - ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an
4823   address context evaluates as zero).
4824 - ``h``: A 32-bit value in the high part of a 64bit data register
4825   (LLVM-specific)
4826 - ``f``: A 32, 64, or 128-bit floating-point register.
4827
4828 X86:
4829
4830 - ``I``: An immediate integer between 0 and 31.
4831 - ``J``: An immediate integer between 0 and 64.
4832 - ``K``: An immediate signed 8-bit integer.
4833 - ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only)
4834   0xffffffff.
4835 - ``M``: An immediate integer between 0 and 3.
4836 - ``N``: An immediate unsigned 8-bit integer.
4837 - ``O``: An immediate integer between 0 and 127.
4838 - ``e``: An immediate 32-bit signed integer.
4839 - ``Z``: An immediate 32-bit unsigned integer.
4840 - ``o``, ``v``: Treated the same as ``m``, at the moment.
4841 - ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
4842   ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d``
4843   registers, and on X86-64, it is all of the integer registers.
4844 - ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
4845   ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers.
4846 - ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register.
4847 - ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has
4848   existed since i386, and can be accessed without the REX prefix.
4849 - ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register.
4850 - ``y``: A 64-bit MMX register, if MMX is enabled.
4851 - ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector
4852   operand in a SSE register. If AVX is also enabled, can also be a 256-bit
4853   vector operand in an AVX register. If AVX-512 is also enabled, can also be a
4854   512-bit vector operand in an AVX512 register, Otherwise, an error.
4855 - ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error.
4856 - ``A``: Special case: allocates EAX first, then EDX, for a single operand (in
4857   32-bit mode, a 64-bit integer operand will get split into two registers). It
4858   is not recommended to use this constraint, as in 64-bit mode, the 64-bit
4859   operand will get allocated only to RAX -- if two 32-bit operands are needed,
4860   you're better off splitting it yourself, before passing it to the asm
4861   statement.
4862
4863 XCore:
4864
4865 - ``r``: A 32-bit integer register.
4866
4867
4868 .. _inline-asm-modifiers:
4869
4870 Asm template argument modifiers
4871 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4872
4873 In the asm template string, modifiers can be used on the operand reference, like
4874 "``${0:n}``".
4875
4876 The modifiers are, in general, expected to behave the same way they do in
4877 GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
4878 inline asm code which was supported by GCC. A mismatch in behavior between LLVM
4879 and GCC likely indicates a bug in LLVM.
4880
4881 Target-independent:
4882
4883 - ``c``: Print an immediate integer constant unadorned, without
4884   the target-specific immediate punctuation (e.g. no ``$`` prefix).
4885 - ``n``: Negate and print immediate integer constant unadorned, without the
4886   target-specific immediate punctuation (e.g. no ``$`` prefix).
4887 - ``l``: Print as an unadorned label, without the target-specific label
4888   punctuation (e.g. no ``$`` prefix).
4889
4890 AArch64:
4891
4892 - ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g.,
4893   instead of ``x30``, print ``w30``.
4894 - ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow).
4895 - ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a
4896   ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of
4897   ``v*``.
4898
4899 AMDGPU:
4900
4901 - ``r``: No effect.
4902
4903 ARM:
4904
4905 - ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a
4906   register).
4907 - ``P``: No effect.
4908 - ``q``: No effect.
4909 - ``y``: Print a VFP single-precision register as an indexed double (e.g. print
4910   as ``d4[1]`` instead of ``s9``)
4911 - ``B``: Bitwise invert and print an immediate integer constant without ``#``
4912   prefix.
4913 - ``L``: Print the low 16-bits of an immediate integer constant.
4914 - ``M``: Print as a register set suitable for ldm/stm. Also prints *all*
4915   register operands subsequent to the specified one (!), so use carefully.
4916 - ``Q``: Print the low-order register of a register-pair, or the low-order
4917   register of a two-register operand.
4918 - ``R``: Print the high-order register of a register-pair, or the high-order
4919   register of a two-register operand.
4920 - ``H``: Print the second register of a register-pair. (On a big-endian system,
4921   ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent
4922   to ``R``.)
4923
4924   .. FIXME: H doesn't currently support printing the second register
4925      of a two-register operand.
4926
4927 - ``e``: Print the low doubleword register of a NEON quad register.
4928 - ``f``: Print the high doubleword register of a NEON quad register.
4929 - ``m``: Print the base register of a memory operand without the ``[`` and ``]``
4930   adornment.
4931
4932 Hexagon:
4933
4934 - ``L``: Print the second register of a two-register operand. Requires that it
4935   has been allocated consecutively to the first.
4936
4937   .. FIXME: why is it restricted to consecutive ones? And there's
4938      nothing that ensures that happens, is there?
4939
4940 - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
4941   nothing. Used to print 'addi' vs 'add' instructions.
4942
4943 MSP430:
4944
4945 No additional modifiers.
4946
4947 MIPS:
4948
4949 - ``X``: Print an immediate integer as hexadecimal
4950 - ``x``: Print the low 16 bits of an immediate integer as hexadecimal.
4951 - ``d``: Print an immediate integer as decimal.
4952 - ``m``: Subtract one and print an immediate integer as decimal.
4953 - ``z``: Print $0 if an immediate zero, otherwise print normally.
4954 - ``L``: Print the low-order register of a two-register operand, or prints the
4955   address of the low-order word of a double-word memory operand.
4956
4957   .. FIXME: L seems to be missing memory operand support.
4958
4959 - ``M``: Print the high-order register of a two-register operand, or prints the
4960   address of the high-order word of a double-word memory operand.
4961
4962   .. FIXME: M seems to be missing memory operand support.
4963
4964 - ``D``: Print the second register of a two-register operand, or prints the
4965   second word of a double-word memory operand. (On a big-endian system, ``D`` is
4966   equivalent to ``L``, and on little-endian system, ``D`` is equivalent to
4967   ``M``.)
4968 - ``w``: No effect. Provided for compatibility with GCC which requires this
4969   modifier in order to print MSA registers (``W0-W31``) with the ``f``
4970   constraint.
4971
4972 NVPTX:
4973
4974 - ``r``: No effect.
4975
4976 PowerPC:
4977
4978 - ``L``: Print the second register of a two-register operand. Requires that it
4979   has been allocated consecutively to the first.
4980
4981   .. FIXME: why is it restricted to consecutive ones? And there's
4982      nothing that ensures that happens, is there?
4983
4984 - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
4985   nothing. Used to print 'addi' vs 'add' instructions.
4986 - ``y``: For a memory operand, prints formatter for a two-register X-form
4987   instruction. (Currently always prints ``r0,OPERAND``).
4988 - ``U``: Prints 'u' if the memory operand is an update form, and nothing
4989   otherwise. (NOTE: LLVM does not support update form, so this will currently
4990   always print nothing)
4991 - ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does
4992   not support indexed form, so this will currently always print nothing)
4993
4994 RISC-V:
4995
4996 - ``i``: Print the letter 'i' if the operand is not a register, otherwise print
4997   nothing. Used to print 'addi' vs 'add' instructions, etc.
4998 - ``z``: Print the register ``zero`` if an immediate zero, otherwise print
4999   normally.
5000
5001 Sparc:
5002
5003 - ``r``: No effect.
5004
5005 SystemZ:
5006
5007 SystemZ implements only ``n``, and does *not* support any of the other
5008 target-independent modifiers.
5009
5010 X86:
5011
5012 - ``c``: Print an unadorned integer or symbol name. (The latter is
5013   target-specific behavior for this typically target-independent modifier).
5014 - ``A``: Print a register name with a '``*``' before it.
5015 - ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory
5016   operand.
5017 - ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a
5018   memory operand.
5019 - ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory
5020   operand.
5021 - ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory
5022   operand.
5023 - ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are
5024   available, otherwise the 32-bit register name; do nothing on a memory operand.
5025 - ``n``: Negate and print an unadorned integer, or, for operands other than an
5026   immediate integer (e.g. a relocatable symbol expression), print a '-' before
5027   the operand. (The behavior for relocatable symbol expressions is a
5028   target-specific behavior for this typically target-independent modifier)
5029 - ``H``: Print a memory reference with additional offset +8.
5030 - ``P``: Print a memory reference or operand for use as the argument of a call
5031   instruction. (E.g. omit ``(rip)``, even though it's PC-relative.)
5032
5033 XCore:
5034
5035 No additional modifiers.
5036
5037
5038 Inline Asm Metadata
5039 ^^^^^^^^^^^^^^^^^^^
5040
5041 The call instructions that wrap inline asm nodes may have a
5042 "``!srcloc``" MDNode attached to it that contains a list of constant
5043 integers. If present, the code generator will use the integer as the
5044 location cookie value when report errors through the ``LLVMContext``
5045 error reporting mechanisms. This allows a front-end to correlate backend
5046 errors that occur with inline asm back to the source code that produced
5047 it. For example:
5048
5049 .. code-block:: llvm
5050
5051     call void asm sideeffect "something bad", ""(), !srcloc !42
5052     ...
5053     !42 = !{ i32 1234567 }
5054
5055 It is up to the front-end to make sense of the magic numbers it places
5056 in the IR. If the MDNode contains multiple constants, the code generator
5057 will use the one that corresponds to the line of the asm that the error
5058 occurs on.
5059
5060 .. _metadata:
5061
5062 Metadata
5063 ========
5064
5065 LLVM IR allows metadata to be attached to instructions and global objects in the
5066 program that can convey extra information about the code to the optimizers and
5067 code generator. One example application of metadata is source-level
5068 debug information. There are two metadata primitives: strings and nodes.
5069
5070 Metadata does not have a type, and is not a value. If referenced from a
5071 ``call`` instruction, it uses the ``metadata`` type.
5072
5073 All metadata are identified in syntax by an exclamation point ('``!``').
5074
5075 .. _metadata-string:
5076
5077 Metadata Nodes and Metadata Strings
5078 -----------------------------------
5079
5080 A metadata string is a string surrounded by double quotes. It can
5081 contain any character by escaping non-printable characters with
5082 "``\xx``" where "``xx``" is the two digit hex code. For example:
5083 "``!"test\00"``".
5084
5085 Metadata nodes are represented with notation similar to structure
5086 constants (a comma separated list of elements, surrounded by braces and
5087 preceded by an exclamation point). Metadata nodes can have any values as
5088 their operand. For example:
5089
5090 .. code-block:: llvm
5091
5092     !{ !"test\00", i32 10}
5093
5094 Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example:
5095
5096 .. code-block:: text
5097
5098     !0 = distinct !{!"test\00", i32 10}
5099
5100 ``distinct`` nodes are useful when nodes shouldn't be merged based on their
5101 content. They can also occur when transformations cause uniquing collisions
5102 when metadata operands change.
5103
5104 A :ref:`named metadata <namedmetadatastructure>` is a collection of
5105 metadata nodes, which can be looked up in the module symbol table. For
5106 example:
5107
5108 .. code-block:: llvm
5109
5110     !foo = !{!4, !3}
5111
5112 Metadata can be used as function arguments. Here the ``llvm.dbg.value``
5113 intrinsic is using three metadata arguments:
5114
5115 .. code-block:: llvm
5116
5117     call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26)
5118
5119 Metadata can be attached to an instruction. Here metadata ``!21`` is attached
5120 to the ``add`` instruction using the ``!dbg`` identifier:
5121
5122 .. code-block:: llvm
5123
5124     %indvar.next = add i64 %indvar, 1, !dbg !21
5125
5126 Instructions may not have multiple metadata attachments with the same
5127 identifier.
5128
5129 Metadata can also be attached to a function or a global variable. Here metadata
5130 ``!22`` is attached to the ``f1`` and ``f2`` functions, and the globals ``g1``
5131 and ``g2`` using the ``!dbg`` identifier:
5132
5133 .. code-block:: llvm
5134
5135     declare !dbg !22 void @f1()
5136     define void @f2() !dbg !22 {
5137       ret void
5138     }
5139
5140     @g1 = global i32 0, !dbg !22
5141     @g2 = external global i32, !dbg !22
5142
5143 Unlike instructions, global objects (functions and global variables) may have
5144 multiple metadata attachments with the same identifier.
5145
5146 A transformation is required to drop any metadata attachment that it does not
5147 know or know it can't preserve. Currently there is an exception for metadata
5148 attachment to globals for ``!type`` and ``!absolute_symbol`` which can't be
5149 unconditionally dropped unless the global is itself deleted.
5150
5151 Metadata attached to a module using named metadata may not be dropped, with
5152 the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``).
5153
5154 More information about specific metadata nodes recognized by the
5155 optimizers and code generator is found below.
5156
5157 .. _specialized-metadata:
5158
5159 Specialized Metadata Nodes
5160 ^^^^^^^^^^^^^^^^^^^^^^^^^^
5161
5162 Specialized metadata nodes are custom data structures in metadata (as opposed
5163 to generic tuples). Their fields are labelled, and can be specified in any
5164 order.
5165
5166 These aren't inherently debug info centric, but currently all the specialized
5167 metadata nodes are related to debug info.
5168
5169 .. _DICompileUnit:
5170
5171 DICompileUnit
5172 """""""""""""
5173
5174 ``DICompileUnit`` nodes represent a compile unit. The ``enums:``,
5175 ``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples
5176 containing the debug info to be emitted along with the compile unit, regardless
5177 of code optimizations (some nodes are only emitted if there are references to
5178 them from instructions). The ``debugInfoForProfiling:`` field is a boolean
5179 indicating whether or not line-table discriminators are updated to provide
5180 more-accurate debug info for profiling results.
5181
5182 .. code-block:: text
5183
5184     !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang",
5185                         isOptimized: true, flags: "-O2", runtimeVersion: 2,
5186                         splitDebugFilename: "abc.debug", emissionKind: FullDebug,
5187                         enums: !2, retainedTypes: !3, globals: !4, imports: !5,
5188                         macros: !6, dwoId: 0x0abcd)
5189
5190 Compile unit descriptors provide the root scope for objects declared in a
5191 specific compilation unit. File descriptors are defined using this scope.  These
5192 descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep
5193 track of global variables, type information, and imported entities (declarations
5194 and namespaces).
5195
5196 .. _DIFile:
5197
5198 DIFile
5199 """"""
5200
5201 ``DIFile`` nodes represent files. The ``filename:`` can include slashes.
5202
5203 .. code-block:: none
5204
5205     !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir",
5206                  checksumkind: CSK_MD5,
5207                  checksum: "000102030405060708090a0b0c0d0e0f")
5208
5209 Files are sometimes used in ``scope:`` fields, and are the only valid target
5210 for ``file:`` fields.
5211 Valid values for ``checksumkind:`` field are: {CSK_None, CSK_MD5, CSK_SHA1, CSK_SHA256}
5212
5213 .. _DIBasicType:
5214
5215 DIBasicType
5216 """""""""""
5217
5218 ``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and
5219 ``float``. ``tag:`` defaults to ``DW_TAG_base_type``.
5220
5221 .. code-block:: text
5222
5223     !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
5224                       encoding: DW_ATE_unsigned_char)
5225     !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
5226
5227 The ``encoding:`` describes the details of the type. Usually it's one of the
5228 following:
5229
5230 .. code-block:: text
5231
5232   DW_ATE_address       = 1
5233   DW_ATE_boolean       = 2
5234   DW_ATE_float         = 4
5235   DW_ATE_signed        = 5
5236   DW_ATE_signed_char   = 6
5237   DW_ATE_unsigned      = 7
5238   DW_ATE_unsigned_char = 8
5239
5240 .. _DISubroutineType:
5241
5242 DISubroutineType
5243 """"""""""""""""
5244
5245 ``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field
5246 refers to a tuple; the first operand is the return type, while the rest are the
5247 types of the formal arguments in order. If the first operand is ``null``, that
5248 represents a function with no return value (such as ``void foo() {}`` in C++).
5249
5250 .. code-block:: text
5251
5252     !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed)
5253     !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char)
5254     !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char)
5255
5256 .. _DIDerivedType:
5257
5258 DIDerivedType
5259 """""""""""""
5260
5261 ``DIDerivedType`` nodes represent types derived from other types, such as
5262 qualified types.
5263
5264 .. code-block:: text
5265
5266     !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
5267                       encoding: DW_ATE_unsigned_char)
5268     !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32,
5269                         align: 32)
5270
5271 The following ``tag:`` values are valid:
5272
5273 .. code-block:: text
5274
5275   DW_TAG_member             = 13
5276   DW_TAG_pointer_type       = 15
5277   DW_TAG_reference_type     = 16
5278   DW_TAG_typedef            = 22
5279   DW_TAG_inheritance        = 28
5280   DW_TAG_ptr_to_member_type = 31
5281   DW_TAG_const_type         = 38
5282   DW_TAG_friend             = 42
5283   DW_TAG_volatile_type      = 53
5284   DW_TAG_restrict_type      = 55
5285   DW_TAG_atomic_type        = 71
5286
5287 .. _DIDerivedTypeMember:
5288
5289 ``DW_TAG_member`` is used to define a member of a :ref:`composite type
5290 <DICompositeType>`. The type of the member is the ``baseType:``. The
5291 ``offset:`` is the member's bit offset.  If the composite type has an ODR
5292 ``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is
5293 uniqued based only on its ``name:`` and ``scope:``.
5294
5295 ``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:``
5296 field of :ref:`composite types <DICompositeType>` to describe parents and
5297 friends.
5298
5299 ``DW_TAG_typedef`` is used to provide a name for the ``baseType:``.
5300
5301 ``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``,
5302 ``DW_TAG_volatile_type``, ``DW_TAG_restrict_type`` and ``DW_TAG_atomic_type``
5303 are used to qualify the ``baseType:``.
5304
5305 Note that the ``void *`` type is expressed as a type derived from NULL.
5306
5307 .. _DICompositeType:
5308
5309 DICompositeType
5310 """""""""""""""
5311
5312 ``DICompositeType`` nodes represent types composed of other types, like
5313 structures and unions. ``elements:`` points to a tuple of the composed types.
5314
5315 If the source language supports ODR, the ``identifier:`` field gives the unique
5316 identifier used for type merging between modules.  When specified,
5317 :ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member
5318 derived types <DIDerivedTypeMember>` that reference the ODR-type in their
5319 ``scope:`` change uniquing rules.
5320
5321 For a given ``identifier:``, there should only be a single composite type that
5322 does not have  ``flags: DIFlagFwdDecl`` set.  LLVM tools that link modules
5323 together will unique such definitions at parse time via the ``identifier:``
5324 field, even if the nodes are ``distinct``.
5325
5326 .. code-block:: text
5327
5328     !0 = !DIEnumerator(name: "SixKind", value: 7)
5329     !1 = !DIEnumerator(name: "SevenKind", value: 7)
5330     !2 = !DIEnumerator(name: "NegEightKind", value: -8)
5331     !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12,
5332                           line: 2, size: 32, align: 32, identifier: "_M4Enum",
5333                           elements: !{!0, !1, !2})
5334
5335 The following ``tag:`` values are valid:
5336
5337 .. code-block:: text
5338
5339   DW_TAG_array_type       = 1
5340   DW_TAG_class_type       = 2
5341   DW_TAG_enumeration_type = 4
5342   DW_TAG_structure_type   = 19
5343   DW_TAG_union_type       = 23
5344
5345 For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange
5346 descriptors <DISubrange>`, each representing the range of subscripts at that
5347 level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an
5348 array type is a native packed vector. The optional ``dataLocation`` is a
5349 DIExpression that describes how to get from an object's address to the actual
5350 raw data, if they aren't equivalent. This is only supported for array types,
5351 particularly to describe Fortran arrays, which have an array descriptor in
5352 addition to the array data. Alternatively it can also be DIVariable which
5353 has the address of the actual raw data. The Fortran language supports pointer
5354 arrays which can be attached to actual arrays, this attachment between pointer
5355 and pointee is called association.  The optional ``associated`` is a
5356 DIExpression that describes whether the pointer array is currently associated.
5357 The optional ``allocated`` is a DIExpression that describes whether the
5358 allocatable array is currently allocated.  The optional ``rank`` is a
5359 DIExpression that describes the rank (number of dimensions) of fortran assumed
5360 rank array (rank is known at runtime).
5361
5362 For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator
5363 descriptors <DIEnumerator>`, each representing the definition of an enumeration
5364 value for the set. All enumeration type descriptors are collected in the
5365 ``enums:`` field of the :ref:`compile unit <DICompileUnit>`.
5366
5367 For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and
5368 ``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types
5369 <DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or
5370 ``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with
5371 ``isDefinition: false``.
5372
5373 .. _DISubrange:
5374
5375 DISubrange
5376 """"""""""
5377
5378 ``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of
5379 :ref:`DICompositeType`.
5380
5381 - ``count: -1`` indicates an empty array.
5382 - ``count: !9`` describes the count with a :ref:`DILocalVariable`.
5383 - ``count: !11`` describes the count with a :ref:`DIGlobalVariable`.
5384
5385 .. code-block:: text
5386
5387     !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0
5388     !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1
5389     !2 = !DISubrange(count: -1) ; empty array.
5390
5391     ; Scopes used in rest of example
5392     !6 = !DIFile(filename: "vla.c", directory: "/path/to/file")
5393     !7 = distinct !DICompileUnit(language: DW_LANG_C99, file: !6)
5394     !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5)
5395
5396     ; Use of local variable as count value
5397     !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5398     !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9)
5399     !11 = !DISubrange(count: !10, lowerBound: 0)
5400
5401     ; Use of global variable as count value
5402     !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9)
5403     !13 = !DISubrange(count: !12, lowerBound: 0)
5404
5405 .. _DIEnumerator:
5406
5407 DIEnumerator
5408 """"""""""""
5409
5410 ``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type``
5411 variants of :ref:`DICompositeType`.
5412
5413 .. code-block:: text
5414
5415     !0 = !DIEnumerator(name: "SixKind", value: 7)
5416     !1 = !DIEnumerator(name: "SevenKind", value: 7)
5417     !2 = !DIEnumerator(name: "NegEightKind", value: -8)
5418
5419 DITemplateTypeParameter
5420 """""""""""""""""""""""
5421
5422 ``DITemplateTypeParameter`` nodes represent type parameters to generic source
5423 language constructs. They are used (optionally) in :ref:`DICompositeType` and
5424 :ref:`DISubprogram` ``templateParams:`` fields.
5425
5426 .. code-block:: text
5427
5428     !0 = !DITemplateTypeParameter(name: "Ty", type: !1)
5429
5430 DITemplateValueParameter
5431 """"""""""""""""""""""""
5432
5433 ``DITemplateValueParameter`` nodes represent value parameters to generic source
5434 language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``,
5435 but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or
5436 ``DW_TAG_GNU_template_param_pack``. They are used (optionally) in
5437 :ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields.
5438
5439 .. code-block:: text
5440
5441     !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7)
5442
5443 DINamespace
5444 """""""""""
5445
5446 ``DINamespace`` nodes represent namespaces in the source language.
5447
5448 .. code-block:: text
5449
5450     !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7)
5451
5452 .. _DIGlobalVariable:
5453
5454 DIGlobalVariable
5455 """"""""""""""""
5456
5457 ``DIGlobalVariable`` nodes represent global variables in the source language.
5458
5459 .. code-block:: text
5460
5461     @foo = global i32, !dbg !0
5462     !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
5463     !1 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !2,
5464                            file: !3, line: 7, type: !4, isLocal: true,
5465                            isDefinition: false, declaration: !5)
5466
5467
5468 DIGlobalVariableExpression
5469 """"""""""""""""""""""""""
5470
5471 ``DIGlobalVariableExpression`` nodes tie a :ref:`DIGlobalVariable` together
5472 with a :ref:`DIExpression`.
5473
5474 .. code-block:: text
5475
5476     @lower = global i32, !dbg !0
5477     @upper = global i32, !dbg !1
5478     !0 = !DIGlobalVariableExpression(
5479              var: !2,
5480              expr: !DIExpression(DW_OP_LLVM_fragment, 0, 32)
5481              )
5482     !1 = !DIGlobalVariableExpression(
5483              var: !2,
5484              expr: !DIExpression(DW_OP_LLVM_fragment, 32, 32)
5485              )
5486     !2 = !DIGlobalVariable(name: "split64", linkageName: "split64", scope: !3,
5487                            file: !4, line: 8, type: !5, declaration: !6)
5488
5489 All global variable expressions should be referenced by the `globals:` field of
5490 a :ref:`compile unit <DICompileUnit>`.
5491
5492 .. _DISubprogram:
5493
5494 DISubprogram
5495 """"""""""""
5496
5497 ``DISubprogram`` nodes represent functions from the source language. A distinct
5498 ``DISubprogram`` may be attached to a function definition using ``!dbg``
5499 metadata. A unique ``DISubprogram`` may be attached to a function declaration
5500 used for call site debug info. The ``retainedNodes:`` field is a list of
5501 :ref:`variables <DILocalVariable>` and :ref:`labels <DILabel>` that must be
5502 retained, even if their IR counterparts are optimized out of the IR. The
5503 ``type:`` field must point at an :ref:`DISubroutineType`.
5504
5505 .. _DISubprogramDeclaration:
5506
5507 When ``isDefinition: false``, subprograms describe a declaration in the type
5508 tree as opposed to a definition of a function.  If the scope is a composite
5509 type with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``,
5510 then the subprogram declaration is uniqued based only on its ``linkageName:``
5511 and ``scope:``.
5512
5513 .. code-block:: text
5514
5515     define void @_Z3foov() !dbg !0 {
5516       ...
5517     }
5518
5519     !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1,
5520                                 file: !2, line: 7, type: !3, isLocal: true,
5521                                 isDefinition: true, scopeLine: 8,
5522                                 containingType: !4,
5523                                 virtuality: DW_VIRTUALITY_pure_virtual,
5524                                 virtualIndex: 10, flags: DIFlagPrototyped,
5525                                 isOptimized: true, unit: !5, templateParams: !6,
5526                                 declaration: !7, retainedNodes: !8,
5527                                 thrownTypes: !9)
5528
5529 .. _DILexicalBlock:
5530
5531 DILexicalBlock
5532 """"""""""""""
5533
5534 ``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram
5535 <DISubprogram>`. The line number and column numbers are used to distinguish
5536 two lexical blocks at same depth. They are valid targets for ``scope:``
5537 fields.
5538
5539 .. code-block:: text
5540
5541     !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35)
5542
5543 Usually lexical blocks are ``distinct`` to prevent node merging based on
5544 operands.
5545
5546 .. _DILexicalBlockFile:
5547
5548 DILexicalBlockFile
5549 """"""""""""""""""
5550
5551 ``DILexicalBlockFile`` nodes are used to discriminate between sections of a
5552 :ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to
5553 indicate textual inclusion, or the ``discriminator:`` field can be used to
5554 discriminate between control flow within a single block in the source language.
5555
5556 .. code-block:: text
5557
5558     !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35)
5559     !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0)
5560     !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1)
5561
5562 .. _DILocation:
5563
5564 DILocation
5565 """"""""""
5566
5567 ``DILocation`` nodes represent source debug locations. The ``scope:`` field is
5568 mandatory, and points at an :ref:`DILexicalBlockFile`, an
5569 :ref:`DILexicalBlock`, or an :ref:`DISubprogram`.
5570
5571 .. code-block:: text
5572
5573     !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2)
5574
5575 .. _DILocalVariable:
5576
5577 DILocalVariable
5578 """""""""""""""
5579
5580 ``DILocalVariable`` nodes represent local variables in the source language. If
5581 the ``arg:`` field is set to non-zero, then this variable is a subprogram
5582 parameter, and it will be included in the ``retainedNodes:`` field of its
5583 :ref:`DISubprogram`.
5584
5585 .. code-block:: text
5586
5587     !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7,
5588                           type: !3, flags: DIFlagArtificial)
5589     !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7,
5590                           type: !3)
5591     !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3)
5592
5593 .. _DIExpression:
5594
5595 DIExpression
5596 """"""""""""
5597
5598 ``DIExpression`` nodes represent expressions that are inspired by the DWARF
5599 expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>`
5600 (such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the
5601 referenced LLVM variable relates to the source language variable. Debug
5602 intrinsics are interpreted left-to-right: start by pushing the value/address
5603 operand of the intrinsic onto a stack, then repeatedly push and evaluate
5604 opcodes from the DIExpression until the final variable description is produced.
5605
5606 The current supported opcode vocabulary is limited:
5607
5608 - ``DW_OP_deref`` dereferences the top of the expression stack.
5609 - ``DW_OP_plus`` pops the last two entries from the expression stack, adds
5610   them together and appends the result to the expression stack.
5611 - ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
5612   the last entry from the second last entry and appends the result to the
5613   expression stack.
5614 - ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
5615 - ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8``
5616   here, respectively) of the variable fragment from the working expression. Note
5617   that contrary to DW_OP_bit_piece, the offset is describing the location
5618   within the described source variable.
5619 - ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding
5620   (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the
5621   expression stack is to be converted. Maps into a ``DW_OP_convert`` operation
5622   that references a base type constructed from the supplied values.
5623 - ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be
5624   optionally applied to the pointer. The memory tag is derived from the
5625   given tag offset in an implementation-defined manner.
5626 - ``DW_OP_swap`` swaps top two stack entries.
5627 - ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
5628   of the stack is treated as an address. The second stack entry is treated as an
5629   address space identifier.
5630 - ``DW_OP_stack_value`` marks a constant value.
5631 - ``DW_OP_LLVM_entry_value, N`` may only appear in MIR and at the
5632   beginning of a ``DIExpression``. In DWARF a ``DBG_VALUE``
5633   instruction binding a ``DIExpression(DW_OP_LLVM_entry_value`` to a
5634   register is lowered to a ``DW_OP_entry_value [reg]``, pushing the
5635   value the register had upon function entry onto the stack.  The next
5636   ``(N - 1)`` operations will be part of the ``DW_OP_entry_value``
5637   block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value,
5638   1, DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an
5639   expression where the entry value of the debug value instruction's
5640   value/address operand is pushed to the stack, and is added
5641   with 123. Due to framework limitations ``N`` can currently only
5642   be 1.
5643
5644   The operation is introduced by the ``LiveDebugValues`` pass, which
5645   applies it only to function parameters that are unmodified
5646   throughout the function. Support is limited to simple register
5647   location descriptions, or as indirect locations (e.g., when a struct
5648   is passed-by-value to a callee via a pointer to a temporary copy
5649   made in the caller). The entry value op is also introduced by the
5650   ``AsmPrinter`` pass when a call site parameter value
5651   (``DW_AT_call_site_parameter_value``) is represented as entry value
5652   of the parameter.
5653 - ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one
5654   value, such as one that calculates the sum of two registers. This is always
5655   used in combination with an ordered list of values, such that
5656   ``DW_OP_LLVM_arg, N`` refers to the ``N``th element in that list. For
5657   example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus,
5658   DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to
5659   ``%reg1 - reg2``. This list of values should be provided by the containing
5660   intrinsic/instruction.
5661 - ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided
5662   signed offset of the specified register. The opcode is only generated by the
5663   ``AsmPrinter`` pass to describe call site parameter value which requires an
5664   expression over two registers.
5665 - ``DW_OP_push_object_address`` pushes the address of the object which can then
5666   serve as a descriptor in subsequent calculation. This opcode can be used to
5667   calculate bounds of fortran allocatable array which has array descriptors.
5668 - ``DW_OP_over`` duplicates the entry currently second in the stack at the top
5669   of the stack. This opcode can be used to calculate bounds of fortran assumed
5670   rank array which has rank known at run time and current dimension number is
5671   implicitly first element of the stack.
5672 - ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can
5673   be used to represent pointer variables which are optimized out but the value
5674   it points to is known. This operator is required as it is different than DWARF
5675   operator DW_OP_implicit_pointer in representation and specification (number
5676   and types of operands) and later can not be used as multiple level.
5677
5678 .. code-block:: text
5679
5680     IR for "*ptr = 4;"
5681     --------------
5682     call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !20)
5683     !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
5684                            type: !18)
5685     !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
5686     !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5687     !20 = !DIExpression(DW_OP_LLVM_implicit_pointer))
5688
5689     IR for "**ptr = 4;"
5690     --------------
5691     call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !21)
5692     !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
5693                            type: !18)
5694     !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
5695     !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
5696     !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5697     !21 = !DIExpression(DW_OP_LLVM_implicit_pointer,
5698                         DW_OP_LLVM_implicit_pointer))
5699
5700 DWARF specifies three kinds of simple location descriptions: Register, memory,
5701 and implicit location descriptions.  Note that a location description is
5702 defined over certain ranges of a program, i.e the location of a variable may
5703 change over the course of the program. Register and memory location
5704 descriptions describe the *concrete location* of a source variable (in the
5705 sense that a debugger might modify its value), whereas *implicit locations*
5706 describe merely the actual *value* of a source variable which might not exist
5707 in registers or in memory (see ``DW_OP_stack_value``).
5708
5709 A ``llvm.dbg.addr`` or ``llvm.dbg.declare`` intrinsic describes an indirect
5710 value (the address) of a source variable. The first operand of the intrinsic
5711 must be an address of some kind. A DIExpression attached to the intrinsic
5712 refines this address to produce a concrete location for the source variable.
5713
5714 A ``llvm.dbg.value`` intrinsic describes the direct value of a source variable.
5715 The first operand of the intrinsic may be a direct or indirect value. A
5716 DIExpression attached to the intrinsic refines the first operand to produce a
5717 direct value. For example, if the first operand is an indirect value, it may be
5718 necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a
5719 valid debug intrinsic.
5720
5721 .. note::
5722
5723    A DIExpression is interpreted in the same way regardless of which kind of
5724    debug intrinsic it's attached to.
5725
5726 .. code-block:: text
5727
5728     !0 = !DIExpression(DW_OP_deref)
5729     !1 = !DIExpression(DW_OP_plus_uconst, 3)
5730     !1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus)
5731     !2 = !DIExpression(DW_OP_bit_piece, 3, 7)
5732     !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7)
5733     !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef)
5734     !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value)
5735
5736 DIArgList
5737 """"""""""""
5738
5739 ``DIArgList`` nodes hold a list of constant or SSA value references. These are
5740 used in :ref:`debug intrinsics<dbg_intrinsics>` (currently only in
5741 ``llvm.dbg.value``) in combination with a ``DIExpression`` that uses the
5742 ``DW_OP_LLVM_arg`` operator. Because a DIArgList may refer to local values
5743 within a function, it must only be used as a function argument, must always be
5744 inlined, and cannot appear in named metadata.
5745
5746 .. code-block:: text
5747
5748     llvm.dbg.value(metadata !DIArgList(i32 %a, i32 %b),
5749                    metadata !16,
5750                    metadata !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus))
5751
5752 DIFlags
5753 """""""""""""""
5754
5755 These flags encode various properties of DINodes.
5756
5757 The `ExportSymbols` flag marks a class, struct or union whose members
5758 may be referenced as if they were defined in the containing class or
5759 union. This flag is used to decide whether the DW_AT_export_symbols can
5760 be used for the structure type.
5761
5762 DIObjCProperty
5763 """"""""""""""
5764
5765 ``DIObjCProperty`` nodes represent Objective-C property nodes.
5766
5767 .. code-block:: text
5768
5769     !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo",
5770                          getter: "getFoo", attributes: 7, type: !2)
5771
5772 DIImportedEntity
5773 """"""""""""""""
5774
5775 ``DIImportedEntity`` nodes represent entities (such as modules) imported into a
5776 compile unit.
5777
5778 .. code-block:: text
5779
5780    !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0,
5781                           entity: !1, line: 7)
5782
5783 DIMacro
5784 """""""
5785
5786 ``DIMacro`` nodes represent definition or undefinition of a macro identifiers.
5787 The ``name:`` field is the macro identifier, followed by macro parameters when
5788 defining a function-like macro, and the ``value`` field is the token-string
5789 used to expand the macro identifier.
5790
5791 .. code-block:: text
5792
5793    !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)",
5794                  value: "((x) + 1)")
5795    !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo")
5796
5797 DIMacroFile
5798 """""""""""
5799
5800 ``DIMacroFile`` nodes represent inclusion of source files.
5801 The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that
5802 appear in the included source file.
5803
5804 .. code-block:: text
5805
5806    !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2,
5807                      nodes: !3)
5808
5809 .. _DILabel:
5810
5811 DILabel
5812 """""""
5813
5814 ``DILabel`` nodes represent labels within a :ref:`DISubprogram`. All fields of
5815 a ``DILabel`` are mandatory. The ``scope:`` field must be one of either a
5816 :ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`.
5817 The ``name:`` field is the label identifier. The ``file:`` field is the
5818 :ref:`DIFile` the label is present in. The ``line:`` field is the source line
5819 within the file where the label is declared.
5820
5821 .. code-block:: text
5822
5823   !2 = !DILabel(scope: !0, name: "foo", file: !1, line: 7)
5824
5825 '``tbaa``' Metadata
5826 ^^^^^^^^^^^^^^^^^^^
5827
5828 In LLVM IR, memory does not have types, so LLVM's own type system is not
5829 suitable for doing type based alias analysis (TBAA). Instead, metadata is
5830 added to the IR to describe a type system of a higher level language. This
5831 can be used to implement C/C++ strict type aliasing rules, but it can also
5832 be used to implement custom alias analysis behavior for other languages.
5833
5834 This description of LLVM's TBAA system is broken into two parts:
5835 :ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and
5836 :ref:`Representation<tbaa_node_representation>` talks about the metadata
5837 encoding of various entities.
5838
5839 It is always possible to trace any TBAA node to a "root" TBAA node (details
5840 in the :ref:`Representation<tbaa_node_representation>` section).  TBAA
5841 nodes with different roots have an unknown aliasing relationship, and LLVM
5842 conservatively infers ``MayAlias`` between them.  The rules mentioned in
5843 this section only pertain to TBAA nodes living under the same root.
5844
5845 .. _tbaa_node_semantics:
5846
5847 Semantics
5848 """""""""
5849
5850 The TBAA metadata system, referred to as "struct path TBAA" (not to be
5851 confused with ``tbaa.struct``), consists of the following high level
5852 concepts: *Type Descriptors*, further subdivided into scalar type
5853 descriptors and struct type descriptors; and *Access Tags*.
5854
5855 **Type descriptors** describe the type system of the higher level language
5856 being compiled.  **Scalar type descriptors** describe types that do not
5857 contain other types.  Each scalar type has a parent type, which must also
5858 be a scalar type or the TBAA root.  Via this parent relation, scalar types
5859 within a TBAA root form a tree.  **Struct type descriptors** denote types
5860 that contain a sequence of other type descriptors, at known offsets.  These
5861 contained type descriptors can either be struct type descriptors themselves
5862 or scalar type descriptors.
5863
5864 **Access tags** are metadata nodes attached to load and store instructions.
5865 Access tags use type descriptors to describe the *location* being accessed
5866 in terms of the type system of the higher level language.  Access tags are
5867 tuples consisting of a base type, an access type and an offset.  The base
5868 type is a scalar type descriptor or a struct type descriptor, the access
5869 type is a scalar type descriptor, and the offset is a constant integer.
5870
5871 The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two
5872 things:
5873
5874  * If ``BaseTy`` is a struct type, the tag describes a memory access (load
5875    or store) of a value of type ``AccessTy`` contained in the struct type
5876    ``BaseTy`` at offset ``Offset``.
5877
5878  * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and
5879    ``AccessTy`` must be the same; and the access tag describes a scalar
5880    access with scalar type ``AccessTy``.
5881
5882 We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)``
5883 tuples this way:
5884
5885  * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is
5886    ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as
5887    described in the TBAA metadata.  ``ImmediateParent(BaseTy, Offset)`` is
5888    undefined if ``Offset`` is non-zero.
5889
5890  * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)``
5891    is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in
5892    ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted
5893    to be relative within that inner type.
5894
5895 A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)``
5896 aliases a memory access with an access tag ``(BaseTy2, AccessTy2,
5897 Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2,
5898 Offset2)`` via the ``Parent`` relation or vice versa.
5899
5900 As a concrete example, the type descriptor graph for the following program
5901
5902 .. code-block:: c
5903
5904     struct Inner {
5905       int i;    // offset 0
5906       float f;  // offset 4
5907     };
5908
5909     struct Outer {
5910       float f;  // offset 0
5911       double d; // offset 4
5912       struct Inner inner_a;  // offset 12
5913     };
5914
5915     void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) {
5916       outer->f = 0;            // tag0: (OuterStructTy, FloatScalarTy, 0)
5917       outer->inner_a.i = 0;    // tag1: (OuterStructTy, IntScalarTy, 12)
5918       outer->inner_a.f = 0.0;  // tag2: (OuterStructTy, FloatScalarTy, 16)
5919       *f = 0.0;                // tag3: (FloatScalarTy, FloatScalarTy, 0)
5920     }
5921
5922 is (note that in C and C++, ``char`` can be used to access any arbitrary
5923 type):
5924
5925 .. code-block:: text
5926
5927     Root = "TBAA Root"
5928     CharScalarTy = ("char", Root, 0)
5929     FloatScalarTy = ("float", CharScalarTy, 0)
5930     DoubleScalarTy = ("double", CharScalarTy, 0)
5931     IntScalarTy = ("int", CharScalarTy, 0)
5932     InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)}
5933     OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4),
5934                      (InnerStructTy, 12)}
5935
5936
5937 with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy,
5938 0)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and
5939 ``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``.
5940
5941 .. _tbaa_node_representation:
5942
5943 Representation
5944 """"""""""""""
5945
5946 The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or
5947 with exactly one ``MDString`` operand.
5948
5949 Scalar type descriptors are represented as an ``MDNode`` s with two
5950 operands.  The first operand is an ``MDString`` denoting the name of the
5951 struct type.  LLVM does not assign meaning to the value of this operand, it
5952 only cares about it being an ``MDString``.  The second operand is an
5953 ``MDNode`` which points to the parent for said scalar type descriptor,
5954 which is either another scalar type descriptor or the TBAA root.  Scalar
5955 type descriptors can have an optional third argument, but that must be the
5956 constant integer zero.
5957
5958 Struct type descriptors are represented as ``MDNode`` s with an odd number
5959 of operands greater than 1.  The first operand is an ``MDString`` denoting
5960 the name of the struct type.  Like in scalar type descriptors the actual
5961 value of this name operand is irrelevant to LLVM.  After the name operand,
5962 the struct type descriptors have a sequence of alternating ``MDNode`` and
5963 ``ConstantInt`` operands.  With N starting from 1, the 2N - 1 th operand,
5964 an ``MDNode``, denotes a contained field, and the 2N th operand, a
5965 ``ConstantInt``, is the offset of the said contained field.  The offsets
5966 must be in non-decreasing order.
5967
5968 Access tags are represented as ``MDNode`` s with either 3 or 4 operands.
5969 The first operand is an ``MDNode`` pointing to the node representing the
5970 base type.  The second operand is an ``MDNode`` pointing to the node
5971 representing the access type.  The third operand is a ``ConstantInt`` that
5972 states the offset of the access.  If a fourth field is present, it must be
5973 a ``ConstantInt`` valued at 0 or 1.  If it is 1 then the access tag states
5974 that the location being accessed is "constant" (meaning
5975 ``pointsToConstantMemory`` should return true; see `other useful
5976 AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_).  The TBAA root of
5977 the access type and the base type of an access tag must be the same, and
5978 that is the TBAA root of the access tag.
5979
5980 '``tbaa.struct``' Metadata
5981 ^^^^^^^^^^^^^^^^^^^^^^^^^^
5982
5983 The :ref:`llvm.memcpy <int_memcpy>` is often used to implement
5984 aggregate assignment operations in C and similar languages, however it
5985 is defined to copy a contiguous region of memory, which is more than
5986 strictly necessary for aggregate types which contain holes due to
5987 padding. Also, it doesn't contain any TBAA information about the fields
5988 of the aggregate.
5989
5990 ``!tbaa.struct`` metadata can describe which memory subregions in a
5991 memcpy are padding and what the TBAA tags of the struct are.
5992
5993 The current metadata format is very simple. ``!tbaa.struct`` metadata
5994 nodes are a list of operands which are in conceptual groups of three.
5995 For each group of three, the first operand gives the byte offset of a
5996 field in bytes, the second gives its size in bytes, and the third gives
5997 its tbaa tag. e.g.:
5998
5999 .. code-block:: llvm
6000
6001     !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 }
6002
6003 This describes a struct with two fields. The first is at offset 0 bytes
6004 with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes
6005 and has size 4 bytes and has tbaa tag !2.
6006
6007 Note that the fields need not be contiguous. In this example, there is a
6008 4 byte gap between the two fields. This gap represents padding which
6009 does not carry useful data and need not be preserved.
6010
6011 '``noalias``' and '``alias.scope``' Metadata
6012 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6013
6014 ``noalias`` and ``alias.scope`` metadata provide the ability to specify generic
6015 noalias memory-access sets. This means that some collection of memory access
6016 instructions (loads, stores, memory-accessing calls, etc.) that carry
6017 ``noalias`` metadata can specifically be specified not to alias with some other
6018 collection of memory access instructions that carry ``alias.scope`` metadata.
6019 Each type of metadata specifies a list of scopes where each scope has an id and
6020 a domain.
6021
6022 When evaluating an aliasing query, if for some domain, the set
6023 of scopes with that domain in one instruction's ``alias.scope`` list is a
6024 subset of (or equal to) the set of scopes for that domain in another
6025 instruction's ``noalias`` list, then the two memory accesses are assumed not to
6026 alias.
6027
6028 Because scopes in one domain don't affect scopes in other domains, separate
6029 domains can be used to compose multiple independent noalias sets.  This is
6030 used for example during inlining.  As the noalias function parameters are
6031 turned into noalias scope metadata, a new domain is used every time the
6032 function is inlined.
6033
6034 The metadata identifying each domain is itself a list containing one or two
6035 entries. The first entry is the name of the domain. Note that if the name is a
6036 string then it can be combined across functions and translation units. A
6037 self-reference can be used to create globally unique domain names. A
6038 descriptive string may optionally be provided as a second list entry.
6039
6040 The metadata identifying each scope is also itself a list containing two or
6041 three entries. The first entry is the name of the scope. Note that if the name
6042 is a string then it can be combined across functions and translation units. A
6043 self-reference can be used to create globally unique scope names. A metadata
6044 reference to the scope's domain is the second entry. A descriptive string may
6045 optionally be provided as a third list entry.
6046
6047 For example,
6048
6049 .. code-block:: llvm
6050
6051     ; Two scope domains:
6052     !0 = !{!0}
6053     !1 = !{!1}
6054
6055     ; Some scopes in these domains:
6056     !2 = !{!2, !0}
6057     !3 = !{!3, !0}
6058     !4 = !{!4, !1}
6059
6060     ; Some scope lists:
6061     !5 = !{!4} ; A list containing only scope !4
6062     !6 = !{!4, !3, !2}
6063     !7 = !{!3}
6064
6065     ; These two instructions don't alias:
6066     %0 = load float, float* %c, align 4, !alias.scope !5
6067     store float %0, float* %arrayidx.i, align 4, !noalias !5
6068
6069     ; These two instructions also don't alias (for domain !1, the set of scopes
6070     ; in the !alias.scope equals that in the !noalias list):
6071     %2 = load float, float* %c, align 4, !alias.scope !5
6072     store float %2, float* %arrayidx.i2, align 4, !noalias !6
6073
6074     ; These two instructions may alias (for domain !0, the set of scopes in
6075     ; the !noalias list is not a superset of, or equal to, the scopes in the
6076     ; !alias.scope list):
6077     %2 = load float, float* %c, align 4, !alias.scope !6
6078     store float %0, float* %arrayidx.i, align 4, !noalias !7
6079
6080 '``fpmath``' Metadata
6081 ^^^^^^^^^^^^^^^^^^^^^
6082
6083 ``fpmath`` metadata may be attached to any instruction of floating-point
6084 type. It can be used to express the maximum acceptable error in the
6085 result of that instruction, in ULPs, thus potentially allowing the
6086 compiler to use a more efficient but less accurate method of computing
6087 it. ULP is defined as follows:
6088
6089     If ``x`` is a real number that lies between two finite consecutive
6090     floating-point numbers ``a`` and ``b``, without being equal to one
6091     of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the
6092     distance between the two non-equal finite floating-point numbers
6093     nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``.
6094
6095 The metadata node shall consist of a single positive float type number
6096 representing the maximum relative error, for example:
6097
6098 .. code-block:: llvm
6099
6100     !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs
6101
6102 .. _range-metadata:
6103
6104 '``range``' Metadata
6105 ^^^^^^^^^^^^^^^^^^^^
6106
6107 ``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
6108 integer types. It expresses the possible ranges the loaded value or the value
6109 returned by the called function at this call site is in. If the loaded or
6110 returned value is not in the specified range, the behavior is undefined. The
6111 ranges are represented with a flattened list of integers. The loaded value or
6112 the value returned is known to be in the union of the ranges defined by each
6113 consecutive pair. Each pair has the following properties:
6114
6115 -  The type must match the type loaded by the instruction.
6116 -  The pair ``a,b`` represents the range ``[a,b)``.
6117 -  Both ``a`` and ``b`` are constants.
6118 -  The range is allowed to wrap.
6119 -  The range should not represent the full or empty set. That is,
6120    ``a!=b``.
6121
6122 In addition, the pairs must be in signed order of the lower bound and
6123 they must be non-contiguous.
6124
6125 Examples:
6126
6127 .. code-block:: llvm
6128
6129       %a = load i8, i8* %x, align 1, !range !0 ; Can only be 0 or 1
6130       %b = load i8, i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
6131       %c = call i8 @foo(),       !range !2 ; Can only be 0, 1, 3, 4 or 5
6132       %d = invoke i8 @bar() to label %cont
6133              unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
6134     ...
6135     !0 = !{ i8 0, i8 2 }
6136     !1 = !{ i8 255, i8 2 }
6137     !2 = !{ i8 0, i8 2, i8 3, i8 6 }
6138     !3 = !{ i8 -2, i8 0, i8 3, i8 6 }
6139
6140 '``absolute_symbol``' Metadata
6141 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6142
6143 ``absolute_symbol`` metadata may be attached to a global variable
6144 declaration. It marks the declaration as a reference to an absolute symbol,
6145 which causes the backend to use absolute relocations for the symbol even
6146 in position independent code, and expresses the possible ranges that the
6147 global variable's *address* (not its value) is in, in the same format as
6148 ``range`` metadata, with the extension that the pair ``all-ones,all-ones``
6149 may be used to represent the full set.
6150
6151 Example (assuming 64-bit pointers):
6152
6153 .. code-block:: llvm
6154
6155       @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256)
6156       @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64)
6157
6158     ...
6159     !0 = !{ i64 0, i64 256 }
6160     !1 = !{ i64 -1, i64 -1 }
6161
6162 '``callees``' Metadata
6163 ^^^^^^^^^^^^^^^^^^^^^^
6164
6165 ``callees`` metadata may be attached to indirect call sites. If ``callees``
6166 metadata is attached to a call site, and any callee is not among the set of
6167 functions provided by the metadata, the behavior is undefined. The intent of
6168 this metadata is to facilitate optimizations such as indirect-call promotion.
6169 For example, in the code below, the call instruction may only target the
6170 ``add`` or ``sub`` functions:
6171
6172 .. code-block:: llvm
6173
6174     %result = call i64 %binop(i64 %x, i64 %y), !callees !0
6175
6176     ...
6177     !0 = !{i64 (i64, i64)* @add, i64 (i64, i64)* @sub}
6178
6179 '``callback``' Metadata
6180 ^^^^^^^^^^^^^^^^^^^^^^^
6181
6182 ``callback`` metadata may be attached to a function declaration, or definition.
6183 (Call sites are excluded only due to the lack of a use case.) For ease of
6184 exposition, we'll refer to the function annotated w/ metadata as a broker
6185 function. The metadata describes how the arguments of a call to the broker are
6186 in turn passed to the callback function specified by the metadata. Thus, the
6187 ``callback`` metadata provides a partial description of a call site inside the
6188 broker function with regards to the arguments of a call to the broker. The only
6189 semantic restriction on the broker function itself is that it is not allowed to
6190 inspect or modify arguments referenced in the ``callback`` metadata as
6191 pass-through to the callback function.
6192
6193 The broker is not required to actually invoke the callback function at runtime.
6194 However, the assumptions about not inspecting or modifying arguments that would
6195 be passed to the specified callback function still hold, even if the callback
6196 function is not dynamically invoked. The broker is allowed to invoke the
6197 callback function more than once per invocation of the broker. The broker is
6198 also allowed to invoke (directly or indirectly) the function passed as a
6199 callback through another use. Finally, the broker is also allowed to relay the
6200 callback callee invocation to a different thread.
6201
6202 The metadata is structured as follows: At the outer level, ``callback``
6203 metadata is a list of ``callback`` encodings. Each encoding starts with a
6204 constant ``i64`` which describes the argument position of the callback function
6205 in the call to the broker. The following elements, except the last, describe
6206 what arguments are passed to the callback function. Each element is again an
6207 ``i64`` constant identifying the argument of the broker that is passed through,
6208 or ``i64 -1`` to indicate an unknown or inspected argument. The order in which
6209 they are listed has to be the same in which they are passed to the callback
6210 callee. The last element of the encoding is a boolean which specifies how
6211 variadic arguments of the broker are handled. If it is true, all variadic
6212 arguments of the broker are passed through to the callback function *after* the
6213 arguments encoded explicitly before.
6214
6215 In the code below, the ``pthread_create`` function is marked as a broker
6216 through the ``!callback !1`` metadata. In the example, there is only one
6217 callback encoding, namely ``!2``, associated with the broker. This encoding
6218 identifies the callback function as the second argument of the broker (``i64
6219 2``) and the sole argument of the callback function as the third one of the
6220 broker function (``i64 3``).
6221
6222 .. FIXME why does the llvm-sphinx-docs builder give a highlighting
6223    error if the below is set to highlight as 'llvm', despite that we
6224    have misc.highlighting_failure set?
6225
6226 .. code-block:: text
6227
6228     declare !callback !1 dso_local i32 @pthread_create(i64*, %union.pthread_attr_t*, i8* (i8*)*, i8*)
6229
6230     ...
6231     !2 = !{i64 2, i64 3, i1 false}
6232     !1 = !{!2}
6233
6234 Another example is shown below. The callback callee is the second argument of
6235 the ``__kmpc_fork_call`` function (``i64 2``). The callee is given two unknown
6236 values (each identified by a ``i64 -1``) and afterwards all
6237 variadic arguments that are passed to the ``__kmpc_fork_call`` call (due to the
6238 final ``i1 true``).
6239
6240 .. FIXME why does the llvm-sphinx-docs builder give a highlighting
6241    error if the below is set to highlight as 'llvm', despite that we
6242    have misc.highlighting_failure set?
6243
6244 .. code-block:: text
6245
6246     declare !callback !0 dso_local void @__kmpc_fork_call(%struct.ident_t*, i32, void (i32*, i32*, ...)*, ...)
6247
6248     ...
6249     !1 = !{i64 2, i64 -1, i64 -1, i1 true}
6250     !0 = !{!1}
6251
6252
6253 '``unpredictable``' Metadata
6254 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6255
6256 ``unpredictable`` metadata may be attached to any branch or switch
6257 instruction. It can be used to express the unpredictability of control
6258 flow. Similar to the llvm.expect intrinsic, it may be used to alter
6259 optimizations related to compare and branch instructions. The metadata
6260 is treated as a boolean value; if it exists, it signals that the branch
6261 or switch that it is attached to is completely unpredictable.
6262
6263 .. _md_dereferenceable:
6264
6265 '``dereferenceable``' Metadata
6266 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6267
6268 The existence of the ``!dereferenceable`` metadata on the instruction
6269 tells the optimizer that the value loaded is known to be dereferenceable.
6270 The number of bytes known to be dereferenceable is specified by the integer
6271 value in the metadata node. This is analogous to the ''dereferenceable''
6272 attribute on parameters and return values.
6273
6274 .. _md_dereferenceable_or_null:
6275
6276 '``dereferenceable_or_null``' Metadata
6277 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6278
6279 The existence of the ``!dereferenceable_or_null`` metadata on the
6280 instruction tells the optimizer that the value loaded is known to be either
6281 dereferenceable or null.
6282 The number of bytes known to be dereferenceable is specified by the integer
6283 value in the metadata node. This is analogous to the ''dereferenceable_or_null''
6284 attribute on parameters and return values.
6285
6286 .. _llvm.loop:
6287
6288 '``llvm.loop``'
6289 ^^^^^^^^^^^^^^^
6290
6291 It is sometimes useful to attach information to loop constructs. Currently,
6292 loop metadata is implemented as metadata attached to the branch instruction
6293 in the loop latch block. The loop metadata node is a list of
6294 other metadata nodes, each representing a property of the loop. Usually,
6295 the first item of the property node is a string. For example, the
6296 ``llvm.loop.unroll.count`` suggests an unroll factor to the loop
6297 unroller:
6298
6299 .. code-block:: llvm
6300
6301       br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0
6302     ...
6303     !0 = !{!0, !1, !2}
6304     !1 = !{!"llvm.loop.unroll.enable"}
6305     !2 = !{!"llvm.loop.unroll.count", i32 4}
6306
6307 For legacy reasons, the first item of a loop metadata node must be a
6308 reference to itself. Before the advent of the 'distinct' keyword, this
6309 forced the preservation of otherwise identical metadata nodes. Since
6310 the loop-metadata node can be attached to multiple nodes, the 'distinct'
6311 keyword has become unnecessary.
6312
6313 Prior to the property nodes, one or two ``DILocation`` (debug location)
6314 nodes can be present in the list. The first, if present, identifies the
6315 source-code location where the loop begins. The second, if present,
6316 identifies the source-code location where the loop ends.
6317
6318 Loop metadata nodes cannot be used as unique identifiers. They are
6319 neither persistent for the same loop through transformations nor
6320 necessarily unique to just one loop.
6321
6322 '``llvm.loop.disable_nonforced``'
6323 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6324
6325 This metadata disables all optional loop transformations unless
6326 explicitly instructed using other transformation metadata such as
6327 ``llvm.loop.unroll.enable``. That is, no heuristic will try to determine
6328 whether a transformation is profitable. The purpose is to avoid that the
6329 loop is transformed to a different loop before an explicitly requested
6330 (forced) transformation is applied. For instance, loop fusion can make
6331 other transformations impossible. Mandatory loop canonicalizations such
6332 as loop rotation are still applied.
6333
6334 It is recommended to use this metadata in addition to any llvm.loop.*
6335 transformation directive. Also, any loop should have at most one
6336 directive applied to it (and a sequence of transformations built using
6337 followup-attributes). Otherwise, which transformation will be applied
6338 depends on implementation details such as the pass pipeline order.
6339
6340 See :ref:`transformation-metadata` for details.
6341
6342 '``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
6343 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6344
6345 Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are
6346 used to control per-loop vectorization and interleaving parameters such as
6347 vectorization width and interleave count. These metadata should be used in
6348 conjunction with ``llvm.loop`` loop identification metadata. The
6349 ``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only
6350 optimization hints and the optimizer will only interleave and vectorize loops if
6351 it believes it is safe to do so. The ``llvm.loop.parallel_accesses`` metadata
6352 which contains information about loop-carried memory dependencies can be helpful
6353 in determining the safety of these transformations.
6354
6355 '``llvm.loop.interleave.count``' Metadata
6356 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6357
6358 This metadata suggests an interleave count to the loop interleaver.
6359 The first operand is the string ``llvm.loop.interleave.count`` and the
6360 second operand is an integer specifying the interleave count. For
6361 example:
6362
6363 .. code-block:: llvm
6364
6365    !0 = !{!"llvm.loop.interleave.count", i32 4}
6366
6367 Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving
6368 multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0
6369 then the interleave count will be determined automatically.
6370
6371 '``llvm.loop.vectorize.enable``' Metadata
6372 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6373
6374 This metadata selectively enables or disables vectorization for the loop. The
6375 first operand is the string ``llvm.loop.vectorize.enable`` and the second operand
6376 is a bit. If the bit operand value is 1 vectorization is enabled. A value of
6377 0 disables vectorization:
6378
6379 .. code-block:: llvm
6380
6381    !0 = !{!"llvm.loop.vectorize.enable", i1 0}
6382    !1 = !{!"llvm.loop.vectorize.enable", i1 1}
6383
6384 '``llvm.loop.vectorize.predicate.enable``' Metadata
6385 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6386
6387 This metadata selectively enables or disables creating predicated instructions
6388 for the loop, which can enable folding of the scalar epilogue loop into the
6389 main loop. The first operand is the string
6390 ``llvm.loop.vectorize.predicate.enable`` and the second operand is a bit. If
6391 the bit operand value is 1 vectorization is enabled. A value of 0 disables
6392 vectorization:
6393
6394 .. code-block:: llvm
6395
6396    !0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0}
6397    !1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1}
6398
6399 '``llvm.loop.vectorize.scalable.enable``' Metadata
6400 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6401
6402 This metadata selectively enables or disables scalable vectorization for the
6403 loop, and only has any effect if vectorization for the loop is already enabled.
6404 The first operand is the string ``llvm.loop.vectorize.scalable.enable``
6405 and the second operand is a bit. If the bit operand value is 1 scalable
6406 vectorization is enabled, whereas a value of 0 reverts to the default fixed
6407 width vectorization:
6408
6409 .. code-block:: llvm
6410
6411    !0 = !{!"llvm.loop.vectorize.scalable.enable", i1 0}
6412    !1 = !{!"llvm.loop.vectorize.scalable.enable", i1 1}
6413
6414 '``llvm.loop.vectorize.width``' Metadata
6415 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6416
6417 This metadata sets the target width of the vectorizer. The first
6418 operand is the string ``llvm.loop.vectorize.width`` and the second
6419 operand is an integer specifying the width. For example:
6420
6421 .. code-block:: llvm
6422
6423    !0 = !{!"llvm.loop.vectorize.width", i32 4}
6424
6425 Note that setting ``llvm.loop.vectorize.width`` to 1 disables
6426 vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to
6427 0 or if the loop does not have this metadata the width will be
6428 determined automatically.
6429
6430 '``llvm.loop.vectorize.followup_vectorized``' Metadata
6431 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6432
6433 This metadata defines which loop attributes the vectorized loop will
6434 have. See :ref:`transformation-metadata` for details.
6435
6436 '``llvm.loop.vectorize.followup_epilogue``' Metadata
6437 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6438
6439 This metadata defines which loop attributes the epilogue will have. The
6440 epilogue is not vectorized and is executed when either the vectorized
6441 loop is not known to preserve semantics (because e.g., it processes two
6442 arrays that are found to alias by a runtime check) or for the last
6443 iterations that do not fill a complete set of vector lanes. See
6444 :ref:`Transformation Metadata <transformation-metadata>` for details.
6445
6446 '``llvm.loop.vectorize.followup_all``' Metadata
6447 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6448
6449 Attributes in the metadata will be added to both the vectorized and
6450 epilogue loop.
6451 See :ref:`Transformation Metadata <transformation-metadata>` for details.
6452
6453 '``llvm.loop.unroll``'
6454 ^^^^^^^^^^^^^^^^^^^^^^
6455
6456 Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling
6457 optimization hints such as the unroll factor. ``llvm.loop.unroll``
6458 metadata should be used in conjunction with ``llvm.loop`` loop
6459 identification metadata. The ``llvm.loop.unroll`` metadata are only
6460 optimization hints and the unrolling will only be performed if the
6461 optimizer believes it is safe to do so.
6462
6463 '``llvm.loop.unroll.count``' Metadata
6464 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6465
6466 This metadata suggests an unroll factor to the loop unroller. The
6467 first operand is the string ``llvm.loop.unroll.count`` and the second
6468 operand is a positive integer specifying the unroll factor. For
6469 example:
6470
6471 .. code-block:: llvm
6472
6473    !0 = !{!"llvm.loop.unroll.count", i32 4}
6474
6475 If the trip count of the loop is less than the unroll count the loop
6476 will be partially unrolled.
6477
6478 '``llvm.loop.unroll.disable``' Metadata
6479 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6480
6481 This metadata disables loop unrolling. The metadata has a single operand
6482 which is the string ``llvm.loop.unroll.disable``. For example:
6483
6484 .. code-block:: llvm
6485
6486    !0 = !{!"llvm.loop.unroll.disable"}
6487
6488 '``llvm.loop.unroll.runtime.disable``' Metadata
6489 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6490
6491 This metadata disables runtime loop unrolling. The metadata has a single
6492 operand which is the string ``llvm.loop.unroll.runtime.disable``. For example:
6493
6494 .. code-block:: llvm
6495
6496    !0 = !{!"llvm.loop.unroll.runtime.disable"}
6497
6498 '``llvm.loop.unroll.enable``' Metadata
6499 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6500
6501 This metadata suggests that the loop should be fully unrolled if the trip count
6502 is known at compile time and partially unrolled if the trip count is not known
6503 at compile time. The metadata has a single operand which is the string
6504 ``llvm.loop.unroll.enable``.  For example:
6505
6506 .. code-block:: llvm
6507
6508    !0 = !{!"llvm.loop.unroll.enable"}
6509
6510 '``llvm.loop.unroll.full``' Metadata
6511 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6512
6513 This metadata suggests that the loop should be unrolled fully. The
6514 metadata has a single operand which is the string ``llvm.loop.unroll.full``.
6515 For example:
6516
6517 .. code-block:: llvm
6518
6519    !0 = !{!"llvm.loop.unroll.full"}
6520
6521 '``llvm.loop.unroll.followup``' Metadata
6522 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6523
6524 This metadata defines which loop attributes the unrolled loop will have.
6525 See :ref:`Transformation Metadata <transformation-metadata>` for details.
6526
6527 '``llvm.loop.unroll.followup_remainder``' Metadata
6528 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6529
6530 This metadata defines which loop attributes the remainder loop after
6531 partial/runtime unrolling will have. See
6532 :ref:`Transformation Metadata <transformation-metadata>` for details.
6533
6534 '``llvm.loop.unroll_and_jam``'
6535 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6536
6537 This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata
6538 above, but affect the unroll and jam pass. In addition any loop with
6539 ``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will
6540 disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the
6541 unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam
6542 too.)
6543
6544 The metadata for unroll and jam otherwise is the same as for ``unroll``.
6545 ``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and
6546 ``llvm.loop.unroll_and_jam.count`` do the same as for unroll.
6547 ``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints
6548 and the normal safety checks will still be performed.
6549
6550 '``llvm.loop.unroll_and_jam.count``' Metadata
6551 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6552
6553 This metadata suggests an unroll and jam factor to use, similarly to
6554 ``llvm.loop.unroll.count``. The first operand is the string
6555 ``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer
6556 specifying the unroll factor. For example:
6557
6558 .. code-block:: llvm
6559
6560    !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4}
6561
6562 If the trip count of the loop is less than the unroll count the loop
6563 will be partially unroll and jammed.
6564
6565 '``llvm.loop.unroll_and_jam.disable``' Metadata
6566 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6567
6568 This metadata disables loop unroll and jamming. The metadata has a single
6569 operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example:
6570
6571 .. code-block:: llvm
6572
6573    !0 = !{!"llvm.loop.unroll_and_jam.disable"}
6574
6575 '``llvm.loop.unroll_and_jam.enable``' Metadata
6576 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6577
6578 This metadata suggests that the loop should be fully unroll and jammed if the
6579 trip count is known at compile time and partially unrolled if the trip count is
6580 not known at compile time. The metadata has a single operand which is the
6581 string ``llvm.loop.unroll_and_jam.enable``.  For example:
6582
6583 .. code-block:: llvm
6584
6585    !0 = !{!"llvm.loop.unroll_and_jam.enable"}
6586
6587 '``llvm.loop.unroll_and_jam.followup_outer``' Metadata
6588 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6589
6590 This metadata defines which loop attributes the outer unrolled loop will
6591 have. See :ref:`Transformation Metadata <transformation-metadata>` for
6592 details.
6593
6594 '``llvm.loop.unroll_and_jam.followup_inner``' Metadata
6595 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6596
6597 This metadata defines which loop attributes the inner jammed loop will
6598 have. See :ref:`Transformation Metadata <transformation-metadata>` for
6599 details.
6600
6601 '``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata
6602 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6603
6604 This metadata defines which attributes the epilogue of the outer loop
6605 will have. This loop is usually unrolled, meaning there is no such
6606 loop. This attribute will be ignored in this case. See
6607 :ref:`Transformation Metadata <transformation-metadata>` for details.
6608
6609 '``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata
6610 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6611
6612 This metadata defines which attributes the inner loop of the epilogue
6613 will have. The outer epilogue will usually be unrolled, meaning there
6614 can be multiple inner remainder loops. See
6615 :ref:`Transformation Metadata <transformation-metadata>` for details.
6616
6617 '``llvm.loop.unroll_and_jam.followup_all``' Metadata
6618 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6619
6620 Attributes specified in the metadata is added to all
6621 ``llvm.loop.unroll_and_jam.*`` loops. See
6622 :ref:`Transformation Metadata <transformation-metadata>` for details.
6623
6624 '``llvm.loop.licm_versioning.disable``' Metadata
6625 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6626
6627 This metadata indicates that the loop should not be versioned for the purpose
6628 of enabling loop-invariant code motion (LICM). The metadata has a single operand
6629 which is the string ``llvm.loop.licm_versioning.disable``. For example:
6630
6631 .. code-block:: llvm
6632
6633    !0 = !{!"llvm.loop.licm_versioning.disable"}
6634
6635 '``llvm.loop.distribute.enable``' Metadata
6636 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6637
6638 Loop distribution allows splitting a loop into multiple loops.  Currently,
6639 this is only performed if the entire loop cannot be vectorized due to unsafe
6640 memory dependencies.  The transformation will attempt to isolate the unsafe
6641 dependencies into their own loop.
6642
6643 This metadata can be used to selectively enable or disable distribution of the
6644 loop.  The first operand is the string ``llvm.loop.distribute.enable`` and the
6645 second operand is a bit. If the bit operand value is 1 distribution is
6646 enabled. A value of 0 disables distribution:
6647
6648 .. code-block:: llvm
6649
6650    !0 = !{!"llvm.loop.distribute.enable", i1 0}
6651    !1 = !{!"llvm.loop.distribute.enable", i1 1}
6652
6653 This metadata should be used in conjunction with ``llvm.loop`` loop
6654 identification metadata.
6655
6656 '``llvm.loop.distribute.followup_coincident``' Metadata
6657 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6658
6659 This metadata defines which attributes extracted loops with no cyclic
6660 dependencies will have (i.e. can be vectorized). See
6661 :ref:`Transformation Metadata <transformation-metadata>` for details.
6662
6663 '``llvm.loop.distribute.followup_sequential``' Metadata
6664 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6665
6666 This metadata defines which attributes the isolated loops with unsafe
6667 memory dependencies will have. See
6668 :ref:`Transformation Metadata <transformation-metadata>` for details.
6669
6670 '``llvm.loop.distribute.followup_fallback``' Metadata
6671 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6672
6673 If loop versioning is necessary, this metadata defined the attributes
6674 the non-distributed fallback version will have. See
6675 :ref:`Transformation Metadata <transformation-metadata>` for details.
6676
6677 '``llvm.loop.distribute.followup_all``' Metadata
6678 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6679
6680 The attributes in this metadata is added to all followup loops of the
6681 loop distribution pass. See
6682 :ref:`Transformation Metadata <transformation-metadata>` for details.
6683
6684 '``llvm.licm.disable``' Metadata
6685 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6686
6687 This metadata indicates that loop-invariant code motion (LICM) should not be
6688 performed on this loop. The metadata has a single operand which is the string
6689 ``llvm.licm.disable``. For example:
6690
6691 .. code-block:: llvm
6692
6693    !0 = !{!"llvm.licm.disable"}
6694
6695 Note that although it operates per loop it isn't given the llvm.loop prefix
6696 as it is not affected by the ``llvm.loop.disable_nonforced`` metadata.
6697
6698 '``llvm.access.group``' Metadata
6699 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6700
6701 ``llvm.access.group`` metadata can be attached to any instruction that
6702 potentially accesses memory. It can point to a single distinct metadata
6703 node, which we call access group. This node represents all memory access
6704 instructions referring to it via ``llvm.access.group``. When an
6705 instruction belongs to multiple access groups, it can also point to a
6706 list of accesses groups, illustrated by the following example.
6707
6708 .. code-block:: llvm
6709
6710    %val = load i32, i32* %arrayidx, !llvm.access.group !0
6711    ...
6712    !0 = !{!1, !2}
6713    !1 = distinct !{}
6714    !2 = distinct !{}
6715
6716 It is illegal for the list node to be empty since it might be confused
6717 with an access group.
6718
6719 The access group metadata node must be 'distinct' to avoid collapsing
6720 multiple access groups by content. A access group metadata node must
6721 always be empty which can be used to distinguish an access group
6722 metadata node from a list of access groups. Being empty avoids the
6723 situation that the content must be updated which, because metadata is
6724 immutable by design, would required finding and updating all references
6725 to the access group node.
6726
6727 The access group can be used to refer to a memory access instruction
6728 without pointing to it directly (which is not possible in global
6729 metadata). Currently, the only metadata making use of it is
6730 ``llvm.loop.parallel_accesses``.
6731
6732 '``llvm.loop.parallel_accesses``' Metadata
6733 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6734
6735 The ``llvm.loop.parallel_accesses`` metadata refers to one or more
6736 access group metadata nodes (see ``llvm.access.group``). It denotes that
6737 no loop-carried memory dependence exist between it and other instructions
6738 in the loop with this metadata.
6739
6740 Let ``m1`` and ``m2`` be two instructions that both have the
6741 ``llvm.access.group`` metadata to the access group ``g1``, respectively
6742 ``g2`` (which might be identical). If a loop contains both access groups
6743 in its ``llvm.loop.parallel_accesses`` metadata, then the compiler can
6744 assume that there is no dependency between ``m1`` and ``m2`` carried by
6745 this loop. Instructions that belong to multiple access groups are
6746 considered having this property if at least one of the access groups
6747 matches the ``llvm.loop.parallel_accesses`` list.
6748
6749 If all memory-accessing instructions in a loop have
6750 ``llvm.access.group`` metadata that each refer to one of the access
6751 groups of a loop's ``llvm.loop.parallel_accesses`` metadata, then the
6752 loop has no loop carried memory dependences and is considered to be a
6753 parallel loop.
6754
6755 Note that if not all memory access instructions belong to an access
6756 group referred to by ``llvm.loop.parallel_accesses``, then the loop must
6757 not be considered trivially parallel. Additional
6758 memory dependence analysis is required to make that determination. As a fail
6759 safe mechanism, this causes loops that were originally parallel to be considered
6760 sequential (if optimization passes that are unaware of the parallel semantics
6761 insert new memory instructions into the loop body).
6762
6763 Example of a loop that is considered parallel due to its correct use of
6764 both ``llvm.access.group`` and ``llvm.loop.parallel_accesses``
6765 metadata types.
6766
6767 .. code-block:: llvm
6768
6769    for.body:
6770      ...
6771      %val0 = load i32, i32* %arrayidx, !llvm.access.group !1
6772      ...
6773      store i32 %val0, i32* %arrayidx1, !llvm.access.group !1
6774      ...
6775      br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
6776
6777    for.end:
6778    ...
6779    !0 = distinct !{!0, !{!"llvm.loop.parallel_accesses", !1}}
6780    !1 = distinct !{}
6781
6782 It is also possible to have nested parallel loops:
6783
6784 .. code-block:: llvm
6785
6786    outer.for.body:
6787      ...
6788      %val1 = load i32, i32* %arrayidx3, !llvm.access.group !4
6789      ...
6790      br label %inner.for.body
6791
6792    inner.for.body:
6793      ...
6794      %val0 = load i32, i32* %arrayidx1, !llvm.access.group !3
6795      ...
6796      store i32 %val0, i32* %arrayidx2, !llvm.access.group !3
6797      ...
6798      br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1
6799
6800    inner.for.end:
6801      ...
6802      store i32 %val1, i32* %arrayidx4, !llvm.access.group !4
6803      ...
6804      br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2
6805
6806    outer.for.end:                                          ; preds = %for.body
6807    ...
6808    !1 = distinct !{!1, !{!"llvm.loop.parallel_accesses", !3}}     ; metadata for the inner loop
6809    !2 = distinct !{!2, !{!"llvm.loop.parallel_accesses", !3, !4}} ; metadata for the outer loop
6810    !3 = distinct !{} ; access group for instructions in the inner loop (which are implicitly contained in outer loop as well)
6811    !4 = distinct !{} ; access group for instructions in the outer, but not the inner loop
6812
6813 '``llvm.loop.mustprogress``' Metadata
6814 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6815
6816 The ``llvm.loop.mustprogress`` metadata indicates that this loop is required to
6817 terminate, unwind, or interact with the environment in an observable way e.g.
6818 via a volatile memory access, I/O, or other synchronization. If such a loop is
6819 not found to interact with the environment in an observable way, the loop may
6820 be removed. This corresponds to the ``mustprogress`` function attribute.
6821
6822 '``irr_loop``' Metadata
6823 ^^^^^^^^^^^^^^^^^^^^^^^
6824
6825 ``irr_loop`` metadata may be attached to the terminator instruction of a basic
6826 block that's an irreducible loop header (note that an irreducible loop has more
6827 than once header basic blocks.) If ``irr_loop`` metadata is attached to the
6828 terminator instruction of a basic block that is not really an irreducible loop
6829 header, the behavior is undefined. The intent of this metadata is to improve the
6830 accuracy of the block frequency propagation. For example, in the code below, the
6831 block ``header0`` may have a loop header weight (relative to the other headers of
6832 the irreducible loop) of 100:
6833
6834 .. code-block:: llvm
6835
6836     header0:
6837     ...
6838     br i1 %cmp, label %t1, label %t2, !irr_loop !0
6839
6840     ...
6841     !0 = !{"loop_header_weight", i64 100}
6842
6843 Irreducible loop header weights are typically based on profile data.
6844
6845 .. _md_invariant.group:
6846
6847 '``invariant.group``' Metadata
6848 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6849
6850 The experimental ``invariant.group`` metadata may be attached to
6851 ``load``/``store`` instructions referencing a single metadata with no entries.
6852 The existence of the ``invariant.group`` metadata on the instruction tells
6853 the optimizer that every ``load`` and ``store`` to the same pointer operand
6854 can be assumed to load or store the same
6855 value (but see the ``llvm.launder.invariant.group`` intrinsic which affects
6856 when two pointers are considered the same). Pointers returned by bitcast or
6857 getelementptr with only zero indices are considered the same.
6858
6859 Examples:
6860
6861 .. code-block:: llvm
6862
6863    @unknownPtr = external global i8
6864    ...
6865    %ptr = alloca i8
6866    store i8 42, i8* %ptr, !invariant.group !0
6867    call void @foo(i8* %ptr)
6868
6869    %a = load i8, i8* %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change
6870    call void @foo(i8* %ptr)
6871
6872    %newPtr = call i8* @getPointer(i8* %ptr)
6873    %c = load i8, i8* %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr
6874
6875    %unknownValue = load i8, i8* @unknownPtr
6876    store i8 %unknownValue, i8* %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42
6877
6878    call void @foo(i8* %ptr)
6879    %newPtr2 = call i8* @llvm.launder.invariant.group(i8* %ptr)
6880    %d = load i8, i8* %newPtr2, !invariant.group !0  ; Can't step through launder.invariant.group to get value of %ptr
6881
6882    ...
6883    declare void @foo(i8*)
6884    declare i8* @getPointer(i8*)
6885    declare i8* @llvm.launder.invariant.group(i8*)
6886
6887    !0 = !{}
6888
6889 The invariant.group metadata must be dropped when replacing one pointer by
6890 another based on aliasing information. This is because invariant.group is tied
6891 to the SSA value of the pointer operand.
6892
6893 .. code-block:: llvm
6894
6895   %v = load i8, i8* %x, !invariant.group !0
6896   ; if %x mustalias %y then we can replace the above instruction with
6897   %v = load i8, i8* %y
6898
6899 Note that this is an experimental feature, which means that its semantics might
6900 change in the future.
6901
6902 '``type``' Metadata
6903 ^^^^^^^^^^^^^^^^^^^
6904
6905 See :doc:`TypeMetadata`.
6906
6907 '``associated``' Metadata
6908 ^^^^^^^^^^^^^^^^^^^^^^^^^
6909
6910 The ``associated`` metadata may be attached to a global variable definition with
6911 a single argument that references a global object (optionally through an alias).
6912
6913 This metadata lowers to the ELF section flag ``SHF_LINK_ORDER`` which prevents
6914 discarding of the global variable in linker GC unless the referenced object is
6915 also discarded. The linker support for this feature is spotty. For best
6916 compatibility, globals carrying this metadata should:
6917
6918 - Be in ``@llvm.compiler.used``.
6919 - If the referenced global variable is in a comdat, be in the same comdat.
6920
6921 ``!associated`` can not express many-to-one relationship. A global variable with
6922 the metadata should generally not be referenced by a function: the function may
6923 be inlined into other functions, leading to more references to the metadata.
6924 Ideally we would want to keep metadata alive as long as any inline location is
6925 alive, but this many-to-one relationship is not representable. Moreover, if the
6926 metadata is retained while the function is discarded, the linker will report an
6927 error of a relocation referencing a discarded section.
6928
6929 The metadata is often used with an explicit section consisting of valid C
6930 identifiers so that the runtime can find the metadata section with
6931 linker-defined encapsulation symbols ``__start_<section_name>`` and
6932 ``__stop_<section_name>``.
6933
6934 It does not have any effect on non-ELF targets.
6935
6936 Example:
6937
6938 .. code-block:: text
6939
6940     $a = comdat any
6941     @a = global i32 1, comdat $a
6942     @b = internal global i32 2, comdat $a, section "abc", !associated !0
6943     !0 = !{i32* @a}
6944
6945
6946 '``prof``' Metadata
6947 ^^^^^^^^^^^^^^^^^^^
6948
6949 The ``prof`` metadata is used to record profile data in the IR.
6950 The first operand of the metadata node indicates the profile metadata
6951 type. There are currently 3 types:
6952 :ref:`branch_weights<prof_node_branch_weights>`,
6953 :ref:`function_entry_count<prof_node_function_entry_count>`, and
6954 :ref:`VP<prof_node_VP>`.
6955
6956 .. _prof_node_branch_weights:
6957
6958 branch_weights
6959 """"""""""""""
6960
6961 Branch weight metadata attached to a branch, select, switch or call instruction
6962 represents the likeliness of the associated branch being taken.
6963 For more information, see :doc:`BranchWeightMetadata`.
6964
6965 .. _prof_node_function_entry_count:
6966
6967 function_entry_count
6968 """"""""""""""""""""
6969
6970 Function entry count metadata can be attached to function definitions
6971 to record the number of times the function is called. Used with BFI
6972 information, it is also used to derive the basic block profile count.
6973 For more information, see :doc:`BranchWeightMetadata`.
6974
6975 .. _prof_node_VP:
6976
6977 VP
6978 ""
6979
6980 VP (value profile) metadata can be attached to instructions that have
6981 value profile information. Currently this is indirect calls (where it
6982 records the hottest callees) and calls to memory intrinsics such as memcpy,
6983 memmove, and memset (where it records the hottest byte lengths).
6984
6985 Each VP metadata node contains "VP" string, then a uint32_t value for the value
6986 profiling kind, a uint64_t value for the total number of times the instruction
6987 is executed, followed by uint64_t value and execution count pairs.
6988 The value profiling kind is 0 for indirect call targets and 1 for memory
6989 operations. For indirect call targets, each profile value is a hash
6990 of the callee function name, and for memory operations each value is the
6991 byte length.
6992
6993 Note that the value counts do not need to add up to the total count
6994 listed in the third operand (in practice only the top hottest values
6995 are tracked and reported).
6996
6997 Indirect call example:
6998
6999 .. code-block:: llvm
7000
7001     call void %f(), !prof !1
7002     !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410}
7003
7004 Note that the VP type is 0 (the second operand), which indicates this is
7005 an indirect call value profile data. The third operand indicates that the
7006 indirect call executed 1600 times. The 4th and 6th operands give the
7007 hashes of the 2 hottest target functions' names (this is the same hash used
7008 to represent function names in the profile database), and the 5th and 7th
7009 operands give the execution count that each of the respective prior target
7010 functions was called.
7011
7012 .. _md_annotation:
7013
7014 '``annotation``' Metadata
7015 ^^^^^^^^^^^^^^^^^^^^^^^^^
7016
7017 The ``annotation`` metadata can be used to attach a tuple of annotation strings
7018 to any instruction. This metadata does not impact the semantics of the program
7019 and may only be used to provide additional insight about the program and
7020 transformations to users.
7021
7022 Example:
7023
7024 .. code-block:: text
7025
7026     %a.addr = alloca float*, align 8, !annotation !0
7027     !0 = !{!"auto-init"}
7028
7029 Module Flags Metadata
7030 =====================
7031
7032 Information about the module as a whole is difficult to convey to LLVM's
7033 subsystems. The LLVM IR isn't sufficient to transmit this information.
7034 The ``llvm.module.flags`` named metadata exists in order to facilitate
7035 this. These flags are in the form of key / value pairs --- much like a
7036 dictionary --- making it easy for any subsystem who cares about a flag to
7037 look it up.
7038
7039 The ``llvm.module.flags`` metadata contains a list of metadata triplets.
7040 Each triplet has the following form:
7041
7042 -  The first element is a *behavior* flag, which specifies the behavior
7043    when two (or more) modules are merged together, and it encounters two
7044    (or more) metadata with the same ID. The supported behaviors are
7045    described below.
7046 -  The second element is a metadata string that is a unique ID for the
7047    metadata. Each module may only have one flag entry for each unique ID (not
7048    including entries with the **Require** behavior).
7049 -  The third element is the value of the flag.
7050
7051 When two (or more) modules are merged together, the resulting
7052 ``llvm.module.flags`` metadata is the union of the modules' flags. That is, for
7053 each unique metadata ID string, there will be exactly one entry in the merged
7054 modules ``llvm.module.flags`` metadata table, and the value for that entry will
7055 be determined by the merge behavior flag, as described below. The only exception
7056 is that entries with the *Require* behavior are always preserved.
7057
7058 The following behaviors are supported:
7059
7060 .. list-table::
7061    :header-rows: 1
7062    :widths: 10 90
7063
7064    * - Value
7065      - Behavior
7066
7067    * - 1
7068      - **Error**
7069            Emits an error if two values disagree, otherwise the resulting value
7070            is that of the operands.
7071
7072    * - 2
7073      - **Warning**
7074            Emits a warning if two values disagree. The result value will be the
7075            operand for the flag from the first module being linked, or the max
7076            if the other module uses **Max** (in which case the resulting flag
7077            will be **Max**).
7078
7079    * - 3
7080      - **Require**
7081            Adds a requirement that another module flag be present and have a
7082            specified value after linking is performed. The value must be a
7083            metadata pair, where the first element of the pair is the ID of the
7084            module flag to be restricted, and the second element of the pair is
7085            the value the module flag should be restricted to. This behavior can
7086            be used to restrict the allowable results (via triggering of an
7087            error) of linking IDs with the **Override** behavior.
7088
7089    * - 4
7090      - **Override**
7091            Uses the specified value, regardless of the behavior or value of the
7092            other module. If both modules specify **Override**, but the values
7093            differ, an error will be emitted.
7094
7095    * - 5
7096      - **Append**
7097            Appends the two values, which are required to be metadata nodes.
7098
7099    * - 6
7100      - **AppendUnique**
7101            Appends the two values, which are required to be metadata
7102            nodes. However, duplicate entries in the second list are dropped
7103            during the append operation.
7104
7105    * - 7
7106      - **Max**
7107            Takes the max of the two values, which are required to be integers.
7108
7109 It is an error for a particular unique flag ID to have multiple behaviors,
7110 except in the case of **Require** (which adds restrictions on another metadata
7111 value) or **Override**.
7112
7113 An example of module flags:
7114
7115 .. code-block:: llvm
7116
7117     !0 = !{ i32 1, !"foo", i32 1 }
7118     !1 = !{ i32 4, !"bar", i32 37 }
7119     !2 = !{ i32 2, !"qux", i32 42 }
7120     !3 = !{ i32 3, !"qux",
7121       !{
7122         !"foo", i32 1
7123       }
7124     }
7125     !llvm.module.flags = !{ !0, !1, !2, !3 }
7126
7127 -  Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior
7128    if two or more ``!"foo"`` flags are seen is to emit an error if their
7129    values are not equal.
7130
7131 -  Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The
7132    behavior if two or more ``!"bar"`` flags are seen is to use the value
7133    '37'.
7134
7135 -  Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The
7136    behavior if two or more ``!"qux"`` flags are seen is to emit a
7137    warning if their values are not equal.
7138
7139 -  Metadata ``!3`` has the ID ``!"qux"`` and the value:
7140
7141    ::
7142
7143        !{ !"foo", i32 1 }
7144
7145    The behavior is to emit an error if the ``llvm.module.flags`` does not
7146    contain a flag with the ID ``!"foo"`` that has the value '1' after linking is
7147    performed.
7148
7149 Synthesized Functions Module Flags Metadata
7150 -------------------------------------------
7151
7152 These metadata specify the default attributes synthesized functions should have.
7153 These metadata are currently respected by a few instrumentation passes, such as
7154 sanitizers.
7155
7156 These metadata correspond to a few function attributes with significant code
7157 generation behaviors. Function attributes with just optimization purposes
7158 should not be listed because the performance impact of these synthesized
7159 functions is small.
7160
7161 - "frame-pointer": **Max**. The value can be 0, 1, or 2. A synthesized function
7162   will get the "frame-pointer" function attribute, with value being "none",
7163   "non-leaf", or "all", respectively.
7164 - "uwtable": **Max**. The value can be 0 or 1. If the value is 1, a synthesized
7165   function will get the ``uwtable`` function attribute.
7166
7167 Objective-C Garbage Collection Module Flags Metadata
7168 ----------------------------------------------------
7169
7170 On the Mach-O platform, Objective-C stores metadata about garbage
7171 collection in a special section called "image info". The metadata
7172 consists of a version number and a bitmask specifying what types of
7173 garbage collection are supported (if any) by the file. If two or more
7174 modules are linked together their garbage collection metadata needs to
7175 be merged rather than appended together.
7176
7177 The Objective-C garbage collection module flags metadata consists of the
7178 following key-value pairs:
7179
7180 .. list-table::
7181    :header-rows: 1
7182    :widths: 30 70
7183
7184    * - Key
7185      - Value
7186
7187    * - ``Objective-C Version``
7188      - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2.
7189
7190    * - ``Objective-C Image Info Version``
7191      - **[Required]** --- The version of the image info section. Currently
7192        always 0.
7193
7194    * - ``Objective-C Image Info Section``
7195      - **[Required]** --- The section to place the metadata. Valid values are
7196        ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and
7197        ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for
7198        Objective-C ABI version 2.
7199
7200    * - ``Objective-C Garbage Collection``
7201      - **[Required]** --- Specifies whether garbage collection is supported or
7202        not. Valid values are 0, for no garbage collection, and 2, for garbage
7203        collection supported.
7204
7205    * - ``Objective-C GC Only``
7206      - **[Optional]** --- Specifies that only garbage collection is supported.
7207        If present, its value must be 6. This flag requires that the
7208        ``Objective-C Garbage Collection`` flag have the value 2.
7209
7210 Some important flag interactions:
7211
7212 -  If a module with ``Objective-C Garbage Collection`` set to 0 is
7213    merged with a module with ``Objective-C Garbage Collection`` set to
7214    2, then the resulting module has the
7215    ``Objective-C Garbage Collection`` flag set to 0.
7216 -  A module with ``Objective-C Garbage Collection`` set to 0 cannot be
7217    merged with a module with ``Objective-C GC Only`` set to 6.
7218
7219 C type width Module Flags Metadata
7220 ----------------------------------
7221
7222 The ARM backend emits a section into each generated object file describing the
7223 options that it was compiled with (in a compiler-independent way) to prevent
7224 linking incompatible objects, and to allow automatic library selection. Some
7225 of these options are not visible at the IR level, namely wchar_t width and enum
7226 width.
7227
7228 To pass this information to the backend, these options are encoded in module
7229 flags metadata, using the following key-value pairs:
7230
7231 .. list-table::
7232    :header-rows: 1
7233    :widths: 30 70
7234
7235    * - Key
7236      - Value
7237
7238    * - short_wchar
7239      - * 0 --- sizeof(wchar_t) == 4
7240        * 1 --- sizeof(wchar_t) == 2
7241
7242    * - short_enum
7243      - * 0 --- Enums are at least as large as an ``int``.
7244        * 1 --- Enums are stored in the smallest integer type which can
7245          represent all of its values.
7246
7247 For example, the following metadata section specifies that the module was
7248 compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an
7249 enum is the smallest type which can represent all of its values::
7250
7251     !llvm.module.flags = !{!0, !1}
7252     !0 = !{i32 1, !"short_wchar", i32 1}
7253     !1 = !{i32 1, !"short_enum", i32 0}
7254
7255 LTO Post-Link Module Flags Metadata
7256 -----------------------------------
7257
7258 Some optimisations are only when the entire LTO unit is present in the current
7259 module. This is represented by the ``LTOPostLink`` module flags metadata, which
7260 will be created with a value of ``1`` when LTO linking occurs.
7261
7262 Automatic Linker Flags Named Metadata
7263 =====================================
7264
7265 Some targets support embedding of flags to the linker inside individual object
7266 files. Typically this is used in conjunction with language extensions which
7267 allow source files to contain linker command line options, and have these
7268 automatically be transmitted to the linker via object files.
7269
7270 These flags are encoded in the IR using named metadata with the name
7271 ``!llvm.linker.options``. Each operand is expected to be a metadata node
7272 which should be a list of other metadata nodes, each of which should be a
7273 list of metadata strings defining linker options.
7274
7275 For example, the following metadata section specifies two separate sets of
7276 linker options, presumably to link against ``libz`` and the ``Cocoa``
7277 framework::
7278
7279     !0 = !{ !"-lz" }
7280     !1 = !{ !"-framework", !"Cocoa" }
7281     !llvm.linker.options = !{ !0, !1 }
7282
7283 The metadata encoding as lists of lists of options, as opposed to a collapsed
7284 list of options, is chosen so that the IR encoding can use multiple option
7285 strings to specify e.g., a single library, while still having that specifier be
7286 preserved as an atomic element that can be recognized by a target specific
7287 assembly writer or object file emitter.
7288
7289 Each individual option is required to be either a valid option for the target's
7290 linker, or an option that is reserved by the target specific assembly writer or
7291 object file emitter. No other aspect of these options is defined by the IR.
7292
7293 Dependent Libs Named Metadata
7294 =============================
7295
7296 Some targets support embedding of strings into object files to indicate
7297 a set of libraries to add to the link. Typically this is used in conjunction
7298 with language extensions which allow source files to explicitly declare the
7299 libraries they depend on, and have these automatically be transmitted to the
7300 linker via object files.
7301
7302 The list is encoded in the IR using named metadata with the name
7303 ``!llvm.dependent-libraries``. Each operand is expected to be a metadata node
7304 which should contain a single string operand.
7305
7306 For example, the following metadata section contains two library specifiers::
7307
7308     !0 = !{!"a library specifier"}
7309     !1 = !{!"another library specifier"}
7310     !llvm.dependent-libraries = !{ !0, !1 }
7311
7312 Each library specifier will be handled independently by the consuming linker.
7313 The effect of the library specifiers are defined by the consuming linker.
7314
7315 .. _summary:
7316
7317 ThinLTO Summary
7318 ===============
7319
7320 Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_
7321 causes the building of a compact summary of the module that is emitted into
7322 the bitcode. The summary is emitted into the LLVM assembly and identified
7323 in syntax by a caret ('``^``').
7324
7325 The summary is parsed into a bitcode output, along with the Module
7326 IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes
7327 of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the
7328 summary entries (just as they currently ignore summary entries in a bitcode
7329 input file).
7330
7331 Eventually, the summary will be parsed into a ModuleSummaryIndex object under
7332 the same conditions where summary index is currently built from bitcode.
7333 Specifically, tools that test the Thin Link portion of a ThinLTO compile
7334 (i.e. llvm-lto and llvm-lto2), or when parsing a combined index
7335 for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag
7336 (this part is not yet implemented, use llvm-as to create a bitcode object
7337 before feeding into thin link tools for now).
7338
7339 There are currently 3 types of summary entries in the LLVM assembly:
7340 :ref:`module paths<module_path_summary>`,
7341 :ref:`global values<gv_summary>`, and
7342 :ref:`type identifiers<typeid_summary>`.
7343
7344 .. _module_path_summary:
7345
7346 Module Path Summary Entry
7347 -------------------------
7348
7349 Each module path summary entry lists a module containing global values included
7350 in the summary. For a single IR module there will be one such entry, but
7351 in a combined summary index produced during the thin link, there will be
7352 one module path entry per linked module with summary.
7353
7354 Example:
7355
7356 .. code-block:: text
7357
7358     ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418))
7359
7360 The ``path`` field is a string path to the bitcode file, and the ``hash``
7361 field is the 160-bit SHA-1 hash of the IR bitcode contents, used for
7362 incremental builds and caching.
7363
7364 .. _gv_summary:
7365
7366 Global Value Summary Entry
7367 --------------------------
7368
7369 Each global value summary entry corresponds to a global value defined or
7370 referenced by a summarized module.
7371
7372 Example:
7373
7374 .. code-block:: text
7375
7376     ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831
7377
7378 For declarations, there will not be a summary list. For definitions, a
7379 global value will contain a list of summaries, one per module containing
7380 a definition. There can be multiple entries in a combined summary index
7381 for symbols with weak linkage.
7382
7383 Each ``Summary`` format will depend on whether the global value is a
7384 :ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or
7385 :ref:`alias<alias_summary>`.
7386
7387 .. _function_summary:
7388
7389 Function Summary
7390 ^^^^^^^^^^^^^^^^
7391
7392 If the global value is a function, the ``Summary`` entry will look like:
7393
7394 .. code-block:: text
7395
7396     function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Params]?[, Refs]?
7397
7398 The ``module`` field includes the summary entry id for the module containing
7399 this definition, and the ``flags`` field contains information such as
7400 the linkage type, a flag indicating whether it is legal to import the
7401 definition, whether it is globally live and whether the linker resolved it
7402 to a local definition (the latter two are populated during the thin link).
7403 The ``insts`` field contains the number of IR instructions in the function.
7404 Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`,
7405 :ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`,
7406 :ref:`Params<params_summary>`, :ref:`Refs<refs_summary>`.
7407
7408 .. _variable_summary:
7409
7410 Global Variable Summary
7411 ^^^^^^^^^^^^^^^^^^^^^^^
7412
7413 If the global value is a variable, the ``Summary`` entry will look like:
7414
7415 .. code-block:: text
7416
7417     variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]?
7418
7419 The variable entry contains a subset of the fields in a
7420 :ref:`function summary <function_summary>`, see the descriptions there.
7421
7422 .. _alias_summary:
7423
7424 Alias Summary
7425 ^^^^^^^^^^^^^
7426
7427 If the global value is an alias, the ``Summary`` entry will look like:
7428
7429 .. code-block:: text
7430
7431     alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2)
7432
7433 The ``module`` and ``flags`` fields are as described for a
7434 :ref:`function summary <function_summary>`. The ``aliasee`` field
7435 contains a reference to the global value summary entry of the aliasee.
7436
7437 .. _funcflags_summary:
7438
7439 Function Flags
7440 ^^^^^^^^^^^^^^
7441
7442 The optional ``FuncFlags`` field looks like:
7443
7444 .. code-block:: text
7445
7446     funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0)
7447
7448 If unspecified, flags are assumed to hold the conservative ``false`` value of
7449 ``0``.
7450
7451 .. _calls_summary:
7452
7453 Calls
7454 ^^^^^
7455
7456 The optional ``Calls`` field looks like:
7457
7458 .. code-block:: text
7459
7460     calls: ((Callee)[, (Callee)]*)
7461
7462 where each ``Callee`` looks like:
7463
7464 .. code-block:: text
7465
7466     callee: ^1[, hotness: None]?[, relbf: 0]?
7467
7468 The ``callee`` refers to the summary entry id of the callee. At most one
7469 of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``,
7470 ``Hot``, and ``Critical``), and ``relbf`` (which holds the integer
7471 branch frequency relative to the entry frequency, scaled down by 2^8)
7472 may be specified. The defaults are ``Unknown`` and ``0``, respectively.
7473
7474 .. _params_summary:
7475
7476 Params
7477 ^^^^^^
7478
7479 The optional ``Params`` is used by ``StackSafety`` and looks like:
7480
7481 .. code-block:: text
7482
7483     Params: ((Param)[, (Param)]*)
7484
7485 where each ``Param`` describes pointer parameter access inside of the
7486 function and looks like:
7487
7488 .. code-block:: text
7489
7490     param: 4, offset: [0, 5][, calls: ((Callee)[, (Callee)]*)]?
7491
7492 where the first ``param`` is the number of the parameter it describes,
7493 ``offset`` is the inclusive range of offsets from the pointer parameter to bytes
7494 which can be accessed by the function. This range does not include accesses by
7495 function calls from ``calls`` list.
7496
7497 where each ``Callee`` describes how parameter is forwarded into other
7498 functions and looks like:
7499
7500 .. code-block:: text
7501
7502     callee: ^3, param: 5, offset: [-3, 3]
7503
7504 The ``callee`` refers to the summary entry id of the callee,  ``param`` is
7505 the number of the callee parameter which points into the callers parameter
7506 with offset known to be inside of the ``offset`` range. ``calls`` will be
7507 consumed and removed by thin link stage to update ``Param::offset`` so it
7508 covers all accesses possible by ``calls``.
7509
7510 Pointer parameter without corresponding ``Param`` is considered unsafe and we
7511 assume that access with any offset is possible.
7512
7513 Example:
7514
7515 If we have the following function:
7516
7517 .. code-block:: text
7518
7519     define i64 @foo(i64* %0, i32* %1, i8* %2, i8 %3) {
7520       store i32* %1, i32** @x
7521       %5 = getelementptr inbounds i8, i8* %2, i64 5
7522       %6 = load i8, i8* %5
7523       %7 = getelementptr inbounds i8, i8* %2, i8 %3
7524       tail call void @bar(i8 %3, i8* %7)
7525       %8 = load i64, i64* %0
7526       ret i64 %8
7527     }
7528
7529 We can expect the record like this:
7530
7531 .. code-block:: text
7532
7533     params: ((param: 0, offset: [0, 7]),(param: 2, offset: [5, 5], calls: ((callee: ^3, param: 1, offset: [-128, 127]))))
7534
7535 The function may access just 8 bytes of the parameter %0 . ``calls`` is empty,
7536 so the parameter is either not used for function calls or ``offset`` already
7537 covers all accesses from nested function calls.
7538 Parameter %1 escapes, so access is unknown.
7539 The function itself can access just a single byte of the parameter %2. Additional
7540 access is possible inside of the ``@bar`` or ``^3``. The function adds signed
7541 offset to the pointer and passes the result as the argument %1 into ``^3``.
7542 This record itself does not tell us how ``^3`` will access the parameter.
7543 Parameter %3 is not a pointer.
7544
7545 .. _refs_summary:
7546
7547 Refs
7548 ^^^^
7549
7550 The optional ``Refs`` field looks like:
7551
7552 .. code-block:: text
7553
7554     refs: ((Ref)[, (Ref)]*)
7555
7556 where each ``Ref`` contains a reference to the summary id of the referenced
7557 value (e.g. ``^1``).
7558
7559 .. _typeidinfo_summary:
7560
7561 TypeIdInfo
7562 ^^^^^^^^^^
7563
7564 The optional ``TypeIdInfo`` field, used for
7565 `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
7566 looks like:
7567
7568 .. code-block:: text
7569
7570     typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]?
7571
7572 These optional fields have the following forms:
7573
7574 TypeTests
7575 """""""""
7576
7577 .. code-block:: text
7578
7579     typeTests: (TypeIdRef[, TypeIdRef]*)
7580
7581 Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
7582 by summary id or ``GUID``.
7583
7584 TypeTestAssumeVCalls
7585 """"""""""""""""""""
7586
7587 .. code-block:: text
7588
7589     typeTestAssumeVCalls: (VFuncId[, VFuncId]*)
7590
7591 Where each VFuncId has the format:
7592
7593 .. code-block:: text
7594
7595     vFuncId: (TypeIdRef, offset: 16)
7596
7597 Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
7598 by summary id or ``GUID`` preceded by a ``guid:`` tag.
7599
7600 TypeCheckedLoadVCalls
7601 """""""""""""""""""""
7602
7603 .. code-block:: text
7604
7605     typeCheckedLoadVCalls: (VFuncId[, VFuncId]*)
7606
7607 Where each VFuncId has the format described for ``TypeTestAssumeVCalls``.
7608
7609 TypeTestAssumeConstVCalls
7610 """""""""""""""""""""""""
7611
7612 .. code-block:: text
7613
7614     typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*)
7615
7616 Where each ConstVCall has the format:
7617
7618 .. code-block:: text
7619
7620     (VFuncId, args: (Arg[, Arg]*))
7621
7622 and where each VFuncId has the format described for ``TypeTestAssumeVCalls``,
7623 and each Arg is an integer argument number.
7624
7625 TypeCheckedLoadConstVCalls
7626 """"""""""""""""""""""""""
7627
7628 .. code-block:: text
7629
7630     typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*)
7631
7632 Where each ConstVCall has the format described for
7633 ``TypeTestAssumeConstVCalls``.
7634
7635 .. _typeid_summary:
7636
7637 Type ID Summary Entry
7638 ---------------------
7639
7640 Each type id summary entry corresponds to a type identifier resolution
7641 which is generated during the LTO link portion of the compile when building
7642 with `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
7643 so these are only present in a combined summary index.
7644
7645 Example:
7646
7647 .. code-block:: text
7648
7649     ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778
7650
7651 The ``typeTestRes`` gives the type test resolution ``kind`` (which may
7652 be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and
7653 the ``size-1`` bit width. It is followed by optional flags, which default to 0,
7654 and an optional WpdResolutions (whole program devirtualization resolution)
7655 field that looks like:
7656
7657 .. code-block:: text
7658
7659     wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]*
7660
7661 where each entry is a mapping from the given byte offset to the whole-program
7662 devirtualization resolution WpdRes, that has one of the following formats:
7663
7664 .. code-block:: text
7665
7666     wpdRes: (kind: branchFunnel)
7667     wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi")
7668     wpdRes: (kind: indir)
7669
7670 Additionally, each wpdRes has an optional ``resByArg`` field, which
7671 describes the resolutions for calls with all constant integer arguments:
7672
7673 .. code-block:: text
7674
7675     resByArg: (ResByArg[, ResByArg]*)
7676
7677 where ResByArg is:
7678
7679 .. code-block:: text
7680
7681     args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0])
7682
7683 Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal``
7684 or ``VirtualConstProp``. The ``info`` field is only used if the kind
7685 is ``UniformRetVal`` (indicates the uniform return value), or
7686 ``UniqueRetVal`` (holds the return value associated with the unique vtable
7687 (0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does
7688 not support the use of absolute symbols to store constants.
7689
7690 .. _intrinsicglobalvariables:
7691
7692 Intrinsic Global Variables
7693 ==========================
7694
7695 LLVM has a number of "magic" global variables that contain data that
7696 affect code generation or other IR semantics. These are documented here.
7697 All globals of this sort should have a section specified as
7698 "``llvm.metadata``". This section and all globals that start with
7699 "``llvm.``" are reserved for use by LLVM.
7700
7701 .. _gv_llvmused:
7702
7703 The '``llvm.used``' Global Variable
7704 -----------------------------------
7705
7706 The ``@llvm.used`` global is an array which has
7707 :ref:`appending linkage <linkage_appending>`. This array contains a list of
7708 pointers to named global variables, functions and aliases which may optionally
7709 have a pointer cast formed of bitcast or getelementptr. For example, a legal
7710 use of it is:
7711
7712 .. code-block:: llvm
7713
7714     @X = global i8 4
7715     @Y = global i32 123
7716
7717     @llvm.used = appending global [2 x i8*] [
7718        i8* @X,
7719        i8* bitcast (i32* @Y to i8*)
7720     ], section "llvm.metadata"
7721
7722 If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler,
7723 and linker are required to treat the symbol as if there is a reference to the
7724 symbol that it cannot see (which is why they have to be named). For example, if
7725 a variable has internal linkage and no references other than that from the
7726 ``@llvm.used`` list, it cannot be deleted. This is commonly used to represent
7727 references from inline asms and other things the compiler cannot "see", and
7728 corresponds to "``attribute((used))``" in GNU C.
7729
7730 On some targets, the code generator must emit a directive to the
7731 assembler or object file to prevent the assembler and linker from
7732 removing the symbol.
7733
7734 .. _gv_llvmcompilerused:
7735
7736 The '``llvm.compiler.used``' Global Variable
7737 --------------------------------------------
7738
7739 The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used``
7740 directive, except that it only prevents the compiler from touching the
7741 symbol. On targets that support it, this allows an intelligent linker to
7742 optimize references to the symbol without being impeded as it would be
7743 by ``@llvm.used``.
7744
7745 This is a rare construct that should only be used in rare circumstances,
7746 and should not be exposed to source languages.
7747
7748 .. _gv_llvmglobalctors:
7749
7750 The '``llvm.global_ctors``' Global Variable
7751 -------------------------------------------
7752
7753 .. code-block:: llvm
7754
7755     %0 = type { i32, void ()*, i8* }
7756     @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }]
7757
7758 The ``@llvm.global_ctors`` array contains a list of constructor
7759 functions, priorities, and an associated global or function.
7760 The functions referenced by this array will be called in ascending order
7761 of priority (i.e. lowest first) when the module is loaded. The order of
7762 functions with the same priority is not defined.
7763
7764 If the third field is non-null, and points to a global variable
7765 or function, the initializer function will only run if the associated
7766 data from the current module is not discarded.
7767 On ELF the referenced global variable or function must be in a comdat.
7768
7769 .. _llvmglobaldtors:
7770
7771 The '``llvm.global_dtors``' Global Variable
7772 -------------------------------------------
7773
7774 .. code-block:: llvm
7775
7776     %0 = type { i32, void ()*, i8* }
7777     @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }]
7778
7779 The ``@llvm.global_dtors`` array contains a list of destructor
7780 functions, priorities, and an associated global or function.
7781 The functions referenced by this array will be called in descending
7782 order of priority (i.e. highest first) when the module is unloaded. The
7783 order of functions with the same priority is not defined.
7784
7785 If the third field is non-null, and points to a global variable
7786 or function, the destructor function will only run if the associated
7787 data from the current module is not discarded.
7788 On ELF the referenced global variable or function must be in a comdat.
7789
7790 Instruction Reference
7791 =====================
7792
7793 The LLVM instruction set consists of several different classifications
7794 of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary
7795 instructions <binaryops>`, :ref:`bitwise binary
7796 instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and
7797 :ref:`other instructions <otherops>`.
7798
7799 .. _terminators:
7800
7801 Terminator Instructions
7802 -----------------------
7803
7804 As mentioned :ref:`previously <functionstructure>`, every basic block in a
7805 program ends with a "Terminator" instruction, which indicates which
7806 block should be executed after the current block is finished. These
7807 terminator instructions typically yield a '``void``' value: they produce
7808 control flow, not values (the one exception being the
7809 ':ref:`invoke <i_invoke>`' instruction).
7810
7811 The terminator instructions are: ':ref:`ret <i_ret>`',
7812 ':ref:`br <i_br>`', ':ref:`switch <i_switch>`',
7813 ':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`',
7814 ':ref:`callbr <i_callbr>`'
7815 ':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`',
7816 ':ref:`catchret <i_catchret>`',
7817 ':ref:`cleanupret <i_cleanupret>`',
7818 and ':ref:`unreachable <i_unreachable>`'.
7819
7820 .. _i_ret:
7821
7822 '``ret``' Instruction
7823 ^^^^^^^^^^^^^^^^^^^^^
7824
7825 Syntax:
7826 """""""
7827
7828 ::
7829
7830       ret <type> <value>       ; Return a value from a non-void function
7831       ret void                 ; Return from void function
7832
7833 Overview:
7834 """""""""
7835
7836 The '``ret``' instruction is used to return control flow (and optionally
7837 a value) from a function back to the caller.
7838
7839 There are two forms of the '``ret``' instruction: one that returns a
7840 value and then causes control flow, and one that just causes control
7841 flow to occur.
7842
7843 Arguments:
7844 """"""""""
7845
7846 The '``ret``' instruction optionally accepts a single argument, the
7847 return value. The type of the return value must be a ':ref:`first
7848 class <t_firstclass>`' type.
7849
7850 A function is not :ref:`well formed <wellformed>` if it has a non-void
7851 return type and contains a '``ret``' instruction with no return value or
7852 a return value with a type that does not match its type, or if it has a
7853 void return type and contains a '``ret``' instruction with a return
7854 value.
7855
7856 Semantics:
7857 """"""""""
7858
7859 When the '``ret``' instruction is executed, control flow returns back to
7860 the calling function's context. If the caller is a
7861 ":ref:`call <i_call>`" instruction, execution continues at the
7862 instruction after the call. If the caller was an
7863 ":ref:`invoke <i_invoke>`" instruction, execution continues at the
7864 beginning of the "normal" destination block. If the instruction returns
7865 a value, that value shall set the call or invoke instruction's return
7866 value.
7867
7868 Example:
7869 """"""""
7870
7871 .. code-block:: llvm
7872
7873       ret i32 5                       ; Return an integer value of 5
7874       ret void                        ; Return from a void function
7875       ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
7876
7877 .. _i_br:
7878
7879 '``br``' Instruction
7880 ^^^^^^^^^^^^^^^^^^^^
7881
7882 Syntax:
7883 """""""
7884
7885 ::
7886
7887       br i1 <cond>, label <iftrue>, label <iffalse>
7888       br label <dest>          ; Unconditional branch
7889
7890 Overview:
7891 """""""""
7892
7893 The '``br``' instruction is used to cause control flow to transfer to a
7894 different basic block in the current function. There are two forms of
7895 this instruction, corresponding to a conditional branch and an
7896 unconditional branch.
7897
7898 Arguments:
7899 """"""""""
7900
7901 The conditional branch form of the '``br``' instruction takes a single
7902 '``i1``' value and two '``label``' values. The unconditional form of the
7903 '``br``' instruction takes a single '``label``' value as a target.
7904
7905 Semantics:
7906 """"""""""
7907
7908 Upon execution of a conditional '``br``' instruction, the '``i1``'
7909 argument is evaluated. If the value is ``true``, control flows to the
7910 '``iftrue``' ``label`` argument. If "cond" is ``false``, control flows
7911 to the '``iffalse``' ``label`` argument.
7912 If '``cond``' is ``poison`` or ``undef``, this instruction has undefined
7913 behavior.
7914
7915 Example:
7916 """"""""
7917
7918 .. code-block:: llvm
7919
7920     Test:
7921       %cond = icmp eq i32 %a, %b
7922       br i1 %cond, label %IfEqual, label %IfUnequal
7923     IfEqual:
7924       ret i32 1
7925     IfUnequal:
7926       ret i32 0
7927
7928 .. _i_switch:
7929
7930 '``switch``' Instruction
7931 ^^^^^^^^^^^^^^^^^^^^^^^^
7932
7933 Syntax:
7934 """""""
7935
7936 ::
7937
7938       switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ]
7939
7940 Overview:
7941 """""""""
7942
7943 The '``switch``' instruction is used to transfer control flow to one of
7944 several different places. It is a generalization of the '``br``'
7945 instruction, allowing a branch to occur to one of many possible
7946 destinations.
7947
7948 Arguments:
7949 """"""""""
7950
7951 The '``switch``' instruction uses three parameters: an integer
7952 comparison value '``value``', a default '``label``' destination, and an
7953 array of pairs of comparison value constants and '``label``'s. The table
7954 is not allowed to contain duplicate constant entries.
7955
7956 Semantics:
7957 """"""""""
7958
7959 The ``switch`` instruction specifies a table of values and destinations.
7960 When the '``switch``' instruction is executed, this table is searched
7961 for the given value. If the value is found, control flow is transferred
7962 to the corresponding destination; otherwise, control flow is transferred
7963 to the default destination.
7964 If '``value``' is ``poison`` or ``undef``, this instruction has undefined
7965 behavior.
7966
7967 Implementation:
7968 """""""""""""""
7969
7970 Depending on properties of the target machine and the particular
7971 ``switch`` instruction, this instruction may be code generated in
7972 different ways. For example, it could be generated as a series of
7973 chained conditional branches or with a lookup table.
7974
7975 Example:
7976 """"""""
7977
7978 .. code-block:: llvm
7979
7980      ; Emulate a conditional br instruction
7981      %Val = zext i1 %value to i32
7982      switch i32 %Val, label %truedest [ i32 0, label %falsedest ]
7983
7984      ; Emulate an unconditional br instruction
7985      switch i32 0, label %dest [ ]
7986
7987      ; Implement a jump table:
7988      switch i32 %val, label %otherwise [ i32 0, label %onzero
7989                                          i32 1, label %onone
7990                                          i32 2, label %ontwo ]
7991
7992 .. _i_indirectbr:
7993
7994 '``indirectbr``' Instruction
7995 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7996
7997 Syntax:
7998 """""""
7999
8000 ::
8001
8002       indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ]
8003
8004 Overview:
8005 """""""""
8006
8007 The '``indirectbr``' instruction implements an indirect branch to a
8008 label within the current function, whose address is specified by
8009 "``address``". Address must be derived from a
8010 :ref:`blockaddress <blockaddress>` constant.
8011
8012 Arguments:
8013 """"""""""
8014
8015 The '``address``' argument is the address of the label to jump to. The
8016 rest of the arguments indicate the full set of possible destinations
8017 that the address may point to. Blocks are allowed to occur multiple
8018 times in the destination list, though this isn't particularly useful.
8019
8020 This destination list is required so that dataflow analysis has an
8021 accurate understanding of the CFG.
8022
8023 Semantics:
8024 """"""""""
8025
8026 Control transfers to the block specified in the address argument. All
8027 possible destination blocks must be listed in the label list, otherwise
8028 this instruction has undefined behavior. This implies that jumps to
8029 labels defined in other functions have undefined behavior as well.
8030 If '``address``' is ``poison`` or ``undef``, this instruction has undefined
8031 behavior.
8032
8033 Implementation:
8034 """""""""""""""
8035
8036 This is typically implemented with a jump through a register.
8037
8038 Example:
8039 """"""""
8040
8041 .. code-block:: llvm
8042
8043      indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ]
8044
8045 .. _i_invoke:
8046
8047 '``invoke``' Instruction
8048 ^^^^^^^^^^^^^^^^^^^^^^^^
8049
8050 Syntax:
8051 """""""
8052
8053 ::
8054
8055       <result> = invoke [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
8056                     [operand bundles] to label <normal label> unwind label <exception label>
8057
8058 Overview:
8059 """""""""
8060
8061 The '``invoke``' instruction causes control to transfer to a specified
8062 function, with the possibility of control flow transfer to either the
8063 '``normal``' label or the '``exception``' label. If the callee function
8064 returns with the "``ret``" instruction, control flow will return to the
8065 "normal" label. If the callee (or any indirect callees) returns via the
8066 ":ref:`resume <i_resume>`" instruction or other exception handling
8067 mechanism, control is interrupted and continued at the dynamically
8068 nearest "exception" label.
8069
8070 The '``exception``' label is a `landing
8071 pad <ExceptionHandling.html#overview>`_ for the exception. As such,
8072 '``exception``' label is required to have the
8073 ":ref:`landingpad <i_landingpad>`" instruction, which contains the
8074 information about the behavior of the program after unwinding happens,
8075 as its first non-PHI instruction. The restrictions on the
8076 "``landingpad``" instruction's tightly couples it to the "``invoke``"
8077 instruction, so that the important information contained within the
8078 "``landingpad``" instruction can't be lost through normal code motion.
8079
8080 Arguments:
8081 """"""""""
8082
8083 This instruction requires several arguments:
8084
8085 #. The optional "cconv" marker indicates which :ref:`calling
8086    convention <callingconv>` the call should use. If none is
8087    specified, the call defaults to using C calling conventions.
8088 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
8089    values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
8090    are valid here.
8091 #. The optional addrspace attribute can be used to indicate the address space
8092    of the called function. If it is not specified, the program address space
8093    from the :ref:`datalayout string<langref_datalayout>` will be used.
8094 #. '``ty``': the type of the call instruction itself which is also the
8095    type of the return value. Functions that return no value are marked
8096    ``void``.
8097 #. '``fnty``': shall be the signature of the function being invoked. The
8098    argument types must match the types implied by this signature. This
8099    type can be omitted if the function is not varargs.
8100 #. '``fnptrval``': An LLVM value containing a pointer to a function to
8101    be invoked. In most cases, this is a direct function invocation, but
8102    indirect ``invoke``'s are just as possible, calling an arbitrary pointer
8103    to function value.
8104 #. '``function args``': argument list whose types match the function
8105    signature argument types and parameter attributes. All arguments must
8106    be of :ref:`first class <t_firstclass>` type. If the function signature
8107    indicates the function accepts a variable number of arguments, the
8108    extra arguments can be specified.
8109 #. '``normal label``': the label reached when the called function
8110    executes a '``ret``' instruction.
8111 #. '``exception label``': the label reached when a callee returns via
8112    the :ref:`resume <i_resume>` instruction or other exception handling
8113    mechanism.
8114 #. The optional :ref:`function attributes <fnattrs>` list.
8115 #. The optional :ref:`operand bundles <opbundles>` list.
8116
8117 Semantics:
8118 """"""""""
8119
8120 This instruction is designed to operate as a standard '``call``'
8121 instruction in most regards. The primary difference is that it
8122 establishes an association with a label, which is used by the runtime
8123 library to unwind the stack.
8124
8125 This instruction is used in languages with destructors to ensure that
8126 proper cleanup is performed in the case of either a ``longjmp`` or a
8127 thrown exception. Additionally, this is important for implementation of
8128 '``catch``' clauses in high-level languages that support them.
8129
8130 For the purposes of the SSA form, the definition of the value returned
8131 by the '``invoke``' instruction is deemed to occur on the edge from the
8132 current block to the "normal" label. If the callee unwinds then no
8133 return value is available.
8134
8135 Example:
8136 """"""""
8137
8138 .. code-block:: llvm
8139
8140       %retval = invoke i32 @Test(i32 15) to label %Continue
8141                   unwind label %TestCleanup              ; i32:retval set
8142       %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
8143                   unwind label %TestCleanup              ; i32:retval set
8144
8145 .. _i_callbr:
8146
8147 '``callbr``' Instruction
8148 ^^^^^^^^^^^^^^^^^^^^^^^^
8149
8150 Syntax:
8151 """""""
8152
8153 ::
8154
8155       <result> = callbr [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
8156                     [operand bundles] to label <fallthrough label> [indirect labels]
8157
8158 Overview:
8159 """""""""
8160
8161 The '``callbr``' instruction causes control to transfer to a specified
8162 function, with the possibility of control flow transfer to either the
8163 '``fallthrough``' label or one of the '``indirect``' labels.
8164
8165 This instruction should only be used to implement the "goto" feature of gcc
8166 style inline assembly. Any other usage is an error in the IR verifier.
8167
8168 Arguments:
8169 """"""""""
8170
8171 This instruction requires several arguments:
8172
8173 #. The optional "cconv" marker indicates which :ref:`calling
8174    convention <callingconv>` the call should use. If none is
8175    specified, the call defaults to using C calling conventions.
8176 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
8177    values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
8178    are valid here.
8179 #. The optional addrspace attribute can be used to indicate the address space
8180    of the called function. If it is not specified, the program address space
8181    from the :ref:`datalayout string<langref_datalayout>` will be used.
8182 #. '``ty``': the type of the call instruction itself which is also the
8183    type of the return value. Functions that return no value are marked
8184    ``void``.
8185 #. '``fnty``': shall be the signature of the function being called. The
8186    argument types must match the types implied by this signature. This
8187    type can be omitted if the function is not varargs.
8188 #. '``fnptrval``': An LLVM value containing a pointer to a function to
8189    be called. In most cases, this is a direct function call, but
8190    other ``callbr``'s are just as possible, calling an arbitrary pointer
8191    to function value.
8192 #. '``function args``': argument list whose types match the function
8193    signature argument types and parameter attributes. All arguments must
8194    be of :ref:`first class <t_firstclass>` type. If the function signature
8195    indicates the function accepts a variable number of arguments, the
8196    extra arguments can be specified.
8197 #. '``fallthrough label``': the label reached when the inline assembly's
8198    execution exits the bottom.
8199 #. '``indirect labels``': the labels reached when a callee transfers control
8200    to a location other than the '``fallthrough label``'. The blockaddress
8201    constant for these should also be in the list of '``function args``'.
8202 #. The optional :ref:`function attributes <fnattrs>` list.
8203 #. The optional :ref:`operand bundles <opbundles>` list.
8204
8205 Semantics:
8206 """"""""""
8207
8208 This instruction is designed to operate as a standard '``call``'
8209 instruction in most regards. The primary difference is that it
8210 establishes an association with additional labels to define where control
8211 flow goes after the call.
8212
8213 The output values of a '``callbr``' instruction are available only to
8214 the '``fallthrough``' block, not to any '``indirect``' blocks(s).
8215
8216 The only use of this today is to implement the "goto" feature of gcc inline
8217 assembly where additional labels can be provided as locations for the inline
8218 assembly to jump to.
8219
8220 Example:
8221 """"""""
8222
8223 .. code-block:: llvm
8224
8225       ; "asm goto" without output constraints.
8226       callbr void asm "", "r,X"(i32 %x, i8 *blockaddress(@foo, %indirect))
8227                   to label %fallthrough [label %indirect]
8228
8229       ; "asm goto" with output constraints.
8230       <result> = callbr i32 asm "", "=r,r,X"(i32 %x, i8 *blockaddress(@foo, %indirect))
8231                   to label %fallthrough [label %indirect]
8232
8233 .. _i_resume:
8234
8235 '``resume``' Instruction
8236 ^^^^^^^^^^^^^^^^^^^^^^^^
8237
8238 Syntax:
8239 """""""
8240
8241 ::
8242
8243       resume <type> <value>
8244
8245 Overview:
8246 """""""""
8247
8248 The '``resume``' instruction is a terminator instruction that has no
8249 successors.
8250
8251 Arguments:
8252 """"""""""
8253
8254 The '``resume``' instruction requires one argument, which must have the
8255 same type as the result of any '``landingpad``' instruction in the same
8256 function.
8257
8258 Semantics:
8259 """"""""""
8260
8261 The '``resume``' instruction resumes propagation of an existing
8262 (in-flight) exception whose unwinding was interrupted with a
8263 :ref:`landingpad <i_landingpad>` instruction.
8264
8265 Example:
8266 """"""""
8267
8268 .. code-block:: llvm
8269
8270       resume { i8*, i32 } %exn
8271
8272 .. _i_catchswitch:
8273
8274 '``catchswitch``' Instruction
8275 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8276
8277 Syntax:
8278 """""""
8279
8280 ::
8281
8282       <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller
8283       <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default>
8284
8285 Overview:
8286 """""""""
8287
8288 The '``catchswitch``' instruction is used by `LLVM's exception handling system
8289 <ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers
8290 that may be executed by the :ref:`EH personality routine <personalityfn>`.
8291
8292 Arguments:
8293 """"""""""
8294
8295 The ``parent`` argument is the token of the funclet that contains the
8296 ``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet,
8297 this operand may be the token ``none``.
8298
8299 The ``default`` argument is the label of another basic block beginning with
8300 either a ``cleanuppad`` or ``catchswitch`` instruction.  This unwind destination
8301 must be a legal target with respect to the ``parent`` links, as described in
8302 the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
8303
8304 The ``handlers`` are a nonempty list of successor blocks that each begin with a
8305 :ref:`catchpad <i_catchpad>` instruction.
8306
8307 Semantics:
8308 """"""""""
8309
8310 Executing this instruction transfers control to one of the successors in
8311 ``handlers``, if appropriate, or continues to unwind via the unwind label if
8312 present.
8313
8314 The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that
8315 it must be both the first non-phi instruction and last instruction in the basic
8316 block. Therefore, it must be the only non-phi instruction in the block.
8317
8318 Example:
8319 """"""""
8320
8321 .. code-block:: text
8322
8323     dispatch1:
8324       %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller
8325     dispatch2:
8326       %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup
8327
8328 .. _i_catchret:
8329
8330 '``catchret``' Instruction
8331 ^^^^^^^^^^^^^^^^^^^^^^^^^^
8332
8333 Syntax:
8334 """""""
8335
8336 ::
8337
8338       catchret from <token> to label <normal>
8339
8340 Overview:
8341 """""""""
8342
8343 The '``catchret``' instruction is a terminator instruction that has a
8344 single successor.
8345
8346
8347 Arguments:
8348 """"""""""
8349
8350 The first argument to a '``catchret``' indicates which ``catchpad`` it
8351 exits.  It must be a :ref:`catchpad <i_catchpad>`.
8352 The second argument to a '``catchret``' specifies where control will
8353 transfer to next.
8354
8355 Semantics:
8356 """"""""""
8357
8358 The '``catchret``' instruction ends an existing (in-flight) exception whose
8359 unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction.  The
8360 :ref:`personality function <personalityfn>` gets a chance to execute arbitrary
8361 code to, for example, destroy the active exception.  Control then transfers to
8362 ``normal``.
8363
8364 The ``token`` argument must be a token produced by a ``catchpad`` instruction.
8365 If the specified ``catchpad`` is not the most-recently-entered not-yet-exited
8366 funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8367 the ``catchret``'s behavior is undefined.
8368
8369 Example:
8370 """"""""
8371
8372 .. code-block:: text
8373
8374       catchret from %catch label %continue
8375
8376 .. _i_cleanupret:
8377
8378 '``cleanupret``' Instruction
8379 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8380
8381 Syntax:
8382 """""""
8383
8384 ::
8385
8386       cleanupret from <value> unwind label <continue>
8387       cleanupret from <value> unwind to caller
8388
8389 Overview:
8390 """""""""
8391
8392 The '``cleanupret``' instruction is a terminator instruction that has
8393 an optional successor.
8394
8395
8396 Arguments:
8397 """"""""""
8398
8399 The '``cleanupret``' instruction requires one argument, which indicates
8400 which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`.
8401 If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited
8402 funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8403 the ``cleanupret``'s behavior is undefined.
8404
8405 The '``cleanupret``' instruction also has an optional successor, ``continue``,
8406 which must be the label of another basic block beginning with either a
8407 ``cleanuppad`` or ``catchswitch`` instruction.  This unwind destination must
8408 be a legal target with respect to the ``parent`` links, as described in the
8409 `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
8410
8411 Semantics:
8412 """"""""""
8413
8414 The '``cleanupret``' instruction indicates to the
8415 :ref:`personality function <personalityfn>` that one
8416 :ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended.
8417 It transfers control to ``continue`` or unwinds out of the function.
8418
8419 Example:
8420 """"""""
8421
8422 .. code-block:: text
8423
8424       cleanupret from %cleanup unwind to caller
8425       cleanupret from %cleanup unwind label %continue
8426
8427 .. _i_unreachable:
8428
8429 '``unreachable``' Instruction
8430 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8431
8432 Syntax:
8433 """""""
8434
8435 ::
8436
8437       unreachable
8438
8439 Overview:
8440 """""""""
8441
8442 The '``unreachable``' instruction has no defined semantics. This
8443 instruction is used to inform the optimizer that a particular portion of
8444 the code is not reachable. This can be used to indicate that the code
8445 after a no-return function cannot be reached, and other facts.
8446
8447 Semantics:
8448 """"""""""
8449
8450 The '``unreachable``' instruction has no defined semantics.
8451
8452 .. _unaryops:
8453
8454 Unary Operations
8455 -----------------
8456
8457 Unary operators require a single operand, execute an operation on
8458 it, and produce a single value. The operand might represent multiple
8459 data, as is the case with the :ref:`vector <t_vector>` data type. The
8460 result value has the same type as its operand.
8461
8462 .. _i_fneg:
8463
8464 '``fneg``' Instruction
8465 ^^^^^^^^^^^^^^^^^^^^^^
8466
8467 Syntax:
8468 """""""
8469
8470 ::
8471
8472       <result> = fneg [fast-math flags]* <ty> <op1>   ; yields ty:result
8473
8474 Overview:
8475 """""""""
8476
8477 The '``fneg``' instruction returns the negation of its operand.
8478
8479 Arguments:
8480 """"""""""
8481
8482 The argument to the '``fneg``' instruction must be a
8483 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8484 floating-point values.
8485
8486 Semantics:
8487 """"""""""
8488
8489 The value produced is a copy of the operand with its sign bit flipped.
8490 This instruction can also take any number of :ref:`fast-math
8491 flags <fastmath>`, which are optimization hints to enable otherwise
8492 unsafe floating-point optimizations:
8493
8494 Example:
8495 """"""""
8496
8497 .. code-block:: text
8498
8499       <result> = fneg float %val          ; yields float:result = -%var
8500
8501 .. _binaryops:
8502
8503 Binary Operations
8504 -----------------
8505
8506 Binary operators are used to do most of the computation in a program.
8507 They require two operands of the same type, execute an operation on
8508 them, and produce a single value. The operands might represent multiple
8509 data, as is the case with the :ref:`vector <t_vector>` data type. The
8510 result value has the same type as its operands.
8511
8512 There are several different binary operators:
8513
8514 .. _i_add:
8515
8516 '``add``' Instruction
8517 ^^^^^^^^^^^^^^^^^^^^^
8518
8519 Syntax:
8520 """""""
8521
8522 ::
8523
8524       <result> = add <ty> <op1>, <op2>          ; yields ty:result
8525       <result> = add nuw <ty> <op1>, <op2>      ; yields ty:result
8526       <result> = add nsw <ty> <op1>, <op2>      ; yields ty:result
8527       <result> = add nuw nsw <ty> <op1>, <op2>  ; yields ty:result
8528
8529 Overview:
8530 """""""""
8531
8532 The '``add``' instruction returns the sum of its two operands.
8533
8534 Arguments:
8535 """"""""""
8536
8537 The two arguments to the '``add``' instruction must be
8538 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8539 arguments must have identical types.
8540
8541 Semantics:
8542 """"""""""
8543
8544 The value produced is the integer sum of the two operands.
8545
8546 If the sum has unsigned overflow, the result returned is the
8547 mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
8548 the result.
8549
8550 Because LLVM integers use a two's complement representation, this
8551 instruction is appropriate for both signed and unsigned integers.
8552
8553 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
8554 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
8555 result value of the ``add`` is a :ref:`poison value <poisonvalues>` if
8556 unsigned and/or signed overflow, respectively, occurs.
8557
8558 Example:
8559 """"""""
8560
8561 .. code-block:: text
8562
8563       <result> = add i32 4, %var          ; yields i32:result = 4 + %var
8564
8565 .. _i_fadd:
8566
8567 '``fadd``' Instruction
8568 ^^^^^^^^^^^^^^^^^^^^^^
8569
8570 Syntax:
8571 """""""
8572
8573 ::
8574
8575       <result> = fadd [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8576
8577 Overview:
8578 """""""""
8579
8580 The '``fadd``' instruction returns the sum of its two operands.
8581
8582 Arguments:
8583 """"""""""
8584
8585 The two arguments to the '``fadd``' instruction must be
8586 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8587 floating-point values. Both arguments must have identical types.
8588
8589 Semantics:
8590 """"""""""
8591
8592 The value produced is the floating-point sum of the two operands.
8593 This instruction is assumed to execute in the default :ref:`floating-point
8594 environment <floatenv>`.
8595 This instruction can also take any number of :ref:`fast-math
8596 flags <fastmath>`, which are optimization hints to enable otherwise
8597 unsafe floating-point optimizations:
8598
8599 Example:
8600 """"""""
8601
8602 .. code-block:: text
8603
8604       <result> = fadd float 4.0, %var          ; yields float:result = 4.0 + %var
8605
8606 .. _i_sub:
8607
8608 '``sub``' Instruction
8609 ^^^^^^^^^^^^^^^^^^^^^
8610
8611 Syntax:
8612 """""""
8613
8614 ::
8615
8616       <result> = sub <ty> <op1>, <op2>          ; yields ty:result
8617       <result> = sub nuw <ty> <op1>, <op2>      ; yields ty:result
8618       <result> = sub nsw <ty> <op1>, <op2>      ; yields ty:result
8619       <result> = sub nuw nsw <ty> <op1>, <op2>  ; yields ty:result
8620
8621 Overview:
8622 """""""""
8623
8624 The '``sub``' instruction returns the difference of its two operands.
8625
8626 Note that the '``sub``' instruction is used to represent the '``neg``'
8627 instruction present in most other intermediate representations.
8628
8629 Arguments:
8630 """"""""""
8631
8632 The two arguments to the '``sub``' instruction must be
8633 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8634 arguments must have identical types.
8635
8636 Semantics:
8637 """"""""""
8638
8639 The value produced is the integer difference of the two operands.
8640
8641 If the difference has unsigned overflow, the result returned is the
8642 mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
8643 the result.
8644
8645 Because LLVM integers use a two's complement representation, this
8646 instruction is appropriate for both signed and unsigned integers.
8647
8648 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
8649 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
8650 result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if
8651 unsigned and/or signed overflow, respectively, occurs.
8652
8653 Example:
8654 """"""""
8655
8656 .. code-block:: text
8657
8658       <result> = sub i32 4, %var          ; yields i32:result = 4 - %var
8659       <result> = sub i32 0, %val          ; yields i32:result = -%var
8660
8661 .. _i_fsub:
8662
8663 '``fsub``' Instruction
8664 ^^^^^^^^^^^^^^^^^^^^^^
8665
8666 Syntax:
8667 """""""
8668
8669 ::
8670
8671       <result> = fsub [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8672
8673 Overview:
8674 """""""""
8675
8676 The '``fsub``' instruction returns the difference of its two operands.
8677
8678 Arguments:
8679 """"""""""
8680
8681 The two arguments to the '``fsub``' instruction must be
8682 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8683 floating-point values. Both arguments must have identical types.
8684
8685 Semantics:
8686 """"""""""
8687
8688 The value produced is the floating-point difference of the two operands.
8689 This instruction is assumed to execute in the default :ref:`floating-point
8690 environment <floatenv>`.
8691 This instruction can also take any number of :ref:`fast-math
8692 flags <fastmath>`, which are optimization hints to enable otherwise
8693 unsafe floating-point optimizations:
8694
8695 Example:
8696 """"""""
8697
8698 .. code-block:: text
8699
8700       <result> = fsub float 4.0, %var           ; yields float:result = 4.0 - %var
8701       <result> = fsub float -0.0, %val          ; yields float:result = -%var
8702
8703 .. _i_mul:
8704
8705 '``mul``' Instruction
8706 ^^^^^^^^^^^^^^^^^^^^^
8707
8708 Syntax:
8709 """""""
8710
8711 ::
8712
8713       <result> = mul <ty> <op1>, <op2>          ; yields ty:result
8714       <result> = mul nuw <ty> <op1>, <op2>      ; yields ty:result
8715       <result> = mul nsw <ty> <op1>, <op2>      ; yields ty:result
8716       <result> = mul nuw nsw <ty> <op1>, <op2>  ; yields ty:result
8717
8718 Overview:
8719 """""""""
8720
8721 The '``mul``' instruction returns the product of its two operands.
8722
8723 Arguments:
8724 """"""""""
8725
8726 The two arguments to the '``mul``' instruction must be
8727 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8728 arguments must have identical types.
8729
8730 Semantics:
8731 """"""""""
8732
8733 The value produced is the integer product of the two operands.
8734
8735 If the result of the multiplication has unsigned overflow, the result
8736 returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the
8737 bit width of the result.
8738
8739 Because LLVM integers use a two's complement representation, and the
8740 result is the same width as the operands, this instruction returns the
8741 correct result for both signed and unsigned integers. If a full product
8742 (e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be
8743 sign-extended or zero-extended as appropriate to the width of the full
8744 product.
8745
8746 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
8747 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
8748 result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if
8749 unsigned and/or signed overflow, respectively, occurs.
8750
8751 Example:
8752 """"""""
8753
8754 .. code-block:: text
8755
8756       <result> = mul i32 4, %var          ; yields i32:result = 4 * %var
8757
8758 .. _i_fmul:
8759
8760 '``fmul``' Instruction
8761 ^^^^^^^^^^^^^^^^^^^^^^
8762
8763 Syntax:
8764 """""""
8765
8766 ::
8767
8768       <result> = fmul [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8769
8770 Overview:
8771 """""""""
8772
8773 The '``fmul``' instruction returns the product of its two operands.
8774
8775 Arguments:
8776 """"""""""
8777
8778 The two arguments to the '``fmul``' instruction must be
8779 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8780 floating-point values. Both arguments must have identical types.
8781
8782 Semantics:
8783 """"""""""
8784
8785 The value produced is the floating-point product of the two operands.
8786 This instruction is assumed to execute in the default :ref:`floating-point
8787 environment <floatenv>`.
8788 This instruction can also take any number of :ref:`fast-math
8789 flags <fastmath>`, which are optimization hints to enable otherwise
8790 unsafe floating-point optimizations:
8791
8792 Example:
8793 """"""""
8794
8795 .. code-block:: text
8796
8797       <result> = fmul float 4.0, %var          ; yields float:result = 4.0 * %var
8798
8799 .. _i_udiv:
8800
8801 '``udiv``' Instruction
8802 ^^^^^^^^^^^^^^^^^^^^^^
8803
8804 Syntax:
8805 """""""
8806
8807 ::
8808
8809       <result> = udiv <ty> <op1>, <op2>         ; yields ty:result
8810       <result> = udiv exact <ty> <op1>, <op2>   ; yields ty:result
8811
8812 Overview:
8813 """""""""
8814
8815 The '``udiv``' instruction returns the quotient of its two operands.
8816
8817 Arguments:
8818 """"""""""
8819
8820 The two arguments to the '``udiv``' instruction must be
8821 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8822 arguments must have identical types.
8823
8824 Semantics:
8825 """"""""""
8826
8827 The value produced is the unsigned integer quotient of the two operands.
8828
8829 Note that unsigned integer division and signed integer division are
8830 distinct operations; for signed integer division, use '``sdiv``'.
8831
8832 Division by zero is undefined behavior. For vectors, if any element
8833 of the divisor is zero, the operation has undefined behavior.
8834
8835
8836 If the ``exact`` keyword is present, the result value of the ``udiv`` is
8837 a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as
8838 such, "((a udiv exact b) mul b) == a").
8839
8840 Example:
8841 """"""""
8842
8843 .. code-block:: text
8844
8845       <result> = udiv i32 4, %var          ; yields i32:result = 4 / %var
8846
8847 .. _i_sdiv:
8848
8849 '``sdiv``' Instruction
8850 ^^^^^^^^^^^^^^^^^^^^^^
8851
8852 Syntax:
8853 """""""
8854
8855 ::
8856
8857       <result> = sdiv <ty> <op1>, <op2>         ; yields ty:result
8858       <result> = sdiv exact <ty> <op1>, <op2>   ; yields ty:result
8859
8860 Overview:
8861 """""""""
8862
8863 The '``sdiv``' instruction returns the quotient of its two operands.
8864
8865 Arguments:
8866 """"""""""
8867
8868 The two arguments to the '``sdiv``' instruction must be
8869 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8870 arguments must have identical types.
8871
8872 Semantics:
8873 """"""""""
8874
8875 The value produced is the signed integer quotient of the two operands
8876 rounded towards zero.
8877
8878 Note that signed integer division and unsigned integer division are
8879 distinct operations; for unsigned integer division, use '``udiv``'.
8880
8881 Division by zero is undefined behavior. For vectors, if any element
8882 of the divisor is zero, the operation has undefined behavior.
8883 Overflow also leads to undefined behavior; this is a rare case, but can
8884 occur, for example, by doing a 32-bit division of -2147483648 by -1.
8885
8886 If the ``exact`` keyword is present, the result value of the ``sdiv`` is
8887 a :ref:`poison value <poisonvalues>` if the result would be rounded.
8888
8889 Example:
8890 """"""""
8891
8892 .. code-block:: text
8893
8894       <result> = sdiv i32 4, %var          ; yields i32:result = 4 / %var
8895
8896 .. _i_fdiv:
8897
8898 '``fdiv``' Instruction
8899 ^^^^^^^^^^^^^^^^^^^^^^
8900
8901 Syntax:
8902 """""""
8903
8904 ::
8905
8906       <result> = fdiv [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8907
8908 Overview:
8909 """""""""
8910
8911 The '``fdiv``' instruction returns the quotient of its two operands.
8912
8913 Arguments:
8914 """"""""""
8915
8916 The two arguments to the '``fdiv``' instruction must be
8917 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8918 floating-point values. Both arguments must have identical types.
8919
8920 Semantics:
8921 """"""""""
8922
8923 The value produced is the floating-point quotient of the two operands.
8924 This instruction is assumed to execute in the default :ref:`floating-point
8925 environment <floatenv>`.
8926 This instruction can also take any number of :ref:`fast-math
8927 flags <fastmath>`, which are optimization hints to enable otherwise
8928 unsafe floating-point optimizations:
8929
8930 Example:
8931 """"""""
8932
8933 .. code-block:: text
8934
8935       <result> = fdiv float 4.0, %var          ; yields float:result = 4.0 / %var
8936
8937 .. _i_urem:
8938
8939 '``urem``' Instruction
8940 ^^^^^^^^^^^^^^^^^^^^^^
8941
8942 Syntax:
8943 """""""
8944
8945 ::
8946
8947       <result> = urem <ty> <op1>, <op2>   ; yields ty:result
8948
8949 Overview:
8950 """""""""
8951
8952 The '``urem``' instruction returns the remainder from the unsigned
8953 division of its two arguments.
8954
8955 Arguments:
8956 """"""""""
8957
8958 The two arguments to the '``urem``' instruction must be
8959 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8960 arguments must have identical types.
8961
8962 Semantics:
8963 """"""""""
8964
8965 This instruction returns the unsigned integer *remainder* of a division.
8966 This instruction always performs an unsigned division to get the
8967 remainder.
8968
8969 Note that unsigned integer remainder and signed integer remainder are
8970 distinct operations; for signed integer remainder, use '``srem``'.
8971
8972 Taking the remainder of a division by zero is undefined behavior.
8973 For vectors, if any element of the divisor is zero, the operation has
8974 undefined behavior.
8975
8976 Example:
8977 """"""""
8978
8979 .. code-block:: text
8980
8981       <result> = urem i32 4, %var          ; yields i32:result = 4 % %var
8982
8983 .. _i_srem:
8984
8985 '``srem``' Instruction
8986 ^^^^^^^^^^^^^^^^^^^^^^
8987
8988 Syntax:
8989 """""""
8990
8991 ::
8992
8993       <result> = srem <ty> <op1>, <op2>   ; yields ty:result
8994
8995 Overview:
8996 """""""""
8997
8998 The '``srem``' instruction returns the remainder from the signed
8999 division of its two operands. This instruction can also take
9000 :ref:`vector <t_vector>` versions of the values in which case the elements
9001 must be integers.
9002
9003 Arguments:
9004 """"""""""
9005
9006 The two arguments to the '``srem``' instruction must be
9007 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9008 arguments must have identical types.
9009
9010 Semantics:
9011 """"""""""
9012
9013 This instruction returns the *remainder* of a division (where the result
9014 is either zero or has the same sign as the dividend, ``op1``), not the
9015 *modulo* operator (where the result is either zero or has the same sign
9016 as the divisor, ``op2``) of a value. For more information about the
9017 difference, see `The Math
9018 Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a
9019 table of how this is implemented in various languages, please see
9020 `Wikipedia: modulo
9021 operation <http://en.wikipedia.org/wiki/Modulo_operation>`_.
9022
9023 Note that signed integer remainder and unsigned integer remainder are
9024 distinct operations; for unsigned integer remainder, use '``urem``'.
9025
9026 Taking the remainder of a division by zero is undefined behavior.
9027 For vectors, if any element of the divisor is zero, the operation has
9028 undefined behavior.
9029 Overflow also leads to undefined behavior; this is a rare case, but can
9030 occur, for example, by taking the remainder of a 32-bit division of
9031 -2147483648 by -1. (The remainder doesn't actually overflow, but this
9032 rule lets srem be implemented using instructions that return both the
9033 result of the division and the remainder.)
9034
9035 Example:
9036 """"""""
9037
9038 .. code-block:: text
9039
9040       <result> = srem i32 4, %var          ; yields i32:result = 4 % %var
9041
9042 .. _i_frem:
9043
9044 '``frem``' Instruction
9045 ^^^^^^^^^^^^^^^^^^^^^^
9046
9047 Syntax:
9048 """""""
9049
9050 ::
9051
9052       <result> = frem [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
9053
9054 Overview:
9055 """""""""
9056
9057 The '``frem``' instruction returns the remainder from the division of
9058 its two operands.
9059
9060 Arguments:
9061 """"""""""
9062
9063 The two arguments to the '``frem``' instruction must be
9064 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9065 floating-point values. Both arguments must have identical types.
9066
9067 Semantics:
9068 """"""""""
9069
9070 The value produced is the floating-point remainder of the two operands.
9071 This is the same output as a libm '``fmod``' function, but without any
9072 possibility of setting ``errno``. The remainder has the same sign as the
9073 dividend.
9074 This instruction is assumed to execute in the default :ref:`floating-point
9075 environment <floatenv>`.
9076 This instruction can also take any number of :ref:`fast-math
9077 flags <fastmath>`, which are optimization hints to enable otherwise
9078 unsafe floating-point optimizations:
9079
9080 Example:
9081 """"""""
9082
9083 .. code-block:: text
9084
9085       <result> = frem float 4.0, %var          ; yields float:result = 4.0 % %var
9086
9087 .. _bitwiseops:
9088
9089 Bitwise Binary Operations
9090 -------------------------
9091
9092 Bitwise binary operators are used to do various forms of bit-twiddling
9093 in a program. They are generally very efficient instructions and can
9094 commonly be strength reduced from other instructions. They require two
9095 operands of the same type, execute an operation on them, and produce a
9096 single value. The resulting value is the same type as its operands.
9097
9098 .. _i_shl:
9099
9100 '``shl``' Instruction
9101 ^^^^^^^^^^^^^^^^^^^^^
9102
9103 Syntax:
9104 """""""
9105
9106 ::
9107
9108       <result> = shl <ty> <op1>, <op2>           ; yields ty:result
9109       <result> = shl nuw <ty> <op1>, <op2>       ; yields ty:result
9110       <result> = shl nsw <ty> <op1>, <op2>       ; yields ty:result
9111       <result> = shl nuw nsw <ty> <op1>, <op2>   ; yields ty:result
9112
9113 Overview:
9114 """""""""
9115
9116 The '``shl``' instruction returns the first operand shifted to the left
9117 a specified number of bits.
9118
9119 Arguments:
9120 """"""""""
9121
9122 Both arguments to the '``shl``' instruction must be the same
9123 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9124 '``op2``' is treated as an unsigned value.
9125
9126 Semantics:
9127 """"""""""
9128
9129 The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`,
9130 where ``n`` is the width of the result. If ``op2`` is (statically or
9131 dynamically) equal to or larger than the number of bits in
9132 ``op1``, this instruction returns a :ref:`poison value <poisonvalues>`.
9133 If the arguments are vectors, each vector element of ``op1`` is shifted
9134 by the corresponding shift amount in ``op2``.
9135
9136 If the ``nuw`` keyword is present, then the shift produces a poison
9137 value if it shifts out any non-zero bits.
9138 If the ``nsw`` keyword is present, then the shift produces a poison
9139 value if it shifts out any bits that disagree with the resultant sign bit.
9140
9141 Example:
9142 """"""""
9143
9144 .. code-block:: text
9145
9146       <result> = shl i32 4, %var   ; yields i32: 4 << %var
9147       <result> = shl i32 4, 2      ; yields i32: 16
9148       <result> = shl i32 1, 10     ; yields i32: 1024
9149       <result> = shl i32 1, 32     ; undefined
9150       <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2>   ; yields: result=<2 x i32> < i32 2, i32 4>
9151
9152 .. _i_lshr:
9153
9154
9155 '``lshr``' Instruction
9156 ^^^^^^^^^^^^^^^^^^^^^^
9157
9158 Syntax:
9159 """""""
9160
9161 ::
9162
9163       <result> = lshr <ty> <op1>, <op2>         ; yields ty:result
9164       <result> = lshr exact <ty> <op1>, <op2>   ; yields ty:result
9165
9166 Overview:
9167 """""""""
9168
9169 The '``lshr``' instruction (logical shift right) returns the first
9170 operand shifted to the right a specified number of bits with zero fill.
9171
9172 Arguments:
9173 """"""""""
9174
9175 Both arguments to the '``lshr``' instruction must be the same
9176 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9177 '``op2``' is treated as an unsigned value.
9178
9179 Semantics:
9180 """"""""""
9181
9182 This instruction always performs a logical shift right operation. The
9183 most significant bits of the result will be filled with zero bits after
9184 the shift. If ``op2`` is (statically or dynamically) equal to or larger
9185 than the number of bits in ``op1``, this instruction returns a :ref:`poison
9186 value <poisonvalues>`. If the arguments are vectors, each vector element
9187 of ``op1`` is shifted by the corresponding shift amount in ``op2``.
9188
9189 If the ``exact`` keyword is present, the result value of the ``lshr`` is
9190 a poison value if any of the bits shifted out are non-zero.
9191
9192 Example:
9193 """"""""
9194
9195 .. code-block:: text
9196
9197       <result> = lshr i32 4, 1   ; yields i32:result = 2
9198       <result> = lshr i32 4, 2   ; yields i32:result = 1
9199       <result> = lshr i8  4, 3   ; yields i8:result = 0
9200       <result> = lshr i8 -2, 1   ; yields i8:result = 0x7F
9201       <result> = lshr i32 1, 32  ; undefined
9202       <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2>   ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
9203
9204 .. _i_ashr:
9205
9206 '``ashr``' Instruction
9207 ^^^^^^^^^^^^^^^^^^^^^^
9208
9209 Syntax:
9210 """""""
9211
9212 ::
9213
9214       <result> = ashr <ty> <op1>, <op2>         ; yields ty:result
9215       <result> = ashr exact <ty> <op1>, <op2>   ; yields ty:result
9216
9217 Overview:
9218 """""""""
9219
9220 The '``ashr``' instruction (arithmetic shift right) returns the first
9221 operand shifted to the right a specified number of bits with sign
9222 extension.
9223
9224 Arguments:
9225 """"""""""
9226
9227 Both arguments to the '``ashr``' instruction must be the same
9228 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9229 '``op2``' is treated as an unsigned value.
9230
9231 Semantics:
9232 """"""""""
9233
9234 This instruction always performs an arithmetic shift right operation,
9235 The most significant bits of the result will be filled with the sign bit
9236 of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger
9237 than the number of bits in ``op1``, this instruction returns a :ref:`poison
9238 value <poisonvalues>`. If the arguments are vectors, each vector element
9239 of ``op1`` is shifted by the corresponding shift amount in ``op2``.
9240
9241 If the ``exact`` keyword is present, the result value of the ``ashr`` is
9242 a poison value if any of the bits shifted out are non-zero.
9243
9244 Example:
9245 """"""""
9246
9247 .. code-block:: text
9248
9249       <result> = ashr i32 4, 1   ; yields i32:result = 2
9250       <result> = ashr i32 4, 2   ; yields i32:result = 1
9251       <result> = ashr i8  4, 3   ; yields i8:result = 0
9252       <result> = ashr i8 -2, 1   ; yields i8:result = -1
9253       <result> = ashr i32 1, 32  ; undefined
9254       <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3>   ; yields: result=<2 x i32> < i32 -1, i32 0>
9255
9256 .. _i_and:
9257
9258 '``and``' Instruction
9259 ^^^^^^^^^^^^^^^^^^^^^
9260
9261 Syntax:
9262 """""""
9263
9264 ::
9265
9266       <result> = and <ty> <op1>, <op2>   ; yields ty:result
9267
9268 Overview:
9269 """""""""
9270
9271 The '``and``' instruction returns the bitwise logical and of its two
9272 operands.
9273
9274 Arguments:
9275 """"""""""
9276
9277 The two arguments to the '``and``' instruction must be
9278 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9279 arguments must have identical types.
9280
9281 Semantics:
9282 """"""""""
9283
9284 The truth table used for the '``and``' instruction is:
9285
9286 +-----+-----+-----+
9287 | In0 | In1 | Out |
9288 +-----+-----+-----+
9289 |   0 |   0 |   0 |
9290 +-----+-----+-----+
9291 |   0 |   1 |   0 |
9292 +-----+-----+-----+
9293 |   1 |   0 |   0 |
9294 +-----+-----+-----+
9295 |   1 |   1 |   1 |
9296 +-----+-----+-----+
9297
9298 Example:
9299 """"""""
9300
9301 .. code-block:: text
9302
9303       <result> = and i32 4, %var         ; yields i32:result = 4 & %var
9304       <result> = and i32 15, 40          ; yields i32:result = 8
9305       <result> = and i32 4, 8            ; yields i32:result = 0
9306
9307 .. _i_or:
9308
9309 '``or``' Instruction
9310 ^^^^^^^^^^^^^^^^^^^^
9311
9312 Syntax:
9313 """""""
9314
9315 ::
9316
9317       <result> = or <ty> <op1>, <op2>   ; yields ty:result
9318
9319 Overview:
9320 """""""""
9321
9322 The '``or``' instruction returns the bitwise logical inclusive or of its
9323 two operands.
9324
9325 Arguments:
9326 """"""""""
9327
9328 The two arguments to the '``or``' instruction must be
9329 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9330 arguments must have identical types.
9331
9332 Semantics:
9333 """"""""""
9334
9335 The truth table used for the '``or``' instruction is:
9336
9337 +-----+-----+-----+
9338 | In0 | In1 | Out |
9339 +-----+-----+-----+
9340 |   0 |   0 |   0 |
9341 +-----+-----+-----+
9342 |   0 |   1 |   1 |
9343 +-----+-----+-----+
9344 |   1 |   0 |   1 |
9345 +-----+-----+-----+
9346 |   1 |   1 |   1 |
9347 +-----+-----+-----+
9348
9349 Example:
9350 """"""""
9351
9352 ::
9353
9354       <result> = or i32 4, %var         ; yields i32:result = 4 | %var
9355       <result> = or i32 15, 40          ; yields i32:result = 47
9356       <result> = or i32 4, 8            ; yields i32:result = 12
9357
9358 .. _i_xor:
9359
9360 '``xor``' Instruction
9361 ^^^^^^^^^^^^^^^^^^^^^
9362
9363 Syntax:
9364 """""""
9365
9366 ::
9367
9368       <result> = xor <ty> <op1>, <op2>   ; yields ty:result
9369
9370 Overview:
9371 """""""""
9372
9373 The '``xor``' instruction returns the bitwise logical exclusive or of
9374 its two operands. The ``xor`` is used to implement the "one's
9375 complement" operation, which is the "~" operator in C.
9376
9377 Arguments:
9378 """"""""""
9379
9380 The two arguments to the '``xor``' instruction must be
9381 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9382 arguments must have identical types.
9383
9384 Semantics:
9385 """"""""""
9386
9387 The truth table used for the '``xor``' instruction is:
9388
9389 +-----+-----+-----+
9390 | In0 | In1 | Out |
9391 +-----+-----+-----+
9392 |   0 |   0 |   0 |
9393 +-----+-----+-----+
9394 |   0 |   1 |   1 |
9395 +-----+-----+-----+
9396 |   1 |   0 |   1 |
9397 +-----+-----+-----+
9398 |   1 |   1 |   0 |
9399 +-----+-----+-----+
9400
9401 Example:
9402 """"""""
9403
9404 .. code-block:: text
9405
9406       <result> = xor i32 4, %var         ; yields i32:result = 4 ^ %var
9407       <result> = xor i32 15, 40          ; yields i32:result = 39
9408       <result> = xor i32 4, 8            ; yields i32:result = 12
9409       <result> = xor i32 %V, -1          ; yields i32:result = ~%V
9410
9411 Vector Operations
9412 -----------------
9413
9414 LLVM supports several instructions to represent vector operations in a
9415 target-independent manner. These instructions cover the element-access
9416 and vector-specific operations needed to process vectors effectively.
9417 While LLVM does directly support these vector operations, many
9418 sophisticated algorithms will want to use target-specific intrinsics to
9419 take full advantage of a specific target.
9420
9421 .. _i_extractelement:
9422
9423 '``extractelement``' Instruction
9424 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9425
9426 Syntax:
9427 """""""
9428
9429 ::
9430
9431       <result> = extractelement <n x <ty>> <val>, <ty2> <idx>  ; yields <ty>
9432       <result> = extractelement <vscale x n x <ty>> <val>, <ty2> <idx> ; yields <ty>
9433
9434 Overview:
9435 """""""""
9436
9437 The '``extractelement``' instruction extracts a single scalar element
9438 from a vector at a specified index.
9439
9440 Arguments:
9441 """"""""""
9442
9443 The first operand of an '``extractelement``' instruction is a value of
9444 :ref:`vector <t_vector>` type. The second operand is an index indicating
9445 the position from which to extract the element. The index may be a
9446 variable of any integer type.
9447
9448 Semantics:
9449 """"""""""
9450
9451 The result is a scalar of the same type as the element type of ``val``.
9452 Its value is the value at position ``idx`` of ``val``. If ``idx``
9453 exceeds the length of ``val`` for a fixed-length vector, the result is a
9454 :ref:`poison value <poisonvalues>`. For a scalable vector, if the value
9455 of ``idx`` exceeds the runtime length of the vector, the result is a
9456 :ref:`poison value <poisonvalues>`.
9457
9458 Example:
9459 """"""""
9460
9461 .. code-block:: text
9462
9463       <result> = extractelement <4 x i32> %vec, i32 0    ; yields i32
9464
9465 .. _i_insertelement:
9466
9467 '``insertelement``' Instruction
9468 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9469
9470 Syntax:
9471 """""""
9472
9473 ::
9474
9475       <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx>    ; yields <n x <ty>>
9476       <result> = insertelement <vscale x n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <vscale x n x <ty>>
9477
9478 Overview:
9479 """""""""
9480
9481 The '``insertelement``' instruction inserts a scalar element into a
9482 vector at a specified index.
9483
9484 Arguments:
9485 """"""""""
9486
9487 The first operand of an '``insertelement``' instruction is a value of
9488 :ref:`vector <t_vector>` type. The second operand is a scalar value whose
9489 type must equal the element type of the first operand. The third operand
9490 is an index indicating the position at which to insert the value. The
9491 index may be a variable of any integer type.
9492
9493 Semantics:
9494 """"""""""
9495
9496 The result is a vector of the same type as ``val``. Its element values
9497 are those of ``val`` except at position ``idx``, where it gets the value
9498 ``elt``. If ``idx`` exceeds the length of ``val`` for a fixed-length vector,
9499 the result is a :ref:`poison value <poisonvalues>`. For a scalable vector,
9500 if the value of ``idx`` exceeds the runtime length of the vector, the result
9501 is a :ref:`poison value <poisonvalues>`.
9502
9503 Example:
9504 """"""""
9505
9506 .. code-block:: text
9507
9508       <result> = insertelement <4 x i32> %vec, i32 1, i32 0    ; yields <4 x i32>
9509
9510 .. _i_shufflevector:
9511
9512 '``shufflevector``' Instruction
9513 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9514
9515 Syntax:
9516 """""""
9517
9518 ::
9519
9520       <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask>    ; yields <m x <ty>>
9521       <result> = shufflevector <vscale x n x <ty>> <v1>, <vscale x n x <ty>> v2, <vscale x m x i32> <mask>  ; yields <vscale x m x <ty>>
9522
9523 Overview:
9524 """""""""
9525
9526 The '``shufflevector``' instruction constructs a permutation of elements
9527 from two input vectors, returning a vector with the same element type as
9528 the input and length that is the same as the shuffle mask.
9529
9530 Arguments:
9531 """"""""""
9532
9533 The first two operands of a '``shufflevector``' instruction are vectors
9534 with the same type. The third argument is a shuffle mask vector constant
9535 whose element type is ``i32``. The mask vector elements must be constant
9536 integers or ``undef`` values. The result of the instruction is a vector
9537 whose length is the same as the shuffle mask and whose element type is the
9538 same as the element type of the first two operands.
9539
9540 Semantics:
9541 """"""""""
9542
9543 The elements of the two input vectors are numbered from left to right
9544 across both of the vectors. For each element of the result vector, the
9545 shuffle mask selects an element from one of the input vectors to copy
9546 to the result. Non-negative elements in the mask represent an index
9547 into the concatenated pair of input vectors.
9548
9549 If the shuffle mask is undefined, the result vector is undefined. If
9550 the shuffle mask selects an undefined element from one of the input
9551 vectors, the resulting element is undefined. An undefined element
9552 in the mask vector specifies that the resulting element is undefined.
9553 An undefined element in the mask vector prevents a poisoned vector
9554 element from propagating.
9555
9556 For scalable vectors, the only valid mask values at present are
9557 ``zeroinitializer`` and ``undef``, since we cannot write all indices as
9558 literals for a vector with a length unknown at compile time.
9559
9560 Example:
9561 """"""""
9562
9563 .. code-block:: text
9564
9565       <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
9566                               <4 x i32> <i32 0, i32 4, i32 1, i32 5>  ; yields <4 x i32>
9567       <result> = shufflevector <4 x i32> %v1, <4 x i32> undef,
9568                               <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32> - Identity shuffle.
9569       <result> = shufflevector <8 x i32> %v1, <8 x i32> undef,
9570                               <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32>
9571       <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
9572                               <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 >  ; yields <8 x i32>
9573
9574 Aggregate Operations
9575 --------------------
9576
9577 LLVM supports several instructions for working with
9578 :ref:`aggregate <t_aggregate>` values.
9579
9580 .. _i_extractvalue:
9581
9582 '``extractvalue``' Instruction
9583 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9584
9585 Syntax:
9586 """""""
9587
9588 ::
9589
9590       <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
9591
9592 Overview:
9593 """""""""
9594
9595 The '``extractvalue``' instruction extracts the value of a member field
9596 from an :ref:`aggregate <t_aggregate>` value.
9597
9598 Arguments:
9599 """"""""""
9600
9601 The first operand of an '``extractvalue``' instruction is a value of
9602 :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are
9603 constant indices to specify which value to extract in a similar manner
9604 as indices in a '``getelementptr``' instruction.
9605
9606 The major differences to ``getelementptr`` indexing are:
9607
9608 -  Since the value being indexed is not a pointer, the first index is
9609    omitted and assumed to be zero.
9610 -  At least one index must be specified.
9611 -  Not only struct indices but also array indices must be in bounds.
9612
9613 Semantics:
9614 """"""""""
9615
9616 The result is the value at the position in the aggregate specified by
9617 the index operands.
9618
9619 Example:
9620 """"""""
9621
9622 .. code-block:: text
9623
9624       <result> = extractvalue {i32, float} %agg, 0    ; yields i32
9625
9626 .. _i_insertvalue:
9627
9628 '``insertvalue``' Instruction
9629 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9630
9631 Syntax:
9632 """""""
9633
9634 ::
9635
9636       <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}*    ; yields <aggregate type>
9637
9638 Overview:
9639 """""""""
9640
9641 The '``insertvalue``' instruction inserts a value into a member field in
9642 an :ref:`aggregate <t_aggregate>` value.
9643
9644 Arguments:
9645 """"""""""
9646
9647 The first operand of an '``insertvalue``' instruction is a value of
9648 :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
9649 a first-class value to insert. The following operands are constant
9650 indices indicating the position at which to insert the value in a
9651 similar manner as indices in a '``extractvalue``' instruction. The value
9652 to insert must have the same type as the value identified by the
9653 indices.
9654
9655 Semantics:
9656 """"""""""
9657
9658 The result is an aggregate of the same type as ``val``. Its value is
9659 that of ``val`` except that the value at the position specified by the
9660 indices is that of ``elt``.
9661
9662 Example:
9663 """"""""
9664
9665 .. code-block:: llvm
9666
9667       %agg1 = insertvalue {i32, float} undef, i32 1, 0              ; yields {i32 1, float undef}
9668       %agg2 = insertvalue {i32, float} %agg1, float %val, 1         ; yields {i32 1, float %val}
9669       %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0    ; yields {i32 undef, {float %val}}
9670
9671 .. _memoryops:
9672
9673 Memory Access and Addressing Operations
9674 ---------------------------------------
9675
9676 A key design point of an SSA-based representation is how it represents
9677 memory. In LLVM, no memory locations are in SSA form, which makes things
9678 very simple. This section describes how to read, write, and allocate
9679 memory in LLVM.
9680
9681 .. _i_alloca:
9682
9683 '``alloca``' Instruction
9684 ^^^^^^^^^^^^^^^^^^^^^^^^
9685
9686 Syntax:
9687 """""""
9688
9689 ::
9690
9691       <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)]     ; yields type addrspace(num)*:result
9692
9693 Overview:
9694 """""""""
9695
9696 The '``alloca``' instruction allocates memory on the stack frame of the
9697 currently executing function, to be automatically released when this
9698 function returns to its caller.  If the address space is not explicitly
9699 specified, the object is allocated in the alloca address space from the
9700 :ref:`datalayout string<langref_datalayout>`.
9701
9702 Arguments:
9703 """"""""""
9704
9705 The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements``
9706 bytes of memory on the runtime stack, returning a pointer of the
9707 appropriate type to the program. If "NumElements" is specified, it is
9708 the number of elements allocated, otherwise "NumElements" is defaulted
9709 to be one. If a constant alignment is specified, the value result of the
9710 allocation is guaranteed to be aligned to at least that boundary. The
9711 alignment may not be greater than ``1 << 29``. If not specified, or if
9712 zero, the target can choose to align the allocation on any convenient
9713 boundary compatible with the type.
9714
9715 '``type``' may be any sized type.
9716
9717 Semantics:
9718 """"""""""
9719
9720 Memory is allocated; a pointer is returned. The allocated memory is
9721 uninitialized, and loading from uninitialized memory produces an undefined
9722 value. The operation itself is undefined if there is insufficient stack
9723 space for the allocation.'``alloca``'d memory is automatically released
9724 when the function returns. The '``alloca``' instruction is commonly used
9725 to represent automatic variables that must have an address available. When
9726 the function returns (either with the ``ret`` or ``resume`` instructions),
9727 the memory is reclaimed. Allocating zero bytes is legal, but the returned
9728 pointer may not be unique. The order in which memory is allocated (ie.,
9729 which way the stack grows) is not specified.
9730
9731 Note that '``alloca``' outside of the alloca address space from the
9732 :ref:`datalayout string<langref_datalayout>` is meaningful only if the
9733 target has assigned it a semantics.
9734
9735 If the returned pointer is used by :ref:`llvm.lifetime.start <int_lifestart>`,
9736 the returned object is initially dead.
9737 See :ref:`llvm.lifetime.start <int_lifestart>` and
9738 :ref:`llvm.lifetime.end <int_lifeend>` for the precise semantics of
9739 lifetime-manipulating intrinsics.
9740
9741 Example:
9742 """"""""
9743
9744 .. code-block:: llvm
9745
9746       %ptr = alloca i32                             ; yields i32*:ptr
9747       %ptr = alloca i32, i32 4                      ; yields i32*:ptr
9748       %ptr = alloca i32, i32 4, align 1024          ; yields i32*:ptr
9749       %ptr = alloca i32, align 1024                 ; yields i32*:ptr
9750
9751 .. _i_load:
9752
9753 '``load``' Instruction
9754 ^^^^^^^^^^^^^^^^^^^^^^
9755
9756 Syntax:
9757 """""""
9758
9759 ::
9760
9761       <result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.load !<empty_node>][, !invariant.group !<empty_node>][, !nonnull !<empty_node>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>][, !noundef !<empty_node>]
9762       <result> = load atomic [volatile] <ty>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>]
9763       !<nontemp_node> = !{ i32 1 }
9764       !<empty_node> = !{}
9765       !<deref_bytes_node> = !{ i64 <dereferenceable_bytes> }
9766       !<align_node> = !{ i64 <value_alignment> }
9767
9768 Overview:
9769 """""""""
9770
9771 The '``load``' instruction is used to read from memory.
9772
9773 Arguments:
9774 """"""""""
9775
9776 The argument to the ``load`` instruction specifies the memory address from which
9777 to load. The type specified must be a :ref:`first class <t_firstclass>` type of
9778 known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If
9779 the ``load`` is marked as ``volatile``, then the optimizer is not allowed to
9780 modify the number or order of execution of this ``load`` with other
9781 :ref:`volatile operations <volatile>`.
9782
9783 If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
9784 <ordering>` and optional ``syncscope("<target-scope>")`` argument. The
9785 ``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions.
9786 Atomic loads produce :ref:`defined <memmodel>` results when they may see
9787 multiple atomic stores. The type of the pointee must be an integer, pointer, or
9788 floating-point type whose bit width is a power of two greater than or equal to
9789 eight and less than or equal to a target-specific size limit.  ``align`` must be
9790 explicitly specified on atomic loads, and the load has undefined behavior if the
9791 alignment is not set to a value which is at least the size in bytes of the
9792 pointee. ``!nontemporal`` does not have any defined semantics for atomic loads.
9793
9794 The optional constant ``align`` argument specifies the alignment of the
9795 operation (that is, the alignment of the memory address). A value of 0
9796 or an omitted ``align`` argument means that the operation has the ABI
9797 alignment for the target. It is the responsibility of the code emitter
9798 to ensure that the alignment information is correct. Overestimating the
9799 alignment results in undefined behavior. Underestimating the alignment
9800 may produce less efficient code. An alignment of 1 is always safe. The
9801 maximum possible alignment is ``1 << 29``. An alignment value higher
9802 than the size of the loaded type implies memory up to the alignment
9803 value bytes can be safely loaded without trapping in the default
9804 address space. Access of the high bytes can interfere with debugging
9805 tools, so should not be accessed if the function has the
9806 ``sanitize_thread`` or ``sanitize_address`` attributes.
9807
9808 The optional ``!nontemporal`` metadata must reference a single
9809 metadata name ``<nontemp_node>`` corresponding to a metadata node with one
9810 ``i32`` entry of value 1. The existence of the ``!nontemporal``
9811 metadata on the instruction tells the optimizer and code generator
9812 that this load is not expected to be reused in the cache. The code
9813 generator may select special instructions to save cache bandwidth, such
9814 as the ``MOVNT`` instruction on x86.
9815
9816 The optional ``!invariant.load`` metadata must reference a single
9817 metadata name ``<empty_node>`` corresponding to a metadata node with no
9818 entries. If a load instruction tagged with the ``!invariant.load``
9819 metadata is executed, the memory location referenced by the load has
9820 to contain the same value at all points in the program where the
9821 memory location is dereferenceable; otherwise, the behavior is
9822 undefined.
9823
9824 The optional ``!invariant.group`` metadata must reference a single metadata name
9825  ``<empty_node>`` corresponding to a metadata node with no entries.
9826  See ``invariant.group`` metadata :ref:`invariant.group <md_invariant.group>`.
9827
9828 The optional ``!nonnull`` metadata must reference a single
9829 metadata name ``<empty_node>`` corresponding to a metadata node with no
9830 entries. The existence of the ``!nonnull`` metadata on the
9831 instruction tells the optimizer that the value loaded is known to
9832 never be null. If the value is null at runtime, the behavior is undefined.
9833 This is analogous to the ``nonnull`` attribute on parameters and return
9834 values. This metadata can only be applied to loads of a pointer type.
9835
9836 The optional ``!dereferenceable`` metadata must reference a single metadata
9837 name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
9838 entry.
9839 See ``dereferenceable`` metadata :ref:`dereferenceable <md_dereferenceable>`.
9840
9841 The optional ``!dereferenceable_or_null`` metadata must reference a single
9842 metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
9843 ``i64`` entry.
9844 See ``dereferenceable_or_null`` metadata :ref:`dereferenceable_or_null
9845 <md_dereferenceable_or_null>`.
9846
9847 The optional ``!align`` metadata must reference a single metadata name
9848 ``<align_node>`` corresponding to a metadata node with one ``i64`` entry.
9849 The existence of the ``!align`` metadata on the instruction tells the
9850 optimizer that the value loaded is known to be aligned to a boundary specified
9851 by the integer value in the metadata node. The alignment must be a power of 2.
9852 This is analogous to the ''align'' attribute on parameters and return values.
9853 This metadata can only be applied to loads of a pointer type. If the returned
9854 value is not appropriately aligned at runtime, the behavior is undefined.
9855
9856 The optional ``!noundef`` metadata must reference a single metadata name
9857 ``<empty_node>`` corresponding to a node with no entries. The existence of
9858 ``!noundef`` metadata on the instruction tells the optimizer that the value
9859 loaded is known to be :ref:`well defined <welldefinedvalues>`.
9860 If the value isn't well defined, the behavior is undefined.
9861
9862 Semantics:
9863 """"""""""
9864
9865 The location of memory pointed to is loaded. If the value being loaded
9866 is of scalar type then the number of bytes read does not exceed the
9867 minimum number of bytes needed to hold all bits of the type. For
9868 example, loading an ``i24`` reads at most three bytes. When loading a
9869 value of a type like ``i20`` with a size that is not an integral number
9870 of bytes, the result is undefined if the value was not originally
9871 written using a store of the same type.
9872 If the value being loaded is of aggregate type, the bytes that correspond to
9873 padding may be accessed but are ignored, because it is impossible to observe
9874 padding from the loaded aggregate value.
9875 If ``<pointer>`` is not a well-defined value, the behavior is undefined.
9876
9877 Examples:
9878 """""""""
9879
9880 .. code-block:: llvm
9881
9882       %ptr = alloca i32                               ; yields i32*:ptr
9883       store i32 3, i32* %ptr                          ; yields void
9884       %val = load i32, i32* %ptr                      ; yields i32:val = i32 3
9885
9886 .. _i_store:
9887
9888 '``store``' Instruction
9889 ^^^^^^^^^^^^^^^^^^^^^^^
9890
9891 Syntax:
9892 """""""
9893
9894 ::
9895
9896       store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>]        ; yields void
9897       store atomic [volatile] <ty> <value>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] ; yields void
9898       !<nontemp_node> = !{ i32 1 }
9899       !<empty_node> = !{}
9900
9901 Overview:
9902 """""""""
9903
9904 The '``store``' instruction is used to write to memory.
9905
9906 Arguments:
9907 """"""""""
9908
9909 There are two arguments to the ``store`` instruction: a value to store and an
9910 address at which to store it. The type of the ``<pointer>`` operand must be a
9911 pointer to the :ref:`first class <t_firstclass>` type of the ``<value>``
9912 operand. If the ``store`` is marked as ``volatile``, then the optimizer is not
9913 allowed to modify the number or order of execution of this ``store`` with other
9914 :ref:`volatile operations <volatile>`.  Only values of :ref:`first class
9915 <t_firstclass>` types of known size (i.e. not containing an :ref:`opaque
9916 structural type <t_opaque>`) can be stored.
9917
9918 If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
9919 <ordering>` and optional ``syncscope("<target-scope>")`` argument. The
9920 ``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions.
9921 Atomic loads produce :ref:`defined <memmodel>` results when they may see
9922 multiple atomic stores. The type of the pointee must be an integer, pointer, or
9923 floating-point type whose bit width is a power of two greater than or equal to
9924 eight and less than or equal to a target-specific size limit.  ``align`` must be
9925 explicitly specified on atomic stores, and the store has undefined behavior if
9926 the alignment is not set to a value which is at least the size in bytes of the
9927 pointee. ``!nontemporal`` does not have any defined semantics for atomic stores.
9928
9929 The optional constant ``align`` argument specifies the alignment of the
9930 operation (that is, the alignment of the memory address). A value of 0
9931 or an omitted ``align`` argument means that the operation has the ABI
9932 alignment for the target. It is the responsibility of the code emitter
9933 to ensure that the alignment information is correct. Overestimating the
9934 alignment results in undefined behavior. Underestimating the
9935 alignment may produce less efficient code. An alignment of 1 is always
9936 safe. The maximum possible alignment is ``1 << 29``. An alignment
9937 value higher than the size of the stored type implies memory up to the
9938 alignment value bytes can be stored to without trapping in the default
9939 address space. Storing to the higher bytes however may result in data
9940 races if another thread can access the same address. Introducing a
9941 data race is not allowed. Storing to the extra bytes is not allowed
9942 even in situations where a data race is known to not exist if the
9943 function has the ``sanitize_address`` attribute.
9944
9945 The optional ``!nontemporal`` metadata must reference a single metadata
9946 name ``<nontemp_node>`` corresponding to a metadata node with one ``i32`` entry
9947 of value 1. The existence of the ``!nontemporal`` metadata on the instruction
9948 tells the optimizer and code generator that this load is not expected to
9949 be reused in the cache. The code generator may select special
9950 instructions to save cache bandwidth, such as the ``MOVNT`` instruction on
9951 x86.
9952
9953 The optional ``!invariant.group`` metadata must reference a
9954 single metadata name ``<empty_node>``. See ``invariant.group`` metadata.
9955
9956 Semantics:
9957 """"""""""
9958
9959 The contents of memory are updated to contain ``<value>`` at the
9960 location specified by the ``<pointer>`` operand. If ``<value>`` is
9961 of scalar type then the number of bytes written does not exceed the
9962 minimum number of bytes needed to hold all bits of the type. For
9963 example, storing an ``i24`` writes at most three bytes. When writing a
9964 value of a type like ``i20`` with a size that is not an integral number
9965 of bytes, it is unspecified what happens to the extra bits that do not
9966 belong to the type, but they will typically be overwritten.
9967 If ``<value>`` is of aggregate type, padding is filled with
9968 :ref:`undef <undefvalues>`.
9969 If ``<pointer>`` is not a well-defined value, the behavior is undefined.
9970
9971 Example:
9972 """"""""
9973
9974 .. code-block:: llvm
9975
9976       %ptr = alloca i32                               ; yields i32*:ptr
9977       store i32 3, i32* %ptr                          ; yields void
9978       %val = load i32, i32* %ptr                      ; yields i32:val = i32 3
9979
9980 .. _i_fence:
9981
9982 '``fence``' Instruction
9983 ^^^^^^^^^^^^^^^^^^^^^^^
9984
9985 Syntax:
9986 """""""
9987
9988 ::
9989
9990       fence [syncscope("<target-scope>")] <ordering>  ; yields void
9991
9992 Overview:
9993 """""""""
9994
9995 The '``fence``' instruction is used to introduce happens-before edges
9996 between operations.
9997
9998 Arguments:
9999 """"""""""
10000
10001 '``fence``' instructions take an :ref:`ordering <ordering>` argument which
10002 defines what *synchronizes-with* edges they add. They can only be given
10003 ``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings.
10004
10005 Semantics:
10006 """"""""""
10007
10008 A fence A which has (at least) ``release`` ordering semantics
10009 *synchronizes with* a fence B with (at least) ``acquire`` ordering
10010 semantics if and only if there exist atomic operations X and Y, both
10011 operating on some atomic object M, such that A is sequenced before X, X
10012 modifies M (either directly or through some side effect of a sequence
10013 headed by X), Y is sequenced before B, and Y observes M. This provides a
10014 *happens-before* dependency between A and B. Rather than an explicit
10015 ``fence``, one (but not both) of the atomic operations X or Y might
10016 provide a ``release`` or ``acquire`` (resp.) ordering constraint and
10017 still *synchronize-with* the explicit ``fence`` and establish the
10018 *happens-before* edge.
10019
10020 A ``fence`` which has ``seq_cst`` ordering, in addition to having both
10021 ``acquire`` and ``release`` semantics specified above, participates in
10022 the global program order of other ``seq_cst`` operations and/or fences.
10023
10024 A ``fence`` instruction can also take an optional
10025 ":ref:`syncscope <syncscope>`" argument.
10026
10027 Example:
10028 """"""""
10029
10030 .. code-block:: text
10031
10032       fence acquire                                        ; yields void
10033       fence syncscope("singlethread") seq_cst              ; yields void
10034       fence syncscope("agent") seq_cst                     ; yields void
10035
10036 .. _i_cmpxchg:
10037
10038 '``cmpxchg``' Instruction
10039 ^^^^^^^^^^^^^^^^^^^^^^^^^
10040
10041 Syntax:
10042 """""""
10043
10044 ::
10045
10046       cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering>[, align <alignment>] ; yields  { ty, i1 }
10047
10048 Overview:
10049 """""""""
10050
10051 The '``cmpxchg``' instruction is used to atomically modify memory. It
10052 loads a value in memory and compares it to a given value. If they are
10053 equal, it tries to store a new value into the memory.
10054
10055 Arguments:
10056 """"""""""
10057
10058 There are three arguments to the '``cmpxchg``' instruction: an address
10059 to operate on, a value to compare to the value currently be at that
10060 address, and a new value to place at that address if the compared values
10061 are equal. The type of '<cmp>' must be an integer or pointer type whose
10062 bit width is a power of two greater than or equal to eight and less
10063 than or equal to a target-specific size limit. '<cmp>' and '<new>' must
10064 have the same type, and the type of '<pointer>' must be a pointer to
10065 that type. If the ``cmpxchg`` is marked as ``volatile``, then the
10066 optimizer is not allowed to modify the number or order of execution of
10067 this ``cmpxchg`` with other :ref:`volatile operations <volatile>`.
10068
10069 The success and failure :ref:`ordering <ordering>` arguments specify how this
10070 ``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
10071 must be at least ``monotonic``, the failure ordering cannot be either
10072 ``release`` or ``acq_rel``.
10073
10074 A ``cmpxchg`` instruction can also take an optional
10075 ":ref:`syncscope <syncscope>`" argument.
10076
10077 The instruction can take an optional ``align`` attribute.
10078 The alignment must be a power of two greater or equal to the size of the
10079 `<value>` type. If unspecified, the alignment is assumed to be equal to the
10080 size of the '<value>' type. Note that this default alignment assumption is
10081 different from the alignment used for the load/store instructions when align
10082 isn't specified.
10083
10084 The pointer passed into cmpxchg must have alignment greater than or
10085 equal to the size in memory of the operand.
10086
10087 Semantics:
10088 """"""""""
10089
10090 The contents of memory at the location specified by the '``<pointer>``' operand
10091 is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is
10092 written to the location. The original value at the location is returned,
10093 together with a flag indicating success (true) or failure (false).
10094
10095 If the cmpxchg operation is marked as ``weak`` then a spurious failure is
10096 permitted: the operation may not write ``<new>`` even if the comparison
10097 matched.
10098
10099 If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
10100 if the value loaded equals ``cmp``.
10101
10102 A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
10103 identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
10104 load with an ordering parameter determined the second ordering parameter.
10105
10106 Example:
10107 """"""""
10108
10109 .. code-block:: llvm
10110
10111     entry:
10112       %orig = load atomic i32, i32* %ptr unordered, align 4                      ; yields i32
10113       br label %loop
10114
10115     loop:
10116       %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop]
10117       %squared = mul i32 %cmp, %cmp
10118       %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields  { i32, i1 }
10119       %value_loaded = extractvalue { i32, i1 } %val_success, 0
10120       %success = extractvalue { i32, i1 } %val_success, 1
10121       br i1 %success, label %done, label %loop
10122
10123     done:
10124       ...
10125
10126 .. _i_atomicrmw:
10127
10128 '``atomicrmw``' Instruction
10129 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
10130
10131 Syntax:
10132 """""""
10133
10134 ::
10135
10136       atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering>[, align <alignment>]  ; yields ty
10137
10138 Overview:
10139 """""""""
10140
10141 The '``atomicrmw``' instruction is used to atomically modify memory.
10142
10143 Arguments:
10144 """"""""""
10145
10146 There are three arguments to the '``atomicrmw``' instruction: an
10147 operation to apply, an address whose value to modify, an argument to the
10148 operation. The operation must be one of the following keywords:
10149
10150 -  xchg
10151 -  add
10152 -  sub
10153 -  and
10154 -  nand
10155 -  or
10156 -  xor
10157 -  max
10158 -  min
10159 -  umax
10160 -  umin
10161 -  fadd
10162 -  fsub
10163
10164 For most of these operations, the type of '<value>' must be an integer
10165 type whose bit width is a power of two greater than or equal to eight
10166 and less than or equal to a target-specific size limit. For xchg, this
10167 may also be a floating point type with the same size constraints as
10168 integers.  For fadd/fsub, this must be a floating point type.  The
10169 type of the '``<pointer>``' operand must be a pointer to that type. If
10170 the ``atomicrmw`` is marked as ``volatile``, then the optimizer is not
10171 allowed to modify the number or order of execution of this
10172 ``atomicrmw`` with other :ref:`volatile operations <volatile>`.
10173
10174 The instruction can take an optional ``align`` attribute.
10175 The alignment must be a power of two greater or equal to the size of the
10176 `<value>` type. If unspecified, the alignment is assumed to be equal to the
10177 size of the '<value>' type. Note that this default alignment assumption is
10178 different from the alignment used for the load/store instructions when align
10179 isn't specified.
10180
10181 A ``atomicrmw`` instruction can also take an optional
10182 ":ref:`syncscope <syncscope>`" argument.
10183
10184 Semantics:
10185 """"""""""
10186
10187 The contents of memory at the location specified by the '``<pointer>``'
10188 operand are atomically read, modified, and written back. The original
10189 value at the location is returned. The modification is specified by the
10190 operation argument:
10191
10192 -  xchg: ``*ptr = val``
10193 -  add: ``*ptr = *ptr + val``
10194 -  sub: ``*ptr = *ptr - val``
10195 -  and: ``*ptr = *ptr & val``
10196 -  nand: ``*ptr = ~(*ptr & val)``
10197 -  or: ``*ptr = *ptr | val``
10198 -  xor: ``*ptr = *ptr ^ val``
10199 -  max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
10200 -  min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
10201 -  umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned comparison)
10202 -  umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned comparison)
10203 - fadd: ``*ptr = *ptr + val`` (using floating point arithmetic)
10204 - fsub: ``*ptr = *ptr - val`` (using floating point arithmetic)
10205
10206 Example:
10207 """"""""
10208
10209 .. code-block:: llvm
10210
10211       %old = atomicrmw add i32* %ptr, i32 1 acquire                        ; yields i32
10212
10213 .. _i_getelementptr:
10214
10215 '``getelementptr``' Instruction
10216 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10217
10218 Syntax:
10219 """""""
10220
10221 ::
10222
10223       <result> = getelementptr <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}*
10224       <result> = getelementptr inbounds <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}*
10225       <result> = getelementptr <ty>, <ptr vector> <ptrval>, [inrange] <vector index type> <idx>
10226
10227 Overview:
10228 """""""""
10229
10230 The '``getelementptr``' instruction is used to get the address of a
10231 subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
10232 address calculation only and does not access memory. The instruction can also
10233 be used to calculate a vector of such addresses.
10234
10235 Arguments:
10236 """"""""""
10237
10238 The first argument is always a type used as the basis for the calculations.
10239 The second argument is always a pointer or a vector of pointers, and is the
10240 base address to start from. The remaining arguments are indices
10241 that indicate which of the elements of the aggregate object are indexed.
10242 The interpretation of each index is dependent on the type being indexed
10243 into. The first index always indexes the pointer value given as the
10244 second argument, the second index indexes a value of the type pointed to
10245 (not necessarily the value directly pointed to, since the first index
10246 can be non-zero), etc. The first type indexed into must be a pointer
10247 value, subsequent types can be arrays, vectors, and structs. Note that
10248 subsequent types being indexed into can never be pointers, since that
10249 would require loading the pointer before continuing calculation.
10250
10251 The type of each index argument depends on the type it is indexing into.
10252 When indexing into a (optionally packed) structure, only ``i32`` integer
10253 **constants** are allowed (when using a vector of indices they must all
10254 be the **same** ``i32`` integer constant). When indexing into an array,
10255 pointer or vector, integers of any width are allowed, and they are not
10256 required to be constant. These integers are treated as signed values
10257 where relevant.
10258
10259 For example, let's consider a C code fragment and how it gets compiled
10260 to LLVM:
10261
10262 .. code-block:: c
10263
10264     struct RT {
10265       char A;
10266       int B[10][20];
10267       char C;
10268     };
10269     struct ST {
10270       int X;
10271       double Y;
10272       struct RT Z;
10273     };
10274
10275     int *foo(struct ST *s) {
10276       return &s[1].Z.B[5][13];
10277     }
10278
10279 The LLVM code generated by Clang is:
10280
10281 .. code-block:: llvm
10282
10283     %struct.RT = type { i8, [10 x [20 x i32]], i8 }
10284     %struct.ST = type { i32, double, %struct.RT }
10285
10286     define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp {
10287     entry:
10288       %arrayidx = getelementptr inbounds %struct.ST, %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13
10289       ret i32* %arrayidx
10290     }
10291
10292 Semantics:
10293 """"""""""
10294
10295 In the example above, the first index is indexing into the
10296 '``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``'
10297 = '``{ i32, double, %struct.RT }``' type, a structure. The second index
10298 indexes into the third element of the structure, yielding a
10299 '``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another
10300 structure. The third index indexes into the second element of the
10301 structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two
10302 dimensions of the array are subscripted into, yielding an '``i32``'
10303 type. The '``getelementptr``' instruction returns a pointer to this
10304 element, thus computing a value of '``i32*``' type.
10305
10306 Note that it is perfectly legal to index partially through a structure,
10307 returning a pointer to an inner element. Because of this, the LLVM code
10308 for the given testcase is equivalent to:
10309
10310 .. code-block:: llvm
10311
10312     define i32* @foo(%struct.ST* %s) {
10313       %t1 = getelementptr %struct.ST, %struct.ST* %s, i32 1                        ; yields %struct.ST*:%t1
10314       %t2 = getelementptr %struct.ST, %struct.ST* %t1, i32 0, i32 2                ; yields %struct.RT*:%t2
10315       %t3 = getelementptr %struct.RT, %struct.RT* %t2, i32 0, i32 1                ; yields [10 x [20 x i32]]*:%t3
10316       %t4 = getelementptr [10 x [20 x i32]], [10 x [20 x i32]]* %t3, i32 0, i32 5  ; yields [20 x i32]*:%t4
10317       %t5 = getelementptr [20 x i32], [20 x i32]* %t4, i32 0, i32 13               ; yields i32*:%t5
10318       ret i32* %t5
10319     }
10320
10321 If the ``inbounds`` keyword is present, the result value of the
10322 ``getelementptr`` is a :ref:`poison value <poisonvalues>` if one of the
10323 following rules is violated:
10324
10325 *  The base pointer has an *in bounds* address of an allocated object, which
10326    means that it points into an allocated object, or to its end. The only
10327    *in bounds* address for a null pointer in the default address-space is the
10328    null pointer itself.
10329 *  If the type of an index is larger than the pointer index type, the
10330    truncation to the pointer index type preserves the signed value.
10331 *  The multiplication of an index by the type size does not wrap the pointer
10332    index type in a signed sense (``nsw``).
10333 *  The successive addition of offsets (without adding the base address) does
10334    not wrap the pointer index type in a signed sense (``nsw``).
10335 *  The successive addition of the current address, interpreted as an unsigned
10336    number, and an offset, interpreted as a signed number, does not wrap the
10337    unsigned address space and remains *in bounds* of the allocated object.
10338    As a corollary, if the added offset is non-negative, the addition does not
10339    wrap in an unsigned sense (``nuw``).
10340 *  In cases where the base is a vector of pointers, the ``inbounds`` keyword
10341    applies to each of the computations element-wise.
10342
10343 These rules are based on the assumption that no allocated object may cross
10344 the unsigned address space boundary, and no allocated object may be larger
10345 than half the pointer index type space.
10346
10347 If the ``inbounds`` keyword is not present, the offsets are added to the
10348 base address with silently-wrapping two's complement arithmetic. If the
10349 offsets have a different width from the pointer, they are sign-extended
10350 or truncated to the width of the pointer. The result value of the
10351 ``getelementptr`` may be outside the object pointed to by the base
10352 pointer. The result value may not necessarily be used to access memory
10353 though, even if it happens to point into allocated storage. See the
10354 :ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
10355 information.
10356
10357 If the ``inrange`` keyword is present before any index, loading from or
10358 storing to any pointer derived from the ``getelementptr`` has undefined
10359 behavior if the load or store would access memory outside of the bounds of
10360 the element selected by the index marked as ``inrange``. The result of a
10361 pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations
10362 involving memory) involving a pointer derived from a ``getelementptr`` with
10363 the ``inrange`` keyword is undefined, with the exception of comparisons
10364 in the case where both operands are in the range of the element selected
10365 by the ``inrange`` keyword, inclusive of the address one past the end of
10366 that element. Note that the ``inrange`` keyword is currently only allowed
10367 in constant ``getelementptr`` expressions.
10368
10369 The getelementptr instruction is often confusing. For some more insight
10370 into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.
10371
10372 Example:
10373 """"""""
10374
10375 .. code-block:: llvm
10376
10377         ; yields [12 x i8]*:aptr
10378         %aptr = getelementptr {i32, [12 x i8]}, {i32, [12 x i8]}* %saptr, i64 0, i32 1
10379         ; yields i8*:vptr
10380         %vptr = getelementptr {i32, <2 x i8>}, {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1
10381         ; yields i8*:eptr
10382         %eptr = getelementptr [12 x i8], [12 x i8]* %aptr, i64 0, i32 1
10383         ; yields i32*:iptr
10384         %iptr = getelementptr [10 x i32], [10 x i32]* @arr, i16 0, i16 0
10385
10386 Vector of pointers:
10387 """""""""""""""""""
10388
10389 The ``getelementptr`` returns a vector of pointers, instead of a single address,
10390 when one or more of its arguments is a vector. In such cases, all vector
10391 arguments should have the same number of elements, and every scalar argument
10392 will be effectively broadcast into a vector during address calculation.
10393
10394 .. code-block:: llvm
10395
10396      ; All arguments are vectors:
10397      ;   A[i] = ptrs[i] + offsets[i]*sizeof(i8)
10398      %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
10399
10400      ; Add the same scalar offset to each pointer of a vector:
10401      ;   A[i] = ptrs[i] + offset*sizeof(i8)
10402      %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset
10403
10404      ; Add distinct offsets to the same pointer:
10405      ;   A[i] = ptr + offsets[i]*sizeof(i8)
10406      %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
10407
10408      ; In all cases described above the type of the result is <4 x i8*>
10409
10410 The two following instructions are equivalent:
10411
10412 .. code-block:: llvm
10413
10414      getelementptr  %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
10415        <4 x i32> <i32 2, i32 2, i32 2, i32 2>,
10416        <4 x i32> <i32 1, i32 1, i32 1, i32 1>,
10417        <4 x i32> %ind4,
10418        <4 x i64> <i64 13, i64 13, i64 13, i64 13>
10419
10420      getelementptr  %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
10421        i32 2, i32 1, <4 x i32> %ind4, i64 13
10422
10423 Let's look at the C code, where the vector version of ``getelementptr``
10424 makes sense:
10425
10426 .. code-block:: c
10427
10428     // Let's assume that we vectorize the following loop:
10429     double *A, *B; int *C;
10430     for (int i = 0; i < size; ++i) {
10431       A[i] = B[C[i]];
10432     }
10433
10434 .. code-block:: llvm
10435
10436     ; get pointers for 8 elements from array B
10437     %ptrs = getelementptr double, double* %B, <8 x i32> %C
10438     ; load 8 elements from array B into A
10439     %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x double*> %ptrs,
10440          i32 8, <8 x i1> %mask, <8 x double> %passthru)
10441
10442 Conversion Operations
10443 ---------------------
10444
10445 The instructions in this category are the conversion instructions
10446 (casting) which all take a single operand and a type. They perform
10447 various bit conversions on the operand.
10448
10449 .. _i_trunc:
10450
10451 '``trunc .. to``' Instruction
10452 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10453
10454 Syntax:
10455 """""""
10456
10457 ::
10458
10459       <result> = trunc <ty> <value> to <ty2>             ; yields ty2
10460
10461 Overview:
10462 """""""""
10463
10464 The '``trunc``' instruction truncates its operand to the type ``ty2``.
10465
10466 Arguments:
10467 """"""""""
10468
10469 The '``trunc``' instruction takes a value to trunc, and a type to trunc
10470 it to. Both types must be of :ref:`integer <t_integer>` types, or vectors
10471 of the same number of integers. The bit size of the ``value`` must be
10472 larger than the bit size of the destination type, ``ty2``. Equal sized
10473 types are not allowed.
10474
10475 Semantics:
10476 """"""""""
10477
10478 The '``trunc``' instruction truncates the high order bits in ``value``
10479 and converts the remaining bits to ``ty2``. Since the source size must
10480 be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
10481 It will always truncate bits.
10482
10483 Example:
10484 """"""""
10485
10486 .. code-block:: llvm
10487
10488       %X = trunc i32 257 to i8                        ; yields i8:1
10489       %Y = trunc i32 123 to i1                        ; yields i1:true
10490       %Z = trunc i32 122 to i1                        ; yields i1:false
10491       %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
10492
10493 .. _i_zext:
10494
10495 '``zext .. to``' Instruction
10496 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10497
10498 Syntax:
10499 """""""
10500
10501 ::
10502
10503       <result> = zext <ty> <value> to <ty2>             ; yields ty2
10504
10505 Overview:
10506 """""""""
10507
10508 The '``zext``' instruction zero extends its operand to type ``ty2``.
10509
10510 Arguments:
10511 """"""""""
10512
10513 The '``zext``' instruction takes a value to cast, and a type to cast it
10514 to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
10515 the same number of integers. The bit size of the ``value`` must be
10516 smaller than the bit size of the destination type, ``ty2``.
10517
10518 Semantics:
10519 """"""""""
10520
10521 The ``zext`` fills the high order bits of the ``value`` with zero bits
10522 until it reaches the size of the destination type, ``ty2``.
10523
10524 When zero extending from i1, the result will always be either 0 or 1.
10525
10526 Example:
10527 """"""""
10528
10529 .. code-block:: llvm
10530
10531       %X = zext i32 257 to i64              ; yields i64:257
10532       %Y = zext i1 true to i32              ; yields i32:1
10533       %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
10534
10535 .. _i_sext:
10536
10537 '``sext .. to``' Instruction
10538 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10539
10540 Syntax:
10541 """""""
10542
10543 ::
10544
10545       <result> = sext <ty> <value> to <ty2>             ; yields ty2
10546
10547 Overview:
10548 """""""""
10549
10550 The '``sext``' sign extends ``value`` to the type ``ty2``.
10551
10552 Arguments:
10553 """"""""""
10554
10555 The '``sext``' instruction takes a value to cast, and a type to cast it
10556 to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
10557 the same number of integers. The bit size of the ``value`` must be
10558 smaller than the bit size of the destination type, ``ty2``.
10559
10560 Semantics:
10561 """"""""""
10562
10563 The '``sext``' instruction performs a sign extension by copying the sign
10564 bit (highest order bit) of the ``value`` until it reaches the bit size
10565 of the type ``ty2``.
10566
10567 When sign extending from i1, the extension always results in -1 or 0.
10568
10569 Example:
10570 """"""""
10571
10572 .. code-block:: llvm
10573
10574       %X = sext i8  -1 to i16              ; yields i16   :65535
10575       %Y = sext i1 true to i32             ; yields i32:-1
10576       %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
10577
10578 '``fptrunc .. to``' Instruction
10579 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10580
10581 Syntax:
10582 """""""
10583
10584 ::
10585
10586       <result> = fptrunc <ty> <value> to <ty2>             ; yields ty2
10587
10588 Overview:
10589 """""""""
10590
10591 The '``fptrunc``' instruction truncates ``value`` to type ``ty2``.
10592
10593 Arguments:
10594 """"""""""
10595
10596 The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>`
10597 value to cast and a :ref:`floating-point <t_floating>` type to cast it to.
10598 The size of ``value`` must be larger than the size of ``ty2``. This
10599 implies that ``fptrunc`` cannot be used to make a *no-op cast*.
10600
10601 Semantics:
10602 """"""""""
10603
10604 The '``fptrunc``' instruction casts a ``value`` from a larger
10605 :ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
10606 <t_floating>` type.
10607 This instruction is assumed to execute in the default :ref:`floating-point
10608 environment <floatenv>`.
10609
10610 Example:
10611 """"""""
10612
10613 .. code-block:: llvm
10614
10615       %X = fptrunc double 16777217.0 to float    ; yields float:16777216.0
10616       %Y = fptrunc double 1.0E+300 to half       ; yields half:+infinity
10617
10618 '``fpext .. to``' Instruction
10619 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10620
10621 Syntax:
10622 """""""
10623
10624 ::
10625
10626       <result> = fpext <ty> <value> to <ty2>             ; yields ty2
10627
10628 Overview:
10629 """""""""
10630
10631 The '``fpext``' extends a floating-point ``value`` to a larger floating-point
10632 value.
10633
10634 Arguments:
10635 """"""""""
10636
10637 The '``fpext``' instruction takes a :ref:`floating-point <t_floating>`
10638 ``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it
10639 to. The source type must be smaller than the destination type.
10640
10641 Semantics:
10642 """"""""""
10643
10644 The '``fpext``' instruction extends the ``value`` from a smaller
10645 :ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
10646 <t_floating>` type. The ``fpext`` cannot be used to make a
10647 *no-op cast* because it always changes bits. Use ``bitcast`` to make a
10648 *no-op cast* for a floating-point cast.
10649
10650 Example:
10651 """"""""
10652
10653 .. code-block:: llvm
10654
10655       %X = fpext float 3.125 to double         ; yields double:3.125000e+00
10656       %Y = fpext double %X to fp128            ; yields fp128:0xL00000000000000004000900000000000
10657
10658 '``fptoui .. to``' Instruction
10659 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10660
10661 Syntax:
10662 """""""
10663
10664 ::
10665
10666       <result> = fptoui <ty> <value> to <ty2>             ; yields ty2
10667
10668 Overview:
10669 """""""""
10670
10671 The '``fptoui``' converts a floating-point ``value`` to its unsigned
10672 integer equivalent of type ``ty2``.
10673
10674 Arguments:
10675 """"""""""
10676
10677 The '``fptoui``' instruction takes a value to cast, which must be a
10678 scalar or vector :ref:`floating-point <t_floating>` value, and a type to
10679 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
10680 ``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
10681 type with the same number of elements as ``ty``
10682
10683 Semantics:
10684 """"""""""
10685
10686 The '``fptoui``' instruction converts its :ref:`floating-point
10687 <t_floating>` operand into the nearest (rounding towards zero)
10688 unsigned integer value. If the value cannot fit in ``ty2``, the result
10689 is a :ref:`poison value <poisonvalues>`.
10690
10691 Example:
10692 """"""""
10693
10694 .. code-block:: llvm
10695
10696       %X = fptoui double 123.0 to i32      ; yields i32:123
10697       %Y = fptoui float 1.0E+300 to i1     ; yields undefined:1
10698       %Z = fptoui float 1.04E+17 to i8     ; yields undefined:1
10699
10700 '``fptosi .. to``' Instruction
10701 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10702
10703 Syntax:
10704 """""""
10705
10706 ::
10707
10708       <result> = fptosi <ty> <value> to <ty2>             ; yields ty2
10709
10710 Overview:
10711 """""""""
10712
10713 The '``fptosi``' instruction converts :ref:`floating-point <t_floating>`
10714 ``value`` to type ``ty2``.
10715
10716 Arguments:
10717 """"""""""
10718
10719 The '``fptosi``' instruction takes a value to cast, which must be a
10720 scalar or vector :ref:`floating-point <t_floating>` value, and a type to
10721 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
10722 ``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
10723 type with the same number of elements as ``ty``
10724
10725 Semantics:
10726 """"""""""
10727
10728 The '``fptosi``' instruction converts its :ref:`floating-point
10729 <t_floating>` operand into the nearest (rounding towards zero)
10730 signed integer value. If the value cannot fit in ``ty2``, the result
10731 is a :ref:`poison value <poisonvalues>`.
10732
10733 Example:
10734 """"""""
10735
10736 .. code-block:: llvm
10737
10738       %X = fptosi double -123.0 to i32      ; yields i32:-123
10739       %Y = fptosi float 1.0E-247 to i1      ; yields undefined:1
10740       %Z = fptosi float 1.04E+17 to i8      ; yields undefined:1
10741
10742 '``uitofp .. to``' Instruction
10743 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10744
10745 Syntax:
10746 """""""
10747
10748 ::
10749
10750       <result> = uitofp <ty> <value> to <ty2>             ; yields ty2
10751
10752 Overview:
10753 """""""""
10754
10755 The '``uitofp``' instruction regards ``value`` as an unsigned integer
10756 and converts that value to the ``ty2`` type.
10757
10758 Arguments:
10759 """"""""""
10760
10761 The '``uitofp``' instruction takes a value to cast, which must be a
10762 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
10763 ``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
10764 ``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
10765 type with the same number of elements as ``ty``
10766
10767 Semantics:
10768 """"""""""
10769
10770 The '``uitofp``' instruction interprets its operand as an unsigned
10771 integer quantity and converts it to the corresponding floating-point
10772 value. If the value cannot be exactly represented, it is rounded using
10773 the default rounding mode.
10774
10775
10776 Example:
10777 """"""""
10778
10779 .. code-block:: llvm
10780
10781       %X = uitofp i32 257 to float         ; yields float:257.0
10782       %Y = uitofp i8 -1 to double          ; yields double:255.0
10783
10784 '``sitofp .. to``' Instruction
10785 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10786
10787 Syntax:
10788 """""""
10789
10790 ::
10791
10792       <result> = sitofp <ty> <value> to <ty2>             ; yields ty2
10793
10794 Overview:
10795 """""""""
10796
10797 The '``sitofp``' instruction regards ``value`` as a signed integer and
10798 converts that value to the ``ty2`` type.
10799
10800 Arguments:
10801 """"""""""
10802
10803 The '``sitofp``' instruction takes a value to cast, which must be a
10804 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
10805 ``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
10806 ``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
10807 type with the same number of elements as ``ty``
10808
10809 Semantics:
10810 """"""""""
10811
10812 The '``sitofp``' instruction interprets its operand as a signed integer
10813 quantity and converts it to the corresponding floating-point value. If the
10814 value cannot be exactly represented, it is rounded using the default rounding
10815 mode.
10816
10817 Example:
10818 """"""""
10819
10820 .. code-block:: llvm
10821
10822       %X = sitofp i32 257 to float         ; yields float:257.0
10823       %Y = sitofp i8 -1 to double          ; yields double:-1.0
10824
10825 .. _i_ptrtoint:
10826
10827 '``ptrtoint .. to``' Instruction
10828 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10829
10830 Syntax:
10831 """""""
10832
10833 ::
10834
10835       <result> = ptrtoint <ty> <value> to <ty2>             ; yields ty2
10836
10837 Overview:
10838 """""""""
10839
10840 The '``ptrtoint``' instruction converts the pointer or a vector of
10841 pointers ``value`` to the integer (or vector of integers) type ``ty2``.
10842
10843 Arguments:
10844 """"""""""
10845
10846 The '``ptrtoint``' instruction takes a ``value`` to cast, which must be
10847 a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a
10848 type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or
10849 a vector of integers type.
10850
10851 Semantics:
10852 """"""""""
10853
10854 The '``ptrtoint``' instruction converts ``value`` to integer type
10855 ``ty2`` by interpreting the pointer value as an integer and either
10856 truncating or zero extending that value to the size of the integer type.
10857 If ``value`` is smaller than ``ty2`` then a zero extension is done. If
10858 ``value`` is larger than ``ty2`` then a truncation is done. If they are
10859 the same size, then nothing is done (*no-op cast*) other than a type
10860 change.
10861
10862 Example:
10863 """"""""
10864
10865 .. code-block:: llvm
10866
10867       %X = ptrtoint i32* %P to i8                         ; yields truncation on 32-bit architecture
10868       %Y = ptrtoint i32* %P to i64                        ; yields zero extension on 32-bit architecture
10869       %Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
10870
10871 .. _i_inttoptr:
10872
10873 '``inttoptr .. to``' Instruction
10874 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10875
10876 Syntax:
10877 """""""
10878
10879 ::
10880
10881       <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>]             ; yields ty2
10882
10883 Overview:
10884 """""""""
10885
10886 The '``inttoptr``' instruction converts an integer ``value`` to a
10887 pointer type, ``ty2``.
10888
10889 Arguments:
10890 """"""""""
10891
10892 The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to
10893 cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>`
10894 type.
10895
10896 The optional ``!dereferenceable`` metadata must reference a single metadata
10897 name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
10898 entry.
10899 See ``dereferenceable`` metadata.
10900
10901 The optional ``!dereferenceable_or_null`` metadata must reference a single
10902 metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
10903 ``i64`` entry.
10904 See ``dereferenceable_or_null`` metadata.
10905
10906 Semantics:
10907 """"""""""
10908
10909 The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by
10910 applying either a zero extension or a truncation depending on the size
10911 of the integer ``value``. If ``value`` is larger than the size of a
10912 pointer then a truncation is done. If ``value`` is smaller than the size
10913 of a pointer then a zero extension is done. If they are the same size,
10914 nothing is done (*no-op cast*).
10915
10916 Example:
10917 """"""""
10918
10919 .. code-block:: llvm
10920
10921       %X = inttoptr i32 255 to i32*          ; yields zero extension on 64-bit architecture
10922       %Y = inttoptr i32 255 to i32*          ; yields no-op on 32-bit architecture
10923       %Z = inttoptr i64 0 to i32*            ; yields truncation on 32-bit architecture
10924       %Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers
10925
10926 .. _i_bitcast:
10927
10928 '``bitcast .. to``' Instruction
10929 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10930
10931 Syntax:
10932 """""""
10933
10934 ::
10935
10936       <result> = bitcast <ty> <value> to <ty2>             ; yields ty2
10937
10938 Overview:
10939 """""""""
10940
10941 The '``bitcast``' instruction converts ``value`` to type ``ty2`` without
10942 changing any bits.
10943
10944 Arguments:
10945 """"""""""
10946
10947 The '``bitcast``' instruction takes a value to cast, which must be a
10948 non-aggregate first class value, and a type to cast it to, which must
10949 also be a non-aggregate :ref:`first class <t_firstclass>` type. The
10950 bit sizes of ``value`` and the destination type, ``ty2``, must be
10951 identical. If the source type is a pointer, the destination type must
10952 also be a pointer of the same size. This instruction supports bitwise
10953 conversion of vectors to integers and to vectors of other types (as
10954 long as they have the same size).
10955
10956 Semantics:
10957 """"""""""
10958
10959 The '``bitcast``' instruction converts ``value`` to type ``ty2``. It
10960 is always a *no-op cast* because no bits change with this
10961 conversion. The conversion is done as if the ``value`` had been stored
10962 to memory and read back as type ``ty2``. Pointer (or vector of
10963 pointers) types may only be converted to other pointer (or vector of
10964 pointers) types with the same address space through this instruction.
10965 To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>`
10966 or :ref:`ptrtoint <i_ptrtoint>` instructions first.
10967
10968 There is a caveat for bitcasts involving vector types in relation to
10969 endianess. For example ``bitcast <2 x i8> <value> to i16`` puts element zero
10970 of the vector in the least significant bits of the i16 for little-endian while
10971 element zero ends up in the most significant bits for big-endian.
10972
10973 Example:
10974 """"""""
10975
10976 .. code-block:: text
10977
10978       %X = bitcast i8 255 to i8          ; yields i8 :-1
10979       %Y = bitcast i32* %x to sint*      ; yields sint*:%x
10980       %Z = bitcast <2 x int> %V to i64;  ; yields i64: %V (depends on endianess)
10981       %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
10982
10983 .. _i_addrspacecast:
10984
10985 '``addrspacecast .. to``' Instruction
10986 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10987
10988 Syntax:
10989 """""""
10990
10991 ::
10992
10993       <result> = addrspacecast <pty> <ptrval> to <pty2>       ; yields pty2
10994
10995 Overview:
10996 """""""""
10997
10998 The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in
10999 address space ``n`` to type ``pty2`` in address space ``m``.
11000
11001 Arguments:
11002 """"""""""
11003
11004 The '``addrspacecast``' instruction takes a pointer or vector of pointer value
11005 to cast and a pointer type to cast it to, which must have a different
11006 address space.
11007
11008 Semantics:
11009 """"""""""
11010
11011 The '``addrspacecast``' instruction converts the pointer value
11012 ``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex
11013 value modification, depending on the target and the address space
11014 pair. Pointer conversions within the same address space must be
11015 performed with the ``bitcast`` instruction. Note that if the address space
11016 conversion is legal then both result and operand refer to the same memory
11017 location.
11018
11019 Example:
11020 """"""""
11021
11022 .. code-block:: llvm
11023
11024       %X = addrspacecast i32* %x to i32 addrspace(1)*    ; yields i32 addrspace(1)*:%x
11025       %Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)*    ; yields i64 addrspace(2)*:%y
11026       %Z = addrspacecast <4 x i32*> %z to <4 x float addrspace(3)*>   ; yields <4 x float addrspace(3)*>:%z
11027
11028 .. _otherops:
11029
11030 Other Operations
11031 ----------------
11032
11033 The instructions in this category are the "miscellaneous" instructions,
11034 which defy better classification.
11035
11036 .. _i_icmp:
11037
11038 '``icmp``' Instruction
11039 ^^^^^^^^^^^^^^^^^^^^^^
11040
11041 Syntax:
11042 """""""
11043
11044 ::
11045
11046       <result> = icmp <cond> <ty> <op1>, <op2>   ; yields i1 or <N x i1>:result
11047
11048 Overview:
11049 """""""""
11050
11051 The '``icmp``' instruction returns a boolean value or a vector of
11052 boolean values based on comparison of its two integer, integer vector,
11053 pointer, or pointer vector operands.
11054
11055 Arguments:
11056 """"""""""
11057
11058 The '``icmp``' instruction takes three operands. The first operand is
11059 the condition code indicating the kind of comparison to perform. It is
11060 not a value, just a keyword. The possible condition codes are:
11061
11062 #. ``eq``: equal
11063 #. ``ne``: not equal
11064 #. ``ugt``: unsigned greater than
11065 #. ``uge``: unsigned greater or equal
11066 #. ``ult``: unsigned less than
11067 #. ``ule``: unsigned less or equal
11068 #. ``sgt``: signed greater than
11069 #. ``sge``: signed greater or equal
11070 #. ``slt``: signed less than
11071 #. ``sle``: signed less or equal
11072
11073 The remaining two arguments must be :ref:`integer <t_integer>` or
11074 :ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They
11075 must also be identical types.
11076
11077 Semantics:
11078 """"""""""
11079
11080 The '``icmp``' compares ``op1`` and ``op2`` according to the condition
11081 code given as ``cond``. The comparison performed always yields either an
11082 :ref:`i1 <t_integer>` or vector of ``i1`` result, as follows:
11083
11084 #. ``eq``: yields ``true`` if the operands are equal, ``false``
11085    otherwise. No sign interpretation is necessary or performed.
11086 #. ``ne``: yields ``true`` if the operands are unequal, ``false``
11087    otherwise. No sign interpretation is necessary or performed.
11088 #. ``ugt``: interprets the operands as unsigned values and yields
11089    ``true`` if ``op1`` is greater than ``op2``.
11090 #. ``uge``: interprets the operands as unsigned values and yields
11091    ``true`` if ``op1`` is greater than or equal to ``op2``.
11092 #. ``ult``: interprets the operands as unsigned values and yields
11093    ``true`` if ``op1`` is less than ``op2``.
11094 #. ``ule``: interprets the operands as unsigned values and yields
11095    ``true`` if ``op1`` is less than or equal to ``op2``.
11096 #. ``sgt``: interprets the operands as signed values and yields ``true``
11097    if ``op1`` is greater than ``op2``.
11098 #. ``sge``: interprets the operands as signed values and yields ``true``
11099    if ``op1`` is greater than or equal to ``op2``.
11100 #. ``slt``: interprets the operands as signed values and yields ``true``
11101    if ``op1`` is less than ``op2``.
11102 #. ``sle``: interprets the operands as signed values and yields ``true``
11103    if ``op1`` is less than or equal to ``op2``.
11104
11105 If the operands are :ref:`pointer <t_pointer>` typed, the pointer values
11106 are compared as if they were integers.
11107
11108 If the operands are integer vectors, then they are compared element by
11109 element. The result is an ``i1`` vector with the same number of elements
11110 as the values being compared. Otherwise, the result is an ``i1``.
11111
11112 Example:
11113 """"""""
11114
11115 .. code-block:: text
11116
11117       <result> = icmp eq i32 4, 5          ; yields: result=false
11118       <result> = icmp ne float* %X, %X     ; yields: result=false
11119       <result> = icmp ult i16  4, 5        ; yields: result=true
11120       <result> = icmp sgt i16  4, 5        ; yields: result=false
11121       <result> = icmp ule i16 -4, 5        ; yields: result=false
11122       <result> = icmp sge i16  4, 5        ; yields: result=false
11123
11124 .. _i_fcmp:
11125
11126 '``fcmp``' Instruction
11127 ^^^^^^^^^^^^^^^^^^^^^^
11128
11129 Syntax:
11130 """""""
11131
11132 ::
11133
11134       <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2>     ; yields i1 or <N x i1>:result
11135
11136 Overview:
11137 """""""""
11138
11139 The '``fcmp``' instruction returns a boolean value or vector of boolean
11140 values based on comparison of its operands.
11141
11142 If the operands are floating-point scalars, then the result type is a
11143 boolean (:ref:`i1 <t_integer>`).
11144
11145 If the operands are floating-point vectors, then the result type is a
11146 vector of boolean with the same number of elements as the operands being
11147 compared.
11148
11149 Arguments:
11150 """"""""""
11151
11152 The '``fcmp``' instruction takes three operands. The first operand is
11153 the condition code indicating the kind of comparison to perform. It is
11154 not a value, just a keyword. The possible condition codes are:
11155
11156 #. ``false``: no comparison, always returns false
11157 #. ``oeq``: ordered and equal
11158 #. ``ogt``: ordered and greater than
11159 #. ``oge``: ordered and greater than or equal
11160 #. ``olt``: ordered and less than
11161 #. ``ole``: ordered and less than or equal
11162 #. ``one``: ordered and not equal
11163 #. ``ord``: ordered (no nans)
11164 #. ``ueq``: unordered or equal
11165 #. ``ugt``: unordered or greater than
11166 #. ``uge``: unordered or greater than or equal
11167 #. ``ult``: unordered or less than
11168 #. ``ule``: unordered or less than or equal
11169 #. ``une``: unordered or not equal
11170 #. ``uno``: unordered (either nans)
11171 #. ``true``: no comparison, always returns true
11172
11173 *Ordered* means that neither operand is a QNAN while *unordered* means
11174 that either operand may be a QNAN.
11175
11176 Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point
11177 <t_floating>` type or a :ref:`vector <t_vector>` of floating-point type.
11178 They must have identical types.
11179
11180 Semantics:
11181 """"""""""
11182
11183 The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the
11184 condition code given as ``cond``. If the operands are vectors, then the
11185 vectors are compared element by element. Each comparison performed
11186 always yields an :ref:`i1 <t_integer>` result, as follows:
11187
11188 #. ``false``: always yields ``false``, regardless of operands.
11189 #. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1``
11190    is equal to ``op2``.
11191 #. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1``
11192    is greater than ``op2``.
11193 #. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1``
11194    is greater than or equal to ``op2``.
11195 #. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1``
11196    is less than ``op2``.
11197 #. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1``
11198    is less than or equal to ``op2``.
11199 #. ``one``: yields ``true`` if both operands are not a QNAN and ``op1``
11200    is not equal to ``op2``.
11201 #. ``ord``: yields ``true`` if both operands are not a QNAN.
11202 #. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is
11203    equal to ``op2``.
11204 #. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is
11205    greater than ``op2``.
11206 #. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is
11207    greater than or equal to ``op2``.
11208 #. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is
11209    less than ``op2``.
11210 #. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is
11211    less than or equal to ``op2``.
11212 #. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is
11213    not equal to ``op2``.
11214 #. ``uno``: yields ``true`` if either operand is a QNAN.
11215 #. ``true``: always yields ``true``, regardless of operands.
11216
11217 The ``fcmp`` instruction can also optionally take any number of
11218 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
11219 otherwise unsafe floating-point optimizations.
11220
11221 Any set of fast-math flags are legal on an ``fcmp`` instruction, but the
11222 only flags that have any effect on its semantics are those that allow
11223 assumptions to be made about the values of input arguments; namely
11224 ``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information.
11225
11226 Example:
11227 """"""""
11228
11229 .. code-block:: text
11230
11231       <result> = fcmp oeq float 4.0, 5.0    ; yields: result=false
11232       <result> = fcmp one float 4.0, 5.0    ; yields: result=true
11233       <result> = fcmp olt float 4.0, 5.0    ; yields: result=true
11234       <result> = fcmp ueq double 1.0, 2.0   ; yields: result=false
11235
11236 .. _i_phi:
11237
11238 '``phi``' Instruction
11239 ^^^^^^^^^^^^^^^^^^^^^
11240
11241 Syntax:
11242 """""""
11243
11244 ::
11245
11246       <result> = phi [fast-math-flags] <ty> [ <val0>, <label0>], ...
11247
11248 Overview:
11249 """""""""
11250
11251 The '``phi``' instruction is used to implement the φ node in the SSA
11252 graph representing the function.
11253
11254 Arguments:
11255 """"""""""
11256
11257 The type of the incoming values is specified with the first type field.
11258 After this, the '``phi``' instruction takes a list of pairs as
11259 arguments, with one pair for each predecessor basic block of the current
11260 block. Only values of :ref:`first class <t_firstclass>` type may be used as
11261 the value arguments to the PHI node. Only labels may be used as the
11262 label arguments.
11263
11264 There must be no non-phi instructions between the start of a basic block
11265 and the PHI instructions: i.e. PHI instructions must be first in a basic
11266 block.
11267
11268 For the purposes of the SSA form, the use of each incoming value is
11269 deemed to occur on the edge from the corresponding predecessor block to
11270 the current block (but after any definition of an '``invoke``'
11271 instruction's return value on the same edge).
11272
11273 The optional ``fast-math-flags`` marker indicates that the phi has one
11274 or more :ref:`fast-math-flags <fastmath>`. These are optimization hints
11275 to enable otherwise unsafe floating-point optimizations. Fast-math-flags
11276 are only valid for phis that return a floating-point scalar or vector
11277 type, or an array (nested to any depth) of floating-point scalar or vector
11278 types.
11279
11280 Semantics:
11281 """"""""""
11282
11283 At runtime, the '``phi``' instruction logically takes on the value
11284 specified by the pair corresponding to the predecessor basic block that
11285 executed just prior to the current block.
11286
11287 Example:
11288 """"""""
11289
11290 .. code-block:: llvm
11291
11292     Loop:       ; Infinite loop that counts from 0 on up...
11293       %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
11294       %nextindvar = add i32 %indvar, 1
11295       br label %Loop
11296
11297 .. _i_select:
11298
11299 '``select``' Instruction
11300 ^^^^^^^^^^^^^^^^^^^^^^^^
11301
11302 Syntax:
11303 """""""
11304
11305 ::
11306
11307       <result> = select [fast-math flags] selty <cond>, <ty> <val1>, <ty> <val2>             ; yields ty
11308
11309       selty is either i1 or {<N x i1>}
11310
11311 Overview:
11312 """""""""
11313
11314 The '``select``' instruction is used to choose one value based on a
11315 condition, without IR-level branching.
11316
11317 Arguments:
11318 """"""""""
11319
11320 The '``select``' instruction requires an 'i1' value or a vector of 'i1'
11321 values indicating the condition, and two values of the same :ref:`first
11322 class <t_firstclass>` type.
11323
11324 #. The optional ``fast-math flags`` marker indicates that the select has one or more
11325    :ref:`fast-math flags <fastmath>`. These are optimization hints to enable
11326    otherwise unsafe floating-point optimizations. Fast-math flags are only valid
11327    for selects that return a floating-point scalar or vector type, or an array
11328    (nested to any depth) of floating-point scalar or vector types.
11329
11330 Semantics:
11331 """"""""""
11332
11333 If the condition is an i1 and it evaluates to 1, the instruction returns
11334 the first value argument; otherwise, it returns the second value
11335 argument.
11336
11337 If the condition is a vector of i1, then the value arguments must be
11338 vectors of the same size, and the selection is done element by element.
11339
11340 If the condition is an i1 and the value arguments are vectors of the
11341 same size, then an entire vector is selected.
11342
11343 Example:
11344 """"""""
11345
11346 .. code-block:: llvm
11347
11348       %X = select i1 true, i8 17, i8 42          ; yields i8:17
11349
11350
11351 .. _i_freeze:
11352
11353 '``freeze``' Instruction
11354 ^^^^^^^^^^^^^^^^^^^^^^^^
11355
11356 Syntax:
11357 """""""
11358
11359 ::
11360
11361       <result> = freeze ty <val>    ; yields ty:result
11362
11363 Overview:
11364 """""""""
11365
11366 The '``freeze``' instruction is used to stop propagation of
11367 :ref:`undef <undefvalues>` and :ref:`poison <poisonvalues>` values.
11368
11369 Arguments:
11370 """"""""""
11371
11372 The '``freeze``' instruction takes a single argument.
11373
11374 Semantics:
11375 """"""""""
11376
11377 If the argument is ``undef`` or ``poison``, '``freeze``' returns an
11378 arbitrary, but fixed, value of type '``ty``'.
11379 Otherwise, this instruction is a no-op and returns the input argument.
11380 All uses of a value returned by the same '``freeze``' instruction are
11381 guaranteed to always observe the same value, while different '``freeze``'
11382 instructions may yield different values.
11383
11384 While ``undef`` and ``poison`` pointers can be frozen, the result is a
11385 non-dereferenceable pointer. See the
11386 :ref:`Pointer Aliasing Rules <pointeraliasing>` section for more information.
11387 If an aggregate value or vector is frozen, the operand is frozen element-wise.
11388 The padding of an aggregate isn't considered, since it isn't visible
11389 without storing it into memory and loading it with a different type.
11390
11391
11392 Example:
11393 """"""""
11394
11395 .. code-block:: text
11396
11397       %w = i32 undef
11398       %x = freeze i32 %w
11399       %y = add i32 %w, %w         ; undef
11400       %z = add i32 %x, %x         ; even number because all uses of %x observe
11401                                   ; the same value
11402       %x2 = freeze i32 %w
11403       %cmp = icmp eq i32 %x, %x2  ; can be true or false
11404
11405       ; example with vectors
11406       %v = <2 x i32> <i32 undef, i32 poison>
11407       %a = extractelement <2 x i32> %v, i32 0    ; undef
11408       %b = extractelement <2 x i32> %v, i32 1    ; poison
11409       %add = add i32 %a, %a                      ; undef
11410
11411       %v.fr = freeze <2 x i32> %v                ; element-wise freeze
11412       %d = extractelement <2 x i32> %v.fr, i32 0 ; not undef
11413       %add.f = add i32 %d, %d                    ; even number
11414
11415       ; branching on frozen value
11416       %poison = add nsw i1 %k, undef   ; poison
11417       %c = freeze i1 %poison
11418       br i1 %c, label %foo, label %bar ; non-deterministic branch to %foo or %bar
11419
11420
11421 .. _i_call:
11422
11423 '``call``' Instruction
11424 ^^^^^^^^^^^^^^^^^^^^^^
11425
11426 Syntax:
11427 """""""
11428
11429 ::
11430
11431       <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(<num>)]
11432                  <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ]
11433
11434 Overview:
11435 """""""""
11436
11437 The '``call``' instruction represents a simple function call.
11438
11439 Arguments:
11440 """"""""""
11441
11442 This instruction requires several arguments:
11443
11444 #. The optional ``tail`` and ``musttail`` markers indicate that the optimizers
11445    should perform tail call optimization. The ``tail`` marker is a hint that
11446    `can be ignored <CodeGenerator.html#sibcallopt>`_. The ``musttail`` marker
11447    means that the call must be tail call optimized in order for the program to
11448    be correct. The ``musttail`` marker provides these guarantees:
11449
11450    #. The call will not cause unbounded stack growth if it is part of a
11451       recursive cycle in the call graph.
11452    #. Arguments with the :ref:`inalloca <attr_inalloca>` or
11453       :ref:`preallocated <attr_preallocated>` attribute are forwarded in place.
11454    #. If the musttail call appears in a function with the ``"thunk"`` attribute
11455       and the caller and callee both have varargs, than any unprototyped
11456       arguments in register or memory are forwarded to the callee. Similarly,
11457       the return value of the callee is returned to the caller's caller, even
11458       if a void return type is in use.
11459
11460    Both markers imply that the callee does not access allocas from the caller.
11461    The ``tail`` marker additionally implies that the callee does not access
11462    varargs from the caller. Calls marked ``musttail`` must obey the following
11463    additional  rules:
11464
11465    - The call must immediately precede a :ref:`ret <i_ret>` instruction,
11466      or a pointer bitcast followed by a ret instruction.
11467    - The ret instruction must return the (possibly bitcasted) value
11468      produced by the call, undef, or void.
11469    - The calling conventions of the caller and callee must match.
11470    - The callee must be varargs iff the caller is varargs. Bitcasting a
11471      non-varargs function to the appropriate varargs type is legal so
11472      long as the non-varargs prefixes obey the other rules.
11473    - The return type must not undergo automatic conversion to an `sret` pointer.
11474
11475   In addition, if the calling convention is not `swifttailcc` or `tailcc`:
11476
11477    - All ABI-impacting function attributes, such as sret, byval, inreg,
11478      returned, and inalloca, must match.
11479    - The caller and callee prototypes must match. Pointer types of parameters
11480      or return types may differ in pointee type, but not in address space.
11481
11482   On the other hand, if the calling convention is `swifttailcc` or `swiftcc`:
11483
11484    - Only these ABI-impacting attributes attributes are allowed: sret, byval,
11485      swiftself, and swiftasync.
11486    - Prototypes are not required to match.
11487
11488    Tail call optimization for calls marked ``tail`` is guaranteed to occur if
11489    the following conditions are met:
11490
11491    -  Caller and callee both have the calling convention ``fastcc`` or ``tailcc``.
11492    -  The call is in tail position (ret immediately follows call and ret
11493       uses value of call or is void).
11494    -  Option ``-tailcallopt`` is enabled,
11495       ``llvm::GuaranteedTailCallOpt`` is ``true``, or the calling convention
11496       is ``tailcc``
11497    -  `Platform-specific constraints are
11498       met. <CodeGenerator.html#tailcallopt>`_
11499
11500 #. The optional ``notail`` marker indicates that the optimizers should not add
11501    ``tail`` or ``musttail`` markers to the call. It is used to prevent tail
11502    call optimization from being performed on the call.
11503
11504 #. The optional ``fast-math flags`` marker indicates that the call has one or more
11505    :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
11506    otherwise unsafe floating-point optimizations. Fast-math flags are only valid
11507    for calls that return a floating-point scalar or vector type, or an array
11508    (nested to any depth) of floating-point scalar or vector types.
11509
11510 #. The optional "cconv" marker indicates which :ref:`calling
11511    convention <callingconv>` the call should use. If none is
11512    specified, the call defaults to using C calling conventions. The
11513    calling convention of the call must match the calling convention of
11514    the target function, or else the behavior is undefined.
11515 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
11516    values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
11517    are valid here.
11518 #. The optional addrspace attribute can be used to indicate the address space
11519    of the called function. If it is not specified, the program address space
11520    from the :ref:`datalayout string<langref_datalayout>` will be used.
11521 #. '``ty``': the type of the call instruction itself which is also the
11522    type of the return value. Functions that return no value are marked
11523    ``void``.
11524 #. '``fnty``': shall be the signature of the function being called. The
11525    argument types must match the types implied by this signature. This
11526    type can be omitted if the function is not varargs.
11527 #. '``fnptrval``': An LLVM value containing a pointer to a function to
11528    be called. In most cases, this is a direct function call, but
11529    indirect ``call``'s are just as possible, calling an arbitrary pointer
11530    to function value.
11531 #. '``function args``': argument list whose types match the function
11532    signature argument types and parameter attributes. All arguments must
11533    be of :ref:`first class <t_firstclass>` type. If the function signature
11534    indicates the function accepts a variable number of arguments, the
11535    extra arguments can be specified.
11536 #. The optional :ref:`function attributes <fnattrs>` list.
11537 #. The optional :ref:`operand bundles <opbundles>` list.
11538
11539 Semantics:
11540 """"""""""
11541
11542 The '``call``' instruction is used to cause control flow to transfer to
11543 a specified function, with its incoming arguments bound to the specified
11544 values. Upon a '``ret``' instruction in the called function, control
11545 flow continues with the instruction after the function call, and the
11546 return value of the function is bound to the result argument.
11547
11548 Example:
11549 """"""""
11550
11551 .. code-block:: llvm
11552
11553       %retval = call i32 @test(i32 %argc)
11554       call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42)        ; yields i32
11555       %X = tail call i32 @foo()                                    ; yields i32
11556       %Y = tail call fastcc i32 @foo()  ; yields i32
11557       call void %foo(i8 97 signext)
11558
11559       %struct.A = type { i32, i8 }
11560       %r = call %struct.A @foo()                        ; yields { i32, i8 }
11561       %gr = extractvalue %struct.A %r, 0                ; yields i32
11562       %gr1 = extractvalue %struct.A %r, 1               ; yields i8
11563       %Z = call void @foo() noreturn                    ; indicates that %foo never returns normally
11564       %ZZ = call zeroext i32 @bar()                     ; Return value is %zero extended
11565
11566 llvm treats calls to some functions with names and arguments that match
11567 the standard C99 library as being the C99 library functions, and may
11568 perform optimizations or generate code for them under that assumption.
11569 This is something we'd like to change in the future to provide better
11570 support for freestanding environments and non-C-based languages.
11571
11572 .. _i_va_arg:
11573
11574 '``va_arg``' Instruction
11575 ^^^^^^^^^^^^^^^^^^^^^^^^
11576
11577 Syntax:
11578 """""""
11579
11580 ::
11581
11582       <resultval> = va_arg <va_list*> <arglist>, <argty>
11583
11584 Overview:
11585 """""""""
11586
11587 The '``va_arg``' instruction is used to access arguments passed through
11588 the "variable argument" area of a function call. It is used to implement
11589 the ``va_arg`` macro in C.
11590
11591 Arguments:
11592 """"""""""
11593
11594 This instruction takes a ``va_list*`` value and the type of the
11595 argument. It returns a value of the specified argument type and
11596 increments the ``va_list`` to point to the next argument. The actual
11597 type of ``va_list`` is target specific.
11598
11599 Semantics:
11600 """"""""""
11601
11602 The '``va_arg``' instruction loads an argument of the specified type
11603 from the specified ``va_list`` and causes the ``va_list`` to point to
11604 the next argument. For more information, see the variable argument
11605 handling :ref:`Intrinsic Functions <int_varargs>`.
11606
11607 It is legal for this instruction to be called in a function which does
11608 not take a variable number of arguments, for example, the ``vfprintf``
11609 function.
11610
11611 ``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic
11612 function <intrinsics>` because it takes a type as an argument.
11613
11614 Example:
11615 """"""""
11616
11617 See the :ref:`variable argument processing <int_varargs>` section.
11618
11619 Note that the code generator does not yet fully support va\_arg on many
11620 targets. Also, it does not currently support va\_arg with aggregate
11621 types on any target.
11622
11623 .. _i_landingpad:
11624
11625 '``landingpad``' Instruction
11626 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11627
11628 Syntax:
11629 """""""
11630
11631 ::
11632
11633       <resultval> = landingpad <resultty> <clause>+
11634       <resultval> = landingpad <resultty> cleanup <clause>*
11635
11636       <clause> := catch <type> <value>
11637       <clause> := filter <array constant type> <array constant>
11638
11639 Overview:
11640 """""""""
11641
11642 The '``landingpad``' instruction is used by `LLVM's exception handling
11643 system <ExceptionHandling.html#overview>`_ to specify that a basic block
11644 is a landing pad --- one where the exception lands, and corresponds to the
11645 code found in the ``catch`` portion of a ``try``/``catch`` sequence. It
11646 defines values supplied by the :ref:`personality function <personalityfn>` upon
11647 re-entry to the function. The ``resultval`` has the type ``resultty``.
11648
11649 Arguments:
11650 """"""""""
11651
11652 The optional
11653 ``cleanup`` flag indicates that the landing pad block is a cleanup.
11654
11655 A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and
11656 contains the global variable representing the "type" that may be caught
11657 or filtered respectively. Unlike the ``catch`` clause, the ``filter``
11658 clause takes an array constant as its argument. Use
11659 "``[0 x i8**] undef``" for a filter which cannot throw. The
11660 '``landingpad``' instruction must contain *at least* one ``clause`` or
11661 the ``cleanup`` flag.
11662
11663 Semantics:
11664 """"""""""
11665
11666 The '``landingpad``' instruction defines the values which are set by the
11667 :ref:`personality function <personalityfn>` upon re-entry to the function, and
11668 therefore the "result type" of the ``landingpad`` instruction. As with
11669 calling conventions, how the personality function results are
11670 represented in LLVM IR is target specific.
11671
11672 The clauses are applied in order from top to bottom. If two
11673 ``landingpad`` instructions are merged together through inlining, the
11674 clauses from the calling function are appended to the list of clauses.
11675 When the call stack is being unwound due to an exception being thrown,
11676 the exception is compared against each ``clause`` in turn. If it doesn't
11677 match any of the clauses, and the ``cleanup`` flag is not set, then
11678 unwinding continues further up the call stack.
11679
11680 The ``landingpad`` instruction has several restrictions:
11681
11682 -  A landing pad block is a basic block which is the unwind destination
11683    of an '``invoke``' instruction.
11684 -  A landing pad block must have a '``landingpad``' instruction as its
11685    first non-PHI instruction.
11686 -  There can be only one '``landingpad``' instruction within the landing
11687    pad block.
11688 -  A basic block that is not a landing pad block may not include a
11689    '``landingpad``' instruction.
11690
11691 Example:
11692 """"""""
11693
11694 .. code-block:: llvm
11695
11696       ;; A landing pad which can catch an integer.
11697       %res = landingpad { i8*, i32 }
11698                catch i8** @_ZTIi
11699       ;; A landing pad that is a cleanup.
11700       %res = landingpad { i8*, i32 }
11701                cleanup
11702       ;; A landing pad which can catch an integer and can only throw a double.
11703       %res = landingpad { i8*, i32 }
11704                catch i8** @_ZTIi
11705                filter [1 x i8**] [@_ZTId]
11706
11707 .. _i_catchpad:
11708
11709 '``catchpad``' Instruction
11710 ^^^^^^^^^^^^^^^^^^^^^^^^^^
11711
11712 Syntax:
11713 """""""
11714
11715 ::
11716
11717       <resultval> = catchpad within <catchswitch> [<args>*]
11718
11719 Overview:
11720 """""""""
11721
11722 The '``catchpad``' instruction is used by `LLVM's exception handling
11723 system <ExceptionHandling.html#overview>`_ to specify that a basic block
11724 begins a catch handler --- one where a personality routine attempts to transfer
11725 control to catch an exception.
11726
11727 Arguments:
11728 """"""""""
11729
11730 The ``catchswitch`` operand must always be a token produced by a
11731 :ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This
11732 ensures that each ``catchpad`` has exactly one predecessor block, and it always
11733 terminates in a ``catchswitch``.
11734
11735 The ``args`` correspond to whatever information the personality routine
11736 requires to know if this is an appropriate handler for the exception. Control
11737 will transfer to the ``catchpad`` if this is the first appropriate handler for
11738 the exception.
11739
11740 The ``resultval`` has the type :ref:`token <t_token>` and is used to match the
11741 ``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH
11742 pads.
11743
11744 Semantics:
11745 """"""""""
11746
11747 When the call stack is being unwound due to an exception being thrown, the
11748 exception is compared against the ``args``. If it doesn't match, control will
11749 not reach the ``catchpad`` instruction.  The representation of ``args`` is
11750 entirely target and personality function-specific.
11751
11752 Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad``
11753 instruction must be the first non-phi of its parent basic block.
11754
11755 The meaning of the tokens produced and consumed by ``catchpad`` and other "pad"
11756 instructions is described in the
11757 `Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_.
11758
11759 When a ``catchpad`` has been "entered" but not yet "exited" (as
11760 described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
11761 it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
11762 that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
11763
11764 Example:
11765 """"""""
11766
11767 .. code-block:: text
11768
11769     dispatch:
11770       %cs = catchswitch within none [label %handler0] unwind to caller
11771       ;; A catch block which can catch an integer.
11772     handler0:
11773       %tok = catchpad within %cs [i8** @_ZTIi]
11774
11775 .. _i_cleanuppad:
11776
11777 '``cleanuppad``' Instruction
11778 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11779
11780 Syntax:
11781 """""""
11782
11783 ::
11784
11785       <resultval> = cleanuppad within <parent> [<args>*]
11786
11787 Overview:
11788 """""""""
11789
11790 The '``cleanuppad``' instruction is used by `LLVM's exception handling
11791 system <ExceptionHandling.html#overview>`_ to specify that a basic block
11792 is a cleanup block --- one where a personality routine attempts to
11793 transfer control to run cleanup actions.
11794 The ``args`` correspond to whatever additional
11795 information the :ref:`personality function <personalityfn>` requires to
11796 execute the cleanup.
11797 The ``resultval`` has the type :ref:`token <t_token>` and is used to
11798 match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`.
11799 The ``parent`` argument is the token of the funclet that contains the
11800 ``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet,
11801 this operand may be the token ``none``.
11802
11803 Arguments:
11804 """"""""""
11805
11806 The instruction takes a list of arbitrary values which are interpreted
11807 by the :ref:`personality function <personalityfn>`.
11808
11809 Semantics:
11810 """"""""""
11811
11812 When the call stack is being unwound due to an exception being thrown,
11813 the :ref:`personality function <personalityfn>` transfers control to the
11814 ``cleanuppad`` with the aid of the personality-specific arguments.
11815 As with calling conventions, how the personality function results are
11816 represented in LLVM IR is target specific.
11817
11818 The ``cleanuppad`` instruction has several restrictions:
11819
11820 -  A cleanup block is a basic block which is the unwind destination of
11821    an exceptional instruction.
11822 -  A cleanup block must have a '``cleanuppad``' instruction as its
11823    first non-PHI instruction.
11824 -  There can be only one '``cleanuppad``' instruction within the
11825    cleanup block.
11826 -  A basic block that is not a cleanup block may not include a
11827    '``cleanuppad``' instruction.
11828
11829 When a ``cleanuppad`` has been "entered" but not yet "exited" (as
11830 described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
11831 it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
11832 that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
11833
11834 Example:
11835 """"""""
11836
11837 .. code-block:: text
11838
11839       %tok = cleanuppad within %cs []
11840
11841 .. _intrinsics:
11842
11843 Intrinsic Functions
11844 ===================
11845
11846 LLVM supports the notion of an "intrinsic function". These functions
11847 have well known names and semantics and are required to follow certain
11848 restrictions. Overall, these intrinsics represent an extension mechanism
11849 for the LLVM language that does not require changing all of the
11850 transformations in LLVM when adding to the language (or the bitcode
11851 reader/writer, the parser, etc...).
11852
11853 Intrinsic function names must all start with an "``llvm.``" prefix. This
11854 prefix is reserved in LLVM for intrinsic names; thus, function names may
11855 not begin with this prefix. Intrinsic functions must always be external
11856 functions: you cannot define the body of intrinsic functions. Intrinsic
11857 functions may only be used in call or invoke instructions: it is illegal
11858 to take the address of an intrinsic function. Additionally, because
11859 intrinsic functions are part of the LLVM language, it is required if any
11860 are added that they be documented here.
11861
11862 Some intrinsic functions can be overloaded, i.e., the intrinsic
11863 represents a family of functions that perform the same operation but on
11864 different data types. Because LLVM can represent over 8 million
11865 different integer types, overloading is used commonly to allow an
11866 intrinsic function to operate on any integer type. One or more of the
11867 argument types or the result type can be overloaded to accept any
11868 integer type. Argument types may also be defined as exactly matching a
11869 previous argument's type or the result type. This allows an intrinsic
11870 function which accepts multiple arguments, but needs all of them to be
11871 of the same type, to only be overloaded with respect to a single
11872 argument or the result.
11873
11874 Overloaded intrinsics will have the names of its overloaded argument
11875 types encoded into its function name, each preceded by a period. Only
11876 those types which are overloaded result in a name suffix. Arguments
11877 whose type is matched against another type do not. For example, the
11878 ``llvm.ctpop`` function can take an integer of any width and returns an
11879 integer of exactly the same integer width. This leads to a family of
11880 functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and
11881 ``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is
11882 overloaded, and only one type suffix is required. Because the argument's
11883 type is matched against the return type, it does not require its own
11884 name suffix.
11885
11886 :ref:`Unnamed types <t_opaque>` are encoded as ``s_s``. Overloaded intrinsics
11887 that depend on an unnamed type in one of its overloaded argument types get an
11888 additional ``.<number>`` suffix. This allows differentiating intrinsics with
11889 different unnamed types as arguments. (For example:
11890 ``llvm.ssa.copy.p0s_s.2(%42*)``) The number is tracked in the LLVM module and
11891 it ensures unique names in the module. While linking together two modules, it is
11892 still possible to get a name clash. In that case one of the names will be
11893 changed by getting a new number.
11894
11895 For target developers who are defining intrinsics for back-end code
11896 generation, any intrinsic overloads based solely the distinction between
11897 integer or floating point types should not be relied upon for correct
11898 code generation. In such cases, the recommended approach for target
11899 maintainers when defining intrinsics is to create separate integer and
11900 FP intrinsics rather than rely on overloading. For example, if different
11901 codegen is required for ``llvm.target.foo(<4 x i32>)`` and
11902 ``llvm.target.foo(<4 x float>)`` then these should be split into
11903 different intrinsics.
11904
11905 To learn how to add an intrinsic function, please see the `Extending
11906 LLVM Guide <ExtendingLLVM.html>`_.
11907
11908 .. _int_varargs:
11909
11910 Variable Argument Handling Intrinsics
11911 -------------------------------------
11912
11913 Variable argument support is defined in LLVM with the
11914 :ref:`va_arg <i_va_arg>` instruction and these three intrinsic
11915 functions. These functions are related to the similarly named macros
11916 defined in the ``<stdarg.h>`` header file.
11917
11918 All of these functions operate on arguments that use a target-specific
11919 value type "``va_list``". The LLVM assembly language reference manual
11920 does not define what this type is, so all transformations should be
11921 prepared to handle these functions regardless of the type used.
11922
11923 This example shows how the :ref:`va_arg <i_va_arg>` instruction and the
11924 variable argument handling intrinsic functions are used.
11925
11926 .. code-block:: llvm
11927
11928     ; This struct is different for every platform. For most platforms,
11929     ; it is merely an i8*.
11930     %struct.va_list = type { i8* }
11931
11932     ; For Unix x86_64 platforms, va_list is the following struct:
11933     ; %struct.va_list = type { i32, i32, i8*, i8* }
11934
11935     define i32 @test(i32 %X, ...) {
11936       ; Initialize variable argument processing
11937       %ap = alloca %struct.va_list
11938       %ap2 = bitcast %struct.va_list* %ap to i8*
11939       call void @llvm.va_start(i8* %ap2)
11940
11941       ; Read a single integer argument
11942       %tmp = va_arg i8* %ap2, i32
11943
11944       ; Demonstrate usage of llvm.va_copy and llvm.va_end
11945       %aq = alloca i8*
11946       %aq2 = bitcast i8** %aq to i8*
11947       call void @llvm.va_copy(i8* %aq2, i8* %ap2)
11948       call void @llvm.va_end(i8* %aq2)
11949
11950       ; Stop processing of arguments.
11951       call void @llvm.va_end(i8* %ap2)
11952       ret i32 %tmp
11953     }
11954
11955     declare void @llvm.va_start(i8*)
11956     declare void @llvm.va_copy(i8*, i8*)
11957     declare void @llvm.va_end(i8*)
11958
11959 .. _int_va_start:
11960
11961 '``llvm.va_start``' Intrinsic
11962 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11963
11964 Syntax:
11965 """""""
11966
11967 ::
11968
11969       declare void @llvm.va_start(i8* <arglist>)
11970
11971 Overview:
11972 """""""""
11973
11974 The '``llvm.va_start``' intrinsic initializes ``*<arglist>`` for
11975 subsequent use by ``va_arg``.
11976
11977 Arguments:
11978 """"""""""
11979
11980 The argument is a pointer to a ``va_list`` element to initialize.
11981
11982 Semantics:
11983 """"""""""
11984
11985 The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro
11986 available in C. In a target-dependent way, it initializes the
11987 ``va_list`` element to which the argument points, so that the next call
11988 to ``va_arg`` will produce the first variable argument passed to the
11989 function. Unlike the C ``va_start`` macro, this intrinsic does not need
11990 to know the last argument of the function as the compiler can figure
11991 that out.
11992
11993 '``llvm.va_end``' Intrinsic
11994 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
11995
11996 Syntax:
11997 """""""
11998
11999 ::
12000
12001       declare void @llvm.va_end(i8* <arglist>)
12002
12003 Overview:
12004 """""""""
12005
12006 The '``llvm.va_end``' intrinsic destroys ``*<arglist>``, which has been
12007 initialized previously with ``llvm.va_start`` or ``llvm.va_copy``.
12008
12009 Arguments:
12010 """"""""""
12011
12012 The argument is a pointer to a ``va_list`` to destroy.
12013
12014 Semantics:
12015 """"""""""
12016
12017 The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro
12018 available in C. In a target-dependent way, it destroys the ``va_list``
12019 element to which the argument points. Calls to
12020 :ref:`llvm.va_start <int_va_start>` and
12021 :ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to
12022 ``llvm.va_end``.
12023
12024 .. _int_va_copy:
12025
12026 '``llvm.va_copy``' Intrinsic
12027 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12028
12029 Syntax:
12030 """""""
12031
12032 ::
12033
12034       declare void @llvm.va_copy(i8* <destarglist>, i8* <srcarglist>)
12035
12036 Overview:
12037 """""""""
12038
12039 The '``llvm.va_copy``' intrinsic copies the current argument position
12040 from the source argument list to the destination argument list.
12041
12042 Arguments:
12043 """"""""""
12044
12045 The first argument is a pointer to a ``va_list`` element to initialize.
12046 The second argument is a pointer to a ``va_list`` element to copy from.
12047
12048 Semantics:
12049 """"""""""
12050
12051 The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro
12052 available in C. In a target-dependent way, it copies the source
12053 ``va_list`` element into the destination ``va_list`` element. This
12054 intrinsic is necessary because the `` llvm.va_start`` intrinsic may be
12055 arbitrarily complex and require, for example, memory allocation.
12056
12057 Accurate Garbage Collection Intrinsics
12058 --------------------------------------
12059
12060 LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_
12061 (GC) requires the frontend to generate code containing appropriate intrinsic
12062 calls and select an appropriate GC strategy which knows how to lower these
12063 intrinsics in a manner which is appropriate for the target collector.
12064
12065 These intrinsics allow identification of :ref:`GC roots on the
12066 stack <int_gcroot>`, as well as garbage collector implementations that
12067 require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers.
12068 Frontends for type-safe garbage collected languages should generate
12069 these intrinsics to make use of the LLVM garbage collectors. For more
12070 details, see `Garbage Collection with LLVM <GarbageCollection.html>`_.
12071
12072 LLVM provides an second experimental set of intrinsics for describing garbage
12073 collection safepoints in compiled code. These intrinsics are an alternative
12074 to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for
12075 :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The
12076 differences in approach are covered in the `Garbage Collection with LLVM
12077 <GarbageCollection.html>`_ documentation. The intrinsics themselves are
12078 described in :doc:`Statepoints`.
12079
12080 .. _int_gcroot:
12081
12082 '``llvm.gcroot``' Intrinsic
12083 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
12084
12085 Syntax:
12086 """""""
12087
12088 ::
12089
12090       declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata)
12091
12092 Overview:
12093 """""""""
12094
12095 The '``llvm.gcroot``' intrinsic declares the existence of a GC root to
12096 the code generator, and allows some metadata to be associated with it.
12097
12098 Arguments:
12099 """"""""""
12100
12101 The first argument specifies the address of a stack object that contains
12102 the root pointer. The second pointer (which must be either a constant or
12103 a global value address) contains the meta-data to be associated with the
12104 root.
12105
12106 Semantics:
12107 """"""""""
12108
12109 At runtime, a call to this intrinsic stores a null pointer into the
12110 "ptrloc" location. At compile-time, the code generator generates
12111 information to allow the runtime to find the pointer at GC safe points.
12112 The '``llvm.gcroot``' intrinsic may only be used in a function which
12113 :ref:`specifies a GC algorithm <gc>`.
12114
12115 .. _int_gcread:
12116
12117 '``llvm.gcread``' Intrinsic
12118 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
12119
12120 Syntax:
12121 """""""
12122
12123 ::
12124
12125       declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr)
12126
12127 Overview:
12128 """""""""
12129
12130 The '``llvm.gcread``' intrinsic identifies reads of references from heap
12131 locations, allowing garbage collector implementations that require read
12132 barriers.
12133
12134 Arguments:
12135 """"""""""
12136
12137 The second argument is the address to read from, which should be an
12138 address allocated from the garbage collector. The first object is a
12139 pointer to the start of the referenced object, if needed by the language
12140 runtime (otherwise null).
12141
12142 Semantics:
12143 """"""""""
12144
12145 The '``llvm.gcread``' intrinsic has the same semantics as a load
12146 instruction, but may be replaced with substantially more complex code by
12147 the garbage collector runtime, as needed. The '``llvm.gcread``'
12148 intrinsic may only be used in a function which :ref:`specifies a GC
12149 algorithm <gc>`.
12150
12151 .. _int_gcwrite:
12152
12153 '``llvm.gcwrite``' Intrinsic
12154 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12155
12156 Syntax:
12157 """""""
12158
12159 ::
12160
12161       declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2)
12162
12163 Overview:
12164 """""""""
12165
12166 The '``llvm.gcwrite``' intrinsic identifies writes of references to heap
12167 locations, allowing garbage collector implementations that require write
12168 barriers (such as generational or reference counting collectors).
12169
12170 Arguments:
12171 """"""""""
12172
12173 The first argument is the reference to store, the second is the start of
12174 the object to store it to, and the third is the address of the field of
12175 Obj to store to. If the runtime does not require a pointer to the
12176 object, Obj may be null.
12177
12178 Semantics:
12179 """"""""""
12180
12181 The '``llvm.gcwrite``' intrinsic has the same semantics as a store
12182 instruction, but may be replaced with substantially more complex code by
12183 the garbage collector runtime, as needed. The '``llvm.gcwrite``'
12184 intrinsic may only be used in a function which :ref:`specifies a GC
12185 algorithm <gc>`.
12186
12187
12188 .. _gc_statepoint:
12189
12190 'llvm.experimental.gc.statepoint' Intrinsic
12191 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12192
12193 Syntax:
12194 """""""
12195
12196 ::
12197
12198       declare token
12199         @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>,
12200                        func_type <target>,
12201                        i64 <#call args>, i64 <flags>,
12202                        ... (call parameters),
12203                        i64 0, i64 0)
12204
12205 Overview:
12206 """""""""
12207
12208 The statepoint intrinsic represents a call which is parse-able by the
12209 runtime.
12210
12211 Operands:
12212 """""""""
12213
12214 The 'id' operand is a constant integer that is reported as the ID
12215 field in the generated stackmap.  LLVM does not interpret this
12216 parameter in any way and its meaning is up to the statepoint user to
12217 decide.  Note that LLVM is free to duplicate code containing
12218 statepoint calls, and this may transform IR that had a unique 'id' per
12219 lexical call to statepoint to IR that does not.
12220
12221 If 'num patch bytes' is non-zero then the call instruction
12222 corresponding to the statepoint is not emitted and LLVM emits 'num
12223 patch bytes' bytes of nops in its place.  LLVM will emit code to
12224 prepare the function arguments and retrieve the function return value
12225 in accordance to the calling convention; the former before the nop
12226 sequence and the latter after the nop sequence.  It is expected that
12227 the user will patch over the 'num patch bytes' bytes of nops with a
12228 calling sequence specific to their runtime before executing the
12229 generated machine code.  There are no guarantees with respect to the
12230 alignment of the nop sequence.  Unlike :doc:`StackMaps` statepoints do
12231 not have a concept of shadow bytes.  Note that semantically the
12232 statepoint still represents a call or invoke to 'target', and the nop
12233 sequence after patching is expected to represent an operation
12234 equivalent to a call or invoke to 'target'.
12235
12236 The 'target' operand is the function actually being called.  The
12237 target can be specified as either a symbolic LLVM function, or as an
12238 arbitrary Value of appropriate function type.  Note that the function
12239 type must match the signature of the callee and the types of the 'call
12240 parameters' arguments.
12241
12242 The '#call args' operand is the number of arguments to the actual
12243 call.  It must exactly match the number of arguments passed in the
12244 'call parameters' variable length section.
12245
12246 The 'flags' operand is used to specify extra information about the
12247 statepoint. This is currently only used to mark certain statepoints
12248 as GC transitions. This operand is a 64-bit integer with the following
12249 layout, where bit 0 is the least significant bit:
12250
12251   +-------+---------------------------------------------------+
12252   | Bit # | Usage                                             |
12253   +=======+===================================================+
12254   |     0 | Set if the statepoint is a GC transition, cleared |
12255   |       | otherwise.                                        |
12256   +-------+---------------------------------------------------+
12257   |  1-63 | Reserved for future use; must be cleared.         |
12258   +-------+---------------------------------------------------+
12259
12260 The 'call parameters' arguments are simply the arguments which need to
12261 be passed to the call target.  They will be lowered according to the
12262 specified calling convention and otherwise handled like a normal call
12263 instruction.  The number of arguments must exactly match what is
12264 specified in '# call args'.  The types must match the signature of
12265 'target'.
12266
12267 The 'call parameter' attributes must be followed by two 'i64 0' constants.
12268 These were originally the length prefixes for 'gc transition parameter' and
12269 'deopt parameter' arguments, but the role of these parameter sets have been
12270 entirely replaced with the corresponding operand bundles.  In a future
12271 revision, these now redundant arguments will be removed.
12272
12273 Semantics:
12274 """"""""""
12275
12276 A statepoint is assumed to read and write all memory.  As a result,
12277 memory operations can not be reordered past a statepoint.  It is
12278 illegal to mark a statepoint as being either 'readonly' or 'readnone'.
12279
12280 Note that legal IR can not perform any memory operation on a 'gc
12281 pointer' argument of the statepoint in a location statically reachable
12282 from the statepoint.  Instead, the explicitly relocated value (from a
12283 ``gc.relocate``) must be used.
12284
12285 'llvm.experimental.gc.result' Intrinsic
12286 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12287
12288 Syntax:
12289 """""""
12290
12291 ::
12292
12293       declare type*
12294         @llvm.experimental.gc.result(token %statepoint_token)
12295
12296 Overview:
12297 """""""""
12298
12299 ``gc.result`` extracts the result of the original call instruction
12300 which was replaced by the ``gc.statepoint``.  The ``gc.result``
12301 intrinsic is actually a family of three intrinsics due to an
12302 implementation limitation.  Other than the type of the return value,
12303 the semantics are the same.
12304
12305 Operands:
12306 """""""""
12307
12308 The first and only argument is the ``gc.statepoint`` which starts
12309 the safepoint sequence of which this ``gc.result`` is a part.
12310 Despite the typing of this as a generic token, *only* the value defined
12311 by a ``gc.statepoint`` is legal here.
12312
12313 Semantics:
12314 """"""""""
12315
12316 The ``gc.result`` represents the return value of the call target of
12317 the ``statepoint``.  The type of the ``gc.result`` must exactly match
12318 the type of the target.  If the call target returns void, there will
12319 be no ``gc.result``.
12320
12321 A ``gc.result`` is modeled as a 'readnone' pure function.  It has no
12322 side effects since it is just a projection of the return value of the
12323 previous call represented by the ``gc.statepoint``.
12324
12325 'llvm.experimental.gc.relocate' Intrinsic
12326 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12327
12328 Syntax:
12329 """""""
12330
12331 ::
12332
12333       declare <pointer type>
12334         @llvm.experimental.gc.relocate(token %statepoint_token,
12335                                        i32 %base_offset,
12336                                        i32 %pointer_offset)
12337
12338 Overview:
12339 """""""""
12340
12341 A ``gc.relocate`` returns the potentially relocated value of a pointer
12342 at the safepoint.
12343
12344 Operands:
12345 """""""""
12346
12347 The first argument is the ``gc.statepoint`` which starts the
12348 safepoint sequence of which this ``gc.relocation`` is a part.
12349 Despite the typing of this as a generic token, *only* the value defined
12350 by a ``gc.statepoint`` is legal here.
12351
12352 The second and third arguments are both indices into operands of the
12353 corresponding statepoint's :ref:`gc-live <ob_gc_live>` operand bundle.
12354
12355 The second argument is an index which specifies the allocation for the pointer
12356 being relocated. The associated value must be within the object with which the
12357 pointer being relocated is associated. The optimizer is free to change *which*
12358 interior derived pointer is reported, provided that it does not replace an
12359 actual base pointer with another interior derived pointer. Collectors are
12360 allowed to rely on the base pointer operand remaining an actual base pointer if
12361 so constructed.
12362
12363 The third argument is an index which specify the (potentially) derived pointer
12364 being relocated.  It is legal for this index to be the same as the second
12365 argument if-and-only-if a base pointer is being relocated.
12366
12367 Semantics:
12368 """"""""""
12369
12370 The return value of ``gc.relocate`` is the potentially relocated value
12371 of the pointer specified by its arguments.  It is unspecified how the
12372 value of the returned pointer relates to the argument to the
12373 ``gc.statepoint`` other than that a) it points to the same source
12374 language object with the same offset, and b) the 'based-on'
12375 relationship of the newly relocated pointers is a projection of the
12376 unrelocated pointers.  In particular, the integer value of the pointer
12377 returned is unspecified.
12378
12379 A ``gc.relocate`` is modeled as a ``readnone`` pure function.  It has no
12380 side effects since it is just a way to extract information about work
12381 done during the actual call modeled by the ``gc.statepoint``.
12382
12383 .. _gc.get.pointer.base:
12384
12385 'llvm.experimental.gc.get.pointer.base' Intrinsic
12386 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12387
12388 Syntax:
12389 """""""
12390
12391 ::
12392
12393       declare <pointer type>
12394         @llvm.experimental.gc.get.pointer.base(
12395           <pointer type> readnone nocapture %derived_ptr)
12396           nounwind readnone willreturn
12397
12398 Overview:
12399 """""""""
12400
12401 ``gc.get.pointer.base`` for a derived pointer returns its base pointer.
12402
12403 Operands:
12404 """""""""
12405
12406 The only argument is a pointer which is based on some object with
12407 an unknown offset from the base of said object.
12408
12409 Semantics:
12410 """"""""""
12411
12412 This intrinsic is used in the abstract machine model for GC to represent
12413 the base pointer for an arbitrary derived pointer.
12414
12415 This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
12416 replacing all uses of this callsite with the offset of a derived pointer from
12417 its base pointer value. The replacement is done as part of the lowering to the
12418 explicit statepoint model.
12419
12420 The return pointer type must be the same as the type of the parameter.
12421
12422
12423 'llvm.experimental.gc.get.pointer.offset' Intrinsic
12424 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12425
12426 Syntax:
12427 """""""
12428
12429 ::
12430
12431       declare i64
12432         @llvm.experimental.gc.get.pointer.offset(
12433           <pointer type> readnone nocapture %derived_ptr)
12434           nounwind readnone willreturn
12435
12436 Overview:
12437 """""""""
12438
12439 ``gc.get.pointer.offset`` for a derived pointer returns the offset from its
12440 base pointer.
12441
12442 Operands:
12443 """""""""
12444
12445 The only argument is a pointer which is based on some object with
12446 an unknown offset from the base of said object.
12447
12448 Semantics:
12449 """"""""""
12450
12451 This intrinsic is used in the abstract machine model for GC to represent
12452 the offset of an arbitrary derived pointer from its base pointer.
12453
12454 This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
12455 replacing all uses of this callsite with the offset of a derived pointer from
12456 its base pointer value. The replacement is done as part of the lowering to the
12457 explicit statepoint model.
12458
12459 Basically this call calculates difference between the derived pointer and its
12460 base pointer (see :ref:`gc.get.pointer.base`) both ptrtoint casted. But
12461 this cast done outside the :ref:`RewriteStatepointsForGC` pass could result
12462 in the pointers lost for further lowering from the abstract model to the
12463 explicit physical one.
12464
12465 Code Generator Intrinsics
12466 -------------------------
12467
12468 These intrinsics are provided by LLVM to expose special features that
12469 may only be implemented with code generator support.
12470
12471 '``llvm.returnaddress``' Intrinsic
12472 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12473
12474 Syntax:
12475 """""""
12476
12477 ::
12478
12479       declare i8* @llvm.returnaddress(i32 <level>)
12480
12481 Overview:
12482 """""""""
12483
12484 The '``llvm.returnaddress``' intrinsic attempts to compute a
12485 target-specific value indicating the return address of the current
12486 function or one of its callers.
12487
12488 Arguments:
12489 """"""""""
12490
12491 The argument to this intrinsic indicates which function to return the
12492 address for. Zero indicates the calling function, one indicates its
12493 caller, etc. The argument is **required** to be a constant integer
12494 value.
12495
12496 Semantics:
12497 """"""""""
12498
12499 The '``llvm.returnaddress``' intrinsic either returns a pointer
12500 indicating the return address of the specified call frame, or zero if it
12501 cannot be identified. The value returned by this intrinsic is likely to
12502 be incorrect or 0 for arguments other than zero, so it should only be
12503 used for debugging purposes.
12504
12505 Note that calling this intrinsic does not prevent function inlining or
12506 other aggressive transformations, so the value returned may not be that
12507 of the obvious source-language caller.
12508
12509 '``llvm.addressofreturnaddress``' Intrinsic
12510 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12511
12512 Syntax:
12513 """""""
12514
12515 ::
12516
12517       declare i8* @llvm.addressofreturnaddress()
12518
12519 Overview:
12520 """""""""
12521
12522 The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific
12523 pointer to the place in the stack frame where the return address of the
12524 current function is stored.
12525
12526 Semantics:
12527 """"""""""
12528
12529 Note that calling this intrinsic does not prevent function inlining or
12530 other aggressive transformations, so the value returned may not be that
12531 of the obvious source-language caller.
12532
12533 This intrinsic is only implemented for x86 and aarch64.
12534
12535 '``llvm.sponentry``' Intrinsic
12536 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12537
12538 Syntax:
12539 """""""
12540
12541 ::
12542
12543       declare i8* @llvm.sponentry()
12544
12545 Overview:
12546 """""""""
12547
12548 The '``llvm.sponentry``' intrinsic returns the stack pointer value at
12549 the entry of the current function calling this intrinsic.
12550
12551 Semantics:
12552 """"""""""
12553
12554 Note this intrinsic is only verified on AArch64.
12555
12556 '``llvm.frameaddress``' Intrinsic
12557 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12558
12559 Syntax:
12560 """""""
12561
12562 ::
12563
12564       declare i8* @llvm.frameaddress(i32 <level>)
12565
12566 Overview:
12567 """""""""
12568
12569 The '``llvm.frameaddress``' intrinsic attempts to return the
12570 target-specific frame pointer value for the specified stack frame.
12571
12572 Arguments:
12573 """"""""""
12574
12575 The argument to this intrinsic indicates which function to return the
12576 frame pointer for. Zero indicates the calling function, one indicates
12577 its caller, etc. The argument is **required** to be a constant integer
12578 value.
12579
12580 Semantics:
12581 """"""""""
12582
12583 The '``llvm.frameaddress``' intrinsic either returns a pointer
12584 indicating the frame address of the specified call frame, or zero if it
12585 cannot be identified. The value returned by this intrinsic is likely to
12586 be incorrect or 0 for arguments other than zero, so it should only be
12587 used for debugging purposes.
12588
12589 Note that calling this intrinsic does not prevent function inlining or
12590 other aggressive transformations, so the value returned may not be that
12591 of the obvious source-language caller.
12592
12593 '``llvm.swift.async.context.addr``' Intrinsic
12594 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12595
12596 Syntax:
12597 """""""
12598
12599 ::
12600
12601       declare i8** @llvm.swift.async.context.addr()
12602
12603 Overview:
12604 """""""""
12605
12606 The '``llvm.swift.async.context.addr``' intrinsic returns a pointer to
12607 the part of the extended frame record containing the asynchronous
12608 context of a Swift execution.
12609
12610 Semantics:
12611 """"""""""
12612
12613 If the caller has a ``swiftasync`` parameter, that argument will initially
12614 be stored at the returned address. If not, it will be initialized to null.
12615
12616 '``llvm.localescape``' and '``llvm.localrecover``' Intrinsics
12617 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12618
12619 Syntax:
12620 """""""
12621
12622 ::
12623
12624       declare void @llvm.localescape(...)
12625       declare i8* @llvm.localrecover(i8* %func, i8* %fp, i32 %idx)
12626
12627 Overview:
12628 """""""""
12629
12630 The '``llvm.localescape``' intrinsic escapes offsets of a collection of static
12631 allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a
12632 live frame pointer to recover the address of the allocation. The offset is
12633 computed during frame layout of the caller of ``llvm.localescape``.
12634
12635 Arguments:
12636 """"""""""
12637
12638 All arguments to '``llvm.localescape``' must be pointers to static allocas or
12639 casts of static allocas. Each function can only call '``llvm.localescape``'
12640 once, and it can only do so from the entry block.
12641
12642 The ``func`` argument to '``llvm.localrecover``' must be a constant
12643 bitcasted pointer to a function defined in the current module. The code
12644 generator cannot determine the frame allocation offset of functions defined in
12645 other modules.
12646
12647 The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a
12648 call frame that is currently live. The return value of '``llvm.localaddress``'
12649 is one way to produce such a value, but various runtimes also expose a suitable
12650 pointer in platform-specific ways.
12651
12652 The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to
12653 '``llvm.localescape``' to recover. It is zero-indexed.
12654
12655 Semantics:
12656 """"""""""
12657
12658 These intrinsics allow a group of functions to share access to a set of local
12659 stack allocations of a one parent function. The parent function may call the
12660 '``llvm.localescape``' intrinsic once from the function entry block, and the
12661 child functions can use '``llvm.localrecover``' to access the escaped allocas.
12662 The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where
12663 the escaped allocas are allocated, which would break attempts to use
12664 '``llvm.localrecover``'.
12665
12666 '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' Intrinsics
12667 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12668
12669 Syntax:
12670 """""""
12671
12672 ::
12673
12674       declare void @llvm.seh.try.begin()
12675       declare void @llvm.seh.try.end()
12676
12677 Overview:
12678 """""""""
12679
12680 The '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' intrinsics mark
12681 the boundary of a _try region for Windows SEH Asynchrous Exception Handling.
12682
12683 Semantics:
12684 """"""""""
12685
12686 When a C-function is compiled with Windows SEH Asynchrous Exception option,
12687 -feh_asynch (aka MSVC -EHa), these two intrinsics are injected to mark _try
12688 boundary and to prevent potential exceptions from being moved across boundary.
12689 Any set of operations can then be confined to the region by reading their leaf
12690 inputs via volatile loads and writing their root outputs via volatile stores.
12691
12692 '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' Intrinsics
12693 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12694
12695 Syntax:
12696 """""""
12697
12698 ::
12699
12700       declare void @llvm.seh.scope.begin()
12701       declare void @llvm.seh.scope.end()
12702
12703 Overview:
12704 """""""""
12705
12706 The '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' intrinsics mark
12707 the boundary of a CPP object lifetime for Windows SEH Asynchrous Exception
12708 Handling (MSVC option -EHa).
12709
12710 Semantics:
12711 """"""""""
12712
12713 LLVM's ordinary exception-handling representation associates EH cleanups and
12714 handlers only with ``invoke``s, which normally correspond only to call sites.  To
12715 support arbitrary faulting instructions, it must be possible to recover the current
12716 EH scope for any instruction.  Turning every operation in LLVM that could fault
12717 into an ``invoke`` of a new, potentially-throwing intrinsic would require adding a
12718 large number of intrinsics, impede optimization of those operations, and make
12719 compilation slower by introducing many extra basic blocks.  These intrinsics can
12720 be used instead to mark the region protected by a cleanup, such as for a local
12721 C++ object with a non-trivial destructor.  ``llvm.seh.scope.begin`` is used to mark
12722 the start of the region; it is always called with ``invoke``, with the unwind block
12723 being the desired unwind destination for any potentially-throwing instructions
12724 within the region.  `llvm.seh.scope.end` is used to mark when the scope ends
12725 and the EH cleanup is no longer required (e.g. because the destructor is being
12726 called).
12727
12728 .. _int_read_register:
12729 .. _int_read_volatile_register:
12730 .. _int_write_register:
12731
12732 '``llvm.read_register``', '``llvm.read_volatile_register``', and '``llvm.write_register``' Intrinsics
12733 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12734
12735 Syntax:
12736 """""""
12737
12738 ::
12739
12740       declare i32 @llvm.read_register.i32(metadata)
12741       declare i64 @llvm.read_register.i64(metadata)
12742       declare i32 @llvm.read_volatile_register.i32(metadata)
12743       declare i64 @llvm.read_volatile_register.i64(metadata)
12744       declare void @llvm.write_register.i32(metadata, i32 @value)
12745       declare void @llvm.write_register.i64(metadata, i64 @value)
12746       !0 = !{!"sp\00"}
12747
12748 Overview:
12749 """""""""
12750
12751 The '``llvm.read_register``', '``llvm.read_volatile_register``', and
12752 '``llvm.write_register``' intrinsics provide access to the named register.
12753 The register must be valid on the architecture being compiled to. The type
12754 needs to be compatible with the register being read.
12755
12756 Semantics:
12757 """"""""""
12758
12759 The '``llvm.read_register``' and '``llvm.read_volatile_register``' intrinsics
12760 return the current value of the register, where possible. The
12761 '``llvm.write_register``' intrinsic sets the current value of the register,
12762 where possible.
12763
12764 A call to '``llvm.read_volatile_register``' is assumed to have side-effects
12765 and possibly return a different value each time (e.g. for a timer register).
12766
12767 This is useful to implement named register global variables that need
12768 to always be mapped to a specific register, as is common practice on
12769 bare-metal programs including OS kernels.
12770
12771 The compiler doesn't check for register availability or use of the used
12772 register in surrounding code, including inline assembly. Because of that,
12773 allocatable registers are not supported.
12774
12775 Warning: So far it only works with the stack pointer on selected
12776 architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of
12777 work is needed to support other registers and even more so, allocatable
12778 registers.
12779
12780 .. _int_stacksave:
12781
12782 '``llvm.stacksave``' Intrinsic
12783 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12784
12785 Syntax:
12786 """""""
12787
12788 ::
12789
12790       declare i8* @llvm.stacksave()
12791
12792 Overview:
12793 """""""""
12794
12795 The '``llvm.stacksave``' intrinsic is used to remember the current state
12796 of the function stack, for use with
12797 :ref:`llvm.stackrestore <int_stackrestore>`. This is useful for
12798 implementing language features like scoped automatic variable sized
12799 arrays in C99.
12800
12801 Semantics:
12802 """"""""""
12803
12804 This intrinsic returns an opaque pointer value that can be passed to
12805 :ref:`llvm.stackrestore <int_stackrestore>`. When an
12806 ``llvm.stackrestore`` intrinsic is executed with a value saved from
12807 ``llvm.stacksave``, it effectively restores the state of the stack to
12808 the state it was in when the ``llvm.stacksave`` intrinsic executed. In
12809 practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that
12810 were allocated after the ``llvm.stacksave`` was executed.
12811
12812 .. _int_stackrestore:
12813
12814 '``llvm.stackrestore``' Intrinsic
12815 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12816
12817 Syntax:
12818 """""""
12819
12820 ::
12821
12822       declare void @llvm.stackrestore(i8* %ptr)
12823
12824 Overview:
12825 """""""""
12826
12827 The '``llvm.stackrestore``' intrinsic is used to restore the state of
12828 the function stack to the state it was in when the corresponding
12829 :ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is
12830 useful for implementing language features like scoped automatic variable
12831 sized arrays in C99.
12832
12833 Semantics:
12834 """"""""""
12835
12836 See the description for :ref:`llvm.stacksave <int_stacksave>`.
12837
12838 .. _int_get_dynamic_area_offset:
12839
12840 '``llvm.get.dynamic.area.offset``' Intrinsic
12841 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12842
12843 Syntax:
12844 """""""
12845
12846 ::
12847
12848       declare i32 @llvm.get.dynamic.area.offset.i32()
12849       declare i64 @llvm.get.dynamic.area.offset.i64()
12850
12851 Overview:
12852 """""""""
12853
12854       The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to
12855       get the offset from native stack pointer to the address of the most
12856       recent dynamic alloca on the caller's stack. These intrinsics are
12857       intended for use in combination with
12858       :ref:`llvm.stacksave <int_stacksave>` to get a
12859       pointer to the most recent dynamic alloca. This is useful, for example,
12860       for AddressSanitizer's stack unpoisoning routines.
12861
12862 Semantics:
12863 """"""""""
12864
12865       These intrinsics return a non-negative integer value that can be used to
12866       get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>`
12867       on the caller's stack. In particular, for targets where stack grows downwards,
12868       adding this offset to the native stack pointer would get the address of the most
12869       recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more
12870       complicated, because subtracting this value from stack pointer would get the address
12871       one past the end of the most recent dynamic alloca.
12872
12873       Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
12874       returns just a zero, for others, such as PowerPC and PowerPC64, it returns a
12875       compile-time-known constant value.
12876
12877       The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
12878       must match the target's default address space's (address space 0) pointer type.
12879
12880 '``llvm.prefetch``' Intrinsic
12881 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12882
12883 Syntax:
12884 """""""
12885
12886 ::
12887
12888       declare void @llvm.prefetch(i8* <address>, i32 <rw>, i32 <locality>, i32 <cache type>)
12889
12890 Overview:
12891 """""""""
12892
12893 The '``llvm.prefetch``' intrinsic is a hint to the code generator to
12894 insert a prefetch instruction if supported; otherwise, it is a noop.
12895 Prefetches have no effect on the behavior of the program but can change
12896 its performance characteristics.
12897
12898 Arguments:
12899 """"""""""
12900
12901 ``address`` is the address to be prefetched, ``rw`` is the specifier
12902 determining if the fetch should be for a read (0) or write (1), and
12903 ``locality`` is a temporal locality specifier ranging from (0) - no
12904 locality, to (3) - extremely local keep in cache. The ``cache type``
12905 specifies whether the prefetch is performed on the data (1) or
12906 instruction (0) cache. The ``rw``, ``locality`` and ``cache type``
12907 arguments must be constant integers.
12908
12909 Semantics:
12910 """"""""""
12911
12912 This intrinsic does not modify the behavior of the program. In
12913 particular, prefetches cannot trap and do not produce a value. On
12914 targets that support this intrinsic, the prefetch can provide hints to
12915 the processor cache for better performance.
12916
12917 '``llvm.pcmarker``' Intrinsic
12918 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12919
12920 Syntax:
12921 """""""
12922
12923 ::
12924
12925       declare void @llvm.pcmarker(i32 <id>)
12926
12927 Overview:
12928 """""""""
12929
12930 The '``llvm.pcmarker``' intrinsic is a method to export a Program
12931 Counter (PC) in a region of code to simulators and other tools. The
12932 method is target specific, but it is expected that the marker will use
12933 exported symbols to transmit the PC of the marker. The marker makes no
12934 guarantees that it will remain with any specific instruction after
12935 optimizations. It is possible that the presence of a marker will inhibit
12936 optimizations. The intended use is to be inserted after optimizations to
12937 allow correlations of simulation runs.
12938
12939 Arguments:
12940 """"""""""
12941
12942 ``id`` is a numerical id identifying the marker.
12943
12944 Semantics:
12945 """"""""""
12946
12947 This intrinsic does not modify the behavior of the program. Backends
12948 that do not support this intrinsic may ignore it.
12949
12950 '``llvm.readcyclecounter``' Intrinsic
12951 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12952
12953 Syntax:
12954 """""""
12955
12956 ::
12957
12958       declare i64 @llvm.readcyclecounter()
12959
12960 Overview:
12961 """""""""
12962
12963 The '``llvm.readcyclecounter``' intrinsic provides access to the cycle
12964 counter register (or similar low latency, high accuracy clocks) on those
12965 targets that support it. On X86, it should map to RDTSC. On Alpha, it
12966 should map to RPCC. As the backing counters overflow quickly (on the
12967 order of 9 seconds on alpha), this should only be used for small
12968 timings.
12969
12970 Semantics:
12971 """"""""""
12972
12973 When directly supported, reading the cycle counter should not modify any
12974 memory. Implementations are allowed to either return an application
12975 specific value or a system wide value. On backends without support, this
12976 is lowered to a constant 0.
12977
12978 Note that runtime support may be conditional on the privilege-level code is
12979 running at and the host platform.
12980
12981 '``llvm.clear_cache``' Intrinsic
12982 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12983
12984 Syntax:
12985 """""""
12986
12987 ::
12988
12989       declare void @llvm.clear_cache(i8*, i8*)
12990
12991 Overview:
12992 """""""""
12993
12994 The '``llvm.clear_cache``' intrinsic ensures visibility of modifications
12995 in the specified range to the execution unit of the processor. On
12996 targets with non-unified instruction and data cache, the implementation
12997 flushes the instruction cache.
12998
12999 Semantics:
13000 """"""""""
13001
13002 On platforms with coherent instruction and data caches (e.g. x86), this
13003 intrinsic is a nop. On platforms with non-coherent instruction and data
13004 cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate
13005 instructions or a system call, if cache flushing requires special
13006 privileges.
13007
13008 The default behavior is to emit a call to ``__clear_cache`` from the run
13009 time library.
13010
13011 This intrinsic does *not* empty the instruction pipeline. Modifications
13012 of the current function are outside the scope of the intrinsic.
13013
13014 '``llvm.instrprof.increment``' Intrinsic
13015 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13016
13017 Syntax:
13018 """""""
13019
13020 ::
13021
13022       declare void @llvm.instrprof.increment(i8* <name>, i64 <hash>,
13023                                              i32 <num-counters>, i32 <index>)
13024
13025 Overview:
13026 """""""""
13027
13028 The '``llvm.instrprof.increment``' intrinsic can be emitted by a
13029 frontend for use with instrumentation based profiling. These will be
13030 lowered by the ``-instrprof`` pass to generate execution counts of a
13031 program at runtime.
13032
13033 Arguments:
13034 """"""""""
13035
13036 The first argument is a pointer to a global variable containing the
13037 name of the entity being instrumented. This should generally be the
13038 (mangled) function name for a set of counters.
13039
13040 The second argument is a hash value that can be used by the consumer
13041 of the profile data to detect changes to the instrumented source, and
13042 the third is the number of counters associated with ``name``. It is an
13043 error if ``hash`` or ``num-counters`` differ between two instances of
13044 ``instrprof.increment`` that refer to the same name.
13045
13046 The last argument refers to which of the counters for ``name`` should
13047 be incremented. It should be a value between 0 and ``num-counters``.
13048
13049 Semantics:
13050 """"""""""
13051
13052 This intrinsic represents an increment of a profiling counter. It will
13053 cause the ``-instrprof`` pass to generate the appropriate data
13054 structures and the code to increment the appropriate value, in a
13055 format that can be written out by a compiler runtime and consumed via
13056 the ``llvm-profdata`` tool.
13057
13058 '``llvm.instrprof.increment.step``' Intrinsic
13059 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13060
13061 Syntax:
13062 """""""
13063
13064 ::
13065
13066       declare void @llvm.instrprof.increment.step(i8* <name>, i64 <hash>,
13067                                                   i32 <num-counters>,
13068                                                   i32 <index>, i64 <step>)
13069
13070 Overview:
13071 """""""""
13072
13073 The '``llvm.instrprof.increment.step``' intrinsic is an extension to
13074 the '``llvm.instrprof.increment``' intrinsic with an additional fifth
13075 argument to specify the step of the increment.
13076
13077 Arguments:
13078 """"""""""
13079 The first four arguments are the same as '``llvm.instrprof.increment``'
13080 intrinsic.
13081
13082 The last argument specifies the value of the increment of the counter variable.
13083
13084 Semantics:
13085 """"""""""
13086 See description of '``llvm.instrprof.increment``' intrinsic.
13087
13088
13089 '``llvm.instrprof.value.profile``' Intrinsic
13090 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13091
13092 Syntax:
13093 """""""
13094
13095 ::
13096
13097       declare void @llvm.instrprof.value.profile(i8* <name>, i64 <hash>,
13098                                                  i64 <value>, i32 <value_kind>,
13099                                                  i32 <index>)
13100
13101 Overview:
13102 """""""""
13103
13104 The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a
13105 frontend for use with instrumentation based profiling. This will be
13106 lowered by the ``-instrprof`` pass to find out the target values,
13107 instrumented expressions take in a program at runtime.
13108
13109 Arguments:
13110 """"""""""
13111
13112 The first argument is a pointer to a global variable containing the
13113 name of the entity being instrumented. ``name`` should generally be the
13114 (mangled) function name for a set of counters.
13115
13116 The second argument is a hash value that can be used by the consumer
13117 of the profile data to detect changes to the instrumented source. It
13118 is an error if ``hash`` differs between two instances of
13119 ``llvm.instrprof.*`` that refer to the same name.
13120
13121 The third argument is the value of the expression being profiled. The profiled
13122 expression's value should be representable as an unsigned 64-bit value. The
13123 fourth argument represents the kind of value profiling that is being done. The
13124 supported value profiling kinds are enumerated through the
13125 ``InstrProfValueKind`` type declared in the
13126 ``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the
13127 index of the instrumented expression within ``name``. It should be >= 0.
13128
13129 Semantics:
13130 """"""""""
13131
13132 This intrinsic represents the point where a call to a runtime routine
13133 should be inserted for value profiling of target expressions. ``-instrprof``
13134 pass will generate the appropriate data structures and replace the
13135 ``llvm.instrprof.value.profile`` intrinsic with the call to the profile
13136 runtime library with proper arguments.
13137
13138 '``llvm.thread.pointer``' Intrinsic
13139 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13140
13141 Syntax:
13142 """""""
13143
13144 ::
13145
13146       declare i8* @llvm.thread.pointer()
13147
13148 Overview:
13149 """""""""
13150
13151 The '``llvm.thread.pointer``' intrinsic returns the value of the thread
13152 pointer.
13153
13154 Semantics:
13155 """"""""""
13156
13157 The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area
13158 for the current thread.  The exact semantics of this value are target
13159 specific: it may point to the start of TLS area, to the end, or somewhere
13160 in the middle.  Depending on the target, this intrinsic may read a register,
13161 call a helper function, read from an alternate memory space, or perform
13162 other operations necessary to locate the TLS area.  Not all targets support
13163 this intrinsic.
13164
13165 '``llvm.call.preallocated.setup``' Intrinsic
13166 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13167
13168 Syntax:
13169 """""""
13170
13171 ::
13172
13173       declare token @llvm.call.preallocated.setup(i32 %num_args)
13174
13175 Overview:
13176 """""""""
13177
13178 The '``llvm.call.preallocated.setup``' intrinsic returns a token which can
13179 be used with a call's ``"preallocated"`` operand bundle to indicate that
13180 certain arguments are allocated and initialized before the call.
13181
13182 Semantics:
13183 """"""""""
13184
13185 The '``llvm.call.preallocated.setup``' intrinsic returns a token which is
13186 associated with at most one call. The token can be passed to
13187 '``@llvm.call.preallocated.arg``' to get a pointer to get that
13188 corresponding argument. The token must be the parameter to a
13189 ``"preallocated"`` operand bundle for the corresponding call.
13190
13191 Nested calls to '``llvm.call.preallocated.setup``' are allowed, but must
13192 be properly nested. e.g.
13193
13194 :: code-block:: llvm
13195
13196       %t1 = call token @llvm.call.preallocated.setup(i32 0)
13197       %t2 = call token @llvm.call.preallocated.setup(i32 0)
13198       call void foo() ["preallocated"(token %t2)]
13199       call void foo() ["preallocated"(token %t1)]
13200
13201 is allowed, but not
13202
13203 :: code-block:: llvm
13204
13205       %t1 = call token @llvm.call.preallocated.setup(i32 0)
13206       %t2 = call token @llvm.call.preallocated.setup(i32 0)
13207       call void foo() ["preallocated"(token %t1)]
13208       call void foo() ["preallocated"(token %t2)]
13209
13210 .. _int_call_preallocated_arg:
13211
13212 '``llvm.call.preallocated.arg``' Intrinsic
13213 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13214
13215 Syntax:
13216 """""""
13217
13218 ::
13219
13220       declare i8* @llvm.call.preallocated.arg(token %setup_token, i32 %arg_index)
13221
13222 Overview:
13223 """""""""
13224
13225 The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
13226 corresponding preallocated argument for the preallocated call.
13227
13228 Semantics:
13229 """"""""""
13230
13231 The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
13232 ``%arg_index``th argument with the ``preallocated`` attribute for
13233 the call associated with the ``%setup_token``, which must be from
13234 '``llvm.call.preallocated.setup``'.
13235
13236 A call to '``llvm.call.preallocated.arg``' must have a call site
13237 ``preallocated`` attribute. The type of the ``preallocated`` attribute must
13238 match the type used by the ``preallocated`` attribute of the corresponding
13239 argument at the preallocated call. The type is used in the case that an
13240 ``llvm.call.preallocated.setup`` does not have a corresponding call (e.g. due
13241 to DCE), where otherwise we cannot know how large the arguments are.
13242
13243 It is undefined behavior if this is called with a token from an
13244 '``llvm.call.preallocated.setup``' if another
13245 '``llvm.call.preallocated.setup``' has already been called or if the
13246 preallocated call corresponding to the '``llvm.call.preallocated.setup``'
13247 has already been called.
13248
13249 .. _int_call_preallocated_teardown:
13250
13251 '``llvm.call.preallocated.teardown``' Intrinsic
13252 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13253
13254 Syntax:
13255 """""""
13256
13257 ::
13258
13259       declare i8* @llvm.call.preallocated.teardown(token %setup_token)
13260
13261 Overview:
13262 """""""""
13263
13264 The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
13265 created by a '``llvm.call.preallocated.setup``'.
13266
13267 Semantics:
13268 """"""""""
13269
13270 The token argument must be a '``llvm.call.preallocated.setup``'.
13271
13272 The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
13273 allocated by the corresponding '``llvm.call.preallocated.setup``'. Exactly
13274 one of this or the preallocated call must be called to prevent stack leaks.
13275 It is undefined behavior to call both a '``llvm.call.preallocated.teardown``'
13276 and the preallocated call for a given '``llvm.call.preallocated.setup``'.
13277
13278 For example, if the stack is allocated for a preallocated call by a
13279 '``llvm.call.preallocated.setup``', then an initializer function called on an
13280 allocated argument throws an exception, there should be a
13281 '``llvm.call.preallocated.teardown``' in the exception handler to prevent
13282 stack leaks.
13283
13284 Following the nesting rules in '``llvm.call.preallocated.setup``', nested
13285 calls to '``llvm.call.preallocated.setup``' and
13286 '``llvm.call.preallocated.teardown``' are allowed but must be properly
13287 nested.
13288
13289 Example:
13290 """"""""
13291
13292 .. code-block:: llvm
13293
13294         %cs = call token @llvm.call.preallocated.setup(i32 1)
13295         %x = call i8* @llvm.call.preallocated.arg(token %cs, i32 0) preallocated(i32)
13296         %y = bitcast i8* %x to i32*
13297         invoke void @constructor(i32* %y) to label %conta unwind label %contb
13298     conta:
13299         call void @foo1(i32* preallocated(i32) %y) ["preallocated"(token %cs)]
13300         ret void
13301     contb:
13302         %s = catchswitch within none [label %catch] unwind to caller
13303     catch:
13304         %p = catchpad within %s []
13305         call void @llvm.call.preallocated.teardown(token %cs)
13306         ret void
13307
13308 Standard C/C++ Library Intrinsics
13309 ---------------------------------
13310
13311 LLVM provides intrinsics for a few important standard C/C++ library
13312 functions. These intrinsics allow source-language front-ends to pass
13313 information about the alignment of the pointer arguments to the code
13314 generator, providing opportunity for more efficient code generation.
13315
13316
13317 '``llvm.abs.*``' Intrinsic
13318 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13319
13320 Syntax:
13321 """""""
13322
13323 This is an overloaded intrinsic. You can use ``llvm.abs`` on any
13324 integer bit width or any vector of integer elements.
13325
13326 ::
13327
13328       declare i32 @llvm.abs.i32(i32 <src>, i1 <is_int_min_poison>)
13329       declare <4 x i32> @llvm.abs.v4i32(<4 x i32> <src>, i1 <is_int_min_poison>)
13330
13331 Overview:
13332 """""""""
13333
13334 The '``llvm.abs``' family of intrinsic functions returns the absolute value
13335 of an argument.
13336
13337 Arguments:
13338 """"""""""
13339
13340 The first argument is the value for which the absolute value is to be returned.
13341 This argument may be of any integer type or a vector with integer element type.
13342 The return type must match the first argument type.
13343
13344 The second argument must be a constant and is a flag to indicate whether the
13345 result value of the '``llvm.abs``' intrinsic is a
13346 :ref:`poison value <poisonvalues>` if the argument is statically or dynamically
13347 an ``INT_MIN`` value.
13348
13349 Semantics:
13350 """"""""""
13351
13352 The '``llvm.abs``' intrinsic returns the magnitude (always positive) of the
13353 argument or each element of a vector argument.". If the argument is ``INT_MIN``,
13354 then the result is also ``INT_MIN`` if ``is_int_min_poison == 0`` and
13355 ``poison`` otherwise.
13356
13357
13358 '``llvm.smax.*``' Intrinsic
13359 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13360
13361 Syntax:
13362 """""""
13363
13364 This is an overloaded intrinsic. You can use ``@llvm.smax`` on any
13365 integer bit width or any vector of integer elements.
13366
13367 ::
13368
13369       declare i32 @llvm.smax.i32(i32 %a, i32 %b)
13370       declare <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b)
13371
13372 Overview:
13373 """""""""
13374
13375 Return the larger of ``%a`` and ``%b`` comparing the values as signed integers.
13376 Vector intrinsics operate on a per-element basis. The larger element of ``%a``
13377 and ``%b`` at a given index is returned for that index.
13378
13379 Arguments:
13380 """"""""""
13381
13382 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13383 integer element type. The argument types must match each other, and the return
13384 type must match the argument type.
13385
13386
13387 '``llvm.smin.*``' Intrinsic
13388 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13389
13390 Syntax:
13391 """""""
13392
13393 This is an overloaded intrinsic. You can use ``@llvm.smin`` on any
13394 integer bit width or any vector of integer elements.
13395
13396 ::
13397
13398       declare i32 @llvm.smin.i32(i32 %a, i32 %b)
13399       declare <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b)
13400
13401 Overview:
13402 """""""""
13403
13404 Return the smaller of ``%a`` and ``%b`` comparing the values as signed integers.
13405 Vector intrinsics operate on a per-element basis. The smaller element of ``%a``
13406 and ``%b`` at a given index is returned for that index.
13407
13408 Arguments:
13409 """"""""""
13410
13411 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13412 integer element type. The argument types must match each other, and the return
13413 type must match the argument type.
13414
13415
13416 '``llvm.umax.*``' Intrinsic
13417 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13418
13419 Syntax:
13420 """""""
13421
13422 This is an overloaded intrinsic. You can use ``@llvm.umax`` on any
13423 integer bit width or any vector of integer elements.
13424
13425 ::
13426
13427       declare i32 @llvm.umax.i32(i32 %a, i32 %b)
13428       declare <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b)
13429
13430 Overview:
13431 """""""""
13432
13433 Return the larger of ``%a`` and ``%b`` comparing the values as unsigned
13434 integers. Vector intrinsics operate on a per-element basis. The larger element
13435 of ``%a`` and ``%b`` at a given index is returned for that index.
13436
13437 Arguments:
13438 """"""""""
13439
13440 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13441 integer element type. The argument types must match each other, and the return
13442 type must match the argument type.
13443
13444
13445 '``llvm.umin.*``' Intrinsic
13446 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13447
13448 Syntax:
13449 """""""
13450
13451 This is an overloaded intrinsic. You can use ``@llvm.umin`` on any
13452 integer bit width or any vector of integer elements.
13453
13454 ::
13455
13456       declare i32 @llvm.umin.i32(i32 %a, i32 %b)
13457       declare <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b)
13458
13459 Overview:
13460 """""""""
13461
13462 Return the smaller of ``%a`` and ``%b`` comparing the values as unsigned
13463 integers. Vector intrinsics operate on a per-element basis. The smaller element
13464 of ``%a`` and ``%b`` at a given index is returned for that index.
13465
13466 Arguments:
13467 """"""""""
13468
13469 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13470 integer element type. The argument types must match each other, and the return
13471 type must match the argument type.
13472
13473
13474 .. _int_memcpy:
13475
13476 '``llvm.memcpy``' Intrinsic
13477 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13478
13479 Syntax:
13480 """""""
13481
13482 This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any
13483 integer bit width and for different address spaces. Not all targets
13484 support all bit widths however.
13485
13486 ::
13487
13488       declare void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
13489                                               i32 <len>, i1 <isvolatile>)
13490       declare void @llvm.memcpy.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
13491                                               i64 <len>, i1 <isvolatile>)
13492
13493 Overview:
13494 """""""""
13495
13496 The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
13497 source location to the destination location.
13498
13499 Note that, unlike the standard libc function, the ``llvm.memcpy.*``
13500 intrinsics do not return a value, takes extra isvolatile
13501 arguments and the pointers can be in specified address spaces.
13502
13503 Arguments:
13504 """"""""""
13505
13506 The first argument is a pointer to the destination, the second is a
13507 pointer to the source. The third argument is an integer argument
13508 specifying the number of bytes to copy, and the fourth is a
13509 boolean indicating a volatile access.
13510
13511 The :ref:`align <attr_align>` parameter attribute can be provided
13512 for the first and second arguments.
13513
13514 If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is
13515 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13516 very cleanly specified and it is unwise to depend on it.
13517
13518 Semantics:
13519 """"""""""
13520
13521 The '``llvm.memcpy.*``' intrinsics copy a block of memory from the source
13522 location to the destination location, which must either be equal or
13523 non-overlapping. It copies "len" bytes of memory over. If the argument is known
13524 to be aligned to some boundary, this can be specified as an attribute on the
13525 argument.
13526
13527 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13528 the arguments.
13529 If ``<len>`` is not a well-defined value, the behavior is undefined.
13530 If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
13531 otherwise the behavior is undefined.
13532
13533 .. _int_memcpy_inline:
13534
13535 '``llvm.memcpy.inline``' Intrinsic
13536 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13537
13538 Syntax:
13539 """""""
13540
13541 This is an overloaded intrinsic. You can use ``llvm.memcpy.inline`` on any
13542 integer bit width and for different address spaces. Not all targets
13543 support all bit widths however.
13544
13545 ::
13546
13547       declare void @llvm.memcpy.inline.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
13548                                                      i32 <len>, i1 <isvolatile>)
13549       declare void @llvm.memcpy.inline.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
13550                                                      i64 <len>, i1 <isvolatile>)
13551
13552 Overview:
13553 """""""""
13554
13555 The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
13556 source location to the destination location and guarantees that no external
13557 functions are called.
13558
13559 Note that, unlike the standard libc function, the ``llvm.memcpy.inline.*``
13560 intrinsics do not return a value, takes extra isvolatile
13561 arguments and the pointers can be in specified address spaces.
13562
13563 Arguments:
13564 """"""""""
13565
13566 The first argument is a pointer to the destination, the second is a
13567 pointer to the source. The third argument is a constant integer argument
13568 specifying the number of bytes to copy, and the fourth is a
13569 boolean indicating a volatile access.
13570
13571 The :ref:`align <attr_align>` parameter attribute can be provided
13572 for the first and second arguments.
13573
13574 If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy.inline`` call is
13575 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13576 very cleanly specified and it is unwise to depend on it.
13577
13578 Semantics:
13579 """"""""""
13580
13581 The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
13582 source location to the destination location, which are not allowed to
13583 overlap. It copies "len" bytes of memory over. If the argument is known
13584 to be aligned to some boundary, this can be specified as an attribute on
13585 the argument.
13586 The behavior of '``llvm.memcpy.inline.*``' is equivalent to the behavior of
13587 '``llvm.memcpy.*``', but the generated code is guaranteed not to call any
13588 external functions.
13589
13590 .. _int_memmove:
13591
13592 '``llvm.memmove``' Intrinsic
13593 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13594
13595 Syntax:
13596 """""""
13597
13598 This is an overloaded intrinsic. You can use llvm.memmove on any integer
13599 bit width and for different address space. Not all targets support all
13600 bit widths however.
13601
13602 ::
13603
13604       declare void @llvm.memmove.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
13605                                                i32 <len>, i1 <isvolatile>)
13606       declare void @llvm.memmove.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
13607                                                i64 <len>, i1 <isvolatile>)
13608
13609 Overview:
13610 """""""""
13611
13612 The '``llvm.memmove.*``' intrinsics move a block of memory from the
13613 source location to the destination location. It is similar to the
13614 '``llvm.memcpy``' intrinsic but allows the two memory locations to
13615 overlap.
13616
13617 Note that, unlike the standard libc function, the ``llvm.memmove.*``
13618 intrinsics do not return a value, takes an extra isvolatile
13619 argument and the pointers can be in specified address spaces.
13620
13621 Arguments:
13622 """"""""""
13623
13624 The first argument is a pointer to the destination, the second is a
13625 pointer to the source. The third argument is an integer argument
13626 specifying the number of bytes to copy, and the fourth is a
13627 boolean indicating a volatile access.
13628
13629 The :ref:`align <attr_align>` parameter attribute can be provided
13630 for the first and second arguments.
13631
13632 If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call
13633 is a :ref:`volatile operation <volatile>`. The detailed access behavior is
13634 not very cleanly specified and it is unwise to depend on it.
13635
13636 Semantics:
13637 """"""""""
13638
13639 The '``llvm.memmove.*``' intrinsics copy a block of memory from the
13640 source location to the destination location, which may overlap. It
13641 copies "len" bytes of memory over. If the argument is known to be
13642 aligned to some boundary, this can be specified as an attribute on
13643 the argument.
13644
13645 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13646 the arguments.
13647 If ``<len>`` is not a well-defined value, the behavior is undefined.
13648 If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
13649 otherwise the behavior is undefined.
13650
13651 .. _int_memset:
13652
13653 '``llvm.memset.*``' Intrinsics
13654 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13655
13656 Syntax:
13657 """""""
13658
13659 This is an overloaded intrinsic. You can use llvm.memset on any integer
13660 bit width and for different address spaces. However, not all targets
13661 support all bit widths.
13662
13663 ::
13664
13665       declare void @llvm.memset.p0i8.i32(i8* <dest>, i8 <val>,
13666                                          i32 <len>, i1 <isvolatile>)
13667       declare void @llvm.memset.p0i8.i64(i8* <dest>, i8 <val>,
13668                                          i64 <len>, i1 <isvolatile>)
13669
13670 Overview:
13671 """""""""
13672
13673 The '``llvm.memset.*``' intrinsics fill a block of memory with a
13674 particular byte value.
13675
13676 Note that, unlike the standard libc function, the ``llvm.memset``
13677 intrinsic does not return a value and takes an extra volatile
13678 argument. Also, the destination can be in an arbitrary address space.
13679
13680 Arguments:
13681 """"""""""
13682
13683 The first argument is a pointer to the destination to fill, the second
13684 is the byte value with which to fill it, the third argument is an
13685 integer argument specifying the number of bytes to fill, and the fourth
13686 is a boolean indicating a volatile access.
13687
13688 The :ref:`align <attr_align>` parameter attribute can be provided
13689 for the first arguments.
13690
13691 If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is
13692 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13693 very cleanly specified and it is unwise to depend on it.
13694
13695 Semantics:
13696 """"""""""
13697
13698 The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting
13699 at the destination location. If the argument is known to be
13700 aligned to some boundary, this can be specified as an attribute on
13701 the argument.
13702
13703 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13704 the arguments.
13705 If ``<len>`` is not a well-defined value, the behavior is undefined.
13706 If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
13707 otherwise the behavior is undefined.
13708
13709 '``llvm.sqrt.*``' Intrinsic
13710 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13711
13712 Syntax:
13713 """""""
13714
13715 This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any
13716 floating-point or vector of floating-point type. Not all targets support
13717 all types however.
13718
13719 ::
13720
13721       declare float     @llvm.sqrt.f32(float %Val)
13722       declare double    @llvm.sqrt.f64(double %Val)
13723       declare x86_fp80  @llvm.sqrt.f80(x86_fp80 %Val)
13724       declare fp128     @llvm.sqrt.f128(fp128 %Val)
13725       declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)
13726
13727 Overview:
13728 """""""""
13729
13730 The '``llvm.sqrt``' intrinsics return the square root of the specified value.
13731
13732 Arguments:
13733 """"""""""
13734
13735 The argument and return value are floating-point numbers of the same type.
13736
13737 Semantics:
13738 """"""""""
13739
13740 Return the same value as a corresponding libm '``sqrt``' function but without
13741 trapping or setting ``errno``. For types specified by IEEE-754, the result
13742 matches a conforming libm implementation.
13743
13744 When specified with the fast-math-flag 'afn', the result may be approximated
13745 using a less accurate calculation.
13746
13747 '``llvm.powi.*``' Intrinsic
13748 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13749
13750 Syntax:
13751 """""""
13752
13753 This is an overloaded intrinsic. You can use ``llvm.powi`` on any
13754 floating-point or vector of floating-point type. Not all targets support
13755 all types however.
13756
13757 Generally, the only supported type for the exponent is the one matching
13758 with the C type ``int``.
13759
13760 ::
13761
13762       declare float     @llvm.powi.f32.i32(float  %Val, i32 %power)
13763       declare double    @llvm.powi.f64.i16(double %Val, i16 %power)
13764       declare x86_fp80  @llvm.powi.f80.i32(x86_fp80  %Val, i32 %power)
13765       declare fp128     @llvm.powi.f128.i32(fp128 %Val, i32 %power)
13766       declare ppc_fp128 @llvm.powi.ppcf128.i32(ppc_fp128  %Val, i32 %power)
13767
13768 Overview:
13769 """""""""
13770
13771 The '``llvm.powi.*``' intrinsics return the first operand raised to the
13772 specified (positive or negative) power. The order of evaluation of
13773 multiplications is not defined. When a vector of floating-point type is
13774 used, the second argument remains a scalar integer value.
13775
13776 Arguments:
13777 """"""""""
13778
13779 The second argument is an integer power, and the first is a value to
13780 raise to that power.
13781
13782 Semantics:
13783 """"""""""
13784
13785 This function returns the first value raised to the second power with an
13786 unspecified sequence of rounding operations.
13787
13788 '``llvm.sin.*``' Intrinsic
13789 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13790
13791 Syntax:
13792 """""""
13793
13794 This is an overloaded intrinsic. You can use ``llvm.sin`` on any
13795 floating-point or vector of floating-point type. Not all targets support
13796 all types however.
13797
13798 ::
13799
13800       declare float     @llvm.sin.f32(float  %Val)
13801       declare double    @llvm.sin.f64(double %Val)
13802       declare x86_fp80  @llvm.sin.f80(x86_fp80  %Val)
13803       declare fp128     @llvm.sin.f128(fp128 %Val)
13804       declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128  %Val)
13805
13806 Overview:
13807 """""""""
13808
13809 The '``llvm.sin.*``' intrinsics return the sine of the operand.
13810
13811 Arguments:
13812 """"""""""
13813
13814 The argument and return value are floating-point numbers of the same type.
13815
13816 Semantics:
13817 """"""""""
13818
13819 Return the same value as a corresponding libm '``sin``' function but without
13820 trapping or setting ``errno``.
13821
13822 When specified with the fast-math-flag 'afn', the result may be approximated
13823 using a less accurate calculation.
13824
13825 '``llvm.cos.*``' Intrinsic
13826 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13827
13828 Syntax:
13829 """""""
13830
13831 This is an overloaded intrinsic. You can use ``llvm.cos`` on any
13832 floating-point or vector of floating-point type. Not all targets support
13833 all types however.
13834
13835 ::
13836
13837       declare float     @llvm.cos.f32(float  %Val)
13838       declare double    @llvm.cos.f64(double %Val)
13839       declare x86_fp80  @llvm.cos.f80(x86_fp80  %Val)
13840       declare fp128     @llvm.cos.f128(fp128 %Val)
13841       declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128  %Val)
13842
13843 Overview:
13844 """""""""
13845
13846 The '``llvm.cos.*``' intrinsics return the cosine of the operand.
13847
13848 Arguments:
13849 """"""""""
13850
13851 The argument and return value are floating-point numbers of the same type.
13852
13853 Semantics:
13854 """"""""""
13855
13856 Return the same value as a corresponding libm '``cos``' function but without
13857 trapping or setting ``errno``.
13858
13859 When specified with the fast-math-flag 'afn', the result may be approximated
13860 using a less accurate calculation.
13861
13862 '``llvm.pow.*``' Intrinsic
13863 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13864
13865 Syntax:
13866 """""""
13867
13868 This is an overloaded intrinsic. You can use ``llvm.pow`` on any
13869 floating-point or vector of floating-point type. Not all targets support
13870 all types however.
13871
13872 ::
13873
13874       declare float     @llvm.pow.f32(float  %Val, float %Power)
13875       declare double    @llvm.pow.f64(double %Val, double %Power)
13876       declare x86_fp80  @llvm.pow.f80(x86_fp80  %Val, x86_fp80 %Power)
13877       declare fp128     @llvm.pow.f128(fp128 %Val, fp128 %Power)
13878       declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128  %Val, ppc_fp128 Power)
13879
13880 Overview:
13881 """""""""
13882
13883 The '``llvm.pow.*``' intrinsics return the first operand raised to the
13884 specified (positive or negative) power.
13885
13886 Arguments:
13887 """"""""""
13888
13889 The arguments and return value are floating-point numbers of the same type.
13890
13891 Semantics:
13892 """"""""""
13893
13894 Return the same value as a corresponding libm '``pow``' function but without
13895 trapping or setting ``errno``.
13896
13897 When specified with the fast-math-flag 'afn', the result may be approximated
13898 using a less accurate calculation.
13899
13900 '``llvm.exp.*``' Intrinsic
13901 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13902
13903 Syntax:
13904 """""""
13905
13906 This is an overloaded intrinsic. You can use ``llvm.exp`` on any
13907 floating-point or vector of floating-point type. Not all targets support
13908 all types however.
13909
13910 ::
13911
13912       declare float     @llvm.exp.f32(float  %Val)
13913       declare double    @llvm.exp.f64(double %Val)
13914       declare x86_fp80  @llvm.exp.f80(x86_fp80  %Val)
13915       declare fp128     @llvm.exp.f128(fp128 %Val)
13916       declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128  %Val)
13917
13918 Overview:
13919 """""""""
13920
13921 The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified
13922 value.
13923
13924 Arguments:
13925 """"""""""
13926
13927 The argument and return value are floating-point numbers of the same type.
13928
13929 Semantics:
13930 """"""""""
13931
13932 Return the same value as a corresponding libm '``exp``' function but without
13933 trapping or setting ``errno``.
13934
13935 When specified with the fast-math-flag 'afn', the result may be approximated
13936 using a less accurate calculation.
13937
13938 '``llvm.exp2.*``' Intrinsic
13939 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13940
13941 Syntax:
13942 """""""
13943
13944 This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
13945 floating-point or vector of floating-point type. Not all targets support
13946 all types however.
13947
13948 ::
13949
13950       declare float     @llvm.exp2.f32(float  %Val)
13951       declare double    @llvm.exp2.f64(double %Val)
13952       declare x86_fp80  @llvm.exp2.f80(x86_fp80  %Val)
13953       declare fp128     @llvm.exp2.f128(fp128 %Val)
13954       declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128  %Val)
13955
13956 Overview:
13957 """""""""
13958
13959 The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the
13960 specified value.
13961
13962 Arguments:
13963 """"""""""
13964
13965 The argument and return value are floating-point numbers of the same type.
13966
13967 Semantics:
13968 """"""""""
13969
13970 Return the same value as a corresponding libm '``exp2``' function but without
13971 trapping or setting ``errno``.
13972
13973 When specified with the fast-math-flag 'afn', the result may be approximated
13974 using a less accurate calculation.
13975
13976 '``llvm.log.*``' Intrinsic
13977 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13978
13979 Syntax:
13980 """""""
13981
13982 This is an overloaded intrinsic. You can use ``llvm.log`` on any
13983 floating-point or vector of floating-point type. Not all targets support
13984 all types however.
13985
13986 ::
13987
13988       declare float     @llvm.log.f32(float  %Val)
13989       declare double    @llvm.log.f64(double %Val)
13990       declare x86_fp80  @llvm.log.f80(x86_fp80  %Val)
13991       declare fp128     @llvm.log.f128(fp128 %Val)
13992       declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128  %Val)
13993
13994 Overview:
13995 """""""""
13996
13997 The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified
13998 value.
13999
14000 Arguments:
14001 """"""""""
14002
14003 The argument and return value are floating-point numbers of the same type.
14004
14005 Semantics:
14006 """"""""""
14007
14008 Return the same value as a corresponding libm '``log``' function but without
14009 trapping or setting ``errno``.
14010
14011 When specified with the fast-math-flag 'afn', the result may be approximated
14012 using a less accurate calculation.
14013
14014 '``llvm.log10.*``' Intrinsic
14015 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14016
14017 Syntax:
14018 """""""
14019
14020 This is an overloaded intrinsic. You can use ``llvm.log10`` on any
14021 floating-point or vector of floating-point type. Not all targets support
14022 all types however.
14023
14024 ::
14025
14026       declare float     @llvm.log10.f32(float  %Val)
14027       declare double    @llvm.log10.f64(double %Val)
14028       declare x86_fp80  @llvm.log10.f80(x86_fp80  %Val)
14029       declare fp128     @llvm.log10.f128(fp128 %Val)
14030       declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128  %Val)
14031
14032 Overview:
14033 """""""""
14034
14035 The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the
14036 specified value.
14037
14038 Arguments:
14039 """"""""""
14040
14041 The argument and return value are floating-point numbers of the same type.
14042
14043 Semantics:
14044 """"""""""
14045
14046 Return the same value as a corresponding libm '``log10``' function but without
14047 trapping or setting ``errno``.
14048
14049 When specified with the fast-math-flag 'afn', the result may be approximated
14050 using a less accurate calculation.
14051
14052 '``llvm.log2.*``' Intrinsic
14053 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14054
14055 Syntax:
14056 """""""
14057
14058 This is an overloaded intrinsic. You can use ``llvm.log2`` on any
14059 floating-point or vector of floating-point type. Not all targets support
14060 all types however.
14061
14062 ::
14063
14064       declare float     @llvm.log2.f32(float  %Val)
14065       declare double    @llvm.log2.f64(double %Val)
14066       declare x86_fp80  @llvm.log2.f80(x86_fp80  %Val)
14067       declare fp128     @llvm.log2.f128(fp128 %Val)
14068       declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128  %Val)
14069
14070 Overview:
14071 """""""""
14072
14073 The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified
14074 value.
14075
14076 Arguments:
14077 """"""""""
14078
14079 The argument and return value are floating-point numbers of the same type.
14080
14081 Semantics:
14082 """"""""""
14083
14084 Return the same value as a corresponding libm '``log2``' function but without
14085 trapping or setting ``errno``.
14086
14087 When specified with the fast-math-flag 'afn', the result may be approximated
14088 using a less accurate calculation.
14089
14090 .. _int_fma:
14091
14092 '``llvm.fma.*``' Intrinsic
14093 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14094
14095 Syntax:
14096 """""""
14097
14098 This is an overloaded intrinsic. You can use ``llvm.fma`` on any
14099 floating-point or vector of floating-point type. Not all targets support
14100 all types however.
14101
14102 ::
14103
14104       declare float     @llvm.fma.f32(float  %a, float  %b, float  %c)
14105       declare double    @llvm.fma.f64(double %a, double %b, double %c)
14106       declare x86_fp80  @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c)
14107       declare fp128     @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c)
14108       declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c)
14109
14110 Overview:
14111 """""""""
14112
14113 The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation.
14114
14115 Arguments:
14116 """"""""""
14117
14118 The arguments and return value are floating-point numbers of the same type.
14119
14120 Semantics:
14121 """"""""""
14122
14123 Return the same value as a corresponding libm '``fma``' function but without
14124 trapping or setting ``errno``.
14125
14126 When specified with the fast-math-flag 'afn', the result may be approximated
14127 using a less accurate calculation.
14128
14129 '``llvm.fabs.*``' Intrinsic
14130 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14131
14132 Syntax:
14133 """""""
14134
14135 This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
14136 floating-point or vector of floating-point type. Not all targets support
14137 all types however.
14138
14139 ::
14140
14141       declare float     @llvm.fabs.f32(float  %Val)
14142       declare double    @llvm.fabs.f64(double %Val)
14143       declare x86_fp80  @llvm.fabs.f80(x86_fp80 %Val)
14144       declare fp128     @llvm.fabs.f128(fp128 %Val)
14145       declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val)
14146
14147 Overview:
14148 """""""""
14149
14150 The '``llvm.fabs.*``' intrinsics return the absolute value of the
14151 operand.
14152
14153 Arguments:
14154 """"""""""
14155
14156 The argument and return value are floating-point numbers of the same
14157 type.
14158
14159 Semantics:
14160 """"""""""
14161
14162 This function returns the same values as the libm ``fabs`` functions
14163 would, and handles error conditions in the same way.
14164
14165 '``llvm.minnum.*``' Intrinsic
14166 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14167
14168 Syntax:
14169 """""""
14170
14171 This is an overloaded intrinsic. You can use ``llvm.minnum`` on any
14172 floating-point or vector of floating-point type. Not all targets support
14173 all types however.
14174
14175 ::
14176
14177       declare float     @llvm.minnum.f32(float %Val0, float %Val1)
14178       declare double    @llvm.minnum.f64(double %Val0, double %Val1)
14179       declare x86_fp80  @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14180       declare fp128     @llvm.minnum.f128(fp128 %Val0, fp128 %Val1)
14181       declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14182
14183 Overview:
14184 """""""""
14185
14186 The '``llvm.minnum.*``' intrinsics return the minimum of the two
14187 arguments.
14188
14189
14190 Arguments:
14191 """"""""""
14192
14193 The arguments and return value are floating-point numbers of the same
14194 type.
14195
14196 Semantics:
14197 """"""""""
14198
14199 Follows the IEEE-754 semantics for minNum, except for handling of
14200 signaling NaNs. This match's the behavior of libm's fmin.
14201
14202 If either operand is a NaN, returns the other non-NaN operand. Returns
14203 NaN only if both operands are NaN. The returned NaN is always
14204 quiet. If the operands compare equal, returns a value that compares
14205 equal to both operands. This means that fmin(+/-0.0, +/-0.0) could
14206 return either -0.0 or 0.0.
14207
14208 Unlike the IEEE-754 2008 behavior, this does not distinguish between
14209 signaling and quiet NaN inputs. If a target's implementation follows
14210 the standard and returns a quiet NaN if either input is a signaling
14211 NaN, the intrinsic lowering is responsible for quieting the inputs to
14212 correctly return the non-NaN input (e.g. by using the equivalent of
14213 ``llvm.canonicalize``).
14214
14215
14216 '``llvm.maxnum.*``' Intrinsic
14217 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14218
14219 Syntax:
14220 """""""
14221
14222 This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any
14223 floating-point or vector of floating-point type. Not all targets support
14224 all types however.
14225
14226 ::
14227
14228       declare float     @llvm.maxnum.f32(float  %Val0, float  %Val1)
14229       declare double    @llvm.maxnum.f64(double %Val0, double %Val1)
14230       declare x86_fp80  @llvm.maxnum.f80(x86_fp80  %Val0, x86_fp80  %Val1)
14231       declare fp128     @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1)
14232       declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128  %Val0, ppc_fp128  %Val1)
14233
14234 Overview:
14235 """""""""
14236
14237 The '``llvm.maxnum.*``' intrinsics return the maximum of the two
14238 arguments.
14239
14240
14241 Arguments:
14242 """"""""""
14243
14244 The arguments and return value are floating-point numbers of the same
14245 type.
14246
14247 Semantics:
14248 """"""""""
14249 Follows the IEEE-754 semantics for maxNum except for the handling of
14250 signaling NaNs. This matches the behavior of libm's fmax.
14251
14252 If either operand is a NaN, returns the other non-NaN operand. Returns
14253 NaN only if both operands are NaN. The returned NaN is always
14254 quiet. If the operands compare equal, returns a value that compares
14255 equal to both operands. This means that fmax(+/-0.0, +/-0.0) could
14256 return either -0.0 or 0.0.
14257
14258 Unlike the IEEE-754 2008 behavior, this does not distinguish between
14259 signaling and quiet NaN inputs. If a target's implementation follows
14260 the standard and returns a quiet NaN if either input is a signaling
14261 NaN, the intrinsic lowering is responsible for quieting the inputs to
14262 correctly return the non-NaN input (e.g. by using the equivalent of
14263 ``llvm.canonicalize``).
14264
14265 '``llvm.minimum.*``' Intrinsic
14266 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14267
14268 Syntax:
14269 """""""
14270
14271 This is an overloaded intrinsic. You can use ``llvm.minimum`` on any
14272 floating-point or vector of floating-point type. Not all targets support
14273 all types however.
14274
14275 ::
14276
14277       declare float     @llvm.minimum.f32(float %Val0, float %Val1)
14278       declare double    @llvm.minimum.f64(double %Val0, double %Val1)
14279       declare x86_fp80  @llvm.minimum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14280       declare fp128     @llvm.minimum.f128(fp128 %Val0, fp128 %Val1)
14281       declare ppc_fp128 @llvm.minimum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14282
14283 Overview:
14284 """""""""
14285
14286 The '``llvm.minimum.*``' intrinsics return the minimum of the two
14287 arguments, propagating NaNs and treating -0.0 as less than +0.0.
14288
14289
14290 Arguments:
14291 """"""""""
14292
14293 The arguments and return value are floating-point numbers of the same
14294 type.
14295
14296 Semantics:
14297 """"""""""
14298 If either operand is a NaN, returns NaN. Otherwise returns the lesser
14299 of the two arguments. -0.0 is considered to be less than +0.0 for this
14300 intrinsic. Note that these are the semantics specified in the draft of
14301 IEEE 754-2018.
14302
14303 '``llvm.maximum.*``' Intrinsic
14304 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14305
14306 Syntax:
14307 """""""
14308
14309 This is an overloaded intrinsic. You can use ``llvm.maximum`` on any
14310 floating-point or vector of floating-point type. Not all targets support
14311 all types however.
14312
14313 ::
14314
14315       declare float     @llvm.maximum.f32(float %Val0, float %Val1)
14316       declare double    @llvm.maximum.f64(double %Val0, double %Val1)
14317       declare x86_fp80  @llvm.maximum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14318       declare fp128     @llvm.maximum.f128(fp128 %Val0, fp128 %Val1)
14319       declare ppc_fp128 @llvm.maximum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14320
14321 Overview:
14322 """""""""
14323
14324 The '``llvm.maximum.*``' intrinsics return the maximum of the two
14325 arguments, propagating NaNs and treating -0.0 as less than +0.0.
14326
14327
14328 Arguments:
14329 """"""""""
14330
14331 The arguments and return value are floating-point numbers of the same
14332 type.
14333
14334 Semantics:
14335 """"""""""
14336 If either operand is a NaN, returns NaN. Otherwise returns the greater
14337 of the two arguments. -0.0 is considered to be less than +0.0 for this
14338 intrinsic. Note that these are the semantics specified in the draft of
14339 IEEE 754-2018.
14340
14341 '``llvm.copysign.*``' Intrinsic
14342 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14343
14344 Syntax:
14345 """""""
14346
14347 This is an overloaded intrinsic. You can use ``llvm.copysign`` on any
14348 floating-point or vector of floating-point type. Not all targets support
14349 all types however.
14350
14351 ::
14352
14353       declare float     @llvm.copysign.f32(float  %Mag, float  %Sgn)
14354       declare double    @llvm.copysign.f64(double %Mag, double %Sgn)
14355       declare x86_fp80  @llvm.copysign.f80(x86_fp80  %Mag, x86_fp80  %Sgn)
14356       declare fp128     @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn)
14357       declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128  %Mag, ppc_fp128  %Sgn)
14358
14359 Overview:
14360 """""""""
14361
14362 The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the
14363 first operand and the sign of the second operand.
14364
14365 Arguments:
14366 """"""""""
14367
14368 The arguments and return value are floating-point numbers of the same
14369 type.
14370
14371 Semantics:
14372 """"""""""
14373
14374 This function returns the same values as the libm ``copysign``
14375 functions would, and handles error conditions in the same way.
14376
14377 '``llvm.floor.*``' Intrinsic
14378 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14379
14380 Syntax:
14381 """""""
14382
14383 This is an overloaded intrinsic. You can use ``llvm.floor`` on any
14384 floating-point or vector of floating-point type. Not all targets support
14385 all types however.
14386
14387 ::
14388
14389       declare float     @llvm.floor.f32(float  %Val)
14390       declare double    @llvm.floor.f64(double %Val)
14391       declare x86_fp80  @llvm.floor.f80(x86_fp80  %Val)
14392       declare fp128     @llvm.floor.f128(fp128 %Val)
14393       declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128  %Val)
14394
14395 Overview:
14396 """""""""
14397
14398 The '``llvm.floor.*``' intrinsics return the floor of the operand.
14399
14400 Arguments:
14401 """"""""""
14402
14403 The argument and return value are floating-point numbers of the same
14404 type.
14405
14406 Semantics:
14407 """"""""""
14408
14409 This function returns the same values as the libm ``floor`` functions
14410 would, and handles error conditions in the same way.
14411
14412 '``llvm.ceil.*``' Intrinsic
14413 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14414
14415 Syntax:
14416 """""""
14417
14418 This is an overloaded intrinsic. You can use ``llvm.ceil`` on any
14419 floating-point or vector of floating-point type. Not all targets support
14420 all types however.
14421
14422 ::
14423
14424       declare float     @llvm.ceil.f32(float  %Val)
14425       declare double    @llvm.ceil.f64(double %Val)
14426       declare x86_fp80  @llvm.ceil.f80(x86_fp80  %Val)
14427       declare fp128     @llvm.ceil.f128(fp128 %Val)
14428       declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128  %Val)
14429
14430 Overview:
14431 """""""""
14432
14433 The '``llvm.ceil.*``' intrinsics return the ceiling of the operand.
14434
14435 Arguments:
14436 """"""""""
14437
14438 The argument and return value are floating-point numbers of the same
14439 type.
14440
14441 Semantics:
14442 """"""""""
14443
14444 This function returns the same values as the libm ``ceil`` functions
14445 would, and handles error conditions in the same way.
14446
14447 '``llvm.trunc.*``' Intrinsic
14448 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14449
14450 Syntax:
14451 """""""
14452
14453 This is an overloaded intrinsic. You can use ``llvm.trunc`` on any
14454 floating-point or vector of floating-point type. Not all targets support
14455 all types however.
14456
14457 ::
14458
14459       declare float     @llvm.trunc.f32(float  %Val)
14460       declare double    @llvm.trunc.f64(double %Val)
14461       declare x86_fp80  @llvm.trunc.f80(x86_fp80  %Val)
14462       declare fp128     @llvm.trunc.f128(fp128 %Val)
14463       declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128  %Val)
14464
14465 Overview:
14466 """""""""
14467
14468 The '``llvm.trunc.*``' intrinsics returns the operand rounded to the
14469 nearest integer not larger in magnitude than the operand.
14470
14471 Arguments:
14472 """"""""""
14473
14474 The argument and return value are floating-point numbers of the same
14475 type.
14476
14477 Semantics:
14478 """"""""""
14479
14480 This function returns the same values as the libm ``trunc`` functions
14481 would, and handles error conditions in the same way.
14482
14483 '``llvm.rint.*``' Intrinsic
14484 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14485
14486 Syntax:
14487 """""""
14488
14489 This is an overloaded intrinsic. You can use ``llvm.rint`` on any
14490 floating-point or vector of floating-point type. Not all targets support
14491 all types however.
14492
14493 ::
14494
14495       declare float     @llvm.rint.f32(float  %Val)
14496       declare double    @llvm.rint.f64(double %Val)
14497       declare x86_fp80  @llvm.rint.f80(x86_fp80  %Val)
14498       declare fp128     @llvm.rint.f128(fp128 %Val)
14499       declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128  %Val)
14500
14501 Overview:
14502 """""""""
14503
14504 The '``llvm.rint.*``' intrinsics returns the operand rounded to the
14505 nearest integer. It may raise an inexact floating-point exception if the
14506 operand isn't an integer.
14507
14508 Arguments:
14509 """"""""""
14510
14511 The argument and return value are floating-point numbers of the same
14512 type.
14513
14514 Semantics:
14515 """"""""""
14516
14517 This function returns the same values as the libm ``rint`` functions
14518 would, and handles error conditions in the same way.
14519
14520 '``llvm.nearbyint.*``' Intrinsic
14521 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14522
14523 Syntax:
14524 """""""
14525
14526 This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any
14527 floating-point or vector of floating-point type. Not all targets support
14528 all types however.
14529
14530 ::
14531
14532       declare float     @llvm.nearbyint.f32(float  %Val)
14533       declare double    @llvm.nearbyint.f64(double %Val)
14534       declare x86_fp80  @llvm.nearbyint.f80(x86_fp80  %Val)
14535       declare fp128     @llvm.nearbyint.f128(fp128 %Val)
14536       declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128  %Val)
14537
14538 Overview:
14539 """""""""
14540
14541 The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the
14542 nearest integer.
14543
14544 Arguments:
14545 """"""""""
14546
14547 The argument and return value are floating-point numbers of the same
14548 type.
14549
14550 Semantics:
14551 """"""""""
14552
14553 This function returns the same values as the libm ``nearbyint``
14554 functions would, and handles error conditions in the same way.
14555
14556 '``llvm.round.*``' Intrinsic
14557 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14558
14559 Syntax:
14560 """""""
14561
14562 This is an overloaded intrinsic. You can use ``llvm.round`` on any
14563 floating-point or vector of floating-point type. Not all targets support
14564 all types however.
14565
14566 ::
14567
14568       declare float     @llvm.round.f32(float  %Val)
14569       declare double    @llvm.round.f64(double %Val)
14570       declare x86_fp80  @llvm.round.f80(x86_fp80  %Val)
14571       declare fp128     @llvm.round.f128(fp128 %Val)
14572       declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128  %Val)
14573
14574 Overview:
14575 """""""""
14576
14577 The '``llvm.round.*``' intrinsics returns the operand rounded to the
14578 nearest integer.
14579
14580 Arguments:
14581 """"""""""
14582
14583 The argument and return value are floating-point numbers of the same
14584 type.
14585
14586 Semantics:
14587 """"""""""
14588
14589 This function returns the same values as the libm ``round``
14590 functions would, and handles error conditions in the same way.
14591
14592 '``llvm.roundeven.*``' Intrinsic
14593 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14594
14595 Syntax:
14596 """""""
14597
14598 This is an overloaded intrinsic. You can use ``llvm.roundeven`` on any
14599 floating-point or vector of floating-point type. Not all targets support
14600 all types however.
14601
14602 ::
14603
14604       declare float     @llvm.roundeven.f32(float  %Val)
14605       declare double    @llvm.roundeven.f64(double %Val)
14606       declare x86_fp80  @llvm.roundeven.f80(x86_fp80  %Val)
14607       declare fp128     @llvm.roundeven.f128(fp128 %Val)
14608       declare ppc_fp128 @llvm.roundeven.ppcf128(ppc_fp128  %Val)
14609
14610 Overview:
14611 """""""""
14612
14613 The '``llvm.roundeven.*``' intrinsics returns the operand rounded to the nearest
14614 integer in floating-point format rounding halfway cases to even (that is, to the
14615 nearest value that is an even integer).
14616
14617 Arguments:
14618 """"""""""
14619
14620 The argument and return value are floating-point numbers of the same type.
14621
14622 Semantics:
14623 """"""""""
14624
14625 This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
14626 also behaves in the same way as C standard function ``roundeven``, except that
14627 it does not raise floating point exceptions.
14628
14629
14630 '``llvm.lround.*``' Intrinsic
14631 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14632
14633 Syntax:
14634 """""""
14635
14636 This is an overloaded intrinsic. You can use ``llvm.lround`` on any
14637 floating-point type. Not all targets support all types however.
14638
14639 ::
14640
14641       declare i32 @llvm.lround.i32.f32(float %Val)
14642       declare i32 @llvm.lround.i32.f64(double %Val)
14643       declare i32 @llvm.lround.i32.f80(float %Val)
14644       declare i32 @llvm.lround.i32.f128(double %Val)
14645       declare i32 @llvm.lround.i32.ppcf128(double %Val)
14646
14647       declare i64 @llvm.lround.i64.f32(float %Val)
14648       declare i64 @llvm.lround.i64.f64(double %Val)
14649       declare i64 @llvm.lround.i64.f80(float %Val)
14650       declare i64 @llvm.lround.i64.f128(double %Val)
14651       declare i64 @llvm.lround.i64.ppcf128(double %Val)
14652
14653 Overview:
14654 """""""""
14655
14656 The '``llvm.lround.*``' intrinsics return the operand rounded to the nearest
14657 integer with ties away from zero.
14658
14659
14660 Arguments:
14661 """"""""""
14662
14663 The argument is a floating-point number and the return value is an integer
14664 type.
14665
14666 Semantics:
14667 """"""""""
14668
14669 This function returns the same values as the libm ``lround``
14670 functions would, but without setting errno.
14671
14672 '``llvm.llround.*``' Intrinsic
14673 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14674
14675 Syntax:
14676 """""""
14677
14678 This is an overloaded intrinsic. You can use ``llvm.llround`` on any
14679 floating-point type. Not all targets support all types however.
14680
14681 ::
14682
14683       declare i64 @llvm.lround.i64.f32(float %Val)
14684       declare i64 @llvm.lround.i64.f64(double %Val)
14685       declare i64 @llvm.lround.i64.f80(float %Val)
14686       declare i64 @llvm.lround.i64.f128(double %Val)
14687       declare i64 @llvm.lround.i64.ppcf128(double %Val)
14688
14689 Overview:
14690 """""""""
14691
14692 The '``llvm.llround.*``' intrinsics return the operand rounded to the nearest
14693 integer with ties away from zero.
14694
14695 Arguments:
14696 """"""""""
14697
14698 The argument is a floating-point number and the return value is an integer
14699 type.
14700
14701 Semantics:
14702 """"""""""
14703
14704 This function returns the same values as the libm ``llround``
14705 functions would, but without setting errno.
14706
14707 '``llvm.lrint.*``' Intrinsic
14708 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14709
14710 Syntax:
14711 """""""
14712
14713 This is an overloaded intrinsic. You can use ``llvm.lrint`` on any
14714 floating-point type. Not all targets support all types however.
14715
14716 ::
14717
14718       declare i32 @llvm.lrint.i32.f32(float %Val)
14719       declare i32 @llvm.lrint.i32.f64(double %Val)
14720       declare i32 @llvm.lrint.i32.f80(float %Val)
14721       declare i32 @llvm.lrint.i32.f128(double %Val)
14722       declare i32 @llvm.lrint.i32.ppcf128(double %Val)
14723
14724       declare i64 @llvm.lrint.i64.f32(float %Val)
14725       declare i64 @llvm.lrint.i64.f64(double %Val)
14726       declare i64 @llvm.lrint.i64.f80(float %Val)
14727       declare i64 @llvm.lrint.i64.f128(double %Val)
14728       declare i64 @llvm.lrint.i64.ppcf128(double %Val)
14729
14730 Overview:
14731 """""""""
14732
14733 The '``llvm.lrint.*``' intrinsics return the operand rounded to the nearest
14734 integer.
14735
14736
14737 Arguments:
14738 """"""""""
14739
14740 The argument is a floating-point number and the return value is an integer
14741 type.
14742
14743 Semantics:
14744 """"""""""
14745
14746 This function returns the same values as the libm ``lrint``
14747 functions would, but without setting errno.
14748
14749 '``llvm.llrint.*``' Intrinsic
14750 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14751
14752 Syntax:
14753 """""""
14754
14755 This is an overloaded intrinsic. You can use ``llvm.llrint`` on any
14756 floating-point type. Not all targets support all types however.
14757
14758 ::
14759
14760       declare i64 @llvm.llrint.i64.f32(float %Val)
14761       declare i64 @llvm.llrint.i64.f64(double %Val)
14762       declare i64 @llvm.llrint.i64.f80(float %Val)
14763       declare i64 @llvm.llrint.i64.f128(double %Val)
14764       declare i64 @llvm.llrint.i64.ppcf128(double %Val)
14765
14766 Overview:
14767 """""""""
14768
14769 The '``llvm.llrint.*``' intrinsics return the operand rounded to the nearest
14770 integer.
14771
14772 Arguments:
14773 """"""""""
14774
14775 The argument is a floating-point number and the return value is an integer
14776 type.
14777
14778 Semantics:
14779 """"""""""
14780
14781 This function returns the same values as the libm ``llrint``
14782 functions would, but without setting errno.
14783
14784 Bit Manipulation Intrinsics
14785 ---------------------------
14786
14787 LLVM provides intrinsics for a few important bit manipulation
14788 operations. These allow efficient code generation for some algorithms.
14789
14790 '``llvm.bitreverse.*``' Intrinsics
14791 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14792
14793 Syntax:
14794 """""""
14795
14796 This is an overloaded intrinsic function. You can use bitreverse on any
14797 integer type.
14798
14799 ::
14800
14801       declare i16 @llvm.bitreverse.i16(i16 <id>)
14802       declare i32 @llvm.bitreverse.i32(i32 <id>)
14803       declare i64 @llvm.bitreverse.i64(i64 <id>)
14804       declare <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> <id>)
14805
14806 Overview:
14807 """""""""
14808
14809 The '``llvm.bitreverse``' family of intrinsics is used to reverse the
14810 bitpattern of an integer value or vector of integer values; for example
14811 ``0b10110110`` becomes ``0b01101101``.
14812
14813 Semantics:
14814 """"""""""
14815
14816 The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit
14817 ``M`` in the input moved to bit ``N-M`` in the output. The vector
14818 intrinsics, such as ``llvm.bitreverse.v4i32``, operate on a per-element
14819 basis and the element order is not affected.
14820
14821 '``llvm.bswap.*``' Intrinsics
14822 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14823
14824 Syntax:
14825 """""""
14826
14827 This is an overloaded intrinsic function. You can use bswap on any
14828 integer type that is an even number of bytes (i.e. BitWidth % 16 == 0).
14829
14830 ::
14831
14832       declare i16 @llvm.bswap.i16(i16 <id>)
14833       declare i32 @llvm.bswap.i32(i32 <id>)
14834       declare i64 @llvm.bswap.i64(i64 <id>)
14835       declare <4 x i32> @llvm.bswap.v4i32(<4 x i32> <id>)
14836
14837 Overview:
14838 """""""""
14839
14840 The '``llvm.bswap``' family of intrinsics is used to byte swap an integer
14841 value or vector of integer values with an even number of bytes (positive
14842 multiple of 16 bits).
14843
14844 Semantics:
14845 """"""""""
14846
14847 The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high
14848 and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32``
14849 intrinsic returns an i32 value that has the four bytes of the input i32
14850 swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the
14851 returned i32 will have its bytes in 3, 2, 1, 0 order. The
14852 ``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this
14853 concept to additional even-byte lengths (6 bytes, 8 bytes and more,
14854 respectively). The vector intrinsics, such as ``llvm.bswap.v4i32``,
14855 operate on a per-element basis and the element order is not affected.
14856
14857 '``llvm.ctpop.*``' Intrinsic
14858 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14859
14860 Syntax:
14861 """""""
14862
14863 This is an overloaded intrinsic. You can use llvm.ctpop on any integer
14864 bit width, or on any vector with integer elements. Not all targets
14865 support all bit widths or vector types, however.
14866
14867 ::
14868
14869       declare i8 @llvm.ctpop.i8(i8  <src>)
14870       declare i16 @llvm.ctpop.i16(i16 <src>)
14871       declare i32 @llvm.ctpop.i32(i32 <src>)
14872       declare i64 @llvm.ctpop.i64(i64 <src>)
14873       declare i256 @llvm.ctpop.i256(i256 <src>)
14874       declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>)
14875
14876 Overview:
14877 """""""""
14878
14879 The '``llvm.ctpop``' family of intrinsics counts the number of bits set
14880 in a value.
14881
14882 Arguments:
14883 """"""""""
14884
14885 The only argument is the value to be counted. The argument may be of any
14886 integer type, or a vector with integer elements. The return type must
14887 match the argument type.
14888
14889 Semantics:
14890 """"""""""
14891
14892 The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within
14893 each element of a vector.
14894
14895 '``llvm.ctlz.*``' Intrinsic
14896 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14897
14898 Syntax:
14899 """""""
14900
14901 This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any
14902 integer bit width, or any vector whose elements are integers. Not all
14903 targets support all bit widths or vector types, however.
14904
14905 ::
14906
14907       declare i8   @llvm.ctlz.i8  (i8   <src>, i1 <is_zero_undef>)
14908       declare i16  @llvm.ctlz.i16 (i16  <src>, i1 <is_zero_undef>)
14909       declare i32  @llvm.ctlz.i32 (i32  <src>, i1 <is_zero_undef>)
14910       declare i64  @llvm.ctlz.i64 (i64  <src>, i1 <is_zero_undef>)
14911       declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>)
14912       declare <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
14913
14914 Overview:
14915 """""""""
14916
14917 The '``llvm.ctlz``' family of intrinsic functions counts the number of
14918 leading zeros in a variable.
14919
14920 Arguments:
14921 """"""""""
14922
14923 The first argument is the value to be counted. This argument may be of
14924 any integer type, or a vector with integer element type. The return
14925 type must match the first argument type.
14926
14927 The second argument must be a constant and is a flag to indicate whether
14928 the intrinsic should ensure that a zero as the first argument produces a
14929 defined result. Historically some architectures did not provide a
14930 defined result for zero values as efficiently, and many algorithms are
14931 now predicated on avoiding zero-value inputs.
14932
14933 Semantics:
14934 """"""""""
14935
14936 The '``llvm.ctlz``' intrinsic counts the leading (most significant)
14937 zeros in a variable, or within each element of the vector. If
14938 ``src == 0`` then the result is the size in bits of the type of ``src``
14939 if ``is_zero_undef == 0`` and ``undef`` otherwise. For example,
14940 ``llvm.ctlz(i32 2) = 30``.
14941
14942 '``llvm.cttz.*``' Intrinsic
14943 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14944
14945 Syntax:
14946 """""""
14947
14948 This is an overloaded intrinsic. You can use ``llvm.cttz`` on any
14949 integer bit width, or any vector of integer elements. Not all targets
14950 support all bit widths or vector types, however.
14951
14952 ::
14953
14954       declare i8   @llvm.cttz.i8  (i8   <src>, i1 <is_zero_undef>)
14955       declare i16  @llvm.cttz.i16 (i16  <src>, i1 <is_zero_undef>)
14956       declare i32  @llvm.cttz.i32 (i32  <src>, i1 <is_zero_undef>)
14957       declare i64  @llvm.cttz.i64 (i64  <src>, i1 <is_zero_undef>)
14958       declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>)
14959       declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
14960
14961 Overview:
14962 """""""""
14963
14964 The '``llvm.cttz``' family of intrinsic functions counts the number of
14965 trailing zeros.
14966
14967 Arguments:
14968 """"""""""
14969
14970 The first argument is the value to be counted. This argument may be of
14971 any integer type, or a vector with integer element type. The return
14972 type must match the first argument type.
14973
14974 The second argument must be a constant and is a flag to indicate whether
14975 the intrinsic should ensure that a zero as the first argument produces a
14976 defined result. Historically some architectures did not provide a
14977 defined result for zero values as efficiently, and many algorithms are
14978 now predicated on avoiding zero-value inputs.
14979
14980 Semantics:
14981 """"""""""
14982
14983 The '``llvm.cttz``' intrinsic counts the trailing (least significant)
14984 zeros in a variable, or within each element of a vector. If ``src == 0``
14985 then the result is the size in bits of the type of ``src`` if
14986 ``is_zero_undef == 0`` and ``undef`` otherwise. For example,
14987 ``llvm.cttz(2) = 1``.
14988
14989 .. _int_overflow:
14990
14991 '``llvm.fshl.*``' Intrinsic
14992 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14993
14994 Syntax:
14995 """""""
14996
14997 This is an overloaded intrinsic. You can use ``llvm.fshl`` on any
14998 integer bit width or any vector of integer elements. Not all targets
14999 support all bit widths or vector types, however.
15000
15001 ::
15002
15003       declare i8  @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c)
15004       declare i67 @llvm.fshl.i67(i67 %a, i67 %b, i67 %c)
15005       declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
15006
15007 Overview:
15008 """""""""
15009
15010 The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left:
15011 the first two values are concatenated as { %a : %b } (%a is the most significant
15012 bits of the wide value), the combined value is shifted left, and the most
15013 significant bits are extracted to produce a result that is the same size as the
15014 original arguments. If the first 2 arguments are identical, this is equivalent
15015 to a rotate left operation. For vector types, the operation occurs for each
15016 element of the vector. The shift argument is treated as an unsigned amount
15017 modulo the element size of the arguments.
15018
15019 Arguments:
15020 """"""""""
15021
15022 The first two arguments are the values to be concatenated. The third
15023 argument is the shift amount. The arguments may be any integer type or a
15024 vector with integer element type. All arguments and the return value must
15025 have the same type.
15026
15027 Example:
15028 """"""""
15029
15030 .. code-block:: text
15031
15032       %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z)  ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8)
15033       %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15)  ; %r = i8: 128 (0b10000000)
15034       %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11)  ; %r = i8: 120 (0b01111000)
15035       %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8)   ; %r = i8: 0   (0b00000000)
15036
15037 '``llvm.fshr.*``' Intrinsic
15038 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15039
15040 Syntax:
15041 """""""
15042
15043 This is an overloaded intrinsic. You can use ``llvm.fshr`` on any
15044 integer bit width or any vector of integer elements. Not all targets
15045 support all bit widths or vector types, however.
15046
15047 ::
15048
15049       declare i8  @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c)
15050       declare i67 @llvm.fshr.i67(i67 %a, i67 %b, i67 %c)
15051       declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
15052
15053 Overview:
15054 """""""""
15055
15056 The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right:
15057 the first two values are concatenated as { %a : %b } (%a is the most significant
15058 bits of the wide value), the combined value is shifted right, and the least
15059 significant bits are extracted to produce a result that is the same size as the
15060 original arguments. If the first 2 arguments are identical, this is equivalent
15061 to a rotate right operation. For vector types, the operation occurs for each
15062 element of the vector. The shift argument is treated as an unsigned amount
15063 modulo the element size of the arguments.
15064
15065 Arguments:
15066 """"""""""
15067
15068 The first two arguments are the values to be concatenated. The third
15069 argument is the shift amount. The arguments may be any integer type or a
15070 vector with integer element type. All arguments and the return value must
15071 have the same type.
15072
15073 Example:
15074 """"""""
15075
15076 .. code-block:: text
15077
15078       %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z)  ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8)
15079       %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15)  ; %r = i8: 254 (0b11111110)
15080       %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11)  ; %r = i8: 225 (0b11100001)
15081       %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8)   ; %r = i8: 255 (0b11111111)
15082
15083 Arithmetic with Overflow Intrinsics
15084 -----------------------------------
15085
15086 LLVM provides intrinsics for fast arithmetic overflow checking.
15087
15088 Each of these intrinsics returns a two-element struct. The first
15089 element of this struct contains the result of the corresponding
15090 arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of
15091 the result. Therefore, for example, the first element of the struct
15092 returned by ``llvm.sadd.with.overflow.i32`` is always the same as the
15093 result of a 32-bit ``add`` instruction with the same operands, where
15094 the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag.
15095
15096 The second element of the result is an ``i1`` that is 1 if the
15097 arithmetic operation overflowed and 0 otherwise. An operation
15098 overflows if, for any values of its operands ``A`` and ``B`` and for
15099 any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is
15100 not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is
15101 ``sext`` for signed overflow and ``zext`` for unsigned overflow, and
15102 ``op`` is the underlying arithmetic operation.
15103
15104 The behavior of these intrinsics is well-defined for all argument
15105 values.
15106
15107 '``llvm.sadd.with.overflow.*``' Intrinsics
15108 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15109
15110 Syntax:
15111 """""""
15112
15113 This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow``
15114 on any integer bit width or vectors of integers.
15115
15116 ::
15117
15118       declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
15119       declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
15120       declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b)
15121       declare {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15122
15123 Overview:
15124 """""""""
15125
15126 The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
15127 a signed addition of the two arguments, and indicate whether an overflow
15128 occurred during the signed summation.
15129
15130 Arguments:
15131 """"""""""
15132
15133 The arguments (%a and %b) and the first element of the result structure
15134 may be of integer types of any bit width, but they must have the same
15135 bit width. The second element of the result structure must be of type
15136 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15137 addition.
15138
15139 Semantics:
15140 """"""""""
15141
15142 The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
15143 a signed addition of the two variables. They return a structure --- the
15144 first element of which is the signed summation, and the second element
15145 of which is a bit specifying if the signed summation resulted in an
15146 overflow.
15147
15148 Examples:
15149 """""""""
15150
15151 .. code-block:: llvm
15152
15153       %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
15154       %sum = extractvalue {i32, i1} %res, 0
15155       %obit = extractvalue {i32, i1} %res, 1
15156       br i1 %obit, label %overflow, label %normal
15157
15158 '``llvm.uadd.with.overflow.*``' Intrinsics
15159 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15160
15161 Syntax:
15162 """""""
15163
15164 This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow``
15165 on any integer bit width or vectors of integers.
15166
15167 ::
15168
15169       declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b)
15170       declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
15171       declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b)
15172       declare {<4 x i32>, <4 x i1>} @llvm.uadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15173
15174 Overview:
15175 """""""""
15176
15177 The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
15178 an unsigned addition of the two arguments, and indicate whether a carry
15179 occurred during the unsigned summation.
15180
15181 Arguments:
15182 """"""""""
15183
15184 The arguments (%a and %b) and the first element of the result structure
15185 may be of integer types of any bit width, but they must have the same
15186 bit width. The second element of the result structure must be of type
15187 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15188 addition.
15189
15190 Semantics:
15191 """"""""""
15192
15193 The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
15194 an unsigned addition of the two arguments. They return a structure --- the
15195 first element of which is the sum, and the second element of which is a
15196 bit specifying if the unsigned summation resulted in a carry.
15197
15198 Examples:
15199 """""""""
15200
15201 .. code-block:: llvm
15202
15203       %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
15204       %sum = extractvalue {i32, i1} %res, 0
15205       %obit = extractvalue {i32, i1} %res, 1
15206       br i1 %obit, label %carry, label %normal
15207
15208 '``llvm.ssub.with.overflow.*``' Intrinsics
15209 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15210
15211 Syntax:
15212 """""""
15213
15214 This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow``
15215 on any integer bit width or vectors of integers.
15216
15217 ::
15218
15219       declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b)
15220       declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
15221       declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b)
15222       declare {<4 x i32>, <4 x i1>} @llvm.ssub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15223
15224 Overview:
15225 """""""""
15226
15227 The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
15228 a signed subtraction of the two arguments, and indicate whether an
15229 overflow occurred during the signed subtraction.
15230
15231 Arguments:
15232 """"""""""
15233
15234 The arguments (%a and %b) and the first element of the result structure
15235 may be of integer types of any bit width, but they must have the same
15236 bit width. The second element of the result structure must be of type
15237 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15238 subtraction.
15239
15240 Semantics:
15241 """"""""""
15242
15243 The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
15244 a signed subtraction of the two arguments. They return a structure --- the
15245 first element of which is the subtraction, and the second element of
15246 which is a bit specifying if the signed subtraction resulted in an
15247 overflow.
15248
15249 Examples:
15250 """""""""
15251
15252 .. code-block:: llvm
15253
15254       %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
15255       %sum = extractvalue {i32, i1} %res, 0
15256       %obit = extractvalue {i32, i1} %res, 1
15257       br i1 %obit, label %overflow, label %normal
15258
15259 '``llvm.usub.with.overflow.*``' Intrinsics
15260 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15261
15262 Syntax:
15263 """""""
15264
15265 This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow``
15266 on any integer bit width or vectors of integers.
15267
15268 ::
15269
15270       declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b)
15271       declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
15272       declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b)
15273       declare {<4 x i32>, <4 x i1>} @llvm.usub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15274
15275 Overview:
15276 """""""""
15277
15278 The '``llvm.usub.with.overflow``' family of intrinsic functions perform
15279 an unsigned subtraction of the two arguments, and indicate whether an
15280 overflow occurred during the unsigned subtraction.
15281
15282 Arguments:
15283 """"""""""
15284
15285 The arguments (%a and %b) and the first element of the result structure
15286 may be of integer types of any bit width, but they must have the same
15287 bit width. The second element of the result structure must be of type
15288 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15289 subtraction.
15290
15291 Semantics:
15292 """"""""""
15293
15294 The '``llvm.usub.with.overflow``' family of intrinsic functions perform
15295 an unsigned subtraction of the two arguments. They return a structure ---
15296 the first element of which is the subtraction, and the second element of
15297 which is a bit specifying if the unsigned subtraction resulted in an
15298 overflow.
15299
15300 Examples:
15301 """""""""
15302
15303 .. code-block:: llvm
15304
15305       %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
15306       %sum = extractvalue {i32, i1} %res, 0
15307       %obit = extractvalue {i32, i1} %res, 1
15308       br i1 %obit, label %overflow, label %normal
15309
15310 '``llvm.smul.with.overflow.*``' Intrinsics
15311 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15312
15313 Syntax:
15314 """""""
15315
15316 This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow``
15317 on any integer bit width or vectors of integers.
15318
15319 ::
15320
15321       declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b)
15322       declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
15323       declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b)
15324       declare {<4 x i32>, <4 x i1>} @llvm.smul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15325
15326 Overview:
15327 """""""""
15328
15329 The '``llvm.smul.with.overflow``' family of intrinsic functions perform
15330 a signed multiplication of the two arguments, and indicate whether an
15331 overflow occurred during the signed multiplication.
15332
15333 Arguments:
15334 """"""""""
15335
15336 The arguments (%a and %b) and the first element of the result structure
15337 may be of integer types of any bit width, but they must have the same
15338 bit width. The second element of the result structure must be of type
15339 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15340 multiplication.
15341
15342 Semantics:
15343 """"""""""
15344
15345 The '``llvm.smul.with.overflow``' family of intrinsic functions perform
15346 a signed multiplication of the two arguments. They return a structure ---
15347 the first element of which is the multiplication, and the second element
15348 of which is a bit specifying if the signed multiplication resulted in an
15349 overflow.
15350
15351 Examples:
15352 """""""""
15353
15354 .. code-block:: llvm
15355
15356       %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
15357       %sum = extractvalue {i32, i1} %res, 0
15358       %obit = extractvalue {i32, i1} %res, 1
15359       br i1 %obit, label %overflow, label %normal
15360
15361 '``llvm.umul.with.overflow.*``' Intrinsics
15362 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15363
15364 Syntax:
15365 """""""
15366
15367 This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow``
15368 on any integer bit width or vectors of integers.
15369
15370 ::
15371
15372       declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b)
15373       declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
15374       declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
15375       declare {<4 x i32>, <4 x i1>} @llvm.umul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15376
15377 Overview:
15378 """""""""
15379
15380 The '``llvm.umul.with.overflow``' family of intrinsic functions perform
15381 a unsigned multiplication of the two arguments, and indicate whether an
15382 overflow occurred during the unsigned multiplication.
15383
15384 Arguments:
15385 """"""""""
15386
15387 The arguments (%a and %b) and the first element of the result structure
15388 may be of integer types of any bit width, but they must have the same
15389 bit width. The second element of the result structure must be of type
15390 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15391 multiplication.
15392
15393 Semantics:
15394 """"""""""
15395
15396 The '``llvm.umul.with.overflow``' family of intrinsic functions perform
15397 an unsigned multiplication of the two arguments. They return a structure ---
15398 the first element of which is the multiplication, and the second
15399 element of which is a bit specifying if the unsigned multiplication
15400 resulted in an overflow.
15401
15402 Examples:
15403 """""""""
15404
15405 .. code-block:: llvm
15406
15407       %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
15408       %sum = extractvalue {i32, i1} %res, 0
15409       %obit = extractvalue {i32, i1} %res, 1
15410       br i1 %obit, label %overflow, label %normal
15411
15412 Saturation Arithmetic Intrinsics
15413 ---------------------------------
15414
15415 Saturation arithmetic is a version of arithmetic in which operations are
15416 limited to a fixed range between a minimum and maximum value. If the result of
15417 an operation is greater than the maximum value, the result is set (or
15418 "clamped") to this maximum. If it is below the minimum, it is clamped to this
15419 minimum.
15420
15421
15422 '``llvm.sadd.sat.*``' Intrinsics
15423 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15424
15425 Syntax
15426 """""""
15427
15428 This is an overloaded intrinsic. You can use ``llvm.sadd.sat``
15429 on any integer bit width or vectors of integers.
15430
15431 ::
15432
15433       declare i16 @llvm.sadd.sat.i16(i16 %a, i16 %b)
15434       declare i32 @llvm.sadd.sat.i32(i32 %a, i32 %b)
15435       declare i64 @llvm.sadd.sat.i64(i64 %a, i64 %b)
15436       declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15437
15438 Overview
15439 """""""""
15440
15441 The '``llvm.sadd.sat``' family of intrinsic functions perform signed
15442 saturating addition on the 2 arguments.
15443
15444 Arguments
15445 """"""""""
15446
15447 The arguments (%a and %b) and the result may be of integer types of any bit
15448 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15449 values that will undergo signed addition.
15450
15451 Semantics:
15452 """"""""""
15453
15454 The maximum value this operation can clamp to is the largest signed value
15455 representable by the bit width of the arguments. The minimum value is the
15456 smallest signed value representable by this bit width.
15457
15458
15459 Examples
15460 """""""""
15461
15462 .. code-block:: llvm
15463
15464       %res = call i4 @llvm.sadd.sat.i4(i4 1, i4 2)  ; %res = 3
15465       %res = call i4 @llvm.sadd.sat.i4(i4 5, i4 6)  ; %res = 7
15466       %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 2)  ; %res = -2
15467       %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 -5)  ; %res = -8
15468
15469
15470 '``llvm.uadd.sat.*``' Intrinsics
15471 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15472
15473 Syntax
15474 """""""
15475
15476 This is an overloaded intrinsic. You can use ``llvm.uadd.sat``
15477 on any integer bit width or vectors of integers.
15478
15479 ::
15480
15481       declare i16 @llvm.uadd.sat.i16(i16 %a, i16 %b)
15482       declare i32 @llvm.uadd.sat.i32(i32 %a, i32 %b)
15483       declare i64 @llvm.uadd.sat.i64(i64 %a, i64 %b)
15484       declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15485
15486 Overview
15487 """""""""
15488
15489 The '``llvm.uadd.sat``' family of intrinsic functions perform unsigned
15490 saturating addition on the 2 arguments.
15491
15492 Arguments
15493 """"""""""
15494
15495 The arguments (%a and %b) and the result may be of integer types of any bit
15496 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15497 values that will undergo unsigned addition.
15498
15499 Semantics:
15500 """"""""""
15501
15502 The maximum value this operation can clamp to is the largest unsigned value
15503 representable by the bit width of the arguments. Because this is an unsigned
15504 operation, the result will never saturate towards zero.
15505
15506
15507 Examples
15508 """""""""
15509
15510 .. code-block:: llvm
15511
15512       %res = call i4 @llvm.uadd.sat.i4(i4 1, i4 2)  ; %res = 3
15513       %res = call i4 @llvm.uadd.sat.i4(i4 5, i4 6)  ; %res = 11
15514       %res = call i4 @llvm.uadd.sat.i4(i4 8, i4 8)  ; %res = 15
15515
15516
15517 '``llvm.ssub.sat.*``' Intrinsics
15518 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15519
15520 Syntax
15521 """""""
15522
15523 This is an overloaded intrinsic. You can use ``llvm.ssub.sat``
15524 on any integer bit width or vectors of integers.
15525
15526 ::
15527
15528       declare i16 @llvm.ssub.sat.i16(i16 %a, i16 %b)
15529       declare i32 @llvm.ssub.sat.i32(i32 %a, i32 %b)
15530       declare i64 @llvm.ssub.sat.i64(i64 %a, i64 %b)
15531       declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15532
15533 Overview
15534 """""""""
15535
15536 The '``llvm.ssub.sat``' family of intrinsic functions perform signed
15537 saturating subtraction on the 2 arguments.
15538
15539 Arguments
15540 """"""""""
15541
15542 The arguments (%a and %b) and the result may be of integer types of any bit
15543 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15544 values that will undergo signed subtraction.
15545
15546 Semantics:
15547 """"""""""
15548
15549 The maximum value this operation can clamp to is the largest signed value
15550 representable by the bit width of the arguments. The minimum value is the
15551 smallest signed value representable by this bit width.
15552
15553
15554 Examples
15555 """""""""
15556
15557 .. code-block:: llvm
15558
15559       %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 1)  ; %res = 1
15560       %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 6)  ; %res = -4
15561       %res = call i4 @llvm.ssub.sat.i4(i4 -4, i4 5)  ; %res = -8
15562       %res = call i4 @llvm.ssub.sat.i4(i4 4, i4 -5)  ; %res = 7
15563
15564
15565 '``llvm.usub.sat.*``' Intrinsics
15566 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15567
15568 Syntax
15569 """""""
15570
15571 This is an overloaded intrinsic. You can use ``llvm.usub.sat``
15572 on any integer bit width or vectors of integers.
15573
15574 ::
15575
15576       declare i16 @llvm.usub.sat.i16(i16 %a, i16 %b)
15577       declare i32 @llvm.usub.sat.i32(i32 %a, i32 %b)
15578       declare i64 @llvm.usub.sat.i64(i64 %a, i64 %b)
15579       declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15580
15581 Overview
15582 """""""""
15583
15584 The '``llvm.usub.sat``' family of intrinsic functions perform unsigned
15585 saturating subtraction on the 2 arguments.
15586
15587 Arguments
15588 """"""""""
15589
15590 The arguments (%a and %b) and the result may be of integer types of any bit
15591 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15592 values that will undergo unsigned subtraction.
15593
15594 Semantics:
15595 """"""""""
15596
15597 The minimum value this operation can clamp to is 0, which is the smallest
15598 unsigned value representable by the bit width of the unsigned arguments.
15599 Because this is an unsigned operation, the result will never saturate towards
15600 the largest possible value representable by this bit width.
15601
15602
15603 Examples
15604 """""""""
15605
15606 .. code-block:: llvm
15607
15608       %res = call i4 @llvm.usub.sat.i4(i4 2, i4 1)  ; %res = 1
15609       %res = call i4 @llvm.usub.sat.i4(i4 2, i4 6)  ; %res = 0
15610
15611
15612 '``llvm.sshl.sat.*``' Intrinsics
15613 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15614
15615 Syntax
15616 """""""
15617
15618 This is an overloaded intrinsic. You can use ``llvm.sshl.sat``
15619 on integers or vectors of integers of any bit width.
15620
15621 ::
15622
15623       declare i16 @llvm.sshl.sat.i16(i16 %a, i16 %b)
15624       declare i32 @llvm.sshl.sat.i32(i32 %a, i32 %b)
15625       declare i64 @llvm.sshl.sat.i64(i64 %a, i64 %b)
15626       declare <4 x i32> @llvm.sshl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15627
15628 Overview
15629 """""""""
15630
15631 The '``llvm.sshl.sat``' family of intrinsic functions perform signed
15632 saturating left shift on the first argument.
15633
15634 Arguments
15635 """"""""""
15636
15637 The arguments (``%a`` and ``%b``) and the result may be of integer types of any
15638 bit width, but they must have the same bit width. ``%a`` is the value to be
15639 shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
15640 dynamically) equal to or larger than the integer bit width of the arguments,
15641 the result is a :ref:`poison value <poisonvalues>`. If the arguments are
15642 vectors, each vector element of ``a`` is shifted by the corresponding shift
15643 amount in ``b``.
15644
15645
15646 Semantics:
15647 """"""""""
15648
15649 The maximum value this operation can clamp to is the largest signed value
15650 representable by the bit width of the arguments. The minimum value is the
15651 smallest signed value representable by this bit width.
15652
15653
15654 Examples
15655 """""""""
15656
15657 .. code-block:: llvm
15658
15659       %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 1)  ; %res = 4
15660       %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 2)  ; %res = 7
15661       %res = call i4 @llvm.sshl.sat.i4(i4 -5, i4 1)  ; %res = -8
15662       %res = call i4 @llvm.sshl.sat.i4(i4 -1, i4 1)  ; %res = -2
15663
15664
15665 '``llvm.ushl.sat.*``' Intrinsics
15666 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15667
15668 Syntax
15669 """""""
15670
15671 This is an overloaded intrinsic. You can use ``llvm.ushl.sat``
15672 on integers or vectors of integers of any bit width.
15673
15674 ::
15675
15676       declare i16 @llvm.ushl.sat.i16(i16 %a, i16 %b)
15677       declare i32 @llvm.ushl.sat.i32(i32 %a, i32 %b)
15678       declare i64 @llvm.ushl.sat.i64(i64 %a, i64 %b)
15679       declare <4 x i32> @llvm.ushl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15680
15681 Overview
15682 """""""""
15683
15684 The '``llvm.ushl.sat``' family of intrinsic functions perform unsigned
15685 saturating left shift on the first argument.
15686
15687 Arguments
15688 """"""""""
15689
15690 The arguments (``%a`` and ``%b``) and the result may be of integer types of any
15691 bit width, but they must have the same bit width. ``%a`` is the value to be
15692 shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
15693 dynamically) equal to or larger than the integer bit width of the arguments,
15694 the result is a :ref:`poison value <poisonvalues>`. If the arguments are
15695 vectors, each vector element of ``a`` is shifted by the corresponding shift
15696 amount in ``b``.
15697
15698 Semantics:
15699 """"""""""
15700
15701 The maximum value this operation can clamp to is the largest unsigned value
15702 representable by the bit width of the arguments.
15703
15704
15705 Examples
15706 """""""""
15707
15708 .. code-block:: llvm
15709
15710       %res = call i4 @llvm.ushl.sat.i4(i4 2, i4 1)  ; %res = 4
15711       %res = call i4 @llvm.ushl.sat.i4(i4 3, i4 3)  ; %res = 15
15712
15713
15714 Fixed Point Arithmetic Intrinsics
15715 ---------------------------------
15716
15717 A fixed point number represents a real data type for a number that has a fixed
15718 number of digits after a radix point (equivalent to the decimal point '.').
15719 The number of digits after the radix point is referred as the `scale`. These
15720 are useful for representing fractional values to a specific precision. The
15721 following intrinsics perform fixed point arithmetic operations on 2 operands
15722 of the same scale, specified as the third argument.
15723
15724 The ``llvm.*mul.fix`` family of intrinsic functions represents a multiplication
15725 of fixed point numbers through scaled integers. Therefore, fixed point
15726 multiplication can be represented as
15727
15728 .. code-block:: llvm
15729
15730         %result = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %scale)
15731
15732         ; Expands to
15733         %a2 = sext i4 %a to i8
15734         %b2 = sext i4 %b to i8
15735         %mul = mul nsw nuw i8 %a, %b
15736         %scale2 = trunc i32 %scale to i8
15737         %r = ashr i8 %mul, i8 %scale2  ; this is for a target rounding down towards negative infinity
15738         %result = trunc i8 %r to i4
15739
15740 The ``llvm.*div.fix`` family of intrinsic functions represents a division of
15741 fixed point numbers through scaled integers. Fixed point division can be
15742 represented as:
15743
15744 .. code-block:: llvm
15745
15746         %result call i4 @llvm.sdiv.fix.i4(i4 %a, i4 %b, i32 %scale)
15747
15748         ; Expands to
15749         %a2 = sext i4 %a to i8
15750         %b2 = sext i4 %b to i8
15751         %scale2 = trunc i32 %scale to i8
15752         %a3 = shl i8 %a2, %scale2
15753         %r = sdiv i8 %a3, %b2 ; this is for a target rounding towards zero
15754         %result = trunc i8 %r to i4
15755
15756 For each of these functions, if the result cannot be represented exactly with
15757 the provided scale, the result is rounded. Rounding is unspecified since
15758 preferred rounding may vary for different targets. Rounding is specified
15759 through a target hook. Different pipelines should legalize or optimize this
15760 using the rounding specified by this hook if it is provided. Operations like
15761 constant folding, instruction combining, KnownBits, and ValueTracking should
15762 also use this hook, if provided, and not assume the direction of rounding. A
15763 rounded result must always be within one unit of precision from the true
15764 result. That is, the error between the returned result and the true result must
15765 be less than 1/2^(scale).
15766
15767
15768 '``llvm.smul.fix.*``' Intrinsics
15769 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15770
15771 Syntax
15772 """""""
15773
15774 This is an overloaded intrinsic. You can use ``llvm.smul.fix``
15775 on any integer bit width or vectors of integers.
15776
15777 ::
15778
15779       declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale)
15780       declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale)
15781       declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale)
15782       declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
15783
15784 Overview
15785 """""""""
15786
15787 The '``llvm.smul.fix``' family of intrinsic functions perform signed
15788 fixed point multiplication on 2 arguments of the same scale.
15789
15790 Arguments
15791 """"""""""
15792
15793 The arguments (%a and %b) and the result may be of integer types of any bit
15794 width, but they must have the same bit width. The arguments may also work with
15795 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
15796 values that will undergo signed fixed point multiplication. The argument
15797 ``%scale`` represents the scale of both operands, and must be a constant
15798 integer.
15799
15800 Semantics:
15801 """"""""""
15802
15803 This operation performs fixed point multiplication on the 2 arguments of a
15804 specified scale. The result will also be returned in the same scale specified
15805 in the third argument.
15806
15807 If the result value cannot be precisely represented in the given scale, the
15808 value is rounded up or down to the closest representable value. The rounding
15809 direction is unspecified.
15810
15811 It is undefined behavior if the result value does not fit within the range of
15812 the fixed point type.
15813
15814
15815 Examples
15816 """""""""
15817
15818 .. code-block:: llvm
15819
15820       %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
15821       %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
15822       %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1)  ; %res = -3 (1.5 x -1 = -1.5)
15823
15824       ; The result in the following could be rounded up to -2 or down to -2.5
15825       %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1)  ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
15826
15827
15828 '``llvm.umul.fix.*``' Intrinsics
15829 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15830
15831 Syntax
15832 """""""
15833
15834 This is an overloaded intrinsic. You can use ``llvm.umul.fix``
15835 on any integer bit width or vectors of integers.
15836
15837 ::
15838
15839       declare i16 @llvm.umul.fix.i16(i16 %a, i16 %b, i32 %scale)
15840       declare i32 @llvm.umul.fix.i32(i32 %a, i32 %b, i32 %scale)
15841       declare i64 @llvm.umul.fix.i64(i64 %a, i64 %b, i32 %scale)
15842       declare <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
15843
15844 Overview
15845 """""""""
15846
15847 The '``llvm.umul.fix``' family of intrinsic functions perform unsigned
15848 fixed point multiplication on 2 arguments of the same scale.
15849
15850 Arguments
15851 """"""""""
15852
15853 The arguments (%a and %b) and the result may be of integer types of any bit
15854 width, but they must have the same bit width. The arguments may also work with
15855 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
15856 values that will undergo unsigned fixed point multiplication. The argument
15857 ``%scale`` represents the scale of both operands, and must be a constant
15858 integer.
15859
15860 Semantics:
15861 """"""""""
15862
15863 This operation performs unsigned fixed point multiplication on the 2 arguments of a
15864 specified scale. The result will also be returned in the same scale specified
15865 in the third argument.
15866
15867 If the result value cannot be precisely represented in the given scale, the
15868 value is rounded up or down to the closest representable value. The rounding
15869 direction is unspecified.
15870
15871 It is undefined behavior if the result value does not fit within the range of
15872 the fixed point type.
15873
15874
15875 Examples
15876 """""""""
15877
15878 .. code-block:: llvm
15879
15880       %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
15881       %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
15882
15883       ; The result in the following could be rounded down to 3.5 or up to 4
15884       %res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1)  ; %res = 7 (or 8) (7.5 x 0.5 = 3.75)
15885
15886
15887 '``llvm.smul.fix.sat.*``' Intrinsics
15888 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15889
15890 Syntax
15891 """""""
15892
15893 This is an overloaded intrinsic. You can use ``llvm.smul.fix.sat``
15894 on any integer bit width or vectors of integers.
15895
15896 ::
15897
15898       declare i16 @llvm.smul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
15899       declare i32 @llvm.smul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
15900       declare i64 @llvm.smul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
15901       declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
15902
15903 Overview
15904 """""""""
15905
15906 The '``llvm.smul.fix.sat``' family of intrinsic functions perform signed
15907 fixed point saturating multiplication on 2 arguments of the same scale.
15908
15909 Arguments
15910 """"""""""
15911
15912 The arguments (%a and %b) and the result may be of integer types of any bit
15913 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15914 values that will undergo signed fixed point multiplication. The argument
15915 ``%scale`` represents the scale of both operands, and must be a constant
15916 integer.
15917
15918 Semantics:
15919 """"""""""
15920
15921 This operation performs fixed point multiplication on the 2 arguments of a
15922 specified scale. The result will also be returned in the same scale specified
15923 in the third argument.
15924
15925 If the result value cannot be precisely represented in the given scale, the
15926 value is rounded up or down to the closest representable value. The rounding
15927 direction is unspecified.
15928
15929 The maximum value this operation can clamp to is the largest signed value
15930 representable by the bit width of the first 2 arguments. The minimum value is the
15931 smallest signed value representable by this bit width.
15932
15933
15934 Examples
15935 """""""""
15936
15937 .. code-block:: llvm
15938
15939       %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
15940       %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
15941       %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -2, i32 1)  ; %res = -3 (1.5 x -1 = -1.5)
15942
15943       ; The result in the following could be rounded up to -2 or down to -2.5
15944       %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -3, i32 1)  ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
15945
15946       ; Saturation
15947       %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 0)  ; %res = 7
15948       %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 4, i32 2)  ; %res = 7
15949       %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 5, i32 2)  ; %res = -8
15950       %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 -2, i32 1)  ; %res = 7
15951
15952       ; Scale can affect the saturation result
15953       %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 0)  ; %res = 7 (2 x 4 -> clamped to 7)
15954       %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1)  ; %res = 4 (1 x 2 = 2)
15955
15956
15957 '``llvm.umul.fix.sat.*``' Intrinsics
15958 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15959
15960 Syntax
15961 """""""
15962
15963 This is an overloaded intrinsic. You can use ``llvm.umul.fix.sat``
15964 on any integer bit width or vectors of integers.
15965
15966 ::
15967
15968       declare i16 @llvm.umul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
15969       declare i32 @llvm.umul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
15970       declare i64 @llvm.umul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
15971       declare <4 x i32> @llvm.umul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
15972
15973 Overview
15974 """""""""
15975
15976 The '``llvm.umul.fix.sat``' family of intrinsic functions perform unsigned
15977 fixed point saturating multiplication on 2 arguments of the same scale.
15978
15979 Arguments
15980 """"""""""
15981
15982 The arguments (%a and %b) and the result may be of integer types of any bit
15983 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15984 values that will undergo unsigned fixed point multiplication. The argument
15985 ``%scale`` represents the scale of both operands, and must be a constant
15986 integer.
15987
15988 Semantics:
15989 """"""""""
15990
15991 This operation performs fixed point multiplication on the 2 arguments of a
15992 specified scale. The result will also be returned in the same scale specified
15993 in the third argument.
15994
15995 If the result value cannot be precisely represented in the given scale, the
15996 value is rounded up or down to the closest representable value. The rounding
15997 direction is unspecified.
15998
15999 The maximum value this operation can clamp to is the largest unsigned value
16000 representable by the bit width of the first 2 arguments. The minimum value is the
16001 smallest unsigned value representable by this bit width (zero).
16002
16003
16004 Examples
16005 """""""""
16006
16007 .. code-block:: llvm
16008
16009       %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
16010       %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
16011
16012       ; The result in the following could be rounded down to 2 or up to 2.5
16013       %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 3, i32 1)  ; %res = 4 (or 5) (1.5 x 1.5 = 2.25)
16014
16015       ; Saturation
16016       %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 2, i32 0)  ; %res = 15 (8 x 2 -> clamped to 15)
16017       %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 8, i32 2)  ; %res = 15 (2 x 2 -> clamped to 3.75)
16018
16019       ; Scale can affect the saturation result
16020       %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 0)  ; %res = 7 (2 x 4 -> clamped to 7)
16021       %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 1)  ; %res = 4 (1 x 2 = 2)
16022
16023
16024 '``llvm.sdiv.fix.*``' Intrinsics
16025 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16026
16027 Syntax
16028 """""""
16029
16030 This is an overloaded intrinsic. You can use ``llvm.sdiv.fix``
16031 on any integer bit width or vectors of integers.
16032
16033 ::
16034
16035       declare i16 @llvm.sdiv.fix.i16(i16 %a, i16 %b, i32 %scale)
16036       declare i32 @llvm.sdiv.fix.i32(i32 %a, i32 %b, i32 %scale)
16037       declare i64 @llvm.sdiv.fix.i64(i64 %a, i64 %b, i32 %scale)
16038       declare <4 x i32> @llvm.sdiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16039
16040 Overview
16041 """""""""
16042
16043 The '``llvm.sdiv.fix``' family of intrinsic functions perform signed
16044 fixed point division on 2 arguments of the same scale.
16045
16046 Arguments
16047 """"""""""
16048
16049 The arguments (%a and %b) and the result may be of integer types of any bit
16050 width, but they must have the same bit width. The arguments may also work with
16051 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16052 values that will undergo signed fixed point division. The argument
16053 ``%scale`` represents the scale of both operands, and must be a constant
16054 integer.
16055
16056 Semantics:
16057 """"""""""
16058
16059 This operation performs fixed point division on the 2 arguments of a
16060 specified scale. The result will also be returned in the same scale specified
16061 in the third argument.
16062
16063 If the result value cannot be precisely represented in the given scale, the
16064 value is rounded up or down to the closest representable value. The rounding
16065 direction is unspecified.
16066
16067 It is undefined behavior if the result value does not fit within the range of
16068 the fixed point type, or if the second argument is zero.
16069
16070
16071 Examples
16072 """""""""
16073
16074 .. code-block:: llvm
16075
16076       %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16077       %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16078       %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
16079
16080       ; The result in the following could be rounded up to 1 or down to 0.5
16081       %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16082
16083
16084 '``llvm.udiv.fix.*``' Intrinsics
16085 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16086
16087 Syntax
16088 """""""
16089
16090 This is an overloaded intrinsic. You can use ``llvm.udiv.fix``
16091 on any integer bit width or vectors of integers.
16092
16093 ::
16094
16095       declare i16 @llvm.udiv.fix.i16(i16 %a, i16 %b, i32 %scale)
16096       declare i32 @llvm.udiv.fix.i32(i32 %a, i32 %b, i32 %scale)
16097       declare i64 @llvm.udiv.fix.i64(i64 %a, i64 %b, i32 %scale)
16098       declare <4 x i32> @llvm.udiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16099
16100 Overview
16101 """""""""
16102
16103 The '``llvm.udiv.fix``' family of intrinsic functions perform unsigned
16104 fixed point division on 2 arguments of the same scale.
16105
16106 Arguments
16107 """"""""""
16108
16109 The arguments (%a and %b) and the result may be of integer types of any bit
16110 width, but they must have the same bit width. The arguments may also work with
16111 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16112 values that will undergo unsigned fixed point division. The argument
16113 ``%scale`` represents the scale of both operands, and must be a constant
16114 integer.
16115
16116 Semantics:
16117 """"""""""
16118
16119 This operation performs fixed point division on the 2 arguments of a
16120 specified scale. The result will also be returned in the same scale specified
16121 in the third argument.
16122
16123 If the result value cannot be precisely represented in the given scale, the
16124 value is rounded up or down to the closest representable value. The rounding
16125 direction is unspecified.
16126
16127 It is undefined behavior if the result value does not fit within the range of
16128 the fixed point type, or if the second argument is zero.
16129
16130
16131 Examples
16132 """""""""
16133
16134 .. code-block:: llvm
16135
16136       %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16137       %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16138       %res = call i4 @llvm.udiv.fix.i4(i4 1, i4 -8, i32 4) ; %res = 2 (0.0625 / 0.5 = 0.125)
16139
16140       ; The result in the following could be rounded up to 1 or down to 0.5
16141       %res = call i4 @llvm.udiv.fix.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16142
16143
16144 '``llvm.sdiv.fix.sat.*``' Intrinsics
16145 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16146
16147 Syntax
16148 """""""
16149
16150 This is an overloaded intrinsic. You can use ``llvm.sdiv.fix.sat``
16151 on any integer bit width or vectors of integers.
16152
16153 ::
16154
16155       declare i16 @llvm.sdiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16156       declare i32 @llvm.sdiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16157       declare i64 @llvm.sdiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16158       declare <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16159
16160 Overview
16161 """""""""
16162
16163 The '``llvm.sdiv.fix.sat``' family of intrinsic functions perform signed
16164 fixed point saturating division on 2 arguments of the same scale.
16165
16166 Arguments
16167 """"""""""
16168
16169 The arguments (%a and %b) and the result may be of integer types of any bit
16170 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16171 values that will undergo signed fixed point division. The argument
16172 ``%scale`` represents the scale of both operands, and must be a constant
16173 integer.
16174
16175 Semantics:
16176 """"""""""
16177
16178 This operation performs fixed point division on the 2 arguments of a
16179 specified scale. The result will also be returned in the same scale specified
16180 in the third argument.
16181
16182 If the result value cannot be precisely represented in the given scale, the
16183 value is rounded up or down to the closest representable value. The rounding
16184 direction is unspecified.
16185
16186 The maximum value this operation can clamp to is the largest signed value
16187 representable by the bit width of the first 2 arguments. The minimum value is the
16188 smallest signed value representable by this bit width.
16189
16190 It is undefined behavior if the second argument is zero.
16191
16192
16193 Examples
16194 """""""""
16195
16196 .. code-block:: llvm
16197
16198       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16199       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16200       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
16201
16202       ; The result in the following could be rounded up to 1 or down to 0.5
16203       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16204
16205       ; Saturation
16206       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -8, i4 -1, i32 0)  ; %res = 7 (-8 / -1 = 8 => 7)
16207       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 4, i4 2, i32 2)  ; %res = 7 (1 / 0.5 = 2 => 1.75)
16208       %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -4, i4 1, i32 2)  ; %res = -8 (-1 / 0.25 = -4 => -2)
16209
16210
16211 '``llvm.udiv.fix.sat.*``' Intrinsics
16212 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16213
16214 Syntax
16215 """""""
16216
16217 This is an overloaded intrinsic. You can use ``llvm.udiv.fix.sat``
16218 on any integer bit width or vectors of integers.
16219
16220 ::
16221
16222       declare i16 @llvm.udiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16223       declare i32 @llvm.udiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16224       declare i64 @llvm.udiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16225       declare <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16226
16227 Overview
16228 """""""""
16229
16230 The '``llvm.udiv.fix.sat``' family of intrinsic functions perform unsigned
16231 fixed point saturating division on 2 arguments of the same scale.
16232
16233 Arguments
16234 """"""""""
16235
16236 The arguments (%a and %b) and the result may be of integer types of any bit
16237 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16238 values that will undergo unsigned fixed point division. The argument
16239 ``%scale`` represents the scale of both operands, and must be a constant
16240 integer.
16241
16242 Semantics:
16243 """"""""""
16244
16245 This operation performs fixed point division on the 2 arguments of a
16246 specified scale. The result will also be returned in the same scale specified
16247 in the third argument.
16248
16249 If the result value cannot be precisely represented in the given scale, the
16250 value is rounded up or down to the closest representable value. The rounding
16251 direction is unspecified.
16252
16253 The maximum value this operation can clamp to is the largest unsigned value
16254 representable by the bit width of the first 2 arguments. The minimum value is the
16255 smallest unsigned value representable by this bit width (zero).
16256
16257 It is undefined behavior if the second argument is zero.
16258
16259 Examples
16260 """""""""
16261
16262 .. code-block:: llvm
16263
16264       %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16265       %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16266
16267       ; The result in the following could be rounded down to 0.5 or up to 1
16268       %res = call i4 @llvm.udiv.fix.sat.i4(i4 3, i4 4, i32 1)  ; %res = 1 (or 2) (1.5 / 2 = 0.75)
16269
16270       ; Saturation
16271       %res = call i4 @llvm.udiv.fix.sat.i4(i4 8, i4 2, i32 2)  ; %res = 15 (2 / 0.5 = 4 => 3.75)
16272
16273
16274 Specialised Arithmetic Intrinsics
16275 ---------------------------------
16276
16277 .. _i_intr_llvm_canonicalize:
16278
16279 '``llvm.canonicalize.*``' Intrinsic
16280 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16281
16282 Syntax:
16283 """""""
16284
16285 ::
16286
16287       declare float @llvm.canonicalize.f32(float %a)
16288       declare double @llvm.canonicalize.f64(double %b)
16289
16290 Overview:
16291 """""""""
16292
16293 The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical
16294 encoding of a floating-point number. This canonicalization is useful for
16295 implementing certain numeric primitives such as frexp. The canonical encoding is
16296 defined by IEEE-754-2008 to be:
16297
16298 ::
16299
16300       2.1.8 canonical encoding: The preferred encoding of a floating-point
16301       representation in a format. Applied to declets, significands of finite
16302       numbers, infinities, and NaNs, especially in decimal formats.
16303
16304 This operation can also be considered equivalent to the IEEE-754-2008
16305 conversion of a floating-point value to the same format. NaNs are handled
16306 according to section 6.2.
16307
16308 Examples of non-canonical encodings:
16309
16310 - x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are
16311   converted to a canonical representation per hardware-specific protocol.
16312 - Many normal decimal floating-point numbers have non-canonical alternative
16313   encodings.
16314 - Some machines, like GPUs or ARMv7 NEON, do not support subnormal values.
16315   These are treated as non-canonical encodings of zero and will be flushed to
16316   a zero of the same sign by this operation.
16317
16318 Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with
16319 default exception handling must signal an invalid exception, and produce a
16320 quiet NaN result.
16321
16322 This function should always be implementable as multiplication by 1.0, provided
16323 that the compiler does not constant fold the operation. Likewise, division by
16324 1.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with
16325 -0.0 is also sufficient provided that the rounding mode is not -Infinity.
16326
16327 ``@llvm.canonicalize`` must preserve the equality relation. That is:
16328
16329 - ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)``
16330 - ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent to
16331   to ``(x == y)``
16332
16333 Additionally, the sign of zero must be conserved:
16334 ``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0``
16335
16336 The payload bits of a NaN must be conserved, with two exceptions.
16337 First, environments which use only a single canonical representation of NaN
16338 must perform said canonicalization. Second, SNaNs must be quieted per the
16339 usual methods.
16340
16341 The canonicalization operation may be optimized away if:
16342
16343 - The input is known to be canonical. For example, it was produced by a
16344   floating-point operation that is required by the standard to be canonical.
16345 - The result is consumed only by (or fused with) other floating-point
16346   operations. That is, the bits of the floating-point value are not examined.
16347
16348 '``llvm.fmuladd.*``' Intrinsic
16349 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16350
16351 Syntax:
16352 """""""
16353
16354 ::
16355
16356       declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
16357       declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
16358
16359 Overview:
16360 """""""""
16361
16362 The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add
16363 expressions that can be fused if the code generator determines that (a) the
16364 target instruction set has support for a fused operation, and (b) that the
16365 fused operation is more efficient than the equivalent, separate pair of mul
16366 and add instructions.
16367
16368 Arguments:
16369 """"""""""
16370
16371 The '``llvm.fmuladd.*``' intrinsics each take three arguments: two
16372 multiplicands, a and b, and an addend c.
16373
16374 Semantics:
16375 """"""""""
16376
16377 The expression:
16378
16379 ::
16380
16381       %0 = call float @llvm.fmuladd.f32(%a, %b, %c)
16382
16383 is equivalent to the expression a \* b + c, except that it is unspecified
16384 whether rounding will be performed between the multiplication and addition
16385 steps. Fusion is not guaranteed, even if the target platform supports it.
16386 If a fused multiply-add is required, the corresponding
16387 :ref:`llvm.fma <int_fma>` intrinsic function should be used instead.
16388 This never sets errno, just as '``llvm.fma.*``'.
16389
16390 Examples:
16391 """""""""
16392
16393 .. code-block:: llvm
16394
16395       %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
16396
16397
16398 Hardware-Loop Intrinsics
16399 ------------------------
16400
16401 LLVM support several intrinsics to mark a loop as a hardware-loop. They are
16402 hints to the backend which are required to lower these intrinsics further to target
16403 specific instructions, or revert the hardware-loop to a normal loop if target
16404 specific restriction are not met and a hardware-loop can't be generated.
16405
16406 These intrinsics may be modified in the future and are not intended to be used
16407 outside the backend. Thus, front-end and mid-level optimizations should not be
16408 generating these intrinsics.
16409
16410
16411 '``llvm.set.loop.iterations.*``' Intrinsic
16412 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16413
16414 Syntax:
16415 """""""
16416
16417 This is an overloaded intrinsic.
16418
16419 ::
16420
16421       declare void @llvm.set.loop.iterations.i32(i32)
16422       declare void @llvm.set.loop.iterations.i64(i64)
16423
16424 Overview:
16425 """""""""
16426
16427 The '``llvm.set.loop.iterations.*``' intrinsics are used to specify the
16428 hardware-loop trip count. They are placed in the loop preheader basic block and
16429 are marked as ``IntrNoDuplicate`` to avoid optimizers duplicating these
16430 instructions.
16431
16432 Arguments:
16433 """"""""""
16434
16435 The integer operand is the loop trip count of the hardware-loop, and thus
16436 not e.g. the loop back-edge taken count.
16437
16438 Semantics:
16439 """"""""""
16440
16441 The '``llvm.set.loop.iterations.*``' intrinsics do not perform any arithmetic
16442 on their operand. It's a hint to the backend that can use this to set up the
16443 hardware-loop count with a target specific instruction, usually a move of this
16444 value to a special register or a hardware-loop instruction.
16445
16446
16447 '``llvm.start.loop.iterations.*``' Intrinsic
16448 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16449
16450 Syntax:
16451 """""""
16452
16453 This is an overloaded intrinsic.
16454
16455 ::
16456
16457       declare i32 @llvm.start.loop.iterations.i32(i32)
16458       declare i64 @llvm.start.loop.iterations.i64(i64)
16459
16460 Overview:
16461 """""""""
16462
16463 The '``llvm.start.loop.iterations.*``' intrinsics are similar to the
16464 '``llvm.set.loop.iterations.*``' intrinsics, used to specify the
16465 hardware-loop trip count but also produce a value identical to the input
16466 that can be used as the input to the loop. They are placed in the loop
16467 preheader basic block and the output is expected to be the input to the
16468 phi for the induction variable of the loop, decremented by the
16469 '``llvm.loop.decrement.reg.*``'.
16470
16471 Arguments:
16472 """"""""""
16473
16474 The integer operand is the loop trip count of the hardware-loop, and thus
16475 not e.g. the loop back-edge taken count.
16476
16477 Semantics:
16478 """"""""""
16479
16480 The '``llvm.start.loop.iterations.*``' intrinsics do not perform any arithmetic
16481 on their operand. It's a hint to the backend that can use this to set up the
16482 hardware-loop count with a target specific instruction, usually a move of this
16483 value to a special register or a hardware-loop instruction.
16484
16485 '``llvm.test.set.loop.iterations.*``' Intrinsic
16486 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16487
16488 Syntax:
16489 """""""
16490
16491 This is an overloaded intrinsic.
16492
16493 ::
16494
16495       declare i1 @llvm.test.set.loop.iterations.i32(i32)
16496       declare i1 @llvm.test.set.loop.iterations.i64(i64)
16497
16498 Overview:
16499 """""""""
16500
16501 The '``llvm.test.set.loop.iterations.*``' intrinsics are used to specify the
16502 the loop trip count, and also test that the given count is not zero, allowing
16503 it to control entry to a while-loop.  They are placed in the loop preheader's
16504 predecessor basic block, and are marked as ``IntrNoDuplicate`` to avoid
16505 optimizers duplicating these instructions.
16506
16507 Arguments:
16508 """"""""""
16509
16510 The integer operand is the loop trip count of the hardware-loop, and thus
16511 not e.g. the loop back-edge taken count.
16512
16513 Semantics:
16514 """"""""""
16515
16516 The '``llvm.test.set.loop.iterations.*``' intrinsics do not perform any
16517 arithmetic on their operand. It's a hint to the backend that can use this to
16518 set up the hardware-loop count with a target specific instruction, usually a
16519 move of this value to a special register or a hardware-loop instruction.
16520 The result is the conditional value of whether the given count is not zero.
16521
16522
16523 '``llvm.test.start.loop.iterations.*``' Intrinsic
16524 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16525
16526 Syntax:
16527 """""""
16528
16529 This is an overloaded intrinsic.
16530
16531 ::
16532
16533       declare {i32, i1} @llvm.test.start.loop.iterations.i32(i32)
16534       declare {i64, i1} @llvm.test.start.loop.iterations.i64(i64)
16535
16536 Overview:
16537 """""""""
16538
16539 The '``llvm.test.start.loop.iterations.*``' intrinsics are similar to the
16540 '``llvm.test.set.loop.iterations.*``' and '``llvm.start.loop.iterations.*``'
16541 intrinsics, used to specify the hardware-loop trip count, but also produce a
16542 value identical to the input that can be used as the input to the loop. The
16543 second i1 output controls entry to a while-loop.
16544
16545 Arguments:
16546 """"""""""
16547
16548 The integer operand is the loop trip count of the hardware-loop, and thus
16549 not e.g. the loop back-edge taken count.
16550
16551 Semantics:
16552 """"""""""
16553
16554 The '``llvm.test.start.loop.iterations.*``' intrinsics do not perform any
16555 arithmetic on their operand. It's a hint to the backend that can use this to
16556 set up the hardware-loop count with a target specific instruction, usually a
16557 move of this value to a special register or a hardware-loop instruction.
16558 The result is a pair of the input and a conditional value of whether the
16559 given count is not zero.
16560
16561
16562 '``llvm.loop.decrement.reg.*``' Intrinsic
16563 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16564
16565 Syntax:
16566 """""""
16567
16568 This is an overloaded intrinsic.
16569
16570 ::
16571
16572       declare i32 @llvm.loop.decrement.reg.i32(i32, i32)
16573       declare i64 @llvm.loop.decrement.reg.i64(i64, i64)
16574
16575 Overview:
16576 """""""""
16577
16578 The '``llvm.loop.decrement.reg.*``' intrinsics are used to lower the loop
16579 iteration counter and return an updated value that will be used in the next
16580 loop test check.
16581
16582 Arguments:
16583 """"""""""
16584
16585 Both arguments must have identical integer types. The first operand is the
16586 loop iteration counter. The second operand is the maximum number of elements
16587 processed in an iteration.
16588
16589 Semantics:
16590 """"""""""
16591
16592 The '``llvm.loop.decrement.reg.*``' intrinsics do an integer ``SUB`` of its
16593 two operands, which is not allowed to wrap. They return the remaining number of
16594 iterations still to be executed, and can be used together with a ``PHI``,
16595 ``ICMP`` and ``BR`` to control the number of loop iterations executed. Any
16596 optimisations are allowed to treat it is a ``SUB``, and it is supported by
16597 SCEV, so it's the backends responsibility to handle cases where it may be
16598 optimised. These intrinsics are marked as ``IntrNoDuplicate`` to avoid
16599 optimizers duplicating these instructions.
16600
16601
16602 '``llvm.loop.decrement.*``' Intrinsic
16603 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16604
16605 Syntax:
16606 """""""
16607
16608 This is an overloaded intrinsic.
16609
16610 ::
16611
16612       declare i1 @llvm.loop.decrement.i32(i32)
16613       declare i1 @llvm.loop.decrement.i64(i64)
16614
16615 Overview:
16616 """""""""
16617
16618 The HardwareLoops pass allows the loop decrement value to be specified with an
16619 option. It defaults to a loop decrement value of 1, but it can be an unsigned
16620 integer value provided by this option.  The '``llvm.loop.decrement.*``'
16621 intrinsics decrement the loop iteration counter with this value, and return a
16622 false predicate if the loop should exit, and true otherwise.
16623 This is emitted if the loop counter is not updated via a ``PHI`` node, which
16624 can also be controlled with an option.
16625
16626 Arguments:
16627 """"""""""
16628
16629 The integer argument is the loop decrement value used to decrement the loop
16630 iteration counter.
16631
16632 Semantics:
16633 """"""""""
16634
16635 The '``llvm.loop.decrement.*``' intrinsics do a ``SUB`` of the loop iteration
16636 counter with the given loop decrement value, and return false if the loop
16637 should exit, this ``SUB`` is not allowed to wrap. The result is a condition
16638 that is used by the conditional branch controlling the loop.
16639
16640
16641 Vector Reduction Intrinsics
16642 ---------------------------
16643
16644 Horizontal reductions of vectors can be expressed using the following
16645 intrinsics. Each one takes a vector operand as an input and applies its
16646 respective operation across all elements of the vector, returning a single
16647 scalar result of the same element type.
16648
16649 .. _int_vector_reduce_add:
16650
16651 '``llvm.vector.reduce.add.*``' Intrinsic
16652 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16653
16654 Syntax:
16655 """""""
16656
16657 ::
16658
16659       declare i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %a)
16660       declare i64 @llvm.vector.reduce.add.v2i64(<2 x i64> %a)
16661
16662 Overview:
16663 """""""""
16664
16665 The '``llvm.vector.reduce.add.*``' intrinsics do an integer ``ADD``
16666 reduction of a vector, returning the result as a scalar. The return type matches
16667 the element-type of the vector input.
16668
16669 Arguments:
16670 """"""""""
16671 The argument to this intrinsic must be a vector of integer values.
16672
16673 .. _int_vector_reduce_fadd:
16674
16675 '``llvm.vector.reduce.fadd.*``' Intrinsic
16676 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16677
16678 Syntax:
16679 """""""
16680
16681 ::
16682
16683       declare float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %a)
16684       declare double @llvm.vector.reduce.fadd.v2f64(double %start_value, <2 x double> %a)
16685
16686 Overview:
16687 """""""""
16688
16689 The '``llvm.vector.reduce.fadd.*``' intrinsics do a floating-point
16690 ``ADD`` reduction of a vector, returning the result as a scalar. The return type
16691 matches the element-type of the vector input.
16692
16693 If the intrinsic call has the 'reassoc' flag set, then the reduction will not
16694 preserve the associativity of an equivalent scalarized counterpart. Otherwise
16695 the reduction will be *sequential*, thus implying that the operation respects
16696 the associativity of a scalarized reduction. That is, the reduction begins with
16697 the start value and performs an fadd operation with consecutively increasing
16698 vector element indices. See the following pseudocode:
16699
16700 ::
16701
16702     float sequential_fadd(start_value, input_vector)
16703       result = start_value
16704       for i = 0 to length(input_vector)
16705         result = result + input_vector[i]
16706       return result
16707
16708
16709 Arguments:
16710 """"""""""
16711 The first argument to this intrinsic is a scalar start value for the reduction.
16712 The type of the start value matches the element-type of the vector input.
16713 The second argument must be a vector of floating-point values.
16714
16715 To ignore the start value, negative zero (``-0.0``) can be used, as it is
16716 the neutral value of floating point addition.
16717
16718 Examples:
16719 """""""""
16720
16721 ::
16722
16723       %unord = call reassoc float @llvm.vector.reduce.fadd.v4f32(float -0.0, <4 x float> %input) ; relaxed reduction
16724       %ord = call float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
16725
16726
16727 .. _int_vector_reduce_mul:
16728
16729 '``llvm.vector.reduce.mul.*``' Intrinsic
16730 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16731
16732 Syntax:
16733 """""""
16734
16735 ::
16736
16737       declare i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %a)
16738       declare i64 @llvm.vector.reduce.mul.v2i64(<2 x i64> %a)
16739
16740 Overview:
16741 """""""""
16742
16743 The '``llvm.vector.reduce.mul.*``' intrinsics do an integer ``MUL``
16744 reduction of a vector, returning the result as a scalar. The return type matches
16745 the element-type of the vector input.
16746
16747 Arguments:
16748 """"""""""
16749 The argument to this intrinsic must be a vector of integer values.
16750
16751 .. _int_vector_reduce_fmul:
16752
16753 '``llvm.vector.reduce.fmul.*``' Intrinsic
16754 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16755
16756 Syntax:
16757 """""""
16758
16759 ::
16760
16761       declare float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %a)
16762       declare double @llvm.vector.reduce.fmul.v2f64(double %start_value, <2 x double> %a)
16763
16764 Overview:
16765 """""""""
16766
16767 The '``llvm.vector.reduce.fmul.*``' intrinsics do a floating-point
16768 ``MUL`` reduction of a vector, returning the result as a scalar. The return type
16769 matches the element-type of the vector input.
16770
16771 If the intrinsic call has the 'reassoc' flag set, then the reduction will not
16772 preserve the associativity of an equivalent scalarized counterpart. Otherwise
16773 the reduction will be *sequential*, thus implying that the operation respects
16774 the associativity of a scalarized reduction. That is, the reduction begins with
16775 the start value and performs an fmul operation with consecutively increasing
16776 vector element indices. See the following pseudocode:
16777
16778 ::
16779
16780     float sequential_fmul(start_value, input_vector)
16781       result = start_value
16782       for i = 0 to length(input_vector)
16783         result = result * input_vector[i]
16784       return result
16785
16786
16787 Arguments:
16788 """"""""""
16789 The first argument to this intrinsic is a scalar start value for the reduction.
16790 The type of the start value matches the element-type of the vector input.
16791 The second argument must be a vector of floating-point values.
16792
16793 To ignore the start value, one (``1.0``) can be used, as it is the neutral
16794 value of floating point multiplication.
16795
16796 Examples:
16797 """""""""
16798
16799 ::
16800
16801       %unord = call reassoc float @llvm.vector.reduce.fmul.v4f32(float 1.0, <4 x float> %input) ; relaxed reduction
16802       %ord = call float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
16803
16804 .. _int_vector_reduce_and:
16805
16806 '``llvm.vector.reduce.and.*``' Intrinsic
16807 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16808
16809 Syntax:
16810 """""""
16811
16812 ::
16813
16814       declare i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %a)
16815
16816 Overview:
16817 """""""""
16818
16819 The '``llvm.vector.reduce.and.*``' intrinsics do a bitwise ``AND``
16820 reduction of a vector, returning the result as a scalar. The return type matches
16821 the element-type of the vector input.
16822
16823 Arguments:
16824 """"""""""
16825 The argument to this intrinsic must be a vector of integer values.
16826
16827 .. _int_vector_reduce_or:
16828
16829 '``llvm.vector.reduce.or.*``' Intrinsic
16830 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16831
16832 Syntax:
16833 """""""
16834
16835 ::
16836
16837       declare i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %a)
16838
16839 Overview:
16840 """""""""
16841
16842 The '``llvm.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction
16843 of a vector, returning the result as a scalar. The return type matches the
16844 element-type of the vector input.
16845
16846 Arguments:
16847 """"""""""
16848 The argument to this intrinsic must be a vector of integer values.
16849
16850 .. _int_vector_reduce_xor:
16851
16852 '``llvm.vector.reduce.xor.*``' Intrinsic
16853 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16854
16855 Syntax:
16856 """""""
16857
16858 ::
16859
16860       declare i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %a)
16861
16862 Overview:
16863 """""""""
16864
16865 The '``llvm.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR``
16866 reduction of a vector, returning the result as a scalar. The return type matches
16867 the element-type of the vector input.
16868
16869 Arguments:
16870 """"""""""
16871 The argument to this intrinsic must be a vector of integer values.
16872
16873 .. _int_vector_reduce_smax:
16874
16875 '``llvm.vector.reduce.smax.*``' Intrinsic
16876 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16877
16878 Syntax:
16879 """""""
16880
16881 ::
16882
16883       declare i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> %a)
16884
16885 Overview:
16886 """""""""
16887
16888 The '``llvm.vector.reduce.smax.*``' intrinsics do a signed integer
16889 ``MAX`` reduction of a vector, returning the result as a scalar. The return type
16890 matches the element-type of the vector input.
16891
16892 Arguments:
16893 """"""""""
16894 The argument to this intrinsic must be a vector of integer values.
16895
16896 .. _int_vector_reduce_smin:
16897
16898 '``llvm.vector.reduce.smin.*``' Intrinsic
16899 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16900
16901 Syntax:
16902 """""""
16903
16904 ::
16905
16906       declare i32 @llvm.vector.reduce.smin.v4i32(<4 x i32> %a)
16907
16908 Overview:
16909 """""""""
16910
16911 The '``llvm.vector.reduce.smin.*``' intrinsics do a signed integer
16912 ``MIN`` reduction of a vector, returning the result as a scalar. The return type
16913 matches the element-type of the vector input.
16914
16915 Arguments:
16916 """"""""""
16917 The argument to this intrinsic must be a vector of integer values.
16918
16919 .. _int_vector_reduce_umax:
16920
16921 '``llvm.vector.reduce.umax.*``' Intrinsic
16922 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16923
16924 Syntax:
16925 """""""
16926
16927 ::
16928
16929       declare i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %a)
16930
16931 Overview:
16932 """""""""
16933
16934 The '``llvm.vector.reduce.umax.*``' intrinsics do an unsigned
16935 integer ``MAX`` reduction of a vector, returning the result as a scalar. The
16936 return type matches the element-type of the vector input.
16937
16938 Arguments:
16939 """"""""""
16940 The argument to this intrinsic must be a vector of integer values.
16941
16942 .. _int_vector_reduce_umin:
16943
16944 '``llvm.vector.reduce.umin.*``' Intrinsic
16945 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16946
16947 Syntax:
16948 """""""
16949
16950 ::
16951
16952       declare i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %a)
16953
16954 Overview:
16955 """""""""
16956
16957 The '``llvm.vector.reduce.umin.*``' intrinsics do an unsigned
16958 integer ``MIN`` reduction of a vector, returning the result as a scalar. The
16959 return type matches the element-type of the vector input.
16960
16961 Arguments:
16962 """"""""""
16963 The argument to this intrinsic must be a vector of integer values.
16964
16965 .. _int_vector_reduce_fmax:
16966
16967 '``llvm.vector.reduce.fmax.*``' Intrinsic
16968 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16969
16970 Syntax:
16971 """""""
16972
16973 ::
16974
16975       declare float @llvm.vector.reduce.fmax.v4f32(<4 x float> %a)
16976       declare double @llvm.vector.reduce.fmax.v2f64(<2 x double> %a)
16977
16978 Overview:
16979 """""""""
16980
16981 The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point
16982 ``MAX`` reduction of a vector, returning the result as a scalar. The return type
16983 matches the element-type of the vector input.
16984
16985 This instruction has the same comparison semantics as the '``llvm.maxnum.*``'
16986 intrinsic. That is, the result will always be a number unless all elements of
16987 the vector are NaN. For a vector with maximum element magnitude 0.0 and
16988 containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
16989
16990 If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
16991 assume that NaNs are not present in the input vector.
16992
16993 Arguments:
16994 """"""""""
16995 The argument to this intrinsic must be a vector of floating-point values.
16996
16997 .. _int_vector_reduce_fmin:
16998
16999 '``llvm.vector.reduce.fmin.*``' Intrinsic
17000 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17001
17002 Syntax:
17003 """""""
17004 This is an overloaded intrinsic.
17005
17006 ::
17007
17008       declare float @llvm.vector.reduce.fmin.v4f32(<4 x float> %a)
17009       declare double @llvm.vector.reduce.fmin.v2f64(<2 x double> %a)
17010
17011 Overview:
17012 """""""""
17013
17014 The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point
17015 ``MIN`` reduction of a vector, returning the result as a scalar. The return type
17016 matches the element-type of the vector input.
17017
17018 This instruction has the same comparison semantics as the '``llvm.minnum.*``'
17019 intrinsic. That is, the result will always be a number unless all elements of
17020 the vector are NaN. For a vector with minimum element magnitude 0.0 and
17021 containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
17022
17023 If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
17024 assume that NaNs are not present in the input vector.
17025
17026 Arguments:
17027 """"""""""
17028 The argument to this intrinsic must be a vector of floating-point values.
17029
17030 '``llvm.experimental.vector.insert``' Intrinsic
17031 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17032
17033 Syntax:
17034 """""""
17035 This is an overloaded intrinsic. You can use ``llvm.experimental.vector.insert``
17036 to insert a fixed-width vector into a scalable vector, but not the other way
17037 around.
17038
17039 ::
17040
17041       declare <vscale x 4 x float> @llvm.experimental.vector.insert.v4f32(<vscale x 4 x float> %vec, <4 x float> %subvec, i64 %idx)
17042       declare <vscale x 2 x double> @llvm.experimental.vector.insert.v2f64(<vscale x 2 x double> %vec, <2 x double> %subvec, i64 %idx)
17043
17044 Overview:
17045 """""""""
17046
17047 The '``llvm.experimental.vector.insert.*``' intrinsics insert a vector into another vector
17048 starting from a given index. The return type matches the type of the vector we
17049 insert into. Conceptually, this can be used to build a scalable vector out of
17050 non-scalable vectors.
17051
17052 Arguments:
17053 """"""""""
17054
17055 The ``vec`` is the vector which ``subvec`` will be inserted into.
17056 The ``subvec`` is the vector that will be inserted.
17057
17058 ``idx`` represents the starting element number at which ``subvec`` will be
17059 inserted. ``idx`` must be a constant multiple of ``subvec``'s known minimum
17060 vector length. If ``subvec`` is a scalable vector, ``idx`` is first scaled by
17061 the runtime scaling factor of ``subvec``. The elements of ``vec`` starting at
17062 ``idx`` are overwritten with ``subvec``. Elements ``idx`` through (``idx`` +
17063 num_elements(``subvec``) - 1) must be valid ``vec`` indices. If this condition
17064 cannot be determined statically but is false at runtime, then the result vector
17065 is undefined.
17066
17067
17068 '``llvm.experimental.vector.extract``' Intrinsic
17069 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17070
17071 Syntax:
17072 """""""
17073 This is an overloaded intrinsic. You can use
17074 ``llvm.experimental.vector.extract`` to extract a fixed-width vector from a
17075 scalable vector, but not the other way around.
17076
17077 ::
17078
17079       declare <4 x float> @llvm.experimental.vector.extract.v4f32(<vscale x 4 x float> %vec, i64 %idx)
17080       declare <2 x double> @llvm.experimental.vector.extract.v2f64(<vscale x 2 x double> %vec, i64 %idx)
17081
17082 Overview:
17083 """""""""
17084
17085 The '``llvm.experimental.vector.extract.*``' intrinsics extract a vector from
17086 within another vector starting from a given index. The return type must be
17087 explicitly specified. Conceptually, this can be used to decompose a scalable
17088 vector into non-scalable parts.
17089
17090 Arguments:
17091 """"""""""
17092
17093 The ``vec`` is the vector from which we will extract a subvector.
17094
17095 The ``idx`` specifies the starting element number within ``vec`` from which a
17096 subvector is extracted. ``idx`` must be a constant multiple of the known-minimum
17097 vector length of the result type. If the result type is a scalable vector,
17098 ``idx`` is first scaled by the result type's runtime scaling factor. Elements
17099 ``idx`` through (``idx`` + num_elements(result_type) - 1) must be valid vector
17100 indices. If this condition cannot be determined statically but is false at
17101 runtime, then the result vector is undefined. The ``idx`` parameter must be a
17102 vector index constant type (for most targets this will be an integer pointer
17103 type).
17104
17105 '``llvm.experimental.vector.reverse``' Intrinsic
17106 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17107
17108 Syntax:
17109 """""""
17110 This is an overloaded intrinsic.
17111
17112 ::
17113
17114       declare <2 x i8> @llvm.experimental.vector.reverse.v2i8(<2 x i8> %a)
17115       declare <vscale x 4 x i32> @llvm.experimental.vector.reverse.nxv4i32(<vscale x 4 x i32> %a)
17116
17117 Overview:
17118 """""""""
17119
17120 The '``llvm.experimental.vector.reverse.*``' intrinsics reverse a vector.
17121 The intrinsic takes a single vector and returns a vector of matching type but
17122 with the original lane order reversed. These intrinsics work for both fixed
17123 and scalable vectors. While this intrinsic is marked as experimental the
17124 recommended way to express reverse operations for fixed-width vectors is still
17125 to use a shufflevector, as that may allow for more optimization opportunities.
17126
17127 Arguments:
17128 """"""""""
17129
17130 The argument to this intrinsic must be a vector.
17131
17132 '``llvm.experimental.vector.splice``' Intrinsic
17133 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17134
17135 Syntax:
17136 """""""
17137 This is an overloaded intrinsic.
17138
17139 ::
17140
17141       declare <2 x double> @llvm.experimental.vector.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm)
17142       declare <vscale x 4 x i32> @llvm.experimental.vector.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm)
17143
17144 Overview:
17145 """""""""
17146
17147 The '``llvm.experimental.vector.splice.*``' intrinsics construct a vector by
17148 concatenating elements from the first input vector with elements of the second
17149 input vector, returning a vector of the same type as the input vectors. The
17150 signed immediate, modulo the number of elements in the vector, is the index
17151 into the first vector from which to extract the result value. This means
17152 conceptually that for a positive immediate, a vector is extracted from
17153 ``concat(%vec1, %vec2)`` starting at index ``imm``, whereas for a negative
17154 immediate, it extracts ``-imm`` trailing elements from the first vector, and
17155 the remaining elements from ``%vec2``.
17156
17157 These intrinsics work for both fixed and scalable vectors. While this intrinsic
17158 is marked as experimental, the recommended way to express this operation for
17159 fixed-width vectors is still to use a shufflevector, as that may allow for more
17160 optimization opportunities.
17161
17162 For example:
17163
17164 .. code-block:: text
17165
17166  llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1)  ==> <B, C, D, E> ; index
17167  llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3) ==> <B, C, D, E> ; trailing elements
17168
17169
17170 Arguments:
17171 """"""""""
17172
17173 The first two operands are vectors with the same type. The third argument
17174 ``imm`` is the start index, modulo VL, where VL is the runtime vector length of
17175 the source/result vector. The ``imm`` is a signed integer constant in the range
17176 ``-VL <= imm < VL``. For values outside of this range the result is poison.
17177
17178 '``llvm.experimental.stepvector``' Intrinsic
17179 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17180
17181 This is an overloaded intrinsic. You can use ``llvm.experimental.stepvector``
17182 to generate a vector whose lane values comprise the linear sequence
17183 <0, 1, 2, ...>. It is primarily intended for scalable vectors.
17184
17185 ::
17186
17187       declare <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
17188       declare <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()
17189
17190 The '``llvm.experimental.stepvector``' intrinsics are used to create vectors
17191 of integers whose elements contain a linear sequence of values starting from 0
17192 with a step of 1.  This experimental intrinsic can only be used for vectors
17193 with integer elements that are at least 8 bits in size. If the sequence value
17194 exceeds the allowed limit for the element type then the result for that lane is
17195 undefined.
17196
17197 These intrinsics work for both fixed and scalable vectors. While this intrinsic
17198 is marked as experimental, the recommended way to express this operation for
17199 fixed-width vectors is still to generate a constant vector instead.
17200
17201
17202 Arguments:
17203 """"""""""
17204
17205 None.
17206
17207
17208 Matrix Intrinsics
17209 -----------------
17210
17211 Operations on matrixes requiring shape information (like number of rows/columns
17212 or the memory layout) can be expressed using the matrix intrinsics. These
17213 intrinsics require matrix dimensions to be passed as immediate arguments, and
17214 matrixes are passed and returned as vectors. This means that for a ``R`` x
17215 ``C`` matrix, element ``i`` of column ``j`` is at index ``j * R + i`` in the
17216 corresponding vector, with indices starting at 0. Currently column-major layout
17217 is assumed.  The intrinsics support both integer and floating point matrixes.
17218
17219
17220 '``llvm.matrix.transpose.*``' Intrinsic
17221 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17222
17223 Syntax:
17224 """""""
17225 This is an overloaded intrinsic.
17226
17227 ::
17228
17229       declare vectorty @llvm.matrix.transpose.*(vectorty %In, i32 <Rows>, i32 <Cols>)
17230
17231 Overview:
17232 """""""""
17233
17234 The '``llvm.matrix.transpose.*``' intrinsics treat ``%In`` as a ``<Rows> x
17235 <Cols>`` matrix and return the transposed matrix in the result vector.
17236
17237 Arguments:
17238 """"""""""
17239
17240 The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
17241 <Cols>`` matrix. Thus, arguments ``<Rows>`` and ``<Cols>`` correspond to the
17242 number of rows and columns, respectively, and must be positive, constant
17243 integers. The returned vector must have ``<Rows> * <Cols>`` elements, and have
17244 the same float or integer element type as ``%In``.
17245
17246 '``llvm.matrix.multiply.*``' Intrinsic
17247 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17248
17249 Syntax:
17250 """""""
17251 This is an overloaded intrinsic.
17252
17253 ::
17254
17255       declare vectorty @llvm.matrix.multiply.*(vectorty %A, vectorty %B, i32 <OuterRows>, i32 <Inner>, i32 <OuterColumns>)
17256
17257 Overview:
17258 """""""""
17259
17260 The '``llvm.matrix.multiply.*``' intrinsics treat ``%A`` as a ``<OuterRows> x
17261 <Inner>`` matrix, ``%B`` as a ``<Inner> x <OuterColumns>`` matrix, and
17262 multiplies them. The result matrix is returned in the result vector.
17263
17264 Arguments:
17265 """"""""""
17266
17267 The first vector argument ``%A`` corresponds to a matrix with ``<OuterRows> *
17268 <Inner>`` elements, and the second argument ``%B`` to a matrix with
17269 ``<Inner> * <OuterColumns>`` elements. Arguments ``<OuterRows>``,
17270 ``<Inner>`` and ``<OuterColumns>`` must be positive, constant integers. The
17271 returned vector must have ``<OuterRows> * <OuterColumns>`` elements.
17272 Vectors ``%A``, ``%B``, and the returned vector all have the same float or
17273 integer element type.
17274
17275
17276 '``llvm.matrix.column.major.load.*``' Intrinsic
17277 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17278
17279 Syntax:
17280 """""""
17281 This is an overloaded intrinsic.
17282
17283 ::
17284
17285       declare vectorty @llvm.matrix.column.major.load.*(
17286           ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
17287
17288 Overview:
17289 """""""""
17290
17291 The '``llvm.matrix.column.major.load.*``' intrinsics load a ``<Rows> x <Cols>``
17292 matrix using a stride of ``%Stride`` to compute the start address of the
17293 different columns.  The offset is computed using ``%Stride``'s bitwidth. This
17294 allows for convenient loading of sub matrixes. If ``<IsVolatile>`` is true, the
17295 intrinsic is considered a :ref:`volatile memory access <volatile>`. The result
17296 matrix is returned in the result vector. If the ``%Ptr`` argument is known to
17297 be aligned to some boundary, this can be specified as an attribute on the
17298 argument.
17299
17300 Arguments:
17301 """"""""""
17302
17303 The first argument ``%Ptr`` is a pointer type to the returned vector type, and
17304 corresponds to the start address to load from. The second argument ``%Stride``
17305 is a positive, constant integer with ``%Stride >= <Rows>``. ``%Stride`` is used
17306 to compute the column memory addresses. I.e., for a column ``C``, its start
17307 memory addresses is calculated with ``%Ptr + C * %Stride``. The third Argument
17308 ``<IsVolatile>`` is a boolean value.  The fourth and fifth arguments,
17309 ``<Rows>`` and ``<Cols>``, correspond to the number of rows and columns,
17310 respectively, and must be positive, constant integers. The returned vector must
17311 have ``<Rows> * <Cols>`` elements.
17312
17313 The :ref:`align <attr_align>` parameter attribute can be provided for the
17314 ``%Ptr`` arguments.
17315
17316
17317 '``llvm.matrix.column.major.store.*``' Intrinsic
17318 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17319
17320 Syntax:
17321 """""""
17322
17323 ::
17324
17325       declare void @llvm.matrix.column.major.store.*(
17326           vectorty %In, ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
17327
17328 Overview:
17329 """""""""
17330
17331 The '``llvm.matrix.column.major.store.*``' intrinsics store the ``<Rows> x
17332 <Cols>`` matrix in ``%In`` to memory using a stride of ``%Stride`` between
17333 columns. The offset is computed using ``%Stride``'s bitwidth. If
17334 ``<IsVolatile>`` is true, the intrinsic is considered a
17335 :ref:`volatile memory access <volatile>`.
17336
17337 If the ``%Ptr`` argument is known to be aligned to some boundary, this can be
17338 specified as an attribute on the argument.
17339
17340 Arguments:
17341 """"""""""
17342
17343 The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
17344 <Cols>`` matrix to be stored to memory. The second argument ``%Ptr`` is a
17345 pointer to the vector type of ``%In``, and is the start address of the matrix
17346 in memory. The third argument ``%Stride`` is a positive, constant integer with
17347 ``%Stride >= <Rows>``.  ``%Stride`` is used to compute the column memory
17348 addresses. I.e., for a column ``C``, its start memory addresses is calculated
17349 with ``%Ptr + C * %Stride``.  The fourth argument ``<IsVolatile>`` is a boolean
17350 value. The arguments ``<Rows>`` and ``<Cols>`` correspond to the number of rows
17351 and columns, respectively, and must be positive, constant integers.
17352
17353 The :ref:`align <attr_align>` parameter attribute can be provided
17354 for the ``%Ptr`` arguments.
17355
17356
17357 Half Precision Floating-Point Intrinsics
17358 ----------------------------------------
17359
17360 For most target platforms, half precision floating-point is a
17361 storage-only format. This means that it is a dense encoding (in memory)
17362 but does not support computation in the format.
17363
17364 This means that code must first load the half-precision floating-point
17365 value as an i16, then convert it to float with
17366 :ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can
17367 then be performed on the float value (including extending to double
17368 etc). To store the value back to memory, it is first converted to float
17369 if needed, then converted to i16 with
17370 :ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an
17371 i16 value.
17372
17373 .. _int_convert_to_fp16:
17374
17375 '``llvm.convert.to.fp16``' Intrinsic
17376 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17377
17378 Syntax:
17379 """""""
17380
17381 ::
17382
17383       declare i16 @llvm.convert.to.fp16.f32(float %a)
17384       declare i16 @llvm.convert.to.fp16.f64(double %a)
17385
17386 Overview:
17387 """""""""
17388
17389 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
17390 conventional floating-point type to half precision floating-point format.
17391
17392 Arguments:
17393 """"""""""
17394
17395 The intrinsic function contains single argument - the value to be
17396 converted.
17397
17398 Semantics:
17399 """"""""""
17400
17401 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
17402 conventional floating-point format to half precision floating-point format. The
17403 return value is an ``i16`` which contains the converted number.
17404
17405 Examples:
17406 """""""""
17407
17408 .. code-block:: llvm
17409
17410       %res = call i16 @llvm.convert.to.fp16.f32(float %a)
17411       store i16 %res, i16* @x, align 2
17412
17413 .. _int_convert_from_fp16:
17414
17415 '``llvm.convert.from.fp16``' Intrinsic
17416 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17417
17418 Syntax:
17419 """""""
17420
17421 ::
17422
17423       declare float @llvm.convert.from.fp16.f32(i16 %a)
17424       declare double @llvm.convert.from.fp16.f64(i16 %a)
17425
17426 Overview:
17427 """""""""
17428
17429 The '``llvm.convert.from.fp16``' intrinsic function performs a
17430 conversion from half precision floating-point format to single precision
17431 floating-point format.
17432
17433 Arguments:
17434 """"""""""
17435
17436 The intrinsic function contains single argument - the value to be
17437 converted.
17438
17439 Semantics:
17440 """"""""""
17441
17442 The '``llvm.convert.from.fp16``' intrinsic function performs a
17443 conversion from half single precision floating-point format to single
17444 precision floating-point format. The input half-float value is
17445 represented by an ``i16`` value.
17446
17447 Examples:
17448 """""""""
17449
17450 .. code-block:: llvm
17451
17452       %a = load i16, i16* @x, align 2
17453       %res = call float @llvm.convert.from.fp16(i16 %a)
17454
17455 Saturating floating-point to integer conversions
17456 ------------------------------------------------
17457
17458 The ``fptoui`` and ``fptosi`` instructions return a
17459 :ref:`poison value <poisonvalues>` if the rounded-towards-zero value is not
17460 representable by the result type. These intrinsics provide an alternative
17461 conversion, which will saturate towards the smallest and largest representable
17462 integer values instead.
17463
17464 '``llvm.fptoui.sat.*``' Intrinsic
17465 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17466
17467 Syntax:
17468 """""""
17469
17470 This is an overloaded intrinsic. You can use ``llvm.fptoui.sat`` on any
17471 floating-point argument type and any integer result type, or vectors thereof.
17472 Not all targets may support all types, however.
17473
17474 ::
17475
17476       declare i32 @llvm.fptoui.sat.i32.f32(float %f)
17477       declare i19 @llvm.fptoui.sat.i19.f64(double %f)
17478       declare <4 x i100> @llvm.fptoui.sat.v4i100.v4f128(<4 x fp128> %f)
17479
17480 Overview:
17481 """""""""
17482
17483 This intrinsic converts the argument into an unsigned integer using saturating
17484 semantics.
17485
17486 Arguments:
17487 """"""""""
17488
17489 The argument may be any floating-point or vector of floating-point type. The
17490 return value may be any integer or vector of integer type. The number of vector
17491 elements in argument and return must be the same.
17492
17493 Semantics:
17494 """"""""""
17495
17496 The conversion to integer is performed subject to the following rules:
17497
17498 - If the argument is any NaN, zero is returned.
17499 - If the argument is smaller than zero (this includes negative infinity),
17500   zero is returned.
17501 - If the argument is larger than the largest representable unsigned integer of
17502   the result type (this includes positive infinity), the largest representable
17503   unsigned integer is returned.
17504 - Otherwise, the result of rounding the argument towards zero is returned.
17505
17506 Example:
17507 """"""""
17508
17509 .. code-block:: text
17510
17511       %a = call i8 @llvm.fptoui.sat.i8.f32(float 123.9)              ; yields i8: 123
17512       %b = call i8 @llvm.fptoui.sat.i8.f32(float -5.7)               ; yields i8:   0
17513       %c = call i8 @llvm.fptoui.sat.i8.f32(float 377.0)              ; yields i8: 255
17514       %d = call i8 @llvm.fptoui.sat.i8.f32(float 0xFFF8000000000000) ; yields i8:   0
17515
17516 '``llvm.fptosi.sat.*``' Intrinsic
17517 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17518
17519 Syntax:
17520 """""""
17521
17522 This is an overloaded intrinsic. You can use ``llvm.fptosi.sat`` on any
17523 floating-point argument type and any integer result type, or vectors thereof.
17524 Not all targets may support all types, however.
17525
17526 ::
17527
17528       declare i32 @llvm.fptosi.sat.i32.f32(float %f)
17529       declare i19 @llvm.fptosi.sat.i19.f64(double %f)
17530       declare <4 x i100> @llvm.fptosi.sat.v4i100.v4f128(<4 x fp128> %f)
17531
17532 Overview:
17533 """""""""
17534
17535 This intrinsic converts the argument into a signed integer using saturating
17536 semantics.
17537
17538 Arguments:
17539 """"""""""
17540
17541 The argument may be any floating-point or vector of floating-point type. The
17542 return value may be any integer or vector of integer type. The number of vector
17543 elements in argument and return must be the same.
17544
17545 Semantics:
17546 """"""""""
17547
17548 The conversion to integer is performed subject to the following rules:
17549
17550 - If the argument is any NaN, zero is returned.
17551 - If the argument is smaller than the smallest representable signed integer of
17552   the result type (this includes negative infinity), the smallest
17553   representable signed integer is returned.
17554 - If the argument is larger than the largest representable signed integer of
17555   the result type (this includes positive infinity), the largest representable
17556   signed integer is returned.
17557 - Otherwise, the result of rounding the argument towards zero is returned.
17558
17559 Example:
17560 """"""""
17561
17562 .. code-block:: text
17563
17564       %a = call i8 @llvm.fptosi.sat.i8.f32(float 23.9)               ; yields i8:   23
17565       %b = call i8 @llvm.fptosi.sat.i8.f32(float -130.8)             ; yields i8: -128
17566       %c = call i8 @llvm.fptosi.sat.i8.f32(float 999.0)              ; yields i8:  127
17567       %d = call i8 @llvm.fptosi.sat.i8.f32(float 0xFFF8000000000000) ; yields i8:    0
17568
17569 .. _dbg_intrinsics:
17570
17571 Debugger Intrinsics
17572 -------------------
17573
17574 The LLVM debugger intrinsics (which all start with ``llvm.dbg.``
17575 prefix), are described in the `LLVM Source Level
17576 Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_
17577 document.
17578
17579 Exception Handling Intrinsics
17580 -----------------------------
17581
17582 The LLVM exception handling intrinsics (which all start with
17583 ``llvm.eh.`` prefix), are described in the `LLVM Exception
17584 Handling <ExceptionHandling.html#format-common-intrinsics>`_ document.
17585
17586 .. _int_trampoline:
17587
17588 Trampoline Intrinsics
17589 ---------------------
17590
17591 These intrinsics make it possible to excise one parameter, marked with
17592 the :ref:`nest <nest>` attribute, from a function. The result is a
17593 callable function pointer lacking the nest parameter - the caller does
17594 not need to provide a value for it. Instead, the value to use is stored
17595 in advance in a "trampoline", a block of memory usually allocated on the
17596 stack, which also contains code to splice the nest value into the
17597 argument list. This is used to implement the GCC nested function address
17598 extension.
17599
17600 For example, if the function is ``i32 f(i8* nest %c, i32 %x, i32 %y)``
17601 then the resulting function pointer has signature ``i32 (i32, i32)*``.
17602 It can be created as follows:
17603
17604 .. code-block:: llvm
17605
17606       %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86
17607       %tramp1 = getelementptr [10 x i8], [10 x i8]* %tramp, i32 0, i32 0
17608       call i8* @llvm.init.trampoline(i8* %tramp1, i8* bitcast (i32 (i8*, i32, i32)* @f to i8*), i8* %nval)
17609       %p = call i8* @llvm.adjust.trampoline(i8* %tramp1)
17610       %fp = bitcast i8* %p to i32 (i32, i32)*
17611
17612 The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to
17613 ``%val = call i32 %f(i8* %nval, i32 %x, i32 %y)``.
17614
17615 .. _int_it:
17616
17617 '``llvm.init.trampoline``' Intrinsic
17618 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17619
17620 Syntax:
17621 """""""
17622
17623 ::
17624
17625       declare void @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>)
17626
17627 Overview:
17628 """""""""
17629
17630 This fills the memory pointed to by ``tramp`` with executable code,
17631 turning it into a trampoline.
17632
17633 Arguments:
17634 """"""""""
17635
17636 The ``llvm.init.trampoline`` intrinsic takes three arguments, all
17637 pointers. The ``tramp`` argument must point to a sufficiently large and
17638 sufficiently aligned block of memory; this memory is written to by the
17639 intrinsic. Note that the size and the alignment are target-specific -
17640 LLVM currently provides no portable way of determining them, so a
17641 front-end that generates this intrinsic needs to have some
17642 target-specific knowledge. The ``func`` argument must hold a function
17643 bitcast to an ``i8*``.
17644
17645 Semantics:
17646 """"""""""
17647
17648 The block of memory pointed to by ``tramp`` is filled with target
17649 dependent code, turning it into a function. Then ``tramp`` needs to be
17650 passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can
17651 be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new
17652 function's signature is the same as that of ``func`` with any arguments
17653 marked with the ``nest`` attribute removed. At most one such ``nest``
17654 argument is allowed, and it must be of pointer type. Calling the new
17655 function is equivalent to calling ``func`` with the same argument list,
17656 but with ``nval`` used for the missing ``nest`` argument. If, after
17657 calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is
17658 modified, then the effect of any later call to the returned function
17659 pointer is undefined.
17660
17661 .. _int_at:
17662
17663 '``llvm.adjust.trampoline``' Intrinsic
17664 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17665
17666 Syntax:
17667 """""""
17668
17669 ::
17670
17671       declare i8* @llvm.adjust.trampoline(i8* <tramp>)
17672
17673 Overview:
17674 """""""""
17675
17676 This performs any required machine-specific adjustment to the address of
17677 a trampoline (passed as ``tramp``).
17678
17679 Arguments:
17680 """"""""""
17681
17682 ``tramp`` must point to a block of memory which already has trampoline
17683 code filled in by a previous call to
17684 :ref:`llvm.init.trampoline <int_it>`.
17685
17686 Semantics:
17687 """"""""""
17688
17689 On some architectures the address of the code to be executed needs to be
17690 different than the address where the trampoline is actually stored. This
17691 intrinsic returns the executable address corresponding to ``tramp``
17692 after performing the required machine specific adjustments. The pointer
17693 returned can then be :ref:`bitcast and executed <int_trampoline>`.
17694
17695
17696 .. _int_vp:
17697
17698 Vector Predication Intrinsics
17699 -----------------------------
17700 VP intrinsics are intended for predicated SIMD/vector code.  A typical VP
17701 operation takes a vector mask and an explicit vector length parameter as in:
17702
17703 ::
17704
17705       <W x T> llvm.vp.<opcode>.*(<W x T> %x, <W x T> %y, <W x i1> %mask, i32 %evl)
17706
17707 The vector mask parameter (%mask) always has a vector of `i1` type, for example
17708 `<32 x i1>`.  The explicit vector length parameter always has the type `i32` and
17709 is an unsigned integer value.  The explicit vector length parameter (%evl) is in
17710 the range:
17711
17712 ::
17713
17714       0 <= %evl <= W,  where W is the number of vector elements
17715
17716 Note that for :ref:`scalable vector types <t_vector>` ``W`` is the runtime
17717 length of the vector.
17718
17719 The VP intrinsic has undefined behavior if ``%evl > W``.  The explicit vector
17720 length (%evl) creates a mask, %EVLmask, with all elements ``0 <= i < %evl`` set
17721 to True, and all other lanes ``%evl <= i < W`` to False.  A new mask %M is
17722 calculated with an element-wise AND from %mask and %EVLmask:
17723
17724 ::
17725
17726       M = %mask AND %EVLmask
17727
17728 A vector operation ``<opcode>`` on vectors ``A`` and ``B`` calculates:
17729
17730 ::
17731
17732        A <opcode> B =  {  A[i] <opcode> B[i]   M[i] = True, and
17733                        {  undef otherwise
17734
17735 Optimization Hint
17736 ^^^^^^^^^^^^^^^^^
17737
17738 Some targets, such as AVX512, do not support the %evl parameter in hardware.
17739 The use of an effective %evl is discouraged for those targets.  The function
17740 ``TargetTransformInfo::hasActiveVectorLength()`` returns true when the target
17741 has native support for %evl.
17742
17743
17744 .. _int_vp_add:
17745
17746 '``llvm.vp.add.*``' Intrinsics
17747 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17748
17749 Syntax:
17750 """""""
17751 This is an overloaded intrinsic.
17752
17753 ::
17754
17755       declare <16 x i32>  @llvm.vp.add.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17756       declare <vscale x 4 x i32>  @llvm.vp.add.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17757       declare <256 x i64>  @llvm.vp.add.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17758
17759 Overview:
17760 """""""""
17761
17762 Predicated integer addition of two vectors of integers.
17763
17764
17765 Arguments:
17766 """"""""""
17767
17768 The first two operands and the result have the same vector of integer type. The
17769 third operand is the vector mask and has the same number of elements as the
17770 result vector type. The fourth operand is the explicit vector length of the
17771 operation.
17772
17773 Semantics:
17774 """"""""""
17775
17776 The '``llvm.vp.add``' intrinsic performs integer addition (:ref:`add <i_add>`)
17777 of the first and second vector operand on each enabled lane.  The result on
17778 disabled lanes is undefined.
17779
17780 Examples:
17781 """""""""
17782
17783 .. code-block:: llvm
17784
17785       %r = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
17786       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
17787
17788       %t = add <4 x i32> %a, %b
17789       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
17790
17791 .. _int_vp_sub:
17792
17793 '``llvm.vp.sub.*``' Intrinsics
17794 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17795
17796 Syntax:
17797 """""""
17798 This is an overloaded intrinsic.
17799
17800 ::
17801
17802       declare <16 x i32>  @llvm.vp.sub.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17803       declare <vscale x 4 x i32>  @llvm.vp.sub.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17804       declare <256 x i64>  @llvm.vp.sub.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17805
17806 Overview:
17807 """""""""
17808
17809 Predicated integer subtraction of two vectors of integers.
17810
17811
17812 Arguments:
17813 """"""""""
17814
17815 The first two operands and the result have the same vector of integer type. The
17816 third operand is the vector mask and has the same number of elements as the
17817 result vector type. The fourth operand is the explicit vector length of the
17818 operation.
17819
17820 Semantics:
17821 """"""""""
17822
17823 The '``llvm.vp.sub``' intrinsic performs integer subtraction
17824 (:ref:`sub <i_sub>`)  of the first and second vector operand on each enabled
17825 lane. The result on disabled lanes is undefined.
17826
17827 Examples:
17828 """""""""
17829
17830 .. code-block:: llvm
17831
17832       %r = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
17833       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
17834
17835       %t = sub <4 x i32> %a, %b
17836       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
17837
17838
17839
17840 .. _int_vp_mul:
17841
17842 '``llvm.vp.mul.*``' Intrinsics
17843 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17844
17845 Syntax:
17846 """""""
17847 This is an overloaded intrinsic.
17848
17849 ::
17850
17851       declare <16 x i32>  @llvm.vp.mul.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17852       declare <vscale x 4 x i32>  @llvm.vp.mul.nxv46i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17853       declare <256 x i64>  @llvm.vp.mul.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17854
17855 Overview:
17856 """""""""
17857
17858 Predicated integer multiplication of two vectors of integers.
17859
17860
17861 Arguments:
17862 """"""""""
17863
17864 The first two operands and the result have the same vector of integer type. The
17865 third operand is the vector mask and has the same number of elements as the
17866 result vector type. The fourth operand is the explicit vector length of the
17867 operation.
17868
17869 Semantics:
17870 """"""""""
17871 The '``llvm.vp.mul``' intrinsic performs integer multiplication
17872 (:ref:`mul <i_mul>`) of the first and second vector operand on each enabled
17873 lane. The result on disabled lanes is undefined.
17874
17875 Examples:
17876 """""""""
17877
17878 .. code-block:: llvm
17879
17880       %r = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
17881       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
17882
17883       %t = mul <4 x i32> %a, %b
17884       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
17885
17886
17887 .. _int_vp_sdiv:
17888
17889 '``llvm.vp.sdiv.*``' Intrinsics
17890 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17891
17892 Syntax:
17893 """""""
17894 This is an overloaded intrinsic.
17895
17896 ::
17897
17898       declare <16 x i32>  @llvm.vp.sdiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17899       declare <vscale x 4 x i32>  @llvm.vp.sdiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17900       declare <256 x i64>  @llvm.vp.sdiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17901
17902 Overview:
17903 """""""""
17904
17905 Predicated, signed division of two vectors of integers.
17906
17907
17908 Arguments:
17909 """"""""""
17910
17911 The first two operands and the result have the same vector of integer type. The
17912 third operand is the vector mask and has the same number of elements as the
17913 result vector type. The fourth operand is the explicit vector length of the
17914 operation.
17915
17916 Semantics:
17917 """"""""""
17918
17919 The '``llvm.vp.sdiv``' intrinsic performs signed division (:ref:`sdiv <i_sdiv>`)
17920 of the first and second vector operand on each enabled lane.  The result on
17921 disabled lanes is undefined.
17922
17923 Examples:
17924 """""""""
17925
17926 .. code-block:: llvm
17927
17928       %r = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
17929       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
17930
17931       %t = sdiv <4 x i32> %a, %b
17932       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
17933
17934
17935 .. _int_vp_udiv:
17936
17937 '``llvm.vp.udiv.*``' Intrinsics
17938 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17939
17940 Syntax:
17941 """""""
17942 This is an overloaded intrinsic.
17943
17944 ::
17945
17946       declare <16 x i32>  @llvm.vp.udiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17947       declare <vscale x 4 x i32>  @llvm.vp.udiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17948       declare <256 x i64>  @llvm.vp.udiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17949
17950 Overview:
17951 """""""""
17952
17953 Predicated, unsigned division of two vectors of integers.
17954
17955
17956 Arguments:
17957 """"""""""
17958
17959 The first two operands and the result have the same vector of integer type. The third operand is the vector mask and has the same number of elements as the result vector type. The fourth operand is the explicit vector length of the operation.
17960
17961 Semantics:
17962 """"""""""
17963
17964 The '``llvm.vp.udiv``' intrinsic performs unsigned division
17965 (:ref:`udiv <i_udiv>`) of the first and second vector operand on each enabled
17966 lane. The result on disabled lanes is undefined.
17967
17968 Examples:
17969 """""""""
17970
17971 .. code-block:: llvm
17972
17973       %r = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
17974       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
17975
17976       %t = udiv <4 x i32> %a, %b
17977       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
17978
17979
17980
17981 .. _int_vp_srem:
17982
17983 '``llvm.vp.srem.*``' Intrinsics
17984 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17985
17986 Syntax:
17987 """""""
17988 This is an overloaded intrinsic.
17989
17990 ::
17991
17992       declare <16 x i32>  @llvm.vp.srem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
17993       declare <vscale x 4 x i32>  @llvm.vp.srem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
17994       declare <256 x i64>  @llvm.vp.srem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
17995
17996 Overview:
17997 """""""""
17998
17999 Predicated computations of the signed remainder of two integer vectors.
18000
18001
18002 Arguments:
18003 """"""""""
18004
18005 The first two operands and the result have the same vector of integer type. The
18006 third operand is the vector mask and has the same number of elements as the
18007 result vector type. The fourth operand is the explicit vector length of the
18008 operation.
18009
18010 Semantics:
18011 """"""""""
18012
18013 The '``llvm.vp.srem``' intrinsic computes the remainder of the signed division
18014 (:ref:`srem <i_srem>`) of the first and second vector operand on each enabled
18015 lane.  The result on disabled lanes is undefined.
18016
18017 Examples:
18018 """""""""
18019
18020 .. code-block:: llvm
18021
18022       %r = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18023       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18024
18025       %t = srem <4 x i32> %a, %b
18026       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18027
18028
18029
18030 .. _int_vp_urem:
18031
18032 '``llvm.vp.urem.*``' Intrinsics
18033 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18034
18035 Syntax:
18036 """""""
18037 This is an overloaded intrinsic.
18038
18039 ::
18040
18041       declare <16 x i32>  @llvm.vp.urem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18042       declare <vscale x 4 x i32>  @llvm.vp.urem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18043       declare <256 x i64>  @llvm.vp.urem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18044
18045 Overview:
18046 """""""""
18047
18048 Predicated computation of the unsigned remainder of two integer vectors.
18049
18050
18051 Arguments:
18052 """"""""""
18053
18054 The first two operands and the result have the same vector of integer type. The
18055 third operand is the vector mask and has the same number of elements as the
18056 result vector type. The fourth operand is the explicit vector length of the
18057 operation.
18058
18059 Semantics:
18060 """"""""""
18061
18062 The '``llvm.vp.urem``' intrinsic computes the remainder of the unsigned division
18063 (:ref:`urem <i_urem>`) of the first and second vector operand on each enabled
18064 lane.  The result on disabled lanes is undefined.
18065
18066 Examples:
18067 """""""""
18068
18069 .. code-block:: llvm
18070
18071       %r = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18072       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18073
18074       %t = urem <4 x i32> %a, %b
18075       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18076
18077
18078 .. _int_vp_ashr:
18079
18080 '``llvm.vp.ashr.*``' Intrinsics
18081 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18082
18083 Syntax:
18084 """""""
18085 This is an overloaded intrinsic.
18086
18087 ::
18088
18089       declare <16 x i32>  @llvm.vp.ashr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18090       declare <vscale x 4 x i32>  @llvm.vp.ashr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18091       declare <256 x i64>  @llvm.vp.ashr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18092
18093 Overview:
18094 """""""""
18095
18096 Vector-predicated arithmetic right-shift.
18097
18098
18099 Arguments:
18100 """"""""""
18101
18102 The first two operands and the result have the same vector of integer type. The
18103 third operand is the vector mask and has the same number of elements as the
18104 result vector type. The fourth operand is the explicit vector length of the
18105 operation.
18106
18107 Semantics:
18108 """"""""""
18109
18110 The '``llvm.vp.ashr``' intrinsic computes the arithmetic right shift
18111 (:ref:`ashr <i_ashr>`) of the first operand by the second operand on each
18112 enabled lane. The result on disabled lanes is undefined.
18113
18114 Examples:
18115 """""""""
18116
18117 .. code-block:: llvm
18118
18119       %r = call <4 x i32> @llvm.vp.ashr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18120       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18121
18122       %t = ashr <4 x i32> %a, %b
18123       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18124
18125
18126 .. _int_vp_lshr:
18127
18128
18129 '``llvm.vp.lshr.*``' Intrinsics
18130 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18131
18132 Syntax:
18133 """""""
18134 This is an overloaded intrinsic.
18135
18136 ::
18137
18138       declare <16 x i32>  @llvm.vp.lshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18139       declare <vscale x 4 x i32>  @llvm.vp.lshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18140       declare <256 x i64>  @llvm.vp.lshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18141
18142 Overview:
18143 """""""""
18144
18145 Vector-predicated logical right-shift.
18146
18147
18148 Arguments:
18149 """"""""""
18150
18151 The first two operands and the result have the same vector of integer type. The
18152 third operand is the vector mask and has the same number of elements as the
18153 result vector type. The fourth operand is the explicit vector length of the
18154 operation.
18155
18156 Semantics:
18157 """"""""""
18158
18159 The '``llvm.vp.lshr``' intrinsic computes the logical right shift
18160 (:ref:`lshr <i_lshr>`) of the first operand by the second operand on each
18161 enabled lane. The result on disabled lanes is undefined.
18162
18163 Examples:
18164 """""""""
18165
18166 .. code-block:: llvm
18167
18168       %r = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18169       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18170
18171       %t = lshr <4 x i32> %a, %b
18172       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18173
18174
18175 .. _int_vp_shl:
18176
18177 '``llvm.vp.shl.*``' Intrinsics
18178 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18179
18180 Syntax:
18181 """""""
18182 This is an overloaded intrinsic.
18183
18184 ::
18185
18186       declare <16 x i32>  @llvm.vp.shl.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18187       declare <vscale x 4 x i32>  @llvm.vp.shl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18188       declare <256 x i64>  @llvm.vp.shl.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18189
18190 Overview:
18191 """""""""
18192
18193 Vector-predicated left shift.
18194
18195
18196 Arguments:
18197 """"""""""
18198
18199 The first two operands and the result have the same vector of integer type. The
18200 third operand is the vector mask and has the same number of elements as the
18201 result vector type. The fourth operand is the explicit vector length of the
18202 operation.
18203
18204 Semantics:
18205 """"""""""
18206
18207 The '``llvm.vp.shl``' intrinsic computes the left shift (:ref:`shl <i_shl>`) of
18208 the first operand by the second operand on each enabled lane.  The result on
18209 disabled lanes is undefined.
18210
18211 Examples:
18212 """""""""
18213
18214 .. code-block:: llvm
18215
18216       %r = call <4 x i32> @llvm.vp.shl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18217       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18218
18219       %t = shl <4 x i32> %a, %b
18220       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18221
18222
18223 .. _int_vp_or:
18224
18225 '``llvm.vp.or.*``' Intrinsics
18226 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18227
18228 Syntax:
18229 """""""
18230 This is an overloaded intrinsic.
18231
18232 ::
18233
18234       declare <16 x i32>  @llvm.vp.or.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18235       declare <vscale x 4 x i32>  @llvm.vp.or.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18236       declare <256 x i64>  @llvm.vp.or.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18237
18238 Overview:
18239 """""""""
18240
18241 Vector-predicated or.
18242
18243
18244 Arguments:
18245 """"""""""
18246
18247 The first two operands and the result have the same vector of integer type. The
18248 third operand is the vector mask and has the same number of elements as the
18249 result vector type. The fourth operand is the explicit vector length of the
18250 operation.
18251
18252 Semantics:
18253 """"""""""
18254
18255 The '``llvm.vp.or``' intrinsic performs a bitwise or (:ref:`or <i_or>`) of the
18256 first two operands on each enabled lane.  The result on disabled lanes is
18257 undefined.
18258
18259 Examples:
18260 """""""""
18261
18262 .. code-block:: llvm
18263
18264       %r = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18265       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18266
18267       %t = or <4 x i32> %a, %b
18268       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18269
18270
18271 .. _int_vp_and:
18272
18273 '``llvm.vp.and.*``' Intrinsics
18274 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18275
18276 Syntax:
18277 """""""
18278 This is an overloaded intrinsic.
18279
18280 ::
18281
18282       declare <16 x i32>  @llvm.vp.and.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18283       declare <vscale x 4 x i32>  @llvm.vp.and.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18284       declare <256 x i64>  @llvm.vp.and.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18285
18286 Overview:
18287 """""""""
18288
18289 Vector-predicated and.
18290
18291
18292 Arguments:
18293 """"""""""
18294
18295 The first two operands and the result have the same vector of integer type. The
18296 third operand is the vector mask and has the same number of elements as the
18297 result vector type. The fourth operand is the explicit vector length of the
18298 operation.
18299
18300 Semantics:
18301 """"""""""
18302
18303 The '``llvm.vp.and``' intrinsic performs a bitwise and (:ref:`and <i_or>`) of
18304 the first two operands on each enabled lane.  The result on disabled lanes is
18305 undefined.
18306
18307 Examples:
18308 """""""""
18309
18310 .. code-block:: llvm
18311
18312       %r = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18313       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18314
18315       %t = and <4 x i32> %a, %b
18316       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18317
18318
18319 .. _int_vp_xor:
18320
18321 '``llvm.vp.xor.*``' Intrinsics
18322 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18323
18324 Syntax:
18325 """""""
18326 This is an overloaded intrinsic.
18327
18328 ::
18329
18330       declare <16 x i32>  @llvm.vp.xor.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18331       declare <vscale x 4 x i32>  @llvm.vp.xor.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18332       declare <256 x i64>  @llvm.vp.xor.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18333
18334 Overview:
18335 """""""""
18336
18337 Vector-predicated, bitwise xor.
18338
18339
18340 Arguments:
18341 """"""""""
18342
18343 The first two operands and the result have the same vector of integer type. The
18344 third operand is the vector mask and has the same number of elements as the
18345 result vector type. The fourth operand is the explicit vector length of the
18346 operation.
18347
18348 Semantics:
18349 """"""""""
18350
18351 The '``llvm.vp.xor``' intrinsic performs a bitwise xor (:ref:`xor <i_xor>`) of
18352 the first two operands on each enabled lane.
18353 The result on disabled lanes is undefined.
18354
18355 Examples:
18356 """""""""
18357
18358 .. code-block:: llvm
18359
18360       %r = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18361       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18362
18363       %t = xor <4 x i32> %a, %b
18364       %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18365
18366
18367 .. _int_vp_fadd:
18368
18369 '``llvm.vp.fadd.*``' Intrinsics
18370 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18371
18372 Syntax:
18373 """""""
18374 This is an overloaded intrinsic.
18375
18376 ::
18377
18378       declare <16 x float>  @llvm.vp.fadd.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18379       declare <vscale x 4 x float>  @llvm.vp.fadd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18380       declare <256 x double>  @llvm.vp.fadd.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18381
18382 Overview:
18383 """""""""
18384
18385 Predicated floating-point addition of two vectors of floating-point values.
18386
18387
18388 Arguments:
18389 """"""""""
18390
18391 The first two operands and the result have the same vector of floating-point type. The
18392 third operand is the vector mask and has the same number of elements as the
18393 result vector type. The fourth operand is the explicit vector length of the
18394 operation.
18395
18396 Semantics:
18397 """"""""""
18398
18399 The '``llvm.vp.fadd``' intrinsic performs floating-point addition (:ref:`add <i_fadd>`)
18400 of the first and second vector operand on each enabled lane.  The result on
18401 disabled lanes is undefined.  The operation is performed in the default
18402 floating-point environment.
18403
18404 Examples:
18405 """""""""
18406
18407 .. code-block:: llvm
18408
18409       %r = call <4 x float> @llvm.vp.fadd.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18410       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18411
18412       %t = fadd <4 x float> %a, %b
18413       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18414
18415
18416 .. _int_vp_fsub:
18417
18418 '``llvm.vp.fsub.*``' Intrinsics
18419 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18420
18421 Syntax:
18422 """""""
18423 This is an overloaded intrinsic.
18424
18425 ::
18426
18427       declare <16 x float>  @llvm.vp.fsub.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18428       declare <vscale x 4 x float>  @llvm.vp.fsub.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18429       declare <256 x double>  @llvm.vp.fsub.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18430
18431 Overview:
18432 """""""""
18433
18434 Predicated floating-point subtraction of two vectors of floating-point values.
18435
18436
18437 Arguments:
18438 """"""""""
18439
18440 The first two operands and the result have the same vector of floating-point type. The
18441 third operand is the vector mask and has the same number of elements as the
18442 result vector type. The fourth operand is the explicit vector length of the
18443 operation.
18444
18445 Semantics:
18446 """"""""""
18447
18448 The '``llvm.vp.fsub``' intrinsic performs floating-point subtraction (:ref:`add <i_fsub>`)
18449 of the first and second vector operand on each enabled lane.  The result on
18450 disabled lanes is undefined.  The operation is performed in the default
18451 floating-point environment.
18452
18453 Examples:
18454 """""""""
18455
18456 .. code-block:: llvm
18457
18458       %r = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18459       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18460
18461       %t = fsub <4 x float> %a, %b
18462       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18463
18464
18465 .. _int_vp_fmul:
18466
18467 '``llvm.vp.fmul.*``' Intrinsics
18468 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18469
18470 Syntax:
18471 """""""
18472 This is an overloaded intrinsic.
18473
18474 ::
18475
18476       declare <16 x float>  @llvm.vp.fmul.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18477       declare <vscale x 4 x float>  @llvm.vp.fmul.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18478       declare <256 x double>  @llvm.vp.fmul.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18479
18480 Overview:
18481 """""""""
18482
18483 Predicated floating-point multiplication of two vectors of floating-point values.
18484
18485
18486 Arguments:
18487 """"""""""
18488
18489 The first two operands and the result have the same vector of floating-point type. The
18490 third operand is the vector mask and has the same number of elements as the
18491 result vector type. The fourth operand is the explicit vector length of the
18492 operation.
18493
18494 Semantics:
18495 """"""""""
18496
18497 The '``llvm.vp.fmul``' intrinsic performs floating-point multiplication (:ref:`add <i_fmul>`)
18498 of the first and second vector operand on each enabled lane.  The result on
18499 disabled lanes is undefined.  The operation is performed in the default
18500 floating-point environment.
18501
18502 Examples:
18503 """""""""
18504
18505 .. code-block:: llvm
18506
18507       %r = call <4 x float> @llvm.vp.fmul.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18508       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18509
18510       %t = fmul <4 x float> %a, %b
18511       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18512
18513
18514 .. _int_vp_fdiv:
18515
18516 '``llvm.vp.fdiv.*``' Intrinsics
18517 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18518
18519 Syntax:
18520 """""""
18521 This is an overloaded intrinsic.
18522
18523 ::
18524
18525       declare <16 x float>  @llvm.vp.fdiv.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18526       declare <vscale x 4 x float>  @llvm.vp.fdiv.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18527       declare <256 x double>  @llvm.vp.fdiv.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18528
18529 Overview:
18530 """""""""
18531
18532 Predicated floating-point division of two vectors of floating-point values.
18533
18534
18535 Arguments:
18536 """"""""""
18537
18538 The first two operands and the result have the same vector of floating-point type. The
18539 third operand is the vector mask and has the same number of elements as the
18540 result vector type. The fourth operand is the explicit vector length of the
18541 operation.
18542
18543 Semantics:
18544 """"""""""
18545
18546 The '``llvm.vp.fdiv``' intrinsic performs floating-point division (:ref:`add <i_fdiv>`)
18547 of the first and second vector operand on each enabled lane.  The result on
18548 disabled lanes is undefined.  The operation is performed in the default
18549 floating-point environment.
18550
18551 Examples:
18552 """""""""
18553
18554 .. code-block:: llvm
18555
18556       %r = call <4 x float> @llvm.vp.fdiv.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18557       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18558
18559       %t = fdiv <4 x float> %a, %b
18560       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18561
18562
18563 .. _int_vp_frem:
18564
18565 '``llvm.vp.frem.*``' Intrinsics
18566 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18567
18568 Syntax:
18569 """""""
18570 This is an overloaded intrinsic.
18571
18572 ::
18573
18574       declare <16 x float>  @llvm.vp.frem.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18575       declare <vscale x 4 x float>  @llvm.vp.frem.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18576       declare <256 x double>  @llvm.vp.frem.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18577
18578 Overview:
18579 """""""""
18580
18581 Predicated floating-point remainder of two vectors of floating-point values.
18582
18583
18584 Arguments:
18585 """"""""""
18586
18587 The first two operands and the result have the same vector of floating-point type. The
18588 third operand is the vector mask and has the same number of elements as the
18589 result vector type. The fourth operand is the explicit vector length of the
18590 operation.
18591
18592 Semantics:
18593 """"""""""
18594
18595 The '``llvm.vp.frem``' intrinsic performs floating-point remainder (:ref:`add <i_frem>`)
18596 of the first and second vector operand on each enabled lane.  The result on
18597 disabled lanes is undefined.  The operation is performed in the default
18598 floating-point environment.
18599
18600 Examples:
18601 """""""""
18602
18603 .. code-block:: llvm
18604
18605       %r = call <4 x float> @llvm.vp.frem.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18606       ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18607
18608       %t = frem <4 x float> %a, %b
18609       %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18610
18611
18612
18613 .. _int_vp_reduce_add:
18614
18615 '``llvm.vp.reduce.add.*``' Intrinsics
18616 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18617
18618 Syntax:
18619 """""""
18620 This is an overloaded intrinsic.
18621
18622 ::
18623
18624       declare i32 @llvm.vp.reduce.add.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
18625       declare i16 @llvm.vp.reduce.add.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18626
18627 Overview:
18628 """""""""
18629
18630 Predicated integer ``ADD`` reduction of a vector and a scalar starting value,
18631 returning the result as a scalar.
18632
18633 Arguments:
18634 """"""""""
18635
18636 The first operand is the start value of the reduction, which must be a scalar
18637 integer type equal to the result type. The second operand is the vector on
18638 which the reduction is performed and must be a vector of integer values whose
18639 element type is the result/start type. The third operand is the vector mask and
18640 is a vector of boolean values with the same number of elements as the vector
18641 operand. The fourth operand is the explicit vector length of the operation.
18642
18643 Semantics:
18644 """"""""""
18645
18646 The '``llvm.vp.reduce.add``' intrinsic performs the integer ``ADD`` reduction
18647 (:ref:`llvm.vector.reduce.add <int_vector_reduce_add>`) of the vector operand
18648 ``val`` on each enabled lane, adding it to the scalar ``start_value``. Disabled
18649 lanes are treated as containing the neutral value ``0`` (i.e. having no effect
18650 on the reduction operation). If the vector length is zero, the result is equal
18651 to ``start_value``.
18652
18653 To ignore the start value, the neutral value can be used.
18654
18655 Examples:
18656 """""""""
18657
18658 .. code-block:: llvm
18659
18660       %r = call i32 @llvm.vp.reduce.add.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
18661       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18662       ; are treated as though %mask were false for those lanes.
18663
18664       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> zeroinitializer
18665       %reduction = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %masked.a)
18666       %also.r = add i32 %reduction, %start
18667
18668
18669 .. _int_vp_reduce_fadd:
18670
18671 '``llvm.vp.reduce.fadd.*``' Intrinsics
18672 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18673
18674 Syntax:
18675 """""""
18676 This is an overloaded intrinsic.
18677
18678 ::
18679
18680       declare float @llvm.vp.reduce.fadd.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
18681       declare double @llvm.vp.reduce.fadd.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18682
18683 Overview:
18684 """""""""
18685
18686 Predicated floating-point ``ADD`` reduction of a vector and a scalar starting
18687 value, returning the result as a scalar.
18688
18689 Arguments:
18690 """"""""""
18691
18692 The first operand is the start value of the reduction, which must be a scalar
18693 floating-point type equal to the result type. The second operand is the vector
18694 on which the reduction is performed and must be a vector of floating-point
18695 values whose element type is the result/start type. The third operand is the
18696 vector mask and is a vector of boolean values with the same number of elements
18697 as the vector operand. The fourth operand is the explicit vector length of the
18698 operation.
18699
18700 Semantics:
18701 """"""""""
18702
18703 The '``llvm.vp.reduce.fadd``' intrinsic performs the floating-point ``ADD``
18704 reduction (:ref:`llvm.vector.reduce.fadd <int_vector_reduce_fadd>`) of the
18705 vector operand ``val`` on each enabled lane, adding it to the scalar
18706 ``start_value``. Disabled lanes are treated as containing the neutral value
18707 ``-0.0`` (i.e. having no effect on the reduction operation). If no lanes are
18708 enabled, the resulting value will be equal to ``start_value``.
18709
18710 To ignore the start value, the neutral value can be used.
18711
18712 See the unpredicated version (:ref:`llvm.vector.reduce.fadd
18713 <int_vector_reduce_fadd>`) for more detail on the semantics of the reduction.
18714
18715 Examples:
18716 """""""""
18717
18718 .. code-block:: llvm
18719
18720       %r = call float @llvm.vp.reduce.fadd.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
18721       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18722       ; are treated as though %mask were false for those lanes.
18723
18724       %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -0.0, float -0.0, float -0.0, float -0.0>
18725       %also.r = call float @llvm.vector.reduce.fadd.v4f32(float %start, <4 x float> %masked.a)
18726
18727
18728 .. _int_vp_reduce_mul:
18729
18730 '``llvm.vp.reduce.mul.*``' Intrinsics
18731 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18732
18733 Syntax:
18734 """""""
18735 This is an overloaded intrinsic.
18736
18737 ::
18738
18739       declare i32 @llvm.vp.reduce.mul.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
18740       declare i16 @llvm.vp.reduce.mul.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18741
18742 Overview:
18743 """""""""
18744
18745 Predicated integer ``MUL`` reduction of a vector and a scalar starting value,
18746 returning the result as a scalar.
18747
18748
18749 Arguments:
18750 """"""""""
18751
18752 The first operand is the start value of the reduction, which must be a scalar
18753 integer type equal to the result type. The second operand is the vector on
18754 which the reduction is performed and must be a vector of integer values whose
18755 element type is the result/start type. The third operand is the vector mask and
18756 is a vector of boolean values with the same number of elements as the vector
18757 operand. The fourth operand is the explicit vector length of the operation.
18758
18759 Semantics:
18760 """"""""""
18761
18762 The '``llvm.vp.reduce.mul``' intrinsic performs the integer ``MUL`` reduction
18763 (:ref:`llvm.vector.reduce.mul <int_vector_reduce_mul>`) of the vector operand ``val``
18764 on each enabled lane, multiplying it by the scalar ``start_value``. Disabled
18765 lanes are treated as containing the neutral value ``1`` (i.e. having no effect
18766 on the reduction operation). If the vector length is zero, the result is the
18767 start value.
18768
18769 To ignore the start value, the neutral value can be used.
18770
18771 Examples:
18772 """""""""
18773
18774 .. code-block:: llvm
18775
18776       %r = call i32 @llvm.vp.reduce.mul.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
18777       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18778       ; are treated as though %mask were false for those lanes.
18779
18780       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 1, i32 1, i32 1, i32 1>
18781       %reduction = call i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %masked.a)
18782       %also.r = mul i32 %reduction, %start
18783
18784 .. _int_vp_reduce_fmul:
18785
18786 '``llvm.vp.reduce.fmul.*``' Intrinsics
18787 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18788
18789 Syntax:
18790 """""""
18791 This is an overloaded intrinsic.
18792
18793 ::
18794
18795       declare float @llvm.vp.reduce.fmul.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
18796       declare double @llvm.vp.reduce.fmul.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18797
18798 Overview:
18799 """""""""
18800
18801 Predicated floating-point ``MUL`` reduction of a vector and a scalar starting
18802 value, returning the result as a scalar.
18803
18804
18805 Arguments:
18806 """"""""""
18807
18808 The first operand is the start value of the reduction, which must be a scalar
18809 floating-point type equal to the result type. The second operand is the vector
18810 on which the reduction is performed and must be a vector of floating-point
18811 values whose element type is the result/start type. The third operand is the
18812 vector mask and is a vector of boolean values with the same number of elements
18813 as the vector operand. The fourth operand is the explicit vector length of the
18814 operation.
18815
18816 Semantics:
18817 """"""""""
18818
18819 The '``llvm.vp.reduce.fmul``' intrinsic performs the floating-point ``MUL``
18820 reduction (:ref:`llvm.vector.reduce.fmul <int_vector_reduce_fmul>`) of the
18821 vector operand ``val`` on each enabled lane, multiplying it by the scalar
18822 `start_value``. Disabled lanes are treated as containing the neutral value
18823 ``1.0`` (i.e. having no effect on the reduction operation). If no lanes are
18824 enabled, the resulting value will be equal to the starting value.
18825
18826 To ignore the start value, the neutral value can be used.
18827
18828 See the unpredicated version (:ref:`llvm.vector.reduce.fmul
18829 <int_vector_reduce_fmul>`) for more detail on the semantics.
18830
18831 Examples:
18832 """""""""
18833
18834 .. code-block:: llvm
18835
18836       %r = call float @llvm.vp.reduce.fmul.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
18837       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18838       ; are treated as though %mask were false for those lanes.
18839
18840       %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float 1.0, float 1.0, float 1.0, float 1.0>
18841       %also.r = call float @llvm.vector.reduce.fmul.v4f32(float %start, <4 x float> %masked.a)
18842
18843
18844 .. _int_vp_reduce_and:
18845
18846 '``llvm.vp.reduce.and.*``' Intrinsics
18847 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18848
18849 Syntax:
18850 """""""
18851 This is an overloaded intrinsic.
18852
18853 ::
18854
18855       declare i32 @llvm.vp.reduce.and.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
18856       declare i16 @llvm.vp.reduce.and.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18857
18858 Overview:
18859 """""""""
18860
18861 Predicated integer ``AND`` reduction of a vector and a scalar starting value,
18862 returning the result as a scalar.
18863
18864
18865 Arguments:
18866 """"""""""
18867
18868 The first operand is the start value of the reduction, which must be a scalar
18869 integer type equal to the result type. The second operand is the vector on
18870 which the reduction is performed and must be a vector of integer values whose
18871 element type is the result/start type. The third operand is the vector mask and
18872 is a vector of boolean values with the same number of elements as the vector
18873 operand. The fourth operand is the explicit vector length of the operation.
18874
18875 Semantics:
18876 """"""""""
18877
18878 The '``llvm.vp.reduce.and``' intrinsic performs the integer ``AND`` reduction
18879 (:ref:`llvm.vector.reduce.and <int_vector_reduce_and>`) of the vector operand
18880 ``val`` on each enabled lane, performing an '``and``' of that with with the
18881 scalar ``start_value``. Disabled lanes are treated as containing the neutral
18882 value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
18883 operation). If the vector length is zero, the result is the start value.
18884
18885 To ignore the start value, the neutral value can be used.
18886
18887 Examples:
18888 """""""""
18889
18890 .. code-block:: llvm
18891
18892       %r = call i32 @llvm.vp.reduce.and.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
18893       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18894       ; are treated as though %mask were false for those lanes.
18895
18896       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
18897       %reduction = call i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %masked.a)
18898       %also.r = and i32 %reduction, %start
18899
18900
18901 .. _int_vp_reduce_or:
18902
18903 '``llvm.vp.reduce.or.*``' Intrinsics
18904 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18905
18906 Syntax:
18907 """""""
18908 This is an overloaded intrinsic.
18909
18910 ::
18911
18912       declare i32 @llvm.vp.reduce.or.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
18913       declare i16 @llvm.vp.reduce.or.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18914
18915 Overview:
18916 """""""""
18917
18918 Predicated integer ``OR`` reduction of a vector and a scalar starting value,
18919 returning the result as a scalar.
18920
18921
18922 Arguments:
18923 """"""""""
18924
18925 The first operand is the start value of the reduction, which must be a scalar
18926 integer type equal to the result type. The second operand is the vector on
18927 which the reduction is performed and must be a vector of integer values whose
18928 element type is the result/start type. The third operand is the vector mask and
18929 is a vector of boolean values with the same number of elements as the vector
18930 operand. The fourth operand is the explicit vector length of the operation.
18931
18932 Semantics:
18933 """"""""""
18934
18935 The '``llvm.vp.reduce.or``' intrinsic performs the integer ``OR`` reduction
18936 (:ref:`llvm.vector.reduce.or <int_vector_reduce_or>`) of the vector operand
18937 ``val`` on each enabled lane, performing an '``or``' of that with the scalar
18938 ``start_value``. Disabled lanes are treated as containing the neutral value
18939 ``0`` (i.e. having no effect on the reduction operation). If the vector length
18940 is zero, the result is the start value.
18941
18942 To ignore the start value, the neutral value can be used.
18943
18944 Examples:
18945 """""""""
18946
18947 .. code-block:: llvm
18948
18949       %r = call i32 @llvm.vp.reduce.or.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
18950       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
18951       ; are treated as though %mask were false for those lanes.
18952
18953       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
18954       %reduction = call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %masked.a)
18955       %also.r = or i32 %reduction, %start
18956
18957 .. _int_vp_reduce_xor:
18958
18959 '``llvm.vp.reduce.xor.*``' Intrinsics
18960 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18961
18962 Syntax:
18963 """""""
18964 This is an overloaded intrinsic.
18965
18966 ::
18967
18968       declare i32 @llvm.vp.reduce.xor.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
18969       declare i16 @llvm.vp.reduce.xor.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
18970
18971 Overview:
18972 """""""""
18973
18974 Predicated integer ``XOR`` reduction of a vector and a scalar starting value,
18975 returning the result as a scalar.
18976
18977
18978 Arguments:
18979 """"""""""
18980
18981 The first operand is the start value of the reduction, which must be a scalar
18982 integer type equal to the result type. The second operand is the vector on
18983 which the reduction is performed and must be a vector of integer values whose
18984 element type is the result/start type. The third operand is the vector mask and
18985 is a vector of boolean values with the same number of elements as the vector
18986 operand. The fourth operand is the explicit vector length of the operation.
18987
18988 Semantics:
18989 """"""""""
18990
18991 The '``llvm.vp.reduce.xor``' intrinsic performs the integer ``XOR`` reduction
18992 (:ref:`llvm.vector.reduce.xor <int_vector_reduce_xor>`) of the vector operand
18993 ``val`` on each enabled lane, performing an '``xor``' of that with the scalar
18994 ``start_value``. Disabled lanes are treated as containing the neutral value
18995 ``0`` (i.e. having no effect on the reduction operation). If the vector length
18996 is zero, the result is the start value.
18997
18998 To ignore the start value, the neutral value can be used.
18999
19000 Examples:
19001 """""""""
19002
19003 .. code-block:: llvm
19004
19005       %r = call i32 @llvm.vp.reduce.xor.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19006       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19007       ; are treated as though %mask were false for those lanes.
19008
19009       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
19010       %reduction = call i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %masked.a)
19011       %also.r = xor i32 %reduction, %start
19012
19013
19014 .. _int_vp_reduce_smax:
19015
19016 '``llvm.vp.reduce.smax.*``' Intrinsics
19017 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19018
19019 Syntax:
19020 """""""
19021 This is an overloaded intrinsic.
19022
19023 ::
19024
19025       declare i32 @llvm.vp.reduce.smax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19026       declare i16 @llvm.vp.reduce.smax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19027
19028 Overview:
19029 """""""""
19030
19031 Predicated signed-integer ``MAX`` reduction of a vector and a scalar starting
19032 value, returning the result as a scalar.
19033
19034
19035 Arguments:
19036 """"""""""
19037
19038 The first operand is the start value of the reduction, which must be a scalar
19039 integer type equal to the result type. The second operand is the vector on
19040 which the reduction is performed and must be a vector of integer values whose
19041 element type is the result/start type. The third operand is the vector mask and
19042 is a vector of boolean values with the same number of elements as the vector
19043 operand. The fourth operand is the explicit vector length of the operation.
19044
19045 Semantics:
19046 """"""""""
19047
19048 The '``llvm.vp.reduce.smax``' intrinsic performs the signed-integer ``MAX``
19049 reduction (:ref:`llvm.vector.reduce.smax <int_vector_reduce_smax>`) of the
19050 vector operand ``val`` on each enabled lane, and taking the maximum of that and
19051 the scalar ``start_value``. Disabled lanes are treated as containing the
19052 neutral value ``INT_MIN`` (i.e. having no effect on the reduction operation).
19053 If the vector length is zero, the result is the start value.
19054
19055 To ignore the start value, the neutral value can be used.
19056
19057 Examples:
19058 """""""""
19059
19060 .. code-block:: llvm
19061
19062       %r = call i8 @llvm.vp.reduce.smax.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
19063       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19064       ; are treated as though %mask were false for those lanes.
19065
19066       %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 -128, i8 -128, i8 -128, i8 -128>
19067       %reduction = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> %masked.a)
19068       %also.r = call i8 @llvm.smax.i8(i8 %reduction, i8 %start)
19069
19070
19071 .. _int_vp_reduce_smin:
19072
19073 '``llvm.vp.reduce.smin.*``' Intrinsics
19074 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19075
19076 Syntax:
19077 """""""
19078 This is an overloaded intrinsic.
19079
19080 ::
19081
19082       declare i32 @llvm.vp.reduce.smin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19083       declare i16 @llvm.vp.reduce.smin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19084
19085 Overview:
19086 """""""""
19087
19088 Predicated signed-integer ``MIN`` reduction of a vector and a scalar starting
19089 value, returning the result as a scalar.
19090
19091
19092 Arguments:
19093 """"""""""
19094
19095 The first operand is the start value of the reduction, which must be a scalar
19096 integer type equal to the result type. The second operand is the vector on
19097 which the reduction is performed and must be a vector of integer values whose
19098 element type is the result/start type. The third operand is the vector mask and
19099 is a vector of boolean values with the same number of elements as the vector
19100 operand. The fourth operand is the explicit vector length of the operation.
19101
19102 Semantics:
19103 """"""""""
19104
19105 The '``llvm.vp.reduce.smin``' intrinsic performs the signed-integer ``MIN``
19106 reduction (:ref:`llvm.vector.reduce.smin <int_vector_reduce_smin>`) of the
19107 vector operand ``val`` on each enabled lane, and taking the minimum of that and
19108 the scalar ``start_value``. Disabled lanes are treated as containing the
19109 neutral value ``INT_MAX`` (i.e. having no effect on the reduction operation).
19110 If the vector length is zero, the result is the start value.
19111
19112 To ignore the start value, the neutral value can be used.
19113
19114 Examples:
19115 """""""""
19116
19117 .. code-block:: llvm
19118
19119       %r = call i8 @llvm.vp.reduce.smin.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
19120       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19121       ; are treated as though %mask were false for those lanes.
19122
19123       %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 127, i8 127, i8 127, i8 127>
19124       %reduction = call i8 @llvm.vector.reduce.smin.v4i8(<4 x i8> %masked.a)
19125       %also.r = call i8 @llvm.smin.i8(i8 %reduction, i8 %start)
19126
19127
19128 .. _int_vp_reduce_umax:
19129
19130 '``llvm.vp.reduce.umax.*``' Intrinsics
19131 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19132
19133 Syntax:
19134 """""""
19135 This is an overloaded intrinsic.
19136
19137 ::
19138
19139       declare i32 @llvm.vp.reduce.umax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19140       declare i16 @llvm.vp.reduce.umax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19141
19142 Overview:
19143 """""""""
19144
19145 Predicated unsigned-integer ``MAX`` reduction of a vector and a scalar starting
19146 value, returning the result as a scalar.
19147
19148
19149 Arguments:
19150 """"""""""
19151
19152 The first operand is the start value of the reduction, which must be a scalar
19153 integer type equal to the result type. The second operand is the vector on
19154 which the reduction is performed and must be a vector of integer values whose
19155 element type is the result/start type. The third operand is the vector mask and
19156 is a vector of boolean values with the same number of elements as the vector
19157 operand. The fourth operand is the explicit vector length of the operation.
19158
19159 Semantics:
19160 """"""""""
19161
19162 The '``llvm.vp.reduce.umax``' intrinsic performs the unsigned-integer ``MAX``
19163 reduction (:ref:`llvm.vector.reduce.umax <int_vector_reduce_umax>`) of the
19164 vector operand ``val`` on each enabled lane, and taking the maximum of that and
19165 the scalar ``start_value``. Disabled lanes are treated as containing the
19166 neutral value ``0`` (i.e. having no effect on the reduction operation). If the
19167 vector length is zero, the result is the start value.
19168
19169 To ignore the start value, the neutral value can be used.
19170
19171 Examples:
19172 """""""""
19173
19174 .. code-block:: llvm
19175
19176       %r = call i32 @llvm.vp.reduce.umax.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19177       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19178       ; are treated as though %mask were false for those lanes.
19179
19180       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
19181       %reduction = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %masked.a)
19182       %also.r = call i32 @llvm.umax.i32(i32 %reduction, i32 %start)
19183
19184
19185 .. _int_vp_reduce_umin:
19186
19187 '``llvm.vp.reduce.umin.*``' Intrinsics
19188 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19189
19190 Syntax:
19191 """""""
19192 This is an overloaded intrinsic.
19193
19194 ::
19195
19196       declare i32 @llvm.vp.reduce.umin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19197       declare i16 @llvm.vp.reduce.umin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19198
19199 Overview:
19200 """""""""
19201
19202 Predicated unsigned-integer ``MIN`` reduction of a vector and a scalar starting
19203 value, returning the result as a scalar.
19204
19205
19206 Arguments:
19207 """"""""""
19208
19209 The first operand is the start value of the reduction, which must be a scalar
19210 integer type equal to the result type. The second operand is the vector on
19211 which the reduction is performed and must be a vector of integer values whose
19212 element type is the result/start type. The third operand is the vector mask and
19213 is a vector of boolean values with the same number of elements as the vector
19214 operand. The fourth operand is the explicit vector length of the operation.
19215
19216 Semantics:
19217 """"""""""
19218
19219 The '``llvm.vp.reduce.umin``' intrinsic performs the unsigned-integer ``MIN``
19220 reduction (:ref:`llvm.vector.reduce.umin <int_vector_reduce_umin>`) of the
19221 vector operand ``val`` on each enabled lane, taking the minimum of that and the
19222 scalar ``start_value``. Disabled lanes are treated as containing the neutral
19223 value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
19224 operation). If the vector length is zero, the result is the start value.
19225
19226 To ignore the start value, the neutral value can be used.
19227
19228 Examples:
19229 """""""""
19230
19231 .. code-block:: llvm
19232
19233       %r = call i32 @llvm.vp.reduce.umin.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19234       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19235       ; are treated as though %mask were false for those lanes.
19236
19237       %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
19238       %reduction = call i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %masked.a)
19239       %also.r = call i32 @llvm.umin.i32(i32 %reduction, i32 %start)
19240
19241
19242 .. _int_vp_reduce_fmax:
19243
19244 '``llvm.vp.reduce.fmax.*``' Intrinsics
19245 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19246
19247 Syntax:
19248 """""""
19249 This is an overloaded intrinsic.
19250
19251 ::
19252
19253       declare float @llvm.vp.reduce.fmax.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
19254       declare double @llvm.vp.reduce.fmax.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19255
19256 Overview:
19257 """""""""
19258
19259 Predicated floating-point ``MAX`` reduction of a vector and a scalar starting
19260 value, returning the result as a scalar.
19261
19262
19263 Arguments:
19264 """"""""""
19265
19266 The first operand is the start value of the reduction, which must be a scalar
19267 floating-point type equal to the result type. The second operand is the vector
19268 on which the reduction is performed and must be a vector of floating-point
19269 values whose element type is the result/start type. The third operand is the
19270 vector mask and is a vector of boolean values with the same number of elements
19271 as the vector operand. The fourth operand is the explicit vector length of the
19272 operation.
19273
19274 Semantics:
19275 """"""""""
19276
19277 The '``llvm.vp.reduce.fmax``' intrinsic performs the floating-point ``MAX``
19278 reduction (:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>`) of the
19279 vector operand ``val`` on each enabled lane, taking the maximum of that and the
19280 scalar ``start_value``. Disabled lanes are treated as containing the neutral
19281 value (i.e. having no effect on the reduction operation). If the vector length
19282 is zero, the result is the start value.
19283
19284 The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
19285 flags are set, the neutral value is ``-QNAN``. If ``nnan``  and ``ninf`` are
19286 both set, then the neutral value is the smallest floating-point value for the
19287 result type. If only ``nnan`` is set then the neutral value is ``-Infinity``.
19288
19289 This instruction has the same comparison semantics as the
19290 :ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>` intrinsic (and thus the
19291 '``llvm.maxnum.*``' intrinsic). That is, the result will always be a number
19292 unless all elements of the vector and the starting value are ``NaN``. For a
19293 vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
19294 ``-0.0`` elements, the sign of the result is unspecified.
19295
19296 To ignore the start value, the neutral value can be used.
19297
19298 Examples:
19299 """""""""
19300
19301 .. code-block:: llvm
19302
19303       %r = call float @llvm.vp.reduce.fmax.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
19304       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19305       ; are treated as though %mask were false for those lanes.
19306
19307       %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
19308       %reduction = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> %masked.a)
19309       %also.r = call float @llvm.maxnum.f32(float %reduction, float %start)
19310
19311
19312 .. _int_vp_reduce_fmin:
19313
19314 '``llvm.vp.reduce.fmin.*``' Intrinsics
19315 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19316
19317 Syntax:
19318 """""""
19319 This is an overloaded intrinsic.
19320
19321 ::
19322
19323       declare float @llvm.vp.reduce.fmin.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
19324       declare double @llvm.vp.reduce.fmin.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19325
19326 Overview:
19327 """""""""
19328
19329 Predicated floating-point ``MIN`` reduction of a vector and a scalar starting
19330 value, returning the result as a scalar.
19331
19332
19333 Arguments:
19334 """"""""""
19335
19336 The first operand is the start value of the reduction, which must be a scalar
19337 floating-point type equal to the result type. The second operand is the vector
19338 on which the reduction is performed and must be a vector of floating-point
19339 values whose element type is the result/start type. The third operand is the
19340 vector mask and is a vector of boolean values with the same number of elements
19341 as the vector operand. The fourth operand is the explicit vector length of the
19342 operation.
19343
19344 Semantics:
19345 """"""""""
19346
19347 The '``llvm.vp.reduce.fmin``' intrinsic performs the floating-point ``MIN``
19348 reduction (:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>`) of the
19349 vector operand ``val`` on each enabled lane, taking the minimum of that and the
19350 scalar ``start_value``. Disabled lanes are treated as containing the neutral
19351 value (i.e. having no effect on the reduction operation). If the vector length
19352 is zero, the result is the start value.
19353
19354 The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
19355 flags are set, the neutral value is ``+QNAN``. If ``nnan``  and ``ninf`` are
19356 both set, then the neutral value is the largest floating-point value for the
19357 result type. If only ``nnan`` is set then the neutral value is ``+Infinity``.
19358
19359 This instruction has the same comparison semantics as the
19360 :ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>` intrinsic (and thus the
19361 '``llvm.minnum.*``' intrinsic). That is, the result will always be a number
19362 unless all elements of the vector and the starting value are ``NaN``. For a
19363 vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
19364 ``-0.0`` elements, the sign of the result is unspecified.
19365
19366 To ignore the start value, the neutral value can be used.
19367
19368 Examples:
19369 """""""""
19370
19371 .. code-block:: llvm
19372
19373       %r = call float @llvm.vp.reduce.fmin.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
19374       ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19375       ; are treated as though %mask were false for those lanes.
19376
19377       %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
19378       %reduction = call float @llvm.vector.reduce.fmin.v4f32(<4 x float> %masked.a)
19379       %also.r = call float @llvm.minnum.f32(float %reduction, float %start)
19380
19381
19382 .. _int_get_active_lane_mask:
19383
19384 '``llvm.get.active.lane.mask.*``' Intrinsics
19385 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19386
19387 Syntax:
19388 """""""
19389 This is an overloaded intrinsic.
19390
19391 ::
19392
19393       declare <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %base, i32 %n)
19394       declare <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 %base, i64 %n)
19395       declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 %base, i64 %n)
19396       declare <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %base, i64 %n)
19397
19398
19399 Overview:
19400 """""""""
19401
19402 Create a mask representing active and inactive vector lanes.
19403
19404
19405 Arguments:
19406 """"""""""
19407
19408 Both operands have the same scalar integer type. The result is a vector with
19409 the i1 element type.
19410
19411 Semantics:
19412 """"""""""
19413
19414 The '``llvm.get.active.lane.mask.*``' intrinsics are semantically equivalent
19415 to:
19416
19417 ::
19418
19419       %m[i] = icmp ult (%base + i), %n
19420
19421 where ``%m`` is a vector (mask) of active/inactive lanes with its elements
19422 indexed by ``i``,  and ``%base``, ``%n`` are the two arguments to
19423 ``llvm.get.active.lane.mask.*``, ``%icmp`` is an integer compare and ``ult``
19424 the unsigned less-than comparison operator.  Overflow cannot occur in
19425 ``(%base + i)`` and its comparison against ``%n`` as it is performed in integer
19426 numbers and not in machine numbers.  If ``%n`` is ``0``, then the result is a
19427 poison value. The above is equivalent to:
19428
19429 ::
19430
19431       %m = @llvm.get.active.lane.mask(%base, %n)
19432
19433 This can, for example, be emitted by the loop vectorizer in which case
19434 ``%base`` is the first element of the vector induction variable (VIV) and
19435 ``%n`` is the loop tripcount. Thus, these intrinsics perform an element-wise
19436 less than comparison of VIV with the loop tripcount, producing a mask of
19437 true/false values representing active/inactive vector lanes, except if the VIV
19438 overflows in which case they return false in the lanes where the VIV overflows.
19439 The arguments are scalar types to accommodate scalable vector types, for which
19440 it is unknown what the type of the step vector needs to be that enumerate its
19441 lanes without overflow.
19442
19443 This mask ``%m`` can e.g. be used in masked load/store instructions. These
19444 intrinsics provide a hint to the backend. I.e., for a vector loop, the
19445 back-edge taken count of the original scalar loop is explicit as the second
19446 argument.
19447
19448
19449 Examples:
19450 """""""""
19451
19452 .. code-block:: llvm
19453
19454       %active.lane.mask = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i64(i64 %elem0, i64 429)
19455       %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %active.lane.mask, <4 x i32> undef)
19456
19457
19458 .. _int_mload_mstore:
19459
19460 Masked Vector Load and Store Intrinsics
19461 ---------------------------------------
19462
19463 LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed.
19464
19465 .. _int_mload:
19466
19467 '``llvm.masked.load.*``' Intrinsics
19468 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19469
19470 Syntax:
19471 """""""
19472 This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type.
19473
19474 ::
19475
19476       declare <16 x float>  @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
19477       declare <2 x double>  @llvm.masked.load.v2f64.p0v2f64  (<2 x double>* <ptr>, i32 <alignment>, <2 x i1>  <mask>, <2 x double> <passthru>)
19478       ;; The data is a vector of pointers to double
19479       declare <8 x double*> @llvm.masked.load.v8p0f64.p0v8p0f64    (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>)
19480       ;; The data is a vector of function pointers
19481       declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f.p0v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>)
19482
19483 Overview:
19484 """""""""
19485
19486 Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
19487
19488
19489 Arguments:
19490 """"""""""
19491
19492 The first operand is the base pointer for the load. The second operand is the alignment of the source location. It must be a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' operand are the same vector types.
19493
19494 Semantics:
19495 """"""""""
19496
19497 The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
19498 The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.
19499
19500
19501 ::
19502
19503        %res = call <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru)
19504
19505        ;; The result of the two following instructions is identical aside from potential memory access exception
19506        %loadlal = load <16 x float>, <16 x float>* %ptr, align 4
19507        %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru
19508
19509 .. _int_mstore:
19510
19511 '``llvm.masked.store.*``' Intrinsics
19512 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19513
19514 Syntax:
19515 """""""
19516 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type.
19517
19518 ::
19519
19520        declare void @llvm.masked.store.v8i32.p0v8i32  (<8  x i32>   <value>, <8  x i32>*   <ptr>, i32 <alignment>,  <8  x i1> <mask>)
19521        declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>,  <16 x i1> <mask>)
19522        ;; The data is a vector of pointers to double
19523        declare void @llvm.masked.store.v8p0f64.p0v8p0f64    (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>)
19524        ;; The data is a vector of function pointers
19525        declare void @llvm.masked.store.v4p0f_i32f.p0v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>)
19526
19527 Overview:
19528 """""""""
19529
19530 Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
19531
19532 Arguments:
19533 """"""""""
19534
19535 The first operand is the vector value to be written to memory. The second operand is the base pointer for the store, it has the same underlying type as the value operand. The third operand is the alignment of the destination location. It must be a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
19536
19537
19538 Semantics:
19539 """"""""""
19540
19541 The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
19542 The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes.
19543
19544 ::
19545
19546        call void @llvm.masked.store.v16f32.p0v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4,  <16 x i1> %mask)
19547
19548        ;; The result of the following instructions is identical aside from potential data races and memory access exceptions
19549        %oldval = load <16 x float>, <16 x float>* %ptr, align 4
19550        %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval
19551        store <16 x float> %res, <16 x float>* %ptr, align 4
19552
19553
19554 Masked Vector Gather and Scatter Intrinsics
19555 -------------------------------------------
19556
19557 LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed.
19558
19559 .. _int_mgather:
19560
19561 '``llvm.masked.gather.*``' Intrinsics
19562 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19563
19564 Syntax:
19565 """""""
19566 This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector.
19567
19568 ::
19569
19570       declare <16 x float> @llvm.masked.gather.v16f32.v16p0f32   (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
19571       declare <2 x double> @llvm.masked.gather.v2f64.v2p1f64     (<2 x double addrspace(1)*> <ptrs>, i32 <alignment>, <2 x i1>  <mask>, <2 x double> <passthru>)
19572       declare <8 x float*> @llvm.masked.gather.v8p0f32.v8p0p0f32 (<8 x float**> <ptrs>, i32 <alignment>, <8 x i1>  <mask>, <8 x float*> <passthru>)
19573
19574 Overview:
19575 """""""""
19576
19577 Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
19578
19579
19580 Arguments:
19581 """"""""""
19582
19583 The first operand is a vector of pointers which holds all memory addresses to read. The second operand is an alignment of the source addresses. It must be 0 or a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' operand are the same vector types.
19584
19585 Semantics:
19586 """"""""""
19587
19588 The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations.
19589 The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks.
19590
19591
19592 ::
19593
19594        %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0f64 (<4 x double*> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> undef)
19595
19596        ;; The gather with all-true mask is equivalent to the following instruction sequence
19597        %ptr0 = extractelement <4 x double*> %ptrs, i32 0
19598        %ptr1 = extractelement <4 x double*> %ptrs, i32 1
19599        %ptr2 = extractelement <4 x double*> %ptrs, i32 2
19600        %ptr3 = extractelement <4 x double*> %ptrs, i32 3
19601
19602        %val0 = load double, double* %ptr0, align 8
19603        %val1 = load double, double* %ptr1, align 8
19604        %val2 = load double, double* %ptr2, align 8
19605        %val3 = load double, double* %ptr3, align 8
19606
19607        %vec0    = insertelement <4 x double>undef, %val0, 0
19608        %vec01   = insertelement <4 x double>%vec0, %val1, 1
19609        %vec012  = insertelement <4 x double>%vec01, %val2, 2
19610        %vec0123 = insertelement <4 x double>%vec012, %val3, 3
19611
19612 .. _int_mscatter:
19613
19614 '``llvm.masked.scatter.*``' Intrinsics
19615 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19616
19617 Syntax:
19618 """""""
19619 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element.
19620
19621 ::
19622
19623        declare void @llvm.masked.scatter.v8i32.v8p0i32     (<8 x i32>     <value>, <8 x i32*>     <ptrs>, i32 <alignment>, <8 x i1>  <mask>)
19624        declare void @llvm.masked.scatter.v16f32.v16p1f32   (<16 x float>  <value>, <16 x float addrspace(1)*>  <ptrs>, i32 <alignment>, <16 x i1> <mask>)
19625        declare void @llvm.masked.scatter.v4p0f64.v4p0p0f64 (<4 x double*> <value>, <4 x double**> <ptrs>, i32 <alignment>, <4 x i1>  <mask>)
19626
19627 Overview:
19628 """""""""
19629
19630 Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
19631
19632 Arguments:
19633 """"""""""
19634
19635 The first operand is a vector value to be written to memory. The second operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The third operand is an alignment of the destination addresses. It must be 0 or a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
19636
19637 Semantics:
19638 """"""""""
19639
19640 The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
19641
19642 ::
19643
19644        ;; This instruction unconditionally stores data vector in multiple addresses
19645        call @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4,  <8 x i1>  <true, true, .. true>)
19646
19647        ;; It is equivalent to a list of scalar stores
19648        %val0 = extractelement <8 x i32> %value, i32 0
19649        %val1 = extractelement <8 x i32> %value, i32 1
19650        ..
19651        %val7 = extractelement <8 x i32> %value, i32 7
19652        %ptr0 = extractelement <8 x i32*> %ptrs, i32 0
19653        %ptr1 = extractelement <8 x i32*> %ptrs, i32 1
19654        ..
19655        %ptr7 = extractelement <8 x i32*> %ptrs, i32 7
19656        ;; Note: the order of the following stores is important when they overlap:
19657        store i32 %val0, i32* %ptr0, align 4
19658        store i32 %val1, i32* %ptr1, align 4
19659        ..
19660        store i32 %val7, i32* %ptr7, align 4
19661
19662
19663 Masked Vector Expanding Load and Compressing Store Intrinsics
19664 -------------------------------------------------------------
19665
19666 LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`.
19667
19668 .. _int_expandload:
19669
19670 '``llvm.masked.expandload.*``' Intrinsics
19671 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19672
19673 Syntax:
19674 """""""
19675 This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask.
19676
19677 ::
19678
19679       declare <16 x float>  @llvm.masked.expandload.v16f32 (float* <ptr>, <16 x i1> <mask>, <16 x float> <passthru>)
19680       declare <2 x i64>     @llvm.masked.expandload.v2i64 (i64* <ptr>, <2 x i1>  <mask>, <2 x i64> <passthru>)
19681
19682 Overview:
19683 """""""""
19684
19685 Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "expandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' operand.
19686
19687
19688 Arguments:
19689 """"""""""
19690
19691 The first operand is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second operand, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' operand have the same vector type.
19692
19693 Semantics:
19694 """"""""""
19695
19696 The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example:
19697
19698 .. code-block:: c
19699
19700     // In this loop we load from B and spread the elements into array A.
19701     double *A, B; int *C;
19702     for (int i = 0; i < size; ++i) {
19703       if (C[i] != 0)
19704         A[i] = B[j++];
19705     }
19706
19707
19708 .. code-block:: llvm
19709
19710     ; Load several elements from array B and expand them in a vector.
19711     ; The number of loaded elements is equal to the number of '1' elements in the Mask.
19712     %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(double* %Bptr, <8 x i1> %Mask, <8 x double> undef)
19713     ; Store the result in A
19714     call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> %Tmp, <8 x double>* %Aptr, i32 8, <8 x i1> %Mask)
19715
19716     ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
19717     %MaskI = bitcast <8 x i1> %Mask to i8
19718     %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
19719     %MaskI64 = zext i8 %MaskIPopcnt to i64
19720     %BNextInd = add i64 %BInd, %MaskI64
19721
19722
19723 Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles.
19724 If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load.
19725
19726 .. _int_compressstore:
19727
19728 '``llvm.masked.compressstore.*``' Intrinsics
19729 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19730
19731 Syntax:
19732 """""""
19733 This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector.
19734
19735 ::
19736
19737       declare void @llvm.masked.compressstore.v8i32  (<8  x i32>   <value>, i32*   <ptr>, <8  x i1> <mask>)
19738       declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, float* <ptr>, <16 x i1> <mask>)
19739
19740 Overview:
19741 """""""""
19742
19743 Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask.
19744
19745 Arguments:
19746 """"""""""
19747
19748 The first operand is the input vector, from which elements are collected and written to memory. The second operand is the base pointer for the store, it has the same underlying type as the element of the input vector operand. The third operand is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements.
19749
19750
19751 Semantics:
19752 """"""""""
19753
19754 The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependences like in the following example:
19755
19756 .. code-block:: c
19757
19758     // In this loop we load elements from A and store them consecutively in B
19759     double *A, B; int *C;
19760     for (int i = 0; i < size; ++i) {
19761       if (C[i] != 0)
19762         B[j++] = A[i]
19763     }
19764
19765
19766 .. code-block:: llvm
19767
19768     ; Load elements from A.
19769     %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double>* %Aptr, i32 8, <8 x i1> %Mask, <8 x double> undef)
19770     ; Store all selected elements consecutively in array B
19771     call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, double* %Bptr, <8 x i1> %Mask)
19772
19773     ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
19774     %MaskI = bitcast <8 x i1> %Mask to i8
19775     %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
19776     %MaskI64 = zext i8 %MaskIPopcnt to i64
19777     %BNextInd = add i64 %BInd, %MaskI64
19778
19779
19780 Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations.
19781
19782
19783 Memory Use Markers
19784 ------------------
19785
19786 This class of intrinsics provides information about the
19787 :ref:`lifetime of memory objects <objectlifetime>` and ranges where variables
19788 are immutable.
19789
19790 .. _int_lifestart:
19791
19792 '``llvm.lifetime.start``' Intrinsic
19793 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19794
19795 Syntax:
19796 """""""
19797
19798 ::
19799
19800       declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>)
19801
19802 Overview:
19803 """""""""
19804
19805 The '``llvm.lifetime.start``' intrinsic specifies the start of a memory
19806 object's lifetime.
19807
19808 Arguments:
19809 """"""""""
19810
19811 The first argument is a constant integer representing the size of the
19812 object, or -1 if it is variable sized. The second argument is a pointer
19813 to the object.
19814
19815 Semantics:
19816 """"""""""
19817
19818 If ``ptr`` is a stack-allocated object and it points to the first byte of
19819 the object, the object is initially marked as dead.
19820 ``ptr`` is conservatively considered as a non-stack-allocated object if
19821 the stack coloring algorithm that is used in the optimization pipeline cannot
19822 conclude that ``ptr`` is a stack-allocated object.
19823
19824 After '``llvm.lifetime.start``', the stack object that ``ptr`` points is marked
19825 as alive and has an uninitialized value.
19826 The stack object is marked as dead when either
19827 :ref:`llvm.lifetime.end <int_lifeend>` to the alloca is executed or the
19828 function returns.
19829
19830 After :ref:`llvm.lifetime.end <int_lifeend>` is called,
19831 '``llvm.lifetime.start``' on the stack object can be called again.
19832 The second '``llvm.lifetime.start``' call marks the object as alive, but it
19833 does not change the address of the object.
19834
19835 If ``ptr`` is a non-stack-allocated object, it does not point to the first
19836 byte of the object or it is a stack object that is already alive, it simply
19837 fills all bytes of the object with ``poison``.
19838
19839
19840 .. _int_lifeend:
19841
19842 '``llvm.lifetime.end``' Intrinsic
19843 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19844
19845 Syntax:
19846 """""""
19847
19848 ::
19849
19850       declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>)
19851
19852 Overview:
19853 """""""""
19854
19855 The '``llvm.lifetime.end``' intrinsic specifies the end of a memory object's
19856 lifetime.
19857
19858 Arguments:
19859 """"""""""
19860
19861 The first argument is a constant integer representing the size of the
19862 object, or -1 if it is variable sized. The second argument is a pointer
19863 to the object.
19864
19865 Semantics:
19866 """"""""""
19867
19868 If ``ptr`` is a stack-allocated object and it points to the first byte of the
19869 object, the object is dead.
19870 ``ptr`` is conservatively considered as a non-stack-allocated object if
19871 the stack coloring algorithm that is used in the optimization pipeline cannot
19872 conclude that ``ptr`` is a stack-allocated object.
19873
19874 Calling ``llvm.lifetime.end`` on an already dead alloca is no-op.
19875
19876 If ``ptr`` is a non-stack-allocated object or it does not point to the first
19877 byte of the object, it is equivalent to simply filling all bytes of the object
19878 with ``poison``.
19879
19880
19881 '``llvm.invariant.start``' Intrinsic
19882 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19883
19884 Syntax:
19885 """""""
19886 This is an overloaded intrinsic. The memory object can belong to any address space.
19887
19888 ::
19889
19890       declare {}* @llvm.invariant.start.p0i8(i64 <size>, i8* nocapture <ptr>)
19891
19892 Overview:
19893 """""""""
19894
19895 The '``llvm.invariant.start``' intrinsic specifies that the contents of
19896 a memory object will not change.
19897
19898 Arguments:
19899 """"""""""
19900
19901 The first argument is a constant integer representing the size of the
19902 object, or -1 if it is variable sized. The second argument is a pointer
19903 to the object.
19904
19905 Semantics:
19906 """"""""""
19907
19908 This intrinsic indicates that until an ``llvm.invariant.end`` that uses
19909 the return value, the referenced memory location is constant and
19910 unchanging.
19911
19912 '``llvm.invariant.end``' Intrinsic
19913 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19914
19915 Syntax:
19916 """""""
19917 This is an overloaded intrinsic. The memory object can belong to any address space.
19918
19919 ::
19920
19921       declare void @llvm.invariant.end.p0i8({}* <start>, i64 <size>, i8* nocapture <ptr>)
19922
19923 Overview:
19924 """""""""
19925
19926 The '``llvm.invariant.end``' intrinsic specifies that the contents of a
19927 memory object are mutable.
19928
19929 Arguments:
19930 """"""""""
19931
19932 The first argument is the matching ``llvm.invariant.start`` intrinsic.
19933 The second argument is a constant integer representing the size of the
19934 object, or -1 if it is variable sized and the third argument is a
19935 pointer to the object.
19936
19937 Semantics:
19938 """"""""""
19939
19940 This intrinsic indicates that the memory is mutable again.
19941
19942 '``llvm.launder.invariant.group``' Intrinsic
19943 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19944
19945 Syntax:
19946 """""""
19947 This is an overloaded intrinsic. The memory object can belong to any address
19948 space. The returned pointer must belong to the same address space as the
19949 argument.
19950
19951 ::
19952
19953       declare i8* @llvm.launder.invariant.group.p0i8(i8* <ptr>)
19954
19955 Overview:
19956 """""""""
19957
19958 The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant
19959 established by ``invariant.group`` metadata no longer holds, to obtain a new
19960 pointer value that carries fresh invariant group information. It is an
19961 experimental intrinsic, which means that its semantics might change in the
19962 future.
19963
19964
19965 Arguments:
19966 """"""""""
19967
19968 The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer
19969 to the memory.
19970
19971 Semantics:
19972 """"""""""
19973
19974 Returns another pointer that aliases its argument but which is considered different
19975 for the purposes of ``load``/``store`` ``invariant.group`` metadata.
19976 It does not read any accessible memory and the execution can be speculated.
19977
19978 '``llvm.strip.invariant.group``' Intrinsic
19979 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19980
19981 Syntax:
19982 """""""
19983 This is an overloaded intrinsic. The memory object can belong to any address
19984 space. The returned pointer must belong to the same address space as the
19985 argument.
19986
19987 ::
19988
19989       declare i8* @llvm.strip.invariant.group.p0i8(i8* <ptr>)
19990
19991 Overview:
19992 """""""""
19993
19994 The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant
19995 established by ``invariant.group`` metadata no longer holds, to obtain a new pointer
19996 value that does not carry the invariant information. It is an experimental
19997 intrinsic, which means that its semantics might change in the future.
19998
19999
20000 Arguments:
20001 """"""""""
20002
20003 The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer
20004 to the memory.
20005
20006 Semantics:
20007 """"""""""
20008
20009 Returns another pointer that aliases its argument but which has no associated
20010 ``invariant.group`` metadata.
20011 It does not read any memory and can be speculated.
20012
20013
20014
20015 .. _constrainedfp:
20016
20017 Constrained Floating-Point Intrinsics
20018 -------------------------------------
20019
20020 These intrinsics are used to provide special handling of floating-point
20021 operations when specific rounding mode or floating-point exception behavior is
20022 required.  By default, LLVM optimization passes assume that the rounding mode is
20023 round-to-nearest and that floating-point exceptions will not be monitored.
20024 Constrained FP intrinsics are used to support non-default rounding modes and
20025 accurately preserve exception behavior without compromising LLVM's ability to
20026 optimize FP code when the default behavior is used.
20027
20028 If any FP operation in a function is constrained then they all must be
20029 constrained. This is required for correct LLVM IR. Optimizations that
20030 move code around can create miscompiles if mixing of constrained and normal
20031 operations is done. The correct way to mix constrained and less constrained
20032 operations is to use the rounding mode and exception handling metadata to
20033 mark constrained intrinsics as having LLVM's default behavior.
20034
20035 Each of these intrinsics corresponds to a normal floating-point operation. The
20036 data arguments and the return value are the same as the corresponding FP
20037 operation.
20038
20039 The rounding mode argument is a metadata string specifying what
20040 assumptions, if any, the optimizer can make when transforming constant
20041 values. Some constrained FP intrinsics omit this argument. If required
20042 by the intrinsic, this argument must be one of the following strings:
20043
20044 ::
20045
20046       "round.dynamic"
20047       "round.tonearest"
20048       "round.downward"
20049       "round.upward"
20050       "round.towardzero"
20051       "round.tonearestaway"
20052
20053 If this argument is "round.dynamic" optimization passes must assume that the
20054 rounding mode is unknown and may change at runtime.  No transformations that
20055 depend on rounding mode may be performed in this case.
20056
20057 The other possible values for the rounding mode argument correspond to the
20058 similarly named IEEE rounding modes.  If the argument is any of these values
20059 optimization passes may perform transformations as long as they are consistent
20060 with the specified rounding mode.
20061
20062 For example, 'x-0'->'x' is not a valid transformation if the rounding mode is
20063 "round.downward" or "round.dynamic" because if the value of 'x' is +0 then
20064 'x-0' should evaluate to '-0' when rounding downward.  However, this
20065 transformation is legal for all other rounding modes.
20066
20067 For values other than "round.dynamic" optimization passes may assume that the
20068 actual runtime rounding mode (as defined in a target-specific manner) matches
20069 the specified rounding mode, but this is not guaranteed.  Using a specific
20070 non-dynamic rounding mode which does not match the actual rounding mode at
20071 runtime results in undefined behavior.
20072
20073 The exception behavior argument is a metadata string describing the floating
20074 point exception semantics that required for the intrinsic. This argument
20075 must be one of the following strings:
20076
20077 ::
20078
20079       "fpexcept.ignore"
20080       "fpexcept.maytrap"
20081       "fpexcept.strict"
20082
20083 If this argument is "fpexcept.ignore" optimization passes may assume that the
20084 exception status flags will not be read and that floating-point exceptions will
20085 be masked.  This allows transformations to be performed that may change the
20086 exception semantics of the original code.  For example, FP operations may be
20087 speculatively executed in this case whereas they must not be for either of the
20088 other possible values of this argument.
20089
20090 If the exception behavior argument is "fpexcept.maytrap" optimization passes
20091 must avoid transformations that may raise exceptions that would not have been
20092 raised by the original code (such as speculatively executing FP operations), but
20093 passes are not required to preserve all exceptions that are implied by the
20094 original code.  For example, exceptions may be potentially hidden by constant
20095 folding.
20096
20097 If the exception behavior argument is "fpexcept.strict" all transformations must
20098 strictly preserve the floating-point exception semantics of the original code.
20099 Any FP exception that would have been raised by the original code must be raised
20100 by the transformed code, and the transformed code must not raise any FP
20101 exceptions that would not have been raised by the original code.  This is the
20102 exception behavior argument that will be used if the code being compiled reads
20103 the FP exception status flags, but this mode can also be used with code that
20104 unmasks FP exceptions.
20105
20106 The number and order of floating-point exceptions is NOT guaranteed.  For
20107 example, a series of FP operations that each may raise exceptions may be
20108 vectorized into a single instruction that raises each unique exception a single
20109 time.
20110
20111 Proper :ref:`function attributes <fnattrs>` usage is required for the
20112 constrained intrinsics to function correctly.
20113
20114 All function *calls* done in a function that uses constrained floating
20115 point intrinsics must have the ``strictfp`` attribute.
20116
20117 All function *definitions* that use constrained floating point intrinsics
20118 must have the ``strictfp`` attribute.
20119
20120 '``llvm.experimental.constrained.fadd``' Intrinsic
20121 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20122
20123 Syntax:
20124 """""""
20125
20126 ::
20127
20128       declare <type>
20129       @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>,
20130                                           metadata <rounding mode>,
20131                                           metadata <exception behavior>)
20132
20133 Overview:
20134 """""""""
20135
20136 The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its
20137 two operands.
20138
20139
20140 Arguments:
20141 """"""""""
20142
20143 The first two arguments to the '``llvm.experimental.constrained.fadd``'
20144 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20145 of floating-point values. Both arguments must have identical types.
20146
20147 The third and fourth arguments specify the rounding mode and exception
20148 behavior as described above.
20149
20150 Semantics:
20151 """"""""""
20152
20153 The value produced is the floating-point sum of the two value operands and has
20154 the same type as the operands.
20155
20156
20157 '``llvm.experimental.constrained.fsub``' Intrinsic
20158 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20159
20160 Syntax:
20161 """""""
20162
20163 ::
20164
20165       declare <type>
20166       @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>,
20167                                           metadata <rounding mode>,
20168                                           metadata <exception behavior>)
20169
20170 Overview:
20171 """""""""
20172
20173 The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference
20174 of its two operands.
20175
20176
20177 Arguments:
20178 """"""""""
20179
20180 The first two arguments to the '``llvm.experimental.constrained.fsub``'
20181 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20182 of floating-point values. Both arguments must have identical types.
20183
20184 The third and fourth arguments specify the rounding mode and exception
20185 behavior as described above.
20186
20187 Semantics:
20188 """"""""""
20189
20190 The value produced is the floating-point difference of the two value operands
20191 and has the same type as the operands.
20192
20193
20194 '``llvm.experimental.constrained.fmul``' Intrinsic
20195 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20196
20197 Syntax:
20198 """""""
20199
20200 ::
20201
20202       declare <type>
20203       @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>,
20204                                           metadata <rounding mode>,
20205                                           metadata <exception behavior>)
20206
20207 Overview:
20208 """""""""
20209
20210 The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of
20211 its two operands.
20212
20213
20214 Arguments:
20215 """"""""""
20216
20217 The first two arguments to the '``llvm.experimental.constrained.fmul``'
20218 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20219 of floating-point values. Both arguments must have identical types.
20220
20221 The third and fourth arguments specify the rounding mode and exception
20222 behavior as described above.
20223
20224 Semantics:
20225 """"""""""
20226
20227 The value produced is the floating-point product of the two value operands and
20228 has the same type as the operands.
20229
20230
20231 '``llvm.experimental.constrained.fdiv``' Intrinsic
20232 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20233
20234 Syntax:
20235 """""""
20236
20237 ::
20238
20239       declare <type>
20240       @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>,
20241                                           metadata <rounding mode>,
20242                                           metadata <exception behavior>)
20243
20244 Overview:
20245 """""""""
20246
20247 The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of
20248 its two operands.
20249
20250
20251 Arguments:
20252 """"""""""
20253
20254 The first two arguments to the '``llvm.experimental.constrained.fdiv``'
20255 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20256 of floating-point values. Both arguments must have identical types.
20257
20258 The third and fourth arguments specify the rounding mode and exception
20259 behavior as described above.
20260
20261 Semantics:
20262 """"""""""
20263
20264 The value produced is the floating-point quotient of the two value operands and
20265 has the same type as the operands.
20266
20267
20268 '``llvm.experimental.constrained.frem``' Intrinsic
20269 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20270
20271 Syntax:
20272 """""""
20273
20274 ::
20275
20276       declare <type>
20277       @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
20278                                           metadata <rounding mode>,
20279                                           metadata <exception behavior>)
20280
20281 Overview:
20282 """""""""
20283
20284 The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder
20285 from the division of its two operands.
20286
20287
20288 Arguments:
20289 """"""""""
20290
20291 The first two arguments to the '``llvm.experimental.constrained.frem``'
20292 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20293 of floating-point values. Both arguments must have identical types.
20294
20295 The third and fourth arguments specify the rounding mode and exception
20296 behavior as described above.  The rounding mode argument has no effect, since
20297 the result of frem is never rounded, but the argument is included for
20298 consistency with the other constrained floating-point intrinsics.
20299
20300 Semantics:
20301 """"""""""
20302
20303 The value produced is the floating-point remainder from the division of the two
20304 value operands and has the same type as the operands.  The remainder has the
20305 same sign as the dividend.
20306
20307 '``llvm.experimental.constrained.fma``' Intrinsic
20308 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20309
20310 Syntax:
20311 """""""
20312
20313 ::
20314
20315       declare <type>
20316       @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>,
20317                                           metadata <rounding mode>,
20318                                           metadata <exception behavior>)
20319
20320 Overview:
20321 """""""""
20322
20323 The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a
20324 fused-multiply-add operation on its operands.
20325
20326 Arguments:
20327 """"""""""
20328
20329 The first three arguments to the '``llvm.experimental.constrained.fma``'
20330 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector
20331 <t_vector>` of floating-point values. All arguments must have identical types.
20332
20333 The fourth and fifth arguments specify the rounding mode and exception behavior
20334 as described above.
20335
20336 Semantics:
20337 """"""""""
20338
20339 The result produced is the product of the first two operands added to the third
20340 operand computed with infinite precision, and then rounded to the target
20341 precision.
20342
20343 '``llvm.experimental.constrained.fptoui``' Intrinsic
20344 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20345
20346 Syntax:
20347 """""""
20348
20349 ::
20350
20351       declare <ty2>
20352       @llvm.experimental.constrained.fptoui(<type> <value>,
20353                                           metadata <exception behavior>)
20354
20355 Overview:
20356 """""""""
20357
20358 The '``llvm.experimental.constrained.fptoui``' intrinsic converts a
20359 floating-point ``value`` to its unsigned integer equivalent of type ``ty2``.
20360
20361 Arguments:
20362 """"""""""
20363
20364 The first argument to the '``llvm.experimental.constrained.fptoui``'
20365 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
20366 <t_vector>` of floating point values.
20367
20368 The second argument specifies the exception behavior as described above.
20369
20370 Semantics:
20371 """"""""""
20372
20373 The result produced is an unsigned integer converted from the floating
20374 point operand. The value is truncated, so it is rounded towards zero.
20375
20376 '``llvm.experimental.constrained.fptosi``' Intrinsic
20377 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20378
20379 Syntax:
20380 """""""
20381
20382 ::
20383
20384       declare <ty2>
20385       @llvm.experimental.constrained.fptosi(<type> <value>,
20386                                           metadata <exception behavior>)
20387
20388 Overview:
20389 """""""""
20390
20391 The '``llvm.experimental.constrained.fptosi``' intrinsic converts
20392 :ref:`floating-point <t_floating>` ``value`` to type ``ty2``.
20393
20394 Arguments:
20395 """"""""""
20396
20397 The first argument to the '``llvm.experimental.constrained.fptosi``'
20398 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
20399 <t_vector>` of floating point values.
20400
20401 The second argument specifies the exception behavior as described above.
20402
20403 Semantics:
20404 """"""""""
20405
20406 The result produced is a signed integer converted from the floating
20407 point operand. The value is truncated, so it is rounded towards zero.
20408
20409 '``llvm.experimental.constrained.uitofp``' Intrinsic
20410 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20411
20412 Syntax:
20413 """""""
20414
20415 ::
20416
20417       declare <ty2>
20418       @llvm.experimental.constrained.uitofp(<type> <value>,
20419                                           metadata <rounding mode>,
20420                                           metadata <exception behavior>)
20421
20422 Overview:
20423 """""""""
20424
20425 The '``llvm.experimental.constrained.uitofp``' intrinsic converts an
20426 unsigned integer ``value`` to a floating-point of type ``ty2``.
20427
20428 Arguments:
20429 """"""""""
20430
20431 The first argument to the '``llvm.experimental.constrained.uitofp``'
20432 intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
20433 <t_vector>` of integer values.
20434
20435 The second and third arguments specify the rounding mode and exception
20436 behavior as described above.
20437
20438 Semantics:
20439 """"""""""
20440
20441 An inexact floating-point exception will be raised if rounding is required.
20442 Any result produced is a floating point value converted from the input
20443 integer operand.
20444
20445 '``llvm.experimental.constrained.sitofp``' Intrinsic
20446 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20447
20448 Syntax:
20449 """""""
20450
20451 ::
20452
20453       declare <ty2>
20454       @llvm.experimental.constrained.sitofp(<type> <value>,
20455                                           metadata <rounding mode>,
20456                                           metadata <exception behavior>)
20457
20458 Overview:
20459 """""""""
20460
20461 The '``llvm.experimental.constrained.sitofp``' intrinsic converts a
20462 signed integer ``value`` to a floating-point of type ``ty2``.
20463
20464 Arguments:
20465 """"""""""
20466
20467 The first argument to the '``llvm.experimental.constrained.sitofp``'
20468 intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
20469 <t_vector>` of integer values.
20470
20471 The second and third arguments specify the rounding mode and exception
20472 behavior as described above.
20473
20474 Semantics:
20475 """"""""""
20476
20477 An inexact floating-point exception will be raised if rounding is required.
20478 Any result produced is a floating point value converted from the input
20479 integer operand.
20480
20481 '``llvm.experimental.constrained.fptrunc``' Intrinsic
20482 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20483
20484 Syntax:
20485 """""""
20486
20487 ::
20488
20489       declare <ty2>
20490       @llvm.experimental.constrained.fptrunc(<type> <value>,
20491                                           metadata <rounding mode>,
20492                                           metadata <exception behavior>)
20493
20494 Overview:
20495 """""""""
20496
20497 The '``llvm.experimental.constrained.fptrunc``' intrinsic truncates ``value``
20498 to type ``ty2``.
20499
20500 Arguments:
20501 """"""""""
20502
20503 The first argument to the '``llvm.experimental.constrained.fptrunc``'
20504 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
20505 <t_vector>` of floating point values. This argument must be larger in size
20506 than the result.
20507
20508 The second and third arguments specify the rounding mode and exception
20509 behavior as described above.
20510
20511 Semantics:
20512 """"""""""
20513
20514 The result produced is a floating point value truncated to be smaller in size
20515 than the operand.
20516
20517 '``llvm.experimental.constrained.fpext``' Intrinsic
20518 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20519
20520 Syntax:
20521 """""""
20522
20523 ::
20524
20525       declare <ty2>
20526       @llvm.experimental.constrained.fpext(<type> <value>,
20527                                           metadata <exception behavior>)
20528
20529 Overview:
20530 """""""""
20531
20532 The '``llvm.experimental.constrained.fpext``' intrinsic extends a
20533 floating-point ``value`` to a larger floating-point value.
20534
20535 Arguments:
20536 """"""""""
20537
20538 The first argument to the '``llvm.experimental.constrained.fpext``'
20539 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
20540 <t_vector>` of floating point values. This argument must be smaller in size
20541 than the result.
20542
20543 The second argument specifies the exception behavior as described above.
20544
20545 Semantics:
20546 """"""""""
20547
20548 The result produced is a floating point value extended to be larger in size
20549 than the operand. All restrictions that apply to the fpext instruction also
20550 apply to this intrinsic.
20551
20552 '``llvm.experimental.constrained.fcmp``' and '``llvm.experimental.constrained.fcmps``' Intrinsics
20553 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20554
20555 Syntax:
20556 """""""
20557
20558 ::
20559
20560       declare <ty2>
20561       @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>,
20562                                           metadata <condition code>,
20563                                           metadata <exception behavior>)
20564       declare <ty2>
20565       @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>,
20566                                            metadata <condition code>,
20567                                            metadata <exception behavior>)
20568
20569 Overview:
20570 """""""""
20571
20572 The '``llvm.experimental.constrained.fcmp``' and
20573 '``llvm.experimental.constrained.fcmps``' intrinsics return a boolean
20574 value or vector of boolean values based on comparison of its operands.
20575
20576 If the operands are floating-point scalars, then the result type is a
20577 boolean (:ref:`i1 <t_integer>`).
20578
20579 If the operands are floating-point vectors, then the result type is a
20580 vector of boolean with the same number of elements as the operands being
20581 compared.
20582
20583 The '``llvm.experimental.constrained.fcmp``' intrinsic performs a quiet
20584 comparison operation while the '``llvm.experimental.constrained.fcmps``'
20585 intrinsic performs a signaling comparison operation.
20586
20587 Arguments:
20588 """"""""""
20589
20590 The first two arguments to the '``llvm.experimental.constrained.fcmp``'
20591 and '``llvm.experimental.constrained.fcmps``' intrinsics must be
20592 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
20593 of floating-point values. Both arguments must have identical types.
20594
20595 The third argument is the condition code indicating the kind of comparison
20596 to perform. It must be a metadata string with one of the following values:
20597
20598 - "``oeq``": ordered and equal
20599 - "``ogt``": ordered and greater than
20600 - "``oge``": ordered and greater than or equal
20601 - "``olt``": ordered and less than
20602 - "``ole``": ordered and less than or equal
20603 - "``one``": ordered and not equal
20604 - "``ord``": ordered (no nans)
20605 - "``ueq``": unordered or equal
20606 - "``ugt``": unordered or greater than
20607 - "``uge``": unordered or greater than or equal
20608 - "``ult``": unordered or less than
20609 - "``ule``": unordered or less than or equal
20610 - "``une``": unordered or not equal
20611 - "``uno``": unordered (either nans)
20612
20613 *Ordered* means that neither operand is a NAN while *unordered* means
20614 that either operand may be a NAN.
20615
20616 The fourth argument specifies the exception behavior as described above.
20617
20618 Semantics:
20619 """"""""""
20620
20621 ``op1`` and ``op2`` are compared according to the condition code given
20622 as the third argument. If the operands are vectors, then the
20623 vectors are compared element by element. Each comparison performed
20624 always yields an :ref:`i1 <t_integer>` result, as follows:
20625
20626 - "``oeq``": yields ``true`` if both operands are not a NAN and ``op1``
20627   is equal to ``op2``.
20628 - "``ogt``": yields ``true`` if both operands are not a NAN and ``op1``
20629   is greater than ``op2``.
20630 - "``oge``": yields ``true`` if both operands are not a NAN and ``op1``
20631   is greater than or equal to ``op2``.
20632 - "``olt``": yields ``true`` if both operands are not a NAN and ``op1``
20633   is less than ``op2``.
20634 - "``ole``": yields ``true`` if both operands are not a NAN and ``op1``
20635   is less than or equal to ``op2``.
20636 - "``one``": yields ``true`` if both operands are not a NAN and ``op1``
20637   is not equal to ``op2``.
20638 - "``ord``": yields ``true`` if both operands are not a NAN.
20639 - "``ueq``": yields ``true`` if either operand is a NAN or ``op1`` is
20640   equal to ``op2``.
20641 - "``ugt``": yields ``true`` if either operand is a NAN or ``op1`` is
20642   greater than ``op2``.
20643 - "``uge``": yields ``true`` if either operand is a NAN or ``op1`` is
20644   greater than or equal to ``op2``.
20645 - "``ult``": yields ``true`` if either operand is a NAN or ``op1`` is
20646   less than ``op2``.
20647 - "``ule``": yields ``true`` if either operand is a NAN or ``op1`` is
20648   less than or equal to ``op2``.
20649 - "``une``": yields ``true`` if either operand is a NAN or ``op1`` is
20650   not equal to ``op2``.
20651 - "``uno``": yields ``true`` if either operand is a NAN.
20652
20653 The quiet comparison operation performed by
20654 '``llvm.experimental.constrained.fcmp``' will only raise an exception
20655 if either operand is a SNAN.  The signaling comparison operation
20656 performed by '``llvm.experimental.constrained.fcmps``' will raise an
20657 exception if either operand is a NAN (QNAN or SNAN). Such an exception
20658 does not preclude a result being produced (e.g. exception might only
20659 set a flag), therefore the distinction between ordered and unordered
20660 comparisons is also relevant for the
20661 '``llvm.experimental.constrained.fcmps``' intrinsic.
20662
20663 '``llvm.experimental.constrained.fmuladd``' Intrinsic
20664 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20665
20666 Syntax:
20667 """""""
20668
20669 ::
20670
20671       declare <type>
20672       @llvm.experimental.constrained.fmuladd(<type> <op1>, <type> <op2>,
20673                                              <type> <op3>,
20674                                              metadata <rounding mode>,
20675                                              metadata <exception behavior>)
20676
20677 Overview:
20678 """""""""
20679
20680 The '``llvm.experimental.constrained.fmuladd``' intrinsic represents
20681 multiply-add expressions that can be fused if the code generator determines
20682 that (a) the target instruction set has support for a fused operation,
20683 and (b) that the fused operation is more efficient than the equivalent,
20684 separate pair of mul and add instructions.
20685
20686 Arguments:
20687 """"""""""
20688
20689 The first three arguments to the '``llvm.experimental.constrained.fmuladd``'
20690 intrinsic must be floating-point or vector of floating-point values.
20691 All three arguments must have identical types.
20692
20693 The fourth and fifth arguments specify the rounding mode and exception behavior
20694 as described above.
20695
20696 Semantics:
20697 """"""""""
20698
20699 The expression:
20700
20701 ::
20702
20703       %0 = call float @llvm.experimental.constrained.fmuladd.f32(%a, %b, %c,
20704                                                                  metadata <rounding mode>,
20705                                                                  metadata <exception behavior>)
20706
20707 is equivalent to the expression:
20708
20709 ::
20710
20711       %0 = call float @llvm.experimental.constrained.fmul.f32(%a, %b,
20712                                                               metadata <rounding mode>,
20713                                                               metadata <exception behavior>)
20714       %1 = call float @llvm.experimental.constrained.fadd.f32(%0, %c,
20715                                                               metadata <rounding mode>,
20716                                                               metadata <exception behavior>)
20717
20718 except that it is unspecified whether rounding will be performed between the
20719 multiplication and addition steps. Fusion is not guaranteed, even if the target
20720 platform supports it.
20721 If a fused multiply-add is required, the corresponding
20722 :ref:`llvm.experimental.constrained.fma <int_fma>` intrinsic function should be
20723 used instead.
20724 This never sets errno, just as '``llvm.experimental.constrained.fma.*``'.
20725
20726 Constrained libm-equivalent Intrinsics
20727 --------------------------------------
20728
20729 In addition to the basic floating-point operations for which constrained
20730 intrinsics are described above, there are constrained versions of various
20731 operations which provide equivalent behavior to a corresponding libm function.
20732 These intrinsics allow the precise behavior of these operations with respect to
20733 rounding mode and exception behavior to be controlled.
20734
20735 As with the basic constrained floating-point intrinsics, the rounding mode
20736 and exception behavior arguments only control the behavior of the optimizer.
20737 They do not change the runtime floating-point environment.
20738
20739
20740 '``llvm.experimental.constrained.sqrt``' Intrinsic
20741 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20742
20743 Syntax:
20744 """""""
20745
20746 ::
20747
20748       declare <type>
20749       @llvm.experimental.constrained.sqrt(<type> <op1>,
20750                                           metadata <rounding mode>,
20751                                           metadata <exception behavior>)
20752
20753 Overview:
20754 """""""""
20755
20756 The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root
20757 of the specified value, returning the same value as the libm '``sqrt``'
20758 functions would, but without setting ``errno``.
20759
20760 Arguments:
20761 """"""""""
20762
20763 The first argument and the return type are floating-point numbers of the same
20764 type.
20765
20766 The second and third arguments specify the rounding mode and exception
20767 behavior as described above.
20768
20769 Semantics:
20770 """"""""""
20771
20772 This function returns the nonnegative square root of the specified value.
20773 If the value is less than negative zero, a floating-point exception occurs
20774 and the return value is architecture specific.
20775
20776
20777 '``llvm.experimental.constrained.pow``' Intrinsic
20778 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20779
20780 Syntax:
20781 """""""
20782
20783 ::
20784
20785       declare <type>
20786       @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>,
20787                                          metadata <rounding mode>,
20788                                          metadata <exception behavior>)
20789
20790 Overview:
20791 """""""""
20792
20793 The '``llvm.experimental.constrained.pow``' intrinsic returns the first operand
20794 raised to the (positive or negative) power specified by the second operand.
20795
20796 Arguments:
20797 """"""""""
20798
20799 The first two arguments and the return value are floating-point numbers of the
20800 same type.  The second argument specifies the power to which the first argument
20801 should be raised.
20802
20803 The third and fourth arguments specify the rounding mode and exception
20804 behavior as described above.
20805
20806 Semantics:
20807 """"""""""
20808
20809 This function returns the first value raised to the second power,
20810 returning the same values as the libm ``pow`` functions would, and
20811 handles error conditions in the same way.
20812
20813
20814 '``llvm.experimental.constrained.powi``' Intrinsic
20815 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20816
20817 Syntax:
20818 """""""
20819
20820 ::
20821
20822       declare <type>
20823       @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>,
20824                                           metadata <rounding mode>,
20825                                           metadata <exception behavior>)
20826
20827 Overview:
20828 """""""""
20829
20830 The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand
20831 raised to the (positive or negative) power specified by the second operand. The
20832 order of evaluation of multiplications is not defined. When a vector of
20833 floating-point type is used, the second argument remains a scalar integer value.
20834
20835
20836 Arguments:
20837 """"""""""
20838
20839 The first argument and the return value are floating-point numbers of the same
20840 type.  The second argument is a 32-bit signed integer specifying the power to
20841 which the first argument should be raised.
20842
20843 The third and fourth arguments specify the rounding mode and exception
20844 behavior as described above.
20845
20846 Semantics:
20847 """"""""""
20848
20849 This function returns the first value raised to the second power with an
20850 unspecified sequence of rounding operations.
20851
20852
20853 '``llvm.experimental.constrained.sin``' Intrinsic
20854 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20855
20856 Syntax:
20857 """""""
20858
20859 ::
20860
20861       declare <type>
20862       @llvm.experimental.constrained.sin(<type> <op1>,
20863                                          metadata <rounding mode>,
20864                                          metadata <exception behavior>)
20865
20866 Overview:
20867 """""""""
20868
20869 The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the
20870 first operand.
20871
20872 Arguments:
20873 """"""""""
20874
20875 The first argument and the return type are floating-point numbers of the same
20876 type.
20877
20878 The second and third arguments specify the rounding mode and exception
20879 behavior as described above.
20880
20881 Semantics:
20882 """"""""""
20883
20884 This function returns the sine of the specified operand, returning the
20885 same values as the libm ``sin`` functions would, and handles error
20886 conditions in the same way.
20887
20888
20889 '``llvm.experimental.constrained.cos``' Intrinsic
20890 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20891
20892 Syntax:
20893 """""""
20894
20895 ::
20896
20897       declare <type>
20898       @llvm.experimental.constrained.cos(<type> <op1>,
20899                                          metadata <rounding mode>,
20900                                          metadata <exception behavior>)
20901
20902 Overview:
20903 """""""""
20904
20905 The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the
20906 first operand.
20907
20908 Arguments:
20909 """"""""""
20910
20911 The first argument and the return type are floating-point numbers of the same
20912 type.
20913
20914 The second and third arguments specify the rounding mode and exception
20915 behavior as described above.
20916
20917 Semantics:
20918 """"""""""
20919
20920 This function returns the cosine of the specified operand, returning the
20921 same values as the libm ``cos`` functions would, and handles error
20922 conditions in the same way.
20923
20924
20925 '``llvm.experimental.constrained.exp``' Intrinsic
20926 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20927
20928 Syntax:
20929 """""""
20930
20931 ::
20932
20933       declare <type>
20934       @llvm.experimental.constrained.exp(<type> <op1>,
20935                                          metadata <rounding mode>,
20936                                          metadata <exception behavior>)
20937
20938 Overview:
20939 """""""""
20940
20941 The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e
20942 exponential of the specified value.
20943
20944 Arguments:
20945 """"""""""
20946
20947 The first argument and the return value are floating-point numbers of the same
20948 type.
20949
20950 The second and third arguments specify the rounding mode and exception
20951 behavior as described above.
20952
20953 Semantics:
20954 """"""""""
20955
20956 This function returns the same values as the libm ``exp`` functions
20957 would, and handles error conditions in the same way.
20958
20959
20960 '``llvm.experimental.constrained.exp2``' Intrinsic
20961 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20962
20963 Syntax:
20964 """""""
20965
20966 ::
20967
20968       declare <type>
20969       @llvm.experimental.constrained.exp2(<type> <op1>,
20970                                           metadata <rounding mode>,
20971                                           metadata <exception behavior>)
20972
20973 Overview:
20974 """""""""
20975
20976 The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2
20977 exponential of the specified value.
20978
20979
20980 Arguments:
20981 """"""""""
20982
20983 The first argument and the return value are floating-point numbers of the same
20984 type.
20985
20986 The second and third arguments specify the rounding mode and exception
20987 behavior as described above.
20988
20989 Semantics:
20990 """"""""""
20991
20992 This function returns the same values as the libm ``exp2`` functions
20993 would, and handles error conditions in the same way.
20994
20995
20996 '``llvm.experimental.constrained.log``' Intrinsic
20997 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20998
20999 Syntax:
21000 """""""
21001
21002 ::
21003
21004       declare <type>
21005       @llvm.experimental.constrained.log(<type> <op1>,
21006                                          metadata <rounding mode>,
21007                                          metadata <exception behavior>)
21008
21009 Overview:
21010 """""""""
21011
21012 The '``llvm.experimental.constrained.log``' intrinsic computes the base-e
21013 logarithm of the specified value.
21014
21015 Arguments:
21016 """"""""""
21017
21018 The first argument and the return value are floating-point numbers of the same
21019 type.
21020
21021 The second and third arguments specify the rounding mode and exception
21022 behavior as described above.
21023
21024
21025 Semantics:
21026 """"""""""
21027
21028 This function returns the same values as the libm ``log`` functions
21029 would, and handles error conditions in the same way.
21030
21031
21032 '``llvm.experimental.constrained.log10``' Intrinsic
21033 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21034
21035 Syntax:
21036 """""""
21037
21038 ::
21039
21040       declare <type>
21041       @llvm.experimental.constrained.log10(<type> <op1>,
21042                                            metadata <rounding mode>,
21043                                            metadata <exception behavior>)
21044
21045 Overview:
21046 """""""""
21047
21048 The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10
21049 logarithm of the specified value.
21050
21051 Arguments:
21052 """"""""""
21053
21054 The first argument and the return value are floating-point numbers of the same
21055 type.
21056
21057 The second and third arguments specify the rounding mode and exception
21058 behavior as described above.
21059
21060 Semantics:
21061 """"""""""
21062
21063 This function returns the same values as the libm ``log10`` functions
21064 would, and handles error conditions in the same way.
21065
21066
21067 '``llvm.experimental.constrained.log2``' Intrinsic
21068 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21069
21070 Syntax:
21071 """""""
21072
21073 ::
21074
21075       declare <type>
21076       @llvm.experimental.constrained.log2(<type> <op1>,
21077                                           metadata <rounding mode>,
21078                                           metadata <exception behavior>)
21079
21080 Overview:
21081 """""""""
21082
21083 The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2
21084 logarithm of the specified value.
21085
21086 Arguments:
21087 """"""""""
21088
21089 The first argument and the return value are floating-point numbers of the same
21090 type.
21091
21092 The second and third arguments specify the rounding mode and exception
21093 behavior as described above.
21094
21095 Semantics:
21096 """"""""""
21097
21098 This function returns the same values as the libm ``log2`` functions
21099 would, and handles error conditions in the same way.
21100
21101
21102 '``llvm.experimental.constrained.rint``' Intrinsic
21103 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21104
21105 Syntax:
21106 """""""
21107
21108 ::
21109
21110       declare <type>
21111       @llvm.experimental.constrained.rint(<type> <op1>,
21112                                           metadata <rounding mode>,
21113                                           metadata <exception behavior>)
21114
21115 Overview:
21116 """""""""
21117
21118 The '``llvm.experimental.constrained.rint``' intrinsic returns the first
21119 operand rounded to the nearest integer. It may raise an inexact floating-point
21120 exception if the operand is not an integer.
21121
21122 Arguments:
21123 """"""""""
21124
21125 The first argument and the return value are floating-point numbers of the same
21126 type.
21127
21128 The second and third arguments specify the rounding mode and exception
21129 behavior as described above.
21130
21131 Semantics:
21132 """"""""""
21133
21134 This function returns the same values as the libm ``rint`` functions
21135 would, and handles error conditions in the same way.  The rounding mode is
21136 described, not determined, by the rounding mode argument.  The actual rounding
21137 mode is determined by the runtime floating-point environment.  The rounding
21138 mode argument is only intended as information to the compiler.
21139
21140
21141 '``llvm.experimental.constrained.lrint``' Intrinsic
21142 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21143
21144 Syntax:
21145 """""""
21146
21147 ::
21148
21149       declare <inttype>
21150       @llvm.experimental.constrained.lrint(<fptype> <op1>,
21151                                            metadata <rounding mode>,
21152                                            metadata <exception behavior>)
21153
21154 Overview:
21155 """""""""
21156
21157 The '``llvm.experimental.constrained.lrint``' intrinsic returns the first
21158 operand rounded to the nearest integer. An inexact floating-point exception
21159 will be raised if the operand is not an integer. An invalid exception is
21160 raised if the result is too large to fit into a supported integer type,
21161 and in this case the result is undefined.
21162
21163 Arguments:
21164 """"""""""
21165
21166 The first argument is a floating-point number. The return value is an
21167 integer type. Not all types are supported on all targets. The supported
21168 types are the same as the ``llvm.lrint`` intrinsic and the ``lrint``
21169 libm functions.
21170
21171 The second and third arguments specify the rounding mode and exception
21172 behavior as described above.
21173
21174 Semantics:
21175 """"""""""
21176
21177 This function returns the same values as the libm ``lrint`` functions
21178 would, and handles error conditions in the same way.
21179
21180 The rounding mode is described, not determined, by the rounding mode
21181 argument.  The actual rounding mode is determined by the runtime floating-point
21182 environment.  The rounding mode argument is only intended as information
21183 to the compiler.
21184
21185 If the runtime floating-point environment is using the default rounding mode
21186 then the results will be the same as the llvm.lrint intrinsic.
21187
21188
21189 '``llvm.experimental.constrained.llrint``' Intrinsic
21190 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21191
21192 Syntax:
21193 """""""
21194
21195 ::
21196
21197       declare <inttype>
21198       @llvm.experimental.constrained.llrint(<fptype> <op1>,
21199                                             metadata <rounding mode>,
21200                                             metadata <exception behavior>)
21201
21202 Overview:
21203 """""""""
21204
21205 The '``llvm.experimental.constrained.llrint``' intrinsic returns the first
21206 operand rounded to the nearest integer. An inexact floating-point exception
21207 will be raised if the operand is not an integer. An invalid exception is
21208 raised if the result is too large to fit into a supported integer type,
21209 and in this case the result is undefined.
21210
21211 Arguments:
21212 """"""""""
21213
21214 The first argument is a floating-point number. The return value is an
21215 integer type. Not all types are supported on all targets. The supported
21216 types are the same as the ``llvm.llrint`` intrinsic and the ``llrint``
21217 libm functions.
21218
21219 The second and third arguments specify the rounding mode and exception
21220 behavior as described above.
21221
21222 Semantics:
21223 """"""""""
21224
21225 This function returns the same values as the libm ``llrint`` functions
21226 would, and handles error conditions in the same way.
21227
21228 The rounding mode is described, not determined, by the rounding mode
21229 argument.  The actual rounding mode is determined by the runtime floating-point
21230 environment.  The rounding mode argument is only intended as information
21231 to the compiler.
21232
21233 If the runtime floating-point environment is using the default rounding mode
21234 then the results will be the same as the llvm.llrint intrinsic.
21235
21236
21237 '``llvm.experimental.constrained.nearbyint``' Intrinsic
21238 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21239
21240 Syntax:
21241 """""""
21242
21243 ::
21244
21245       declare <type>
21246       @llvm.experimental.constrained.nearbyint(<type> <op1>,
21247                                                metadata <rounding mode>,
21248                                                metadata <exception behavior>)
21249
21250 Overview:
21251 """""""""
21252
21253 The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first
21254 operand rounded to the nearest integer. It will not raise an inexact
21255 floating-point exception if the operand is not an integer.
21256
21257
21258 Arguments:
21259 """"""""""
21260
21261 The first argument and the return value are floating-point numbers of the same
21262 type.
21263
21264 The second and third arguments specify the rounding mode and exception
21265 behavior as described above.
21266
21267 Semantics:
21268 """"""""""
21269
21270 This function returns the same values as the libm ``nearbyint`` functions
21271 would, and handles error conditions in the same way.  The rounding mode is
21272 described, not determined, by the rounding mode argument.  The actual rounding
21273 mode is determined by the runtime floating-point environment.  The rounding
21274 mode argument is only intended as information to the compiler.
21275
21276
21277 '``llvm.experimental.constrained.maxnum``' Intrinsic
21278 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21279
21280 Syntax:
21281 """""""
21282
21283 ::
21284
21285       declare <type>
21286       @llvm.experimental.constrained.maxnum(<type> <op1>, <type> <op2>
21287                                             metadata <exception behavior>)
21288
21289 Overview:
21290 """""""""
21291
21292 The '``llvm.experimental.constrained.maxnum``' intrinsic returns the maximum
21293 of the two arguments.
21294
21295 Arguments:
21296 """"""""""
21297
21298 The first two arguments and the return value are floating-point numbers
21299 of the same type.
21300
21301 The third argument specifies the exception behavior as described above.
21302
21303 Semantics:
21304 """"""""""
21305
21306 This function follows the IEEE-754 semantics for maxNum.
21307
21308
21309 '``llvm.experimental.constrained.minnum``' Intrinsic
21310 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21311
21312 Syntax:
21313 """""""
21314
21315 ::
21316
21317       declare <type>
21318       @llvm.experimental.constrained.minnum(<type> <op1>, <type> <op2>
21319                                             metadata <exception behavior>)
21320
21321 Overview:
21322 """""""""
21323
21324 The '``llvm.experimental.constrained.minnum``' intrinsic returns the minimum
21325 of the two arguments.
21326
21327 Arguments:
21328 """"""""""
21329
21330 The first two arguments and the return value are floating-point numbers
21331 of the same type.
21332
21333 The third argument specifies the exception behavior as described above.
21334
21335 Semantics:
21336 """"""""""
21337
21338 This function follows the IEEE-754 semantics for minNum.
21339
21340
21341 '``llvm.experimental.constrained.maximum``' Intrinsic
21342 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21343
21344 Syntax:
21345 """""""
21346
21347 ::
21348
21349       declare <type>
21350       @llvm.experimental.constrained.maximum(<type> <op1>, <type> <op2>
21351                                              metadata <exception behavior>)
21352
21353 Overview:
21354 """""""""
21355
21356 The '``llvm.experimental.constrained.maximum``' intrinsic returns the maximum
21357 of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
21358
21359 Arguments:
21360 """"""""""
21361
21362 The first two arguments and the return value are floating-point numbers
21363 of the same type.
21364
21365 The third argument specifies the exception behavior as described above.
21366
21367 Semantics:
21368 """"""""""
21369
21370 This function follows semantics specified in the draft of IEEE 754-2018.
21371
21372
21373 '``llvm.experimental.constrained.minimum``' Intrinsic
21374 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21375
21376 Syntax:
21377 """""""
21378
21379 ::
21380
21381       declare <type>
21382       @llvm.experimental.constrained.minimum(<type> <op1>, <type> <op2>
21383                                              metadata <exception behavior>)
21384
21385 Overview:
21386 """""""""
21387
21388 The '``llvm.experimental.constrained.minimum``' intrinsic returns the minimum
21389 of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
21390
21391 Arguments:
21392 """"""""""
21393
21394 The first two arguments and the return value are floating-point numbers
21395 of the same type.
21396
21397 The third argument specifies the exception behavior as described above.
21398
21399 Semantics:
21400 """"""""""
21401
21402 This function follows semantics specified in the draft of IEEE 754-2018.
21403
21404
21405 '``llvm.experimental.constrained.ceil``' Intrinsic
21406 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21407
21408 Syntax:
21409 """""""
21410
21411 ::
21412
21413       declare <type>
21414       @llvm.experimental.constrained.ceil(<type> <op1>,
21415                                           metadata <exception behavior>)
21416
21417 Overview:
21418 """""""""
21419
21420 The '``llvm.experimental.constrained.ceil``' intrinsic returns the ceiling of the
21421 first operand.
21422
21423 Arguments:
21424 """"""""""
21425
21426 The first argument and the return value are floating-point numbers of the same
21427 type.
21428
21429 The second argument specifies the exception behavior as described above.
21430
21431 Semantics:
21432 """"""""""
21433
21434 This function returns the same values as the libm ``ceil`` functions
21435 would and handles error conditions in the same way.
21436
21437
21438 '``llvm.experimental.constrained.floor``' Intrinsic
21439 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21440
21441 Syntax:
21442 """""""
21443
21444 ::
21445
21446       declare <type>
21447       @llvm.experimental.constrained.floor(<type> <op1>,
21448                                            metadata <exception behavior>)
21449
21450 Overview:
21451 """""""""
21452
21453 The '``llvm.experimental.constrained.floor``' intrinsic returns the floor of the
21454 first operand.
21455
21456 Arguments:
21457 """"""""""
21458
21459 The first argument and the return value are floating-point numbers of the same
21460 type.
21461
21462 The second argument specifies the exception behavior as described above.
21463
21464 Semantics:
21465 """"""""""
21466
21467 This function returns the same values as the libm ``floor`` functions
21468 would and handles error conditions in the same way.
21469
21470
21471 '``llvm.experimental.constrained.round``' Intrinsic
21472 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21473
21474 Syntax:
21475 """""""
21476
21477 ::
21478
21479       declare <type>
21480       @llvm.experimental.constrained.round(<type> <op1>,
21481                                            metadata <exception behavior>)
21482
21483 Overview:
21484 """""""""
21485
21486 The '``llvm.experimental.constrained.round``' intrinsic returns the first
21487 operand rounded to the nearest integer.
21488
21489 Arguments:
21490 """"""""""
21491
21492 The first argument and the return value are floating-point numbers of the same
21493 type.
21494
21495 The second argument specifies the exception behavior as described above.
21496
21497 Semantics:
21498 """"""""""
21499
21500 This function returns the same values as the libm ``round`` functions
21501 would and handles error conditions in the same way.
21502
21503
21504 '``llvm.experimental.constrained.roundeven``' Intrinsic
21505 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21506
21507 Syntax:
21508 """""""
21509
21510 ::
21511
21512       declare <type>
21513       @llvm.experimental.constrained.roundeven(<type> <op1>,
21514                                                metadata <exception behavior>)
21515
21516 Overview:
21517 """""""""
21518
21519 The '``llvm.experimental.constrained.roundeven``' intrinsic returns the first
21520 operand rounded to the nearest integer in floating-point format, rounding
21521 halfway cases to even (that is, to the nearest value that is an even integer),
21522 regardless of the current rounding direction.
21523
21524 Arguments:
21525 """"""""""
21526
21527 The first argument and the return value are floating-point numbers of the same
21528 type.
21529
21530 The second argument specifies the exception behavior as described above.
21531
21532 Semantics:
21533 """"""""""
21534
21535 This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
21536 also behaves in the same way as C standard function ``roundeven`` and can signal
21537 the invalid operation exception for a SNAN operand.
21538
21539
21540 '``llvm.experimental.constrained.lround``' Intrinsic
21541 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21542
21543 Syntax:
21544 """""""
21545
21546 ::
21547
21548       declare <inttype>
21549       @llvm.experimental.constrained.lround(<fptype> <op1>,
21550                                             metadata <exception behavior>)
21551
21552 Overview:
21553 """""""""
21554
21555 The '``llvm.experimental.constrained.lround``' intrinsic returns the first
21556 operand rounded to the nearest integer with ties away from zero.  It will
21557 raise an inexact floating-point exception if the operand is not an integer.
21558 An invalid exception is raised if the result is too large to fit into a
21559 supported integer type, and in this case the result is undefined.
21560
21561 Arguments:
21562 """"""""""
21563
21564 The first argument is a floating-point number. The return value is an
21565 integer type. Not all types are supported on all targets. The supported
21566 types are the same as the ``llvm.lround`` intrinsic and the ``lround``
21567 libm functions.
21568
21569 The second argument specifies the exception behavior as described above.
21570
21571 Semantics:
21572 """"""""""
21573
21574 This function returns the same values as the libm ``lround`` functions
21575 would and handles error conditions in the same way.
21576
21577
21578 '``llvm.experimental.constrained.llround``' Intrinsic
21579 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21580
21581 Syntax:
21582 """""""
21583
21584 ::
21585
21586       declare <inttype>
21587       @llvm.experimental.constrained.llround(<fptype> <op1>,
21588                                              metadata <exception behavior>)
21589
21590 Overview:
21591 """""""""
21592
21593 The '``llvm.experimental.constrained.llround``' intrinsic returns the first
21594 operand rounded to the nearest integer with ties away from zero. It will
21595 raise an inexact floating-point exception if the operand is not an integer.
21596 An invalid exception is raised if the result is too large to fit into a
21597 supported integer type, and in this case the result is undefined.
21598
21599 Arguments:
21600 """"""""""
21601
21602 The first argument is a floating-point number. The return value is an
21603 integer type. Not all types are supported on all targets. The supported
21604 types are the same as the ``llvm.llround`` intrinsic and the ``llround``
21605 libm functions.
21606
21607 The second argument specifies the exception behavior as described above.
21608
21609 Semantics:
21610 """"""""""
21611
21612 This function returns the same values as the libm ``llround`` functions
21613 would and handles error conditions in the same way.
21614
21615
21616 '``llvm.experimental.constrained.trunc``' Intrinsic
21617 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21618
21619 Syntax:
21620 """""""
21621
21622 ::
21623
21624       declare <type>
21625       @llvm.experimental.constrained.trunc(<type> <op1>,
21626                                            metadata <exception behavior>)
21627
21628 Overview:
21629 """""""""
21630
21631 The '``llvm.experimental.constrained.trunc``' intrinsic returns the first
21632 operand rounded to the nearest integer not larger in magnitude than the
21633 operand.
21634
21635 Arguments:
21636 """"""""""
21637
21638 The first argument and the return value are floating-point numbers of the same
21639 type.
21640
21641 The second argument specifies the exception behavior as described above.
21642
21643 Semantics:
21644 """"""""""
21645
21646 This function returns the same values as the libm ``trunc`` functions
21647 would and handles error conditions in the same way.
21648
21649 .. _int_experimental_noalias_scope_decl:
21650
21651 '``llvm.experimental.noalias.scope.decl``' Intrinsic
21652 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21653
21654 Syntax:
21655 """""""
21656
21657
21658 ::
21659
21660       declare void @llvm.experimental.noalias.scope.decl(metadata !id.scope.list)
21661
21662 Overview:
21663 """""""""
21664
21665 The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
21666 noalias scope is declared. When the intrinsic is duplicated, a decision must
21667 also be made about the scope: depending on the reason of the duplication,
21668 the scope might need to be duplicated as well.
21669
21670
21671 Arguments:
21672 """"""""""
21673
21674 The ``!id.scope.list`` argument is metadata that is a list of ``noalias``
21675 metadata references. The format is identical to that required for ``noalias``
21676 metadata. This list must have exactly one element.
21677
21678 Semantics:
21679 """"""""""
21680
21681 The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
21682 noalias scope is declared. When the intrinsic is duplicated, a decision must
21683 also be made about the scope: depending on the reason of the duplication,
21684 the scope might need to be duplicated as well.
21685
21686 For example, when the intrinsic is used inside a loop body, and that loop is
21687 unrolled, the associated noalias scope must also be duplicated. Otherwise, the
21688 noalias property it signifies would spill across loop iterations, whereas it
21689 was only valid within a single iteration.
21690
21691 .. code-block:: llvm
21692
21693   ; This examples shows two possible positions for noalias.decl and how they impact the semantics:
21694   ; If it is outside the loop (Version 1), then %a and %b are noalias across *all* iterations.
21695   ; If it is inside the loop (Version 2), then %a and %b are noalias only within *one* iteration.
21696   declare void @decl_in_loop(i8* %a.base, i8* %b.base) {
21697   entry:
21698     ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 1: noalias decl outside loop
21699     br label %loop
21700
21701   loop:
21702     %a = phi i8* [ %a.base, %entry ], [ %a.inc, %loop ]
21703     %b = phi i8* [ %b.base, %entry ], [ %b.inc, %loop ]
21704     ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 2: noalias decl inside loop
21705     %val = load i8, i8* %a, !alias.scope !2
21706     store i8 %val, i8* %b, !noalias !2
21707     %a.inc = getelementptr inbounds i8, i8* %a, i64 1
21708     %b.inc = getelementptr inbounds i8, i8* %b, i64 1
21709     %cond = call i1 @cond()
21710     br i1 %cond, label %loop, label %exit
21711
21712   exit:
21713     ret void
21714   }
21715
21716   !0 = !{!0} ; domain
21717   !1 = !{!1, !0} ; scope
21718   !2 = !{!1} ; scope list
21719
21720 Multiple calls to `@llvm.experimental.noalias.scope.decl` for the same scope
21721 are possible, but one should never dominate another. Violations are pointed out
21722 by the verifier as they indicate a problem in either a transformation pass or
21723 the input.
21724
21725
21726 Floating Point Environment Manipulation intrinsics
21727 --------------------------------------------------
21728
21729 These functions read or write floating point environment, such as rounding
21730 mode or state of floating point exceptions. Altering the floating point
21731 environment requires special care. See :ref:`Floating Point Environment <floatenv>`.
21732
21733 '``llvm.flt.rounds``' Intrinsic
21734 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21735
21736 Syntax:
21737 """""""
21738
21739 ::
21740
21741       declare i32 @llvm.flt.rounds()
21742
21743 Overview:
21744 """""""""
21745
21746 The '``llvm.flt.rounds``' intrinsic reads the current rounding mode.
21747
21748 Semantics:
21749 """"""""""
21750
21751 The '``llvm.flt.rounds``' intrinsic returns the current rounding mode.
21752 Encoding of the returned values is same as the result of ``FLT_ROUNDS``,
21753 specified by C standard:
21754
21755 ::
21756
21757     0  - toward zero
21758     1  - to nearest, ties to even
21759     2  - toward positive infinity
21760     3  - toward negative infinity
21761     4  - to nearest, ties away from zero
21762
21763 Other values may be used to represent additional rounding modes, supported by a
21764 target. These values are target-specific.
21765
21766
21767 '``llvm.set.rounding``' Intrinsic
21768 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21769
21770 Syntax:
21771 """""""
21772
21773 ::
21774
21775       declare void @llvm.set.rounding(i32 <val>)
21776
21777 Overview:
21778 """""""""
21779
21780 The '``llvm.set.rounding``' intrinsic sets current rounding mode.
21781
21782 Arguments:
21783 """"""""""
21784
21785 The argument is the required rounding mode. Encoding of rounding mode is
21786 the same as used by '``llvm.flt.rounds``'.
21787
21788 Semantics:
21789 """"""""""
21790
21791 The '``llvm.set.rounding``' intrinsic sets the current rounding mode. It is
21792 similar to C library function 'fesetround', however this intrinsic does not
21793 return any value and uses platform-independent representation of IEEE rounding
21794 modes.
21795
21796
21797 Floating Point Test Intrinsics
21798 ------------------------------
21799
21800 These functions get properties of floating point values.
21801
21802
21803 '``llvm.isnan``' Intrinsic
21804 ^^^^^^^^^^^^^^^^^^^^^^^^^^
21805
21806 Syntax:
21807 """""""
21808
21809 ::
21810
21811       declare i1 @llvm.isnan(<fptype> <op>)
21812       declare <N x i1> @llvm.isnan(<vector-fptype> <op>)
21813
21814 Overview:
21815 """""""""
21816
21817 The '``llvm.isnan``' intrinsic returns a boolean value or vector of boolean
21818 values depending on whether the value is NaN.
21819
21820 If the operand is a floating-point scalar, then the result type is a
21821 boolean (:ref:`i1 <t_integer>`).
21822
21823 If the operand is a floating-point vector, then the result type is a
21824 vector of boolean with the same number of elements as the operand.
21825
21826 Arguments:
21827 """"""""""
21828
21829 The argument to the '``llvm.isnan``' intrinsic must be
21830 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
21831 of floating-point values.
21832
21833
21834 Semantics:
21835 """"""""""
21836
21837 The function tests if ``op`` is NaN. If ``op`` is a vector, then the
21838 check is made element by element. Each test yields an :ref:`i1 <t_integer>`
21839 result, which is ``true``, if the value is NaN. The function never raises
21840 floating point exceptions.
21841
21842
21843 General Intrinsics
21844 ------------------
21845
21846 This class of intrinsics is designed to be generic and has no specific
21847 purpose.
21848
21849 '``llvm.var.annotation``' Intrinsic
21850 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21851
21852 Syntax:
21853 """""""
21854
21855 ::
21856
21857       declare void @llvm.var.annotation(i8* <val>, i8* <str>, i8* <str>, i32  <int>)
21858
21859 Overview:
21860 """""""""
21861
21862 The '``llvm.var.annotation``' intrinsic.
21863
21864 Arguments:
21865 """"""""""
21866
21867 The first argument is a pointer to a value, the second is a pointer to a
21868 global string, the third is a pointer to a global string which is the
21869 source file name, and the last argument is the line number.
21870
21871 Semantics:
21872 """"""""""
21873
21874 This intrinsic allows annotation of local variables with arbitrary
21875 strings. This can be useful for special purpose optimizations that want
21876 to look for these annotations. These have no other defined use; they are
21877 ignored by code generation and optimization.
21878
21879 '``llvm.ptr.annotation.*``' Intrinsic
21880 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21881
21882 Syntax:
21883 """""""
21884
21885 This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a
21886 pointer to an integer of any width. *NOTE* you must specify an address space for
21887 the pointer. The identifier for the default address space is the integer
21888 '``0``'.
21889
21890 ::
21891
21892       declare i8*   @llvm.ptr.annotation.p<address space>i8(i8* <val>, i8* <str>, i8* <str>, i32  <int>)
21893       declare i16*  @llvm.ptr.annotation.p<address space>i16(i16* <val>, i8* <str>, i8* <str>, i32  <int>)
21894       declare i32*  @llvm.ptr.annotation.p<address space>i32(i32* <val>, i8* <str>, i8* <str>, i32  <int>)
21895       declare i64*  @llvm.ptr.annotation.p<address space>i64(i64* <val>, i8* <str>, i8* <str>, i32  <int>)
21896       declare i256* @llvm.ptr.annotation.p<address space>i256(i256* <val>, i8* <str>, i8* <str>, i32  <int>)
21897
21898 Overview:
21899 """""""""
21900
21901 The '``llvm.ptr.annotation``' intrinsic.
21902
21903 Arguments:
21904 """"""""""
21905
21906 The first argument is a pointer to an integer value of arbitrary bitwidth
21907 (result of some expression), the second is a pointer to a global string, the
21908 third is a pointer to a global string which is the source file name, and the
21909 last argument is the line number. It returns the value of the first argument.
21910
21911 Semantics:
21912 """"""""""
21913
21914 This intrinsic allows annotation of a pointer to an integer with arbitrary
21915 strings. This can be useful for special purpose optimizations that want to look
21916 for these annotations. These have no other defined use; they are ignored by code
21917 generation and optimization.
21918
21919 '``llvm.annotation.*``' Intrinsic
21920 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21921
21922 Syntax:
21923 """""""
21924
21925 This is an overloaded intrinsic. You can use '``llvm.annotation``' on
21926 any integer bit width.
21927
21928 ::
21929
21930       declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32  <int>)
21931       declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32  <int>)
21932       declare i32 @llvm.annotation.i32(i32 <val>, i8* <str>, i8* <str>, i32  <int>)
21933       declare i64 @llvm.annotation.i64(i64 <val>, i8* <str>, i8* <str>, i32  <int>)
21934       declare i256 @llvm.annotation.i256(i256 <val>, i8* <str>, i8* <str>, i32  <int>)
21935
21936 Overview:
21937 """""""""
21938
21939 The '``llvm.annotation``' intrinsic.
21940
21941 Arguments:
21942 """"""""""
21943
21944 The first argument is an integer value (result of some expression), the
21945 second is a pointer to a global string, the third is a pointer to a
21946 global string which is the source file name, and the last argument is
21947 the line number. It returns the value of the first argument.
21948
21949 Semantics:
21950 """"""""""
21951
21952 This intrinsic allows annotations to be put on arbitrary expressions
21953 with arbitrary strings. This can be useful for special purpose
21954 optimizations that want to look for these annotations. These have no
21955 other defined use; they are ignored by code generation and optimization.
21956
21957 '``llvm.codeview.annotation``' Intrinsic
21958 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21959
21960 Syntax:
21961 """""""
21962
21963 This annotation emits a label at its program point and an associated
21964 ``S_ANNOTATION`` codeview record with some additional string metadata. This is
21965 used to implement MSVC's ``__annotation`` intrinsic. It is marked
21966 ``noduplicate``, so calls to this intrinsic prevent inlining and should be
21967 considered expensive.
21968
21969 ::
21970
21971       declare void @llvm.codeview.annotation(metadata)
21972
21973 Arguments:
21974 """"""""""
21975
21976 The argument should be an MDTuple containing any number of MDStrings.
21977
21978 '``llvm.trap``' Intrinsic
21979 ^^^^^^^^^^^^^^^^^^^^^^^^^
21980
21981 Syntax:
21982 """""""
21983
21984 ::
21985
21986       declare void @llvm.trap() cold noreturn nounwind
21987
21988 Overview:
21989 """""""""
21990
21991 The '``llvm.trap``' intrinsic.
21992
21993 Arguments:
21994 """"""""""
21995
21996 None.
21997
21998 Semantics:
21999 """"""""""
22000
22001 This intrinsic is lowered to the target dependent trap instruction. If
22002 the target does not have a trap instruction, this intrinsic will be
22003 lowered to a call of the ``abort()`` function.
22004
22005 '``llvm.debugtrap``' Intrinsic
22006 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22007
22008 Syntax:
22009 """""""
22010
22011 ::
22012
22013       declare void @llvm.debugtrap() nounwind
22014
22015 Overview:
22016 """""""""
22017
22018 The '``llvm.debugtrap``' intrinsic.
22019
22020 Arguments:
22021 """"""""""
22022
22023 None.
22024
22025 Semantics:
22026 """"""""""
22027
22028 This intrinsic is lowered to code which is intended to cause an
22029 execution trap with the intention of requesting the attention of a
22030 debugger.
22031
22032 '``llvm.ubsantrap``' Intrinsic
22033 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22034
22035 Syntax:
22036 """""""
22037
22038 ::
22039
22040       declare void @llvm.ubsantrap(i8 immarg) cold noreturn nounwind
22041
22042 Overview:
22043 """""""""
22044
22045 The '``llvm.ubsantrap``' intrinsic.
22046
22047 Arguments:
22048 """"""""""
22049
22050 An integer describing the kind of failure detected.
22051
22052 Semantics:
22053 """"""""""
22054
22055 This intrinsic is lowered to code which is intended to cause an execution trap,
22056 embedding the argument into encoding of that trap somehow to discriminate
22057 crashes if possible.
22058
22059 Equivalent to ``@llvm.trap`` for targets that do not support this behaviour.
22060
22061 '``llvm.stackprotector``' Intrinsic
22062 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22063
22064 Syntax:
22065 """""""
22066
22067 ::
22068
22069       declare void @llvm.stackprotector(i8* <guard>, i8** <slot>)
22070
22071 Overview:
22072 """""""""
22073
22074 The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it
22075 onto the stack at ``slot``. The stack slot is adjusted to ensure that it
22076 is placed on the stack before local variables.
22077
22078 Arguments:
22079 """"""""""
22080
22081 The ``llvm.stackprotector`` intrinsic requires two pointer arguments.
22082 The first argument is the value loaded from the stack guard
22083 ``@__stack_chk_guard``. The second variable is an ``alloca`` that has
22084 enough space to hold the value of the guard.
22085
22086 Semantics:
22087 """"""""""
22088
22089 This intrinsic causes the prologue/epilogue inserter to force the position of
22090 the ``AllocaInst`` stack slot to be before local variables on the stack. This is
22091 to ensure that if a local variable on the stack is overwritten, it will destroy
22092 the value of the guard. When the function exits, the guard on the stack is
22093 checked against the original guard by ``llvm.stackprotectorcheck``. If they are
22094 different, then ``llvm.stackprotectorcheck`` causes the program to abort by
22095 calling the ``__stack_chk_fail()`` function.
22096
22097 '``llvm.stackguard``' Intrinsic
22098 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22099
22100 Syntax:
22101 """""""
22102
22103 ::
22104
22105       declare i8* @llvm.stackguard()
22106
22107 Overview:
22108 """""""""
22109
22110 The ``llvm.stackguard`` intrinsic returns the system stack guard value.
22111
22112 It should not be generated by frontends, since it is only for internal usage.
22113 The reason why we create this intrinsic is that we still support IR form Stack
22114 Protector in FastISel.
22115
22116 Arguments:
22117 """"""""""
22118
22119 None.
22120
22121 Semantics:
22122 """"""""""
22123
22124 On some platforms, the value returned by this intrinsic remains unchanged
22125 between loads in the same thread. On other platforms, it returns the same
22126 global variable value, if any, e.g. ``@__stack_chk_guard``.
22127
22128 Currently some platforms have IR-level customized stack guard loading (e.g.
22129 X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be
22130 in the future.
22131
22132 '``llvm.objectsize``' Intrinsic
22133 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22134
22135 Syntax:
22136 """""""
22137
22138 ::
22139
22140       declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
22141       declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
22142
22143 Overview:
22144 """""""""
22145
22146 The ``llvm.objectsize`` intrinsic is designed to provide information to the
22147 optimizer to determine whether a) an operation (like memcpy) will overflow a
22148 buffer that corresponds to an object, or b) that a runtime check for overflow
22149 isn't necessary. An object in this context means an allocation of a specific
22150 class, structure, array, or other object.
22151
22152 Arguments:
22153 """"""""""
22154
22155 The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a
22156 pointer to or into the ``object``. The second argument determines whether
22157 ``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is
22158 unknown. The third argument controls how ``llvm.objectsize`` acts when ``null``
22159 in address space 0 is used as its pointer argument. If it's ``false``,
22160 ``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if
22161 the ``null`` is in a non-zero address space or if ``true`` is given for the
22162 third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth
22163 argument to ``llvm.objectsize`` determines if the value should be evaluated at
22164 runtime.
22165
22166 The second, third, and fourth arguments only accept constants.
22167
22168 Semantics:
22169 """"""""""
22170
22171 The ``llvm.objectsize`` intrinsic is lowered to a value representing the size of
22172 the object concerned. If the size cannot be determined, ``llvm.objectsize``
22173 returns ``i32/i64 -1 or 0`` (depending on the ``min`` argument).
22174
22175 '``llvm.expect``' Intrinsic
22176 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
22177
22178 Syntax:
22179 """""""
22180
22181 This is an overloaded intrinsic. You can use ``llvm.expect`` on any
22182 integer bit width.
22183
22184 ::
22185
22186       declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>)
22187       declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>)
22188       declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>)
22189
22190 Overview:
22191 """""""""
22192
22193 The ``llvm.expect`` intrinsic provides information about expected (the
22194 most probable) value of ``val``, which can be used by optimizers.
22195
22196 Arguments:
22197 """"""""""
22198
22199 The ``llvm.expect`` intrinsic takes two arguments. The first argument is
22200 a value. The second argument is an expected value.
22201
22202 Semantics:
22203 """"""""""
22204
22205 This intrinsic is lowered to the ``val``.
22206
22207 '``llvm.expect.with.probability``' Intrinsic
22208 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22209
22210 Syntax:
22211 """""""
22212
22213 This intrinsic is similar to ``llvm.expect``. This is an overloaded intrinsic.
22214 You can use ``llvm.expect.with.probability`` on any integer bit width.
22215
22216 ::
22217
22218       declare i1 @llvm.expect.with.probability.i1(i1 <val>, i1 <expected_val>, double <prob>)
22219       declare i32 @llvm.expect.with.probability.i32(i32 <val>, i32 <expected_val>, double <prob>)
22220       declare i64 @llvm.expect.with.probability.i64(i64 <val>, i64 <expected_val>, double <prob>)
22221
22222 Overview:
22223 """""""""
22224
22225 The ``llvm.expect.with.probability`` intrinsic provides information about
22226 expected value of ``val`` with probability(or confidence) ``prob``, which can
22227 be used by optimizers.
22228
22229 Arguments:
22230 """"""""""
22231
22232 The ``llvm.expect.with.probability`` intrinsic takes three arguments. The first
22233 argument is a value. The second argument is an expected value. The third
22234 argument is a probability.
22235
22236 Semantics:
22237 """"""""""
22238
22239 This intrinsic is lowered to the ``val``.
22240
22241 .. _int_assume:
22242
22243 '``llvm.assume``' Intrinsic
22244 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22245
22246 Syntax:
22247 """""""
22248
22249 ::
22250
22251       declare void @llvm.assume(i1 %cond)
22252
22253 Overview:
22254 """""""""
22255
22256 The ``llvm.assume`` allows the optimizer to assume that the provided
22257 condition is true. This information can then be used in simplifying other parts
22258 of the code.
22259
22260 More complex assumptions can be encoded as
22261 :ref:`assume operand bundles <assume_opbundles>`.
22262
22263 Arguments:
22264 """"""""""
22265
22266 The argument of the call is the condition which the optimizer may assume is
22267 always true.
22268
22269 Semantics:
22270 """"""""""
22271
22272 The intrinsic allows the optimizer to assume that the provided condition is
22273 always true whenever the control flow reaches the intrinsic call. No code is
22274 generated for this intrinsic, and instructions that contribute only to the
22275 provided condition are not used for code generation. If the condition is
22276 violated during execution, the behavior is undefined.
22277
22278 Note that the optimizer might limit the transformations performed on values
22279 used by the ``llvm.assume`` intrinsic in order to preserve the instructions
22280 only used to form the intrinsic's input argument. This might prove undesirable
22281 if the extra information provided by the ``llvm.assume`` intrinsic does not cause
22282 sufficient overall improvement in code quality. For this reason,
22283 ``llvm.assume`` should not be used to document basic mathematical invariants
22284 that the optimizer can otherwise deduce or facts that are of little use to the
22285 optimizer.
22286
22287 .. _int_ssa_copy:
22288
22289 '``llvm.ssa.copy``' Intrinsic
22290 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22291
22292 Syntax:
22293 """""""
22294
22295 ::
22296
22297       declare type @llvm.ssa.copy(type %operand) returned(1) readnone
22298
22299 Arguments:
22300 """"""""""
22301
22302 The first argument is an operand which is used as the returned value.
22303
22304 Overview:
22305 """"""""""
22306
22307 The ``llvm.ssa.copy`` intrinsic can be used to attach information to
22308 operations by copying them and giving them new names.  For example,
22309 the PredicateInfo utility uses it to build Extended SSA form, and
22310 attach various forms of information to operands that dominate specific
22311 uses.  It is not meant for general use, only for building temporary
22312 renaming forms that require value splits at certain points.
22313
22314 .. _type.test:
22315
22316 '``llvm.type.test``' Intrinsic
22317 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22318
22319 Syntax:
22320 """""""
22321
22322 ::
22323
22324       declare i1 @llvm.type.test(i8* %ptr, metadata %type) nounwind readnone
22325
22326
22327 Arguments:
22328 """"""""""
22329
22330 The first argument is a pointer to be tested. The second argument is a
22331 metadata object representing a :doc:`type identifier <TypeMetadata>`.
22332
22333 Overview:
22334 """""""""
22335
22336 The ``llvm.type.test`` intrinsic tests whether the given pointer is associated
22337 with the given type identifier.
22338
22339 .. _type.checked.load:
22340
22341 '``llvm.type.checked.load``' Intrinsic
22342 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22343
22344 Syntax:
22345 """""""
22346
22347 ::
22348
22349       declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly
22350
22351
22352 Arguments:
22353 """"""""""
22354
22355 The first argument is a pointer from which to load a function pointer. The
22356 second argument is the byte offset from which to load the function pointer. The
22357 third argument is a metadata object representing a :doc:`type identifier
22358 <TypeMetadata>`.
22359
22360 Overview:
22361 """""""""
22362
22363 The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a
22364 virtual table pointer using type metadata. This intrinsic is used to implement
22365 control flow integrity in conjunction with virtual call optimization. The
22366 virtual call optimization pass will optimize away ``llvm.type.checked.load``
22367 intrinsics associated with devirtualized calls, thereby removing the type
22368 check in cases where it is not needed to enforce the control flow integrity
22369 constraint.
22370
22371 If the given pointer is associated with a type metadata identifier, this
22372 function returns true as the second element of its return value. (Note that
22373 the function may also return true if the given pointer is not associated
22374 with a type metadata identifier.) If the function's return value's second
22375 element is true, the following rules apply to the first element:
22376
22377 - If the given pointer is associated with the given type metadata identifier,
22378   it is the function pointer loaded from the given byte offset from the given
22379   pointer.
22380
22381 - If the given pointer is not associated with the given type metadata
22382   identifier, it is one of the following (the choice of which is unspecified):
22383
22384   1. The function pointer that would have been loaded from an arbitrarily chosen
22385      (through an unspecified mechanism) pointer associated with the type
22386      metadata.
22387
22388   2. If the function has a non-void return type, a pointer to a function that
22389      returns an unspecified value without causing side effects.
22390
22391 If the function's return value's second element is false, the value of the
22392 first element is undefined.
22393
22394
22395 '``llvm.arithmetic.fence``' Intrinsic
22396 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22397
22398 Syntax:
22399 """""""
22400
22401 ::
22402
22403       declare <type>
22404       @llvm.arithmetic.fence(<type> <op>)
22405
22406 Overview:
22407 """""""""
22408
22409 The purpose of the ``llvm.arithmetic.fence`` intrinsic
22410 is to prevent the optimizer from performaing fast-math optimizations,
22411 particularly reassociation,
22412 between the argument and the expression that contains the argument.
22413 It can be used to preserve the parentheses in the source language.
22414
22415 Arguments:
22416 """"""""""
22417
22418 The ``llvm.arithmetic.fence`` intrinsic takes only one argument.
22419 The argument and the return value are floating-point numbers,
22420 or vector floating-point numbers, of the same type.
22421
22422 Semantics:
22423 """"""""""
22424
22425 This intrinsic returns the value of its operand. The optimizer can optimize
22426 the argument, but the optimizer cannot hoist any component of the operand
22427 to the containing context, and the optimizer cannot move the calculation of
22428 any expression in the containing context into the operand.
22429
22430
22431 '``llvm.donothing``' Intrinsic
22432 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22433
22434 Syntax:
22435 """""""
22436
22437 ::
22438
22439       declare void @llvm.donothing() nounwind readnone
22440
22441 Overview:
22442 """""""""
22443
22444 The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only
22445 three intrinsics (besides ``llvm.experimental.patchpoint`` and
22446 ``llvm.experimental.gc.statepoint``) that can be called with an invoke
22447 instruction.
22448
22449 Arguments:
22450 """"""""""
22451
22452 None.
22453
22454 Semantics:
22455 """"""""""
22456
22457 This intrinsic does nothing, and it's removed by optimizers and ignored
22458 by codegen.
22459
22460 '``llvm.experimental.deoptimize``' Intrinsic
22461 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22462
22463 Syntax:
22464 """""""
22465
22466 ::
22467
22468       declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ]
22469
22470 Overview:
22471 """""""""
22472
22473 This intrinsic, together with :ref:`deoptimization operand bundles
22474 <deopt_opbundles>`, allow frontends to express transfer of control and
22475 frame-local state from the currently executing (typically more specialized,
22476 hence faster) version of a function into another (typically more generic, hence
22477 slower) version.
22478
22479 In languages with a fully integrated managed runtime like Java and JavaScript
22480 this intrinsic can be used to implement "uncommon trap" or "side exit" like
22481 functionality.  In unmanaged languages like C and C++, this intrinsic can be
22482 used to represent the slow paths of specialized functions.
22483
22484
22485 Arguments:
22486 """"""""""
22487
22488 The intrinsic takes an arbitrary number of arguments, whose meaning is
22489 decided by the :ref:`lowering strategy<deoptimize_lowering>`.
22490
22491 Semantics:
22492 """"""""""
22493
22494 The ``@llvm.experimental.deoptimize`` intrinsic executes an attached
22495 deoptimization continuation (denoted using a :ref:`deoptimization
22496 operand bundle <deopt_opbundles>`) and returns the value returned by
22497 the deoptimization continuation.  Defining the semantic properties of
22498 the continuation itself is out of scope of the language reference --
22499 as far as LLVM is concerned, the deoptimization continuation can
22500 invoke arbitrary side effects, including reading from and writing to
22501 the entire heap.
22502
22503 Deoptimization continuations expressed using ``"deopt"`` operand bundles always
22504 continue execution to the end of the physical frame containing them, so all
22505 calls to ``@llvm.experimental.deoptimize`` must be in "tail position":
22506
22507    - ``@llvm.experimental.deoptimize`` cannot be invoked.
22508    - The call must immediately precede a :ref:`ret <i_ret>` instruction.
22509    - The ``ret`` instruction must return the value produced by the
22510      ``@llvm.experimental.deoptimize`` call if there is one, or void.
22511
22512 Note that the above restrictions imply that the return type for a call to
22513 ``@llvm.experimental.deoptimize`` will match the return type of its immediate
22514 caller.
22515
22516 The inliner composes the ``"deopt"`` continuations of the caller into the
22517 ``"deopt"`` continuations present in the inlinee, and also updates calls to this
22518 intrinsic to return directly from the frame of the function it inlined into.
22519
22520 All declarations of ``@llvm.experimental.deoptimize`` must share the
22521 same calling convention.
22522
22523 .. _deoptimize_lowering:
22524
22525 Lowering:
22526 """""""""
22527
22528 Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the
22529 symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to
22530 ensure that this symbol is defined).  The call arguments to
22531 ``@llvm.experimental.deoptimize`` are lowered as if they were formal
22532 arguments of the specified types, and not as varargs.
22533
22534
22535 '``llvm.experimental.guard``' Intrinsic
22536 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22537
22538 Syntax:
22539 """""""
22540
22541 ::
22542
22543       declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ]
22544
22545 Overview:
22546 """""""""
22547
22548 This intrinsic, together with :ref:`deoptimization operand bundles
22549 <deopt_opbundles>`, allows frontends to express guards or checks on
22550 optimistic assumptions made during compilation.  The semantics of
22551 ``@llvm.experimental.guard`` is defined in terms of
22552 ``@llvm.experimental.deoptimize`` -- its body is defined to be
22553 equivalent to:
22554
22555 .. code-block:: text
22556
22557   define void @llvm.experimental.guard(i1 %pred, <args...>) {
22558     %realPred = and i1 %pred, undef
22559     br i1 %realPred, label %continue, label %leave [, !make.implicit !{}]
22560
22561   leave:
22562     call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ]
22563     ret void
22564
22565   continue:
22566     ret void
22567   }
22568
22569
22570 with the optional ``[, !make.implicit !{}]`` present if and only if it
22571 is present on the call site.  For more details on ``!make.implicit``,
22572 see :doc:`FaultMaps`.
22573
22574 In words, ``@llvm.experimental.guard`` executes the attached
22575 ``"deopt"`` continuation if (but **not** only if) its first argument
22576 is ``false``.  Since the optimizer is allowed to replace the ``undef``
22577 with an arbitrary value, it can optimize guard to fail "spuriously",
22578 i.e. without the original condition being false (hence the "not only
22579 if"); and this allows for "check widening" type optimizations.
22580
22581 ``@llvm.experimental.guard`` cannot be invoked.
22582
22583 After ``@llvm.experimental.guard`` was first added, a more general
22584 formulation was found in ``@llvm.experimental.widenable.condition``.
22585 Support for ``@llvm.experimental.guard`` is slowly being rephrased in
22586 terms of this alternate.
22587
22588 '``llvm.experimental.widenable.condition``' Intrinsic
22589 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22590
22591 Syntax:
22592 """""""
22593
22594 ::
22595
22596       declare i1 @llvm.experimental.widenable.condition()
22597
22598 Overview:
22599 """""""""
22600
22601 This intrinsic represents a "widenable condition" which is
22602 boolean expressions with the following property: whether this
22603 expression is `true` or `false`, the program is correct and
22604 well-defined.
22605
22606 Together with :ref:`deoptimization operand bundles <deopt_opbundles>`,
22607 ``@llvm.experimental.widenable.condition`` allows frontends to
22608 express guards or checks on optimistic assumptions made during
22609 compilation and represent them as branch instructions on special
22610 conditions.
22611
22612 While this may appear similar in semantics to `undef`, it is very
22613 different in that an invocation produces a particular, singular
22614 value. It is also intended to be lowered late, and remain available
22615 for specific optimizations and transforms that can benefit from its
22616 special properties.
22617
22618 Arguments:
22619 """"""""""
22620
22621 None.
22622
22623 Semantics:
22624 """"""""""
22625
22626 The intrinsic ``@llvm.experimental.widenable.condition()``
22627 returns either `true` or `false`. For each evaluation of a call
22628 to this intrinsic, the program must be valid and correct both if
22629 it returns `true` and if it returns `false`. This allows
22630 transformation passes to replace evaluations of this intrinsic
22631 with either value whenever one is beneficial.
22632
22633 When used in a branch condition, it allows us to choose between
22634 two alternative correct solutions for the same problem, like
22635 in example below:
22636
22637 .. code-block:: text
22638
22639     %cond = call i1 @llvm.experimental.widenable.condition()
22640     br i1 %cond, label %solution_1, label %solution_2
22641
22642   label %fast_path:
22643     ; Apply memory-consuming but fast solution for a task.
22644
22645   label %slow_path:
22646     ; Cheap in memory but slow solution.
22647
22648 Whether the result of intrinsic's call is `true` or `false`,
22649 it should be correct to pick either solution. We can switch
22650 between them by replacing the result of
22651 ``@llvm.experimental.widenable.condition`` with different
22652 `i1` expressions.
22653
22654 This is how it can be used to represent guards as widenable branches:
22655
22656 .. code-block:: text
22657
22658   block:
22659     ; Unguarded instructions
22660     call void @llvm.experimental.guard(i1 %cond, <args...>) ["deopt"(<deopt_args...>)]
22661     ; Guarded instructions
22662
22663 Can be expressed in an alternative equivalent form of explicit branch using
22664 ``@llvm.experimental.widenable.condition``:
22665
22666 .. code-block:: text
22667
22668   block:
22669     ; Unguarded instructions
22670     %widenable_condition = call i1 @llvm.experimental.widenable.condition()
22671     %guard_condition = and i1 %cond, %widenable_condition
22672     br i1 %guard_condition, label %guarded, label %deopt
22673
22674   guarded:
22675     ; Guarded instructions
22676
22677   deopt:
22678     call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ]
22679
22680 So the block `guarded` is only reachable when `%cond` is `true`,
22681 and it should be valid to go to the block `deopt` whenever `%cond`
22682 is `true` or `false`.
22683
22684 ``@llvm.experimental.widenable.condition`` will never throw, thus
22685 it cannot be invoked.
22686
22687 Guard widening:
22688 """""""""""""""
22689
22690 When ``@llvm.experimental.widenable.condition()`` is used in
22691 condition of a guard represented as explicit branch, it is
22692 legal to widen the guard's condition with any additional
22693 conditions.
22694
22695 Guard widening looks like replacement of
22696
22697 .. code-block:: text
22698
22699   %widenable_cond = call i1 @llvm.experimental.widenable.condition()
22700   %guard_cond = and i1 %cond, %widenable_cond
22701   br i1 %guard_cond, label %guarded, label %deopt
22702
22703 with
22704
22705 .. code-block:: text
22706
22707   %widenable_cond = call i1 @llvm.experimental.widenable.condition()
22708   %new_cond = and i1 %any_other_cond, %widenable_cond
22709   %new_guard_cond = and i1 %cond, %new_cond
22710   br i1 %new_guard_cond, label %guarded, label %deopt
22711
22712 for this branch. Here `%any_other_cond` is an arbitrarily chosen
22713 well-defined `i1` value. By making guard widening, we may
22714 impose stricter conditions on `guarded` block and bail to the
22715 deopt when the new condition is not met.
22716
22717 Lowering:
22718 """""""""
22719
22720 Default lowering strategy is replacing the result of
22721 call of ``@llvm.experimental.widenable.condition``  with
22722 constant `true`. However it is always correct to replace
22723 it with any other `i1` value. Any pass can
22724 freely do it if it can benefit from non-default lowering.
22725
22726
22727 '``llvm.load.relative``' Intrinsic
22728 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22729
22730 Syntax:
22731 """""""
22732
22733 ::
22734
22735       declare i8* @llvm.load.relative.iN(i8* %ptr, iN %offset) argmemonly nounwind readonly
22736
22737 Overview:
22738 """""""""
22739
22740 This intrinsic loads a 32-bit value from the address ``%ptr + %offset``,
22741 adds ``%ptr`` to that value and returns it. The constant folder specifically
22742 recognizes the form of this intrinsic and the constant initializers it may
22743 load from; if a loaded constant initializer is known to have the form
22744 ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``.
22745
22746 LLVM provides that the calculation of such a constant initializer will
22747 not overflow at link time under the medium code model if ``x`` is an
22748 ``unnamed_addr`` function. However, it does not provide this guarantee for
22749 a constant initializer folded into a function body. This intrinsic can be
22750 used to avoid the possibility of overflows when loading from such a constant.
22751
22752 .. _llvm_sideeffect:
22753
22754 '``llvm.sideeffect``' Intrinsic
22755 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22756
22757 Syntax:
22758 """""""
22759
22760 ::
22761
22762       declare void @llvm.sideeffect() inaccessiblememonly nounwind
22763
22764 Overview:
22765 """""""""
22766
22767 The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers
22768 treat it as having side effects, so it can be inserted into a loop to
22769 indicate that the loop shouldn't be assumed to terminate (which could
22770 potentially lead to the loop being optimized away entirely), even if it's
22771 an infinite loop with no other side effects.
22772
22773 Arguments:
22774 """"""""""
22775
22776 None.
22777
22778 Semantics:
22779 """"""""""
22780
22781 This intrinsic actually does nothing, but optimizers must assume that it
22782 has externally observable side effects.
22783
22784 '``llvm.is.constant.*``' Intrinsic
22785 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22786
22787 Syntax:
22788 """""""
22789
22790 This is an overloaded intrinsic. You can use llvm.is.constant with any argument type.
22791
22792 ::
22793
22794       declare i1 @llvm.is.constant.i32(i32 %operand) nounwind readnone
22795       declare i1 @llvm.is.constant.f32(float %operand) nounwind readnone
22796       declare i1 @llvm.is.constant.TYPENAME(TYPE %operand) nounwind readnone
22797
22798 Overview:
22799 """""""""
22800
22801 The '``llvm.is.constant``' intrinsic will return true if the argument
22802 is known to be a manifest compile-time constant. It is guaranteed to
22803 fold to either true or false before generating machine code.
22804
22805 Semantics:
22806 """"""""""
22807
22808 This intrinsic generates no code. If its argument is known to be a
22809 manifest compile-time constant value, then the intrinsic will be
22810 converted to a constant true value. Otherwise, it will be converted to
22811 a constant false value.
22812
22813 In particular, note that if the argument is a constant expression
22814 which refers to a global (the address of which _is_ a constant, but
22815 not manifest during the compile), then the intrinsic evaluates to
22816 false.
22817
22818 The result also intentionally depends on the result of optimization
22819 passes -- e.g., the result can change depending on whether a
22820 function gets inlined or not. A function's parameters are
22821 obviously not constant. However, a call like
22822 ``llvm.is.constant.i32(i32 %param)`` *can* return true after the
22823 function is inlined, if the value passed to the function parameter was
22824 a constant.
22825
22826 On the other hand, if constant folding is not run, it will never
22827 evaluate to true, even in simple cases.
22828
22829 .. _int_ptrmask:
22830
22831 '``llvm.ptrmask``' Intrinsic
22832 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22833
22834 Syntax:
22835 """""""
22836
22837 ::
22838
22839       declare ptrty llvm.ptrmask(ptrty %ptr, intty %mask) readnone speculatable
22840
22841 Arguments:
22842 """"""""""
22843
22844 The first argument is a pointer. The second argument is an integer.
22845
22846 Overview:
22847 """"""""""
22848
22849 The ``llvm.ptrmask`` intrinsic masks out bits of the pointer according to a mask.
22850 This allows stripping data from tagged pointers without converting them to an
22851 integer (ptrtoint/inttoptr). As a consequence, we can preserve more information
22852 to facilitate alias analysis and underlying-object detection.
22853
22854 Semantics:
22855 """"""""""
22856
22857 The result of ``ptrmask(ptr, mask)`` is equivalent to
22858 ``getelementptr ptr, (ptrtoint(ptr) & mask) - ptrtoint(ptr)``. Both the returned
22859 pointer and the first argument are based on the same underlying object (for more
22860 information on the *based on* terminology see
22861 :ref:`the pointer aliasing rules <pointeraliasing>`). If the bitwidth of the
22862 mask argument does not match the pointer size of the target, the mask is
22863 zero-extended or truncated accordingly.
22864
22865 .. _int_vscale:
22866
22867 '``llvm.vscale``' Intrinsic
22868 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
22869
22870 Syntax:
22871 """""""
22872
22873 ::
22874
22875       declare i32 llvm.vscale.i32()
22876       declare i64 llvm.vscale.i64()
22877
22878 Overview:
22879 """""""""
22880
22881 The ``llvm.vscale`` intrinsic returns the value for ``vscale`` in scalable
22882 vectors such as ``<vscale x 16 x i8>``.
22883
22884 Semantics:
22885 """"""""""
22886
22887 ``vscale`` is a positive value that is constant throughout program
22888 execution, but is unknown at compile time.
22889 If the result value does not fit in the result type, then the result is
22890 a :ref:`poison value <poisonvalues>`.
22891
22892
22893 Stack Map Intrinsics
22894 --------------------
22895
22896 LLVM provides experimental intrinsics to support runtime patching
22897 mechanisms commonly desired in dynamic language JITs. These intrinsics
22898 are described in :doc:`StackMaps`.
22899
22900 Element Wise Atomic Memory Intrinsics
22901 -------------------------------------
22902
22903 These intrinsics are similar to the standard library memory intrinsics except
22904 that they perform memory transfer as a sequence of atomic memory accesses.
22905
22906 .. _int_memcpy_element_unordered_atomic:
22907
22908 '``llvm.memcpy.element.unordered.atomic``' Intrinsic
22909 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22910
22911 Syntax:
22912 """""""
22913
22914 This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on
22915 any integer bit width and for different address spaces. Not all targets
22916 support all bit widths however.
22917
22918 ::
22919
22920       declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>,
22921                                                                        i8* <src>,
22922                                                                        i32 <len>,
22923                                                                        i32 <element_size>)
22924       declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>,
22925                                                                        i8* <src>,
22926                                                                        i64 <len>,
22927                                                                        i32 <element_size>)
22928
22929 Overview:
22930 """""""""
22931
22932 The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the
22933 '``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated
22934 as arrays with elements that are exactly ``element_size`` bytes, and the copy between
22935 buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations
22936 that are a positive integer multiple of the ``element_size`` in size.
22937
22938 Arguments:
22939 """"""""""
22940
22941 The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>`
22942 intrinsic, with the added constraint that ``len`` is required to be a positive integer
22943 multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
22944 ``element_size``, then the behaviour of the intrinsic is undefined.
22945
22946 ``element_size`` must be a compile-time constant positive power of two no greater than
22947 target-specific atomic access size limit.
22948
22949 For each of the input pointers ``align`` parameter attribute must be specified. It
22950 must be a power of two no less than the ``element_size``. Caller guarantees that
22951 both the source and destination pointers are aligned to that boundary.
22952
22953 Semantics:
22954 """"""""""
22955
22956 The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of
22957 memory from the source location to the destination location. These locations are not
22958 allowed to overlap. The memory copy is performed as a sequence of load/store operations
22959 where each access is guaranteed to be a multiple of ``element_size`` bytes wide and
22960 aligned at an ``element_size`` boundary.
22961
22962 The order of the copy is unspecified. The same value may be read from the source
22963 buffer many times, but only one write is issued to the destination buffer per
22964 element. It is well defined to have concurrent reads and writes to both source and
22965 destination provided those reads and writes are unordered atomic when specified.
22966
22967 This intrinsic does not provide any additional ordering guarantees over those
22968 provided by a set of unordered loads from the source location and stores to the
22969 destination.
22970
22971 Lowering:
22972 """""""""
22973
22974 In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is
22975 lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*'
22976 is replaced with an actual element size. See :ref:`RewriteStatepointsForGC intrinsic
22977 lowering <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
22978 lowering.
22979
22980 Optimizer is allowed to inline memory copy when it's profitable to do so.
22981
22982 '``llvm.memmove.element.unordered.atomic``' Intrinsic
22983 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22984
22985 Syntax:
22986 """""""
22987
22988 This is an overloaded intrinsic. You can use
22989 ``llvm.memmove.element.unordered.atomic`` on any integer bit width and for
22990 different address spaces. Not all targets support all bit widths however.
22991
22992 ::
22993
22994       declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>,
22995                                                                         i8* <src>,
22996                                                                         i32 <len>,
22997                                                                         i32 <element_size>)
22998       declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>,
22999                                                                         i8* <src>,
23000                                                                         i64 <len>,
23001                                                                         i32 <element_size>)
23002
23003 Overview:
23004 """""""""
23005
23006 The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization
23007 of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and
23008 ``src`` are treated as arrays with elements that are exactly ``element_size``
23009 bytes, and the copy between buffers uses a sequence of
23010 :ref:`unordered atomic <ordering>` load/store operations that are a positive
23011 integer multiple of the ``element_size`` in size.
23012
23013 Arguments:
23014 """"""""""
23015
23016 The first three arguments are the same as they are in the
23017 :ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that
23018 ``len`` is required to be a positive integer multiple of the ``element_size``.
23019 If ``len`` is not a positive integer multiple of ``element_size``, then the
23020 behaviour of the intrinsic is undefined.
23021
23022 ``element_size`` must be a compile-time constant positive power of two no
23023 greater than a target-specific atomic access size limit.
23024
23025 For each of the input pointers the ``align`` parameter attribute must be
23026 specified. It must be a power of two no less than the ``element_size``. Caller
23027 guarantees that both the source and destination pointers are aligned to that
23028 boundary.
23029
23030 Semantics:
23031 """"""""""
23032
23033 The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes
23034 of memory from the source location to the destination location. These locations
23035 are allowed to overlap. The memory copy is performed as a sequence of load/store
23036 operations where each access is guaranteed to be a multiple of ``element_size``
23037 bytes wide and aligned at an ``element_size`` boundary.
23038
23039 The order of the copy is unspecified. The same value may be read from the source
23040 buffer many times, but only one write is issued to the destination buffer per
23041 element. It is well defined to have concurrent reads and writes to both source
23042 and destination provided those reads and writes are unordered atomic when
23043 specified.
23044
23045 This intrinsic does not provide any additional ordering guarantees over those
23046 provided by a set of unordered loads from the source location and stores to the
23047 destination.
23048
23049 Lowering:
23050 """""""""
23051
23052 In the most general case call to the
23053 '``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol
23054 ``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an
23055 actual element size. See :ref:`RewriteStatepointsForGC intrinsic lowering
23056 <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
23057 lowering.
23058
23059 The optimizer is allowed to inline the memory copy when it's profitable to do so.
23060
23061 .. _int_memset_element_unordered_atomic:
23062
23063 '``llvm.memset.element.unordered.atomic``' Intrinsic
23064 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23065
23066 Syntax:
23067 """""""
23068
23069 This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on
23070 any integer bit width and for different address spaces. Not all targets
23071 support all bit widths however.
23072
23073 ::
23074
23075       declare void @llvm.memset.element.unordered.atomic.p0i8.i32(i8* <dest>,
23076                                                                   i8 <value>,
23077                                                                   i32 <len>,
23078                                                                   i32 <element_size>)
23079       declare void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* <dest>,
23080                                                                   i8 <value>,
23081                                                                   i64 <len>,
23082                                                                   i32 <element_size>)
23083
23084 Overview:
23085 """""""""
23086
23087 The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the
23088 '``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array
23089 with elements that are exactly ``element_size`` bytes, and the assignment to that array
23090 uses uses a sequence of :ref:`unordered atomic <ordering>` store operations
23091 that are a positive integer multiple of the ``element_size`` in size.
23092
23093 Arguments:
23094 """"""""""
23095
23096 The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>`
23097 intrinsic, with the added constraint that ``len`` is required to be a positive integer
23098 multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
23099 ``element_size``, then the behaviour of the intrinsic is undefined.
23100
23101 ``element_size`` must be a compile-time constant positive power of two no greater than
23102 target-specific atomic access size limit.
23103
23104 The ``dest`` input pointer must have the ``align`` parameter attribute specified. It
23105 must be a power of two no less than the ``element_size``. Caller guarantees that
23106 the destination pointer is aligned to that boundary.
23107
23108 Semantics:
23109 """"""""""
23110
23111 The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of
23112 memory starting at the destination location to the given ``value``. The memory is
23113 set with a sequence of store operations where each access is guaranteed to be a
23114 multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary.
23115
23116 The order of the assignment is unspecified. Only one write is issued to the
23117 destination buffer per element. It is well defined to have concurrent reads and
23118 writes to the destination provided those reads and writes are unordered atomic
23119 when specified.
23120
23121 This intrinsic does not provide any additional ordering guarantees over those
23122 provided by a set of unordered stores to the destination.
23123
23124 Lowering:
23125 """""""""
23126
23127 In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is
23128 lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*'
23129 is replaced with an actual element size.
23130
23131 The optimizer is allowed to inline the memory assignment when it's profitable to do so.
23132
23133 Objective-C ARC Runtime Intrinsics
23134 ----------------------------------
23135
23136 LLVM provides intrinsics that lower to Objective-C ARC runtime entry points.
23137 LLVM is aware of the semantics of these functions, and optimizes based on that
23138 knowledge. You can read more about the details of Objective-C ARC `here
23139 <https://clang.llvm.org/docs/AutomaticReferenceCounting.html>`_.
23140
23141 '``llvm.objc.autorelease``' Intrinsic
23142 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23143
23144 Syntax:
23145 """""""
23146 ::
23147
23148       declare i8* @llvm.objc.autorelease(i8*)
23149
23150 Lowering:
23151 """""""""
23152
23153 Lowers to a call to `objc_autorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autorelease>`_.
23154
23155 '``llvm.objc.autoreleasePoolPop``' Intrinsic
23156 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23157
23158 Syntax:
23159 """""""
23160 ::
23161
23162       declare void @llvm.objc.autoreleasePoolPop(i8*)
23163
23164 Lowering:
23165 """""""""
23166
23167 Lowers to a call to `objc_autoreleasePoolPop <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpop-void-pool>`_.
23168
23169 '``llvm.objc.autoreleasePoolPush``' Intrinsic
23170 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23171
23172 Syntax:
23173 """""""
23174 ::
23175
23176       declare i8* @llvm.objc.autoreleasePoolPush()
23177
23178 Lowering:
23179 """""""""
23180
23181 Lowers to a call to `objc_autoreleasePoolPush <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpush-void>`_.
23182
23183 '``llvm.objc.autoreleaseReturnValue``' Intrinsic
23184 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23185
23186 Syntax:
23187 """""""
23188 ::
23189
23190       declare i8* @llvm.objc.autoreleaseReturnValue(i8*)
23191
23192 Lowering:
23193 """""""""
23194
23195 Lowers to a call to `objc_autoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue>`_.
23196
23197 '``llvm.objc.copyWeak``' Intrinsic
23198 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23199
23200 Syntax:
23201 """""""
23202 ::
23203
23204       declare void @llvm.objc.copyWeak(i8**, i8**)
23205
23206 Lowering:
23207 """""""""
23208
23209 Lowers to a call to `objc_copyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-copyweak-id-dest-id-src>`_.
23210
23211 '``llvm.objc.destroyWeak``' Intrinsic
23212 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23213
23214 Syntax:
23215 """""""
23216 ::
23217
23218       declare void @llvm.objc.destroyWeak(i8**)
23219
23220 Lowering:
23221 """""""""
23222
23223 Lowers to a call to `objc_destroyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-destroyweak-id-object>`_.
23224
23225 '``llvm.objc.initWeak``' Intrinsic
23226 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23227
23228 Syntax:
23229 """""""
23230 ::
23231
23232       declare i8* @llvm.objc.initWeak(i8**, i8*)
23233
23234 Lowering:
23235 """""""""
23236
23237 Lowers to a call to `objc_initWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-initweak>`_.
23238
23239 '``llvm.objc.loadWeak``' Intrinsic
23240 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23241
23242 Syntax:
23243 """""""
23244 ::
23245
23246       declare i8* @llvm.objc.loadWeak(i8**)
23247
23248 Lowering:
23249 """""""""
23250
23251 Lowers to a call to `objc_loadWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweak>`_.
23252
23253 '``llvm.objc.loadWeakRetained``' Intrinsic
23254 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23255
23256 Syntax:
23257 """""""
23258 ::
23259
23260       declare i8* @llvm.objc.loadWeakRetained(i8**)
23261
23262 Lowering:
23263 """""""""
23264
23265 Lowers to a call to `objc_loadWeakRetained <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweakretained>`_.
23266
23267 '``llvm.objc.moveWeak``' Intrinsic
23268 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23269
23270 Syntax:
23271 """""""
23272 ::
23273
23274       declare void @llvm.objc.moveWeak(i8**, i8**)
23275
23276 Lowering:
23277 """""""""
23278
23279 Lowers to a call to `objc_moveWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-moveweak-id-dest-id-src>`_.
23280
23281 '``llvm.objc.release``' Intrinsic
23282 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23283
23284 Syntax:
23285 """""""
23286 ::
23287
23288       declare void @llvm.objc.release(i8*)
23289
23290 Lowering:
23291 """""""""
23292
23293 Lowers to a call to `objc_release <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-release-id-value>`_.
23294
23295 '``llvm.objc.retain``' Intrinsic
23296 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23297
23298 Syntax:
23299 """""""
23300 ::
23301
23302       declare i8* @llvm.objc.retain(i8*)
23303
23304 Lowering:
23305 """""""""
23306
23307 Lowers to a call to `objc_retain <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain>`_.
23308
23309 '``llvm.objc.retainAutorelease``' Intrinsic
23310 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23311
23312 Syntax:
23313 """""""
23314 ::
23315
23316       declare i8* @llvm.objc.retainAutorelease(i8*)
23317
23318 Lowering:
23319 """""""""
23320
23321 Lowers to a call to `objc_retainAutorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautorelease>`_.
23322
23323 '``llvm.objc.retainAutoreleaseReturnValue``' Intrinsic
23324 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23325
23326 Syntax:
23327 """""""
23328 ::
23329
23330       declare i8* @llvm.objc.retainAutoreleaseReturnValue(i8*)
23331
23332 Lowering:
23333 """""""""
23334
23335 Lowers to a call to `objc_retainAutoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasereturnvalue>`_.
23336
23337 '``llvm.objc.retainAutoreleasedReturnValue``' Intrinsic
23338 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23339
23340 Syntax:
23341 """""""
23342 ::
23343
23344       declare i8* @llvm.objc.retainAutoreleasedReturnValue(i8*)
23345
23346 Lowering:
23347 """""""""
23348
23349 Lowers to a call to `objc_retainAutoreleasedReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasedreturnvalue>`_.
23350
23351 '``llvm.objc.retainBlock``' Intrinsic
23352 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23353
23354 Syntax:
23355 """""""
23356 ::
23357
23358       declare i8* @llvm.objc.retainBlock(i8*)
23359
23360 Lowering:
23361 """""""""
23362
23363 Lowers to a call to `objc_retainBlock <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainblock>`_.
23364
23365 '``llvm.objc.storeStrong``' Intrinsic
23366 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23367
23368 Syntax:
23369 """""""
23370 ::
23371
23372       declare void @llvm.objc.storeStrong(i8**, i8*)
23373
23374 Lowering:
23375 """""""""
23376
23377 Lowers to a call to `objc_storeStrong <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-storestrong-id-object-id-value>`_.
23378
23379 '``llvm.objc.storeWeak``' Intrinsic
23380 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23381
23382 Syntax:
23383 """""""
23384 ::
23385
23386       declare i8* @llvm.objc.storeWeak(i8**, i8*)
23387
23388 Lowering:
23389 """""""""
23390
23391 Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_.
23392
23393 Preserving Debug Information Intrinsics
23394 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23395
23396 These intrinsics are used to carry certain debuginfo together with
23397 IR-level operations. For example, it may be desirable to
23398 know the structure/union name and the original user-level field
23399 indices. Such information got lost in IR GetElementPtr instruction
23400 since the IR types are different from debugInfo types and unions
23401 are converted to structs in IR.
23402
23403 '``llvm.preserve.array.access.index``' Intrinsic
23404 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23405
23406 Syntax:
23407 """""""
23408 ::
23409
23410       declare <ret_type>
23411       @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(<type> base,
23412                                                                            i32 dim,
23413                                                                            i32 index)
23414
23415 Overview:
23416 """""""""
23417
23418 The '``llvm.preserve.array.access.index``' intrinsic returns the getelementptr address
23419 based on array base ``base``, array dimension ``dim`` and the last access index ``index``
23420 into the array. The return type ``ret_type`` is a pointer type to the array element.
23421 The array ``dim`` and ``index`` are preserved which is more robust than
23422 getelementptr instruction which may be subject to compiler transformation.
23423 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
23424 to provide array or pointer debuginfo type.
23425 The metadata is a ``DICompositeType`` or ``DIDerivedType`` representing the
23426 debuginfo version of ``type``.
23427
23428 Arguments:
23429 """"""""""
23430
23431 The ``base`` is the array base address.  The ``dim`` is the array dimension.
23432 The ``base`` is a pointer if ``dim`` equals 0.
23433 The ``index`` is the last access index into the array or pointer.
23434
23435 The ``base`` argument must be annotated with an :ref:`elementtype
23436 <attr_elementtype>` attribute at the call-site. This attribute specifies the
23437 getelementptr element type.
23438
23439 Semantics:
23440 """"""""""
23441
23442 The '``llvm.preserve.array.access.index``' intrinsic produces the same result
23443 as a getelementptr with base ``base`` and access operands ``{dim's 0's, index}``.
23444
23445 '``llvm.preserve.union.access.index``' Intrinsic
23446 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23447
23448 Syntax:
23449 """""""
23450 ::
23451
23452       declare <type>
23453       @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(<type> base,
23454                                                                         i32 di_index)
23455
23456 Overview:
23457 """""""""
23458
23459 The '``llvm.preserve.union.access.index``' intrinsic carries the debuginfo field index
23460 ``di_index`` and returns the ``base`` address.
23461 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
23462 to provide union debuginfo type.
23463 The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
23464 The return type ``type`` is the same as the ``base`` type.
23465
23466 Arguments:
23467 """"""""""
23468
23469 The ``base`` is the union base address. The ``di_index`` is the field index in debuginfo.
23470
23471 Semantics:
23472 """"""""""
23473
23474 The '``llvm.preserve.union.access.index``' intrinsic returns the ``base`` address.
23475
23476 '``llvm.preserve.struct.access.index``' Intrinsic
23477 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23478
23479 Syntax:
23480 """""""
23481 ::
23482
23483       declare <ret_type>
23484       @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(<type> base,
23485                                                                  i32 gep_index,
23486                                                                  i32 di_index)
23487
23488 Overview:
23489 """""""""
23490
23491 The '``llvm.preserve.struct.access.index``' intrinsic returns the getelementptr address
23492 based on struct base ``base`` and IR struct member index ``gep_index``.
23493 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
23494 to provide struct debuginfo type.
23495 The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
23496 The return type ``ret_type`` is a pointer type to the structure member.
23497
23498 Arguments:
23499 """"""""""
23500
23501 The ``base`` is the structure base address. The ``gep_index`` is the struct member index
23502 based on IR structures. The ``di_index`` is the struct member index based on debuginfo.
23503
23504 The ``base`` argument must be annotated with an :ref:`elementtype
23505 <attr_elementtype>` attribute at the call-site. This attribute specifies the
23506 getelementptr element type.
23507
23508 Semantics:
23509 """"""""""
23510
23511 The '``llvm.preserve.struct.access.index``' intrinsic produces the same result
23512 as a getelementptr with base ``base`` and access operands ``{0, gep_index}``.