native_client_sdk/src/doc/reference/pnacl-bitcode-abi.rst

   1 ==============================
   2 PNaCl Bitcode Reference Manual
   3 ==============================
   4
   5 .. contents::
   6    :local:
   7    :backlinks: none
   8    :depth: 3
   9
  10 Introduction
  11 ============
  12
  13 This document is a reference manual for the PNaCl bitcode format. It describes
  14 the bitcode on a *semantic* level; the physical encoding level will be described
  15 elsewhere. For the purpose of this document, the textual form of LLVM IR is
  16 used to describe instructions and other bitcode constructs.
  17
  18 Since the PNaCl bitcode is based to a large extent on LLVM IR as of
  19 version 3.3, many sections in this document point to a relevant section
  20 of the LLVM language reference manual. Only the changes, restrictions
  21 and variations specific to PNaCl are described---full semantic
  22 descriptions are not duplicated from the LLVM reference manual.
  23
  24 High Level Structure
  25 ====================
  26
  27 A PNaCl portable executable (**pexe** in short) is a single LLVM IR module.
  28
  29 Data Model
  30 ----------
  31
  32 The data model for PNaCl bitcode is fixed at little-endian ILP32: pointers are
  33 32 bits in size. 64-bit integer types are also supported natively via the i64
  34 type (for example, a front-end can generate these from the C/C++ type
  35 ``long long``).
  36
  37 Floating point support is fixed at IEEE 754 32-bit and 64-bit values (f32 and
  38 f64, respectively).
  39
  40 .. _bitcode_linkagetypes:
  41
  42 Linkage Types
  43 -------------
  44
  45 `LLVM LangRef: Linkage Types
  46 <http://llvm.org/releases/3.3/docs/LangRef.html#linkage>`_
  47
  48 The linkage types supported by PNaCl bitcode are ``internal`` and ``external``.
  49 A single function in the pexe, named ``_start``, has the linkage type
  50 ``external``. All the other functions and globals have the linkage type
  51 ``internal``.
  52
  53 Calling Conventions
  54 -------------------
  55
  56 `LLVM LangRef: Calling Conventions
  57 <http://llvm.org/releases/3.3/docs/LangRef.html#callingconv>`_
  58
  59 The only calling convention supported by PNaCl bitcode is ``ccc`` - the C
  60 calling convention.
  61
  62 Visibility Styles
  63 -----------------
  64
  65 `LLVM LangRef: Visibility Styles
  66 <http://llvm.org/releases/3.3/docs/LangRef.html#visibility-styles>`_
  67
  68 PNaCl bitcode does not support visibility styles.
  69
  70 .. _bitcode_globalvariables:
  71
  72 Global Variables
  73 ----------------
  74
  75 `LLVM LangRef: Global Variables
  76 <http://llvm.org/releases/3.3/docs/LangRef.html#globalvars>`_
  77
  78 Restrictions on global variables:
  79
  80 * PNaCl bitcode does not support LLVM IR TLS models. See
  81   :ref:`language_support_threading` for more details.
  82 * Restrictions on :ref:`linkage types <bitcode_linkagetypes>`.
  83 * The ``addrspace``, ``section``, ``unnamed_addr`` and
  84   ``externally_initialized`` attributes are not supported.
  85
  86 Every global variable must have an initializer. Each initializer must be
  87 either a *SimpleElement* or a *CompoundElement*, defined as follows.
  88
  89 A *SimpleElement* is one of the following:
  90
  91 1) An i8 array literal or ``zeroinitializer``:
  92
  93 .. naclcode::
  94   :prettyprint: 0
  95
  96      [SIZE x i8] c"DATA"
  97      [SIZE x i8] zeroinitializer
  98
  99 2) A reference to a *GlobalValue* (a function or global variable) with an
 100    optional 32-bit byte offset added to it (the addend, which may be
 101    negative):
 102
 103 .. naclcode::
 104   :prettyprint: 0
 105
 106      ptrtoint (TYPE* @GLOBAL to i32)
 107      add (i32 ptrtoint (TYPE* @GLOBAL to i32), i32 ADDEND)
 108
 109 A *CompoundElement* is a unnamed, packed struct containing more than one
 110 *SimpleElement*.
 111
 112 Functions
 113 ---------
 114
 115 `LLVM LangRef: Functions
 116 <http://llvm.org/releases/3.3/docs/LangRef.html#functionstructure>`_
 117
 118 The restrictions on :ref:`linkage types <bitcode_linkagetypes>`, calling
 119 conventions and visibility styles apply to functions. In addition, the following
 120 are not supported for functions:
 121
 122 * Function attributes (either for the the function itself, its parameters or its
 123   return type).
 124 * Garbage collector name (``gc``).
 125 * Functions with a variable number of arguments (*vararg*).
 126 * Alignment (``align``).
 127
 128 Aliases
 129 -------
 130
 131 `LLVM LangRef: Aliases
 132 <http://llvm.org/releases/3.3/docs/LangRef.html#aliases>`_
 133
 134 PNaCl bitcode does not support aliases.
 135
 136 Named Metadata
 137 --------------
 138
 139 `LLVM LangRef: Named Metadata
 140 <http://llvm.org/releases/3.3/docs/LangRef.html#namedmetadatastructure>`_
 141
 142 While PNaCl bitcode has provisions for debugging metadata, it is not considered
 143 part of the stable ABI. It exists for tool support and should not appear in
 144 distributed pexes.
 145
 146 Other kinds of LLVM metadata are not supported.
 147
 148 Module-Level Inline Assembly
 149 ----------------------------
 150
 151 `LLVM LangRef: Module-Level Inline Assembly
 152 <http://llvm.org/releases/3.3/docs/LangRef.html#moduleasm>`_
 153
 154 PNaCl bitcode does not support inline assembly.
 155
 156 Volatile Memory Accesses
 157 ------------------------
 158
 159 `LLVM LangRef: Volatile Memory Accesses
 160 <http://llvm.org/releases/3.3/docs/LangRef.html#volatile>`_
 161
 162 PNaCl bitcode does not support volatile memory accesses. The
 163 ``volatile`` attribute on loads and stores is not supported. See the
 164 :doc:`pnacl-c-cpp-language-support` for more details.
 165
 166 Memory Model for Concurrent Operations
 167 --------------------------------------
 168
 169 `LLVM LangRef: Memory Model for Concurrent Operations
 170 <http://llvm.org/releases/3.3/docs/LangRef.html#memmodel>`_
 171
 172 See the :doc:`PNaCl C/C++ Language Support <pnacl-c-cpp-language-support>`
 173 for details.
 174
 175 Fast-Math Flags
 176 ---------------
 177
 178 `LLVM LangRef: Fast-Math Flags
 179 <http://llvm.org/releases/3.3/docs/LangRef.html#fastmath>`_
 180
 181 Fast-math mode is not currently supported by the PNaCl bitcode.
 182
 183 Type System
 184 ===========
 185
 186 `LLVM LangRef: Type System
 187 <http://llvm.org/releases/3.3/docs/LangRef.html#typesystem>`_
 188
 189 The LLVM types allowed in PNaCl bitcode are restricted, as follows:
 190
 191 Scalar types
 192 ------------
 193
 194 * The only scalar types allowed are integer, float (32-bit floating point),
 195   double (64-bit floating point) and void.
 196
 197   * The only integer sizes allowed are i1, i8, i16, i32 and i64.
 198   * The only integer sizes allowed for function arguments and function return
 199     values are i32 and i64.
 200
 201 Vector types
 202 ------------
 203
 204 The only vector types allowed are:
 205
 206 * 128-bit vectors integers of elements size i8, i16, i32.
 207 * 128-bit vectors of float elements.
 208 * Vectors of i1 type with element counts corresponding to the allowed
 209   element counts listed previously (their width is therefore not
 210   128-bits).
 211
 212 Array and struct types
 213 ----------------------
 214
 215 Array and struct types are only allowed in
 216 :ref:`global variable initializers <bitcode_globalvariables>`.
 217
 218 .. _bitcode_pointertypes:
 219
 220 Pointer types
 221 -------------
 222
 223 Only the following pointer types are allowed:
 224
 225 * Pointers to valid PNaCl bitcode scalar types, as specified above, except for
 226   ``i1``.
 227 * Pointers to valid PNaCl bitcode vector types, as specified above, except for
 228   ``<? x i1>``.
 229 * Pointers to functions.
 230
 231 In addition, the address space for all pointers must be 0.
 232
 233 A pointer is *inherent* when it represents the return value of an ``alloca``
 234 instruction, or is an address of a global value.
 235
 236 A pointer is *normalized* if it's either:
 237
 238 * *inherent*
 239 * Is the return value of a ``bitcast`` instruction.
 240 * Is the return value of a ``inttoptr`` instruction.
 241
 242 Undefined Values
 243 ----------------
 244
 245 `LLVM LangRef: Undefined Values
 246 <http://llvm.org/releases/3.3/docs/LangRef.html#undefvalues>`_
 247
 248 ``undef`` is only allowed within functions, not in global variable initializers.
 249
 250 Constant Expressions
 251 --------------------
 252
 253 `LLVM LangRef: Constant Expressions
 254 <http://llvm.org/releases/3.3/docs/LangRef.html#constant-expressions>`_
 255
 256 Constant expressions are only allowed in
 257 :ref:`global variable initializers <bitcode_globalvariables>`.
 258
 259 Other Values
 260 ============
 261
 262 Metadata Nodes and Metadata Strings
 263 -----------------------------------
 264
 265 `LLVM LangRef: Metadata Nodes and Metadata Strings
 266 <http://llvm.org/releases/3.3/docs/LangRef.html#metadata>`_
 267
 268 While PNaCl bitcode has provisions for debugging metadata, it is not considered
 269 part of the stable ABI. It exists for tool support and should not appear in
 270 distributed pexes.
 271
 272 Other kinds of LLVM metadata are not supported.
 273
 274 Intrinsic Global Variables
 275 ==========================
 276
 277 `LLVM LangRef: Intrinsic Global Variables
 278 <http://llvm.org/releases/3.3/docs/LangRef.html#intrinsic-global-variables>`_
 279
 280 PNaCl bitcode does not support intrinsic global variables.
 281
 282 .. _ir_and_errno:
 283
 284 Errno and errors in arithmetic instructions
 285 ===========================================
 286
 287 Some arithmetic instructions and intrinsics have the similar semantics to
 288 libc math functions, but differ in the treatment of ``errno``. While the
 289 libc functions may set ``errno`` for domain errors, the instructions and
 290 intrinsics do not. This is because the variable ``errno`` is not special
 291 and is not required to be part of the program.
 292
 293 Instruction Reference
 294 =====================
 295
 296 List of allowed instructions
 297 ----------------------------
 298
 299 This is a list of LLVM instructions supported by PNaCl bitcode. Where
 300 applicable, PNaCl-specific restrictions are provided.
 301
 302 .. TODO: explain instructions or link in the future
 303
 304 The following attributes are disallowed for all instructions:
 305
 306 * ``nsw`` and ``nuw``
 307 * ``exact``
 308
 309 Only the LLVM instructions listed here are supported by PNaCl bitcode.
 310
 311 * ``ret``
 312 * ``br``
 313 * ``switch``
 314
 315   i1 values are disallowed for ``switch``.
 316
 317 * ``add``, ``sub``, ``mul``, ``shl``,  ``udiv``, ``sdiv``, ``urem``, ``srem``,
 318   ``lshr``, ``ashr``
 319
 320   These arithmetic operations are disallowed on values of type ``i1``.
 321
 322   Integer division (``udiv``, ``sdiv``, ``urem``, ``srem``) by zero is
 323   guaranteed to trap in PNaCl bitcode.
 324
 325 * ``and``
 326 * ``or``
 327 * ``xor``
 328 * ``fadd``
 329 * ``fsub``
 330 * ``fmul``
 331 * ``fdiv``
 332 * ``frem``
 333
 334   The frem instruction has the semantics of the libc fmod function for
 335   computing the floating point remainder. If the numerator is infinity, or
 336   denominator is zero, or either are NaN, then the result is NaN.
 337   Unlike the libc fmod function, this does not set ``errno`` when the
 338   result is NaN (see the :ref:`instructions and errno <ir_and_errno>`
 339   section).
 340
 341 * ``alloca``
 342
 343   See :ref:`alloca instructions <bitcode_allocainst>`.
 344
 345 * ``load``, ``store``
 346
 347   The pointer argument of these instructions must be a *normalized* pointer (see
 348   :ref:`pointer types <bitcode_pointertypes>`). The ``volatile`` and ``atomic``
 349   attributes are not supported. Loads and stores of the type ``i1`` and ``<? x
 350   i1>`` are not supported.
 351
 352   These instructions must follow the following alignment restrictions:
 353
 354   * On integer memory accesses: ``align 1``.
 355   * On ``float`` memory accesses: ``align 1`` or ``align 4``.
 356   * On ``double`` memory accesses: ``align 1`` or ``align 8``.
 357   * On vector memory accesses: alignment at the vector's element width, for
 358     example ``<4 x i32>`` must be ``align 4``.
 359
 360 * ``trunc``
 361 * ``zext``
 362 * ``sext``
 363 * ``fptrunc``
 364 * ``fpext``
 365 * ``fptoui``
 366 * ``fptosi``
 367 * ``uitofp``
 368 * ``sitofp``
 369
 370 * ``ptrtoint``
 371
 372   The pointer argument of a ``ptrtoint`` instruction must be a *normalized*
 373   pointer (see :ref:`pointer types <bitcode_pointertypes>`) and the integer
 374   argument must be an i32.
 375
 376 * ``inttoptr``
 377
 378   The integer argument of a ``inttoptr`` instruction must be an i32.
 379
 380 * ``bitcast``
 381
 382   The pointer argument of a ``bitcast`` instruction must be a *inherent* pointer
 383   (see :ref:`pointer types <bitcode_pointertypes>`).
 384
 385 * ``icmp``
 386 * ``fcmp``
 387 * ``phi``
 388 * ``select``
 389 * ``call``
 390 * ``unreachable``
 391 * ``insertelement``
 392 * ``extractelement``
 393
 394 .. _bitcode_allocainst:
 395
 396 ``alloca``
 397 ----------
 398
 399 The only allowed type for ``alloca`` instructions in PNaCl bitcode is i8. The
 400 size argument must be an i32. For example:
 401
 402 .. naclcode::
 403   :prettyprint: 0
 404
 405     %buf = alloca i8, i32 8, align 4
 406
 407 Intrinsic Functions
 408 ===================
 409
 410 `LLVM LangRef: Intrinsic Functions
 411 <http://llvm.org/releases/3.3/docs/LangRef.html#intrinsics>`_
 412
 413 List of allowed intrinsics
 414 --------------------------
 415
 416 The only intrinsics supported by PNaCl bitcode are the following.
 417
 418 * ``llvm.memcpy``
 419 * ``llvm.memmove``
 420 * ``llvm.memset``
 421
 422   These intrinsics are only supported with an i32 ``len`` argument.
 423
 424 * ``llvm.bswap``
 425
 426   The overloaded ``llvm.bswap`` intrinsic is only supported with the following
 427   argument types: i16, i32, i64 (the types supported by C-style GCC builtins).
 428
 429 * ``llvm.ctlz``
 430 * ``llvm.cttz``
 431 * ``llvm.ctpop``
 432
 433   The overloaded ``llvm.ctlz``, ``llvm.cttz``, and ``llvm.ctpop`` intrinsics
 434   are only supported with the i32 and i64 argument types (the types
 435   supported by C-style GCC builtins).
 436
 437 * ``llvm.fabs``
 438
 439   The overloaded ``llvm.fabs`` intrinsic is supported for float, double and
 440   ``<4 x float>`` argument types. It returns the absolute value of
 441   the argument. Some notable points: it returns ``+0.0`` when given ``-0.0``,
 442   ``+inf`` when given ``-inf``, and a positive ``NaN`` when given any
 443   signed ``NaN``.
 444
 445   NOTE: This intrinsic was introduced in the pepper_42 SDK.
 446
 447 * ``llvm.sqrt``
 448
 449   The overloaded ``llvm.sqrt`` intrinsic is only supported for float
 450   and double arguments types. This has the same semantics as the libc
 451   sqrt function, returning ``NaN`` for values less than ``-0.0``.
 452   However, this does not set ``errno`` when the result is NaN (see the
 453   :ref:`instructions and errno <ir_and_errno>` section).
 454
 455 * ``llvm.stacksave``
 456 * ``llvm.stackrestore``
 457
 458   These intrinsics are used to implement language features like scoped automatic
 459   variable sized arrays in C99. ``llvm.stacksave`` returns a value that
 460   represents the current state of the stack. This value may only be used as the
 461   argument to ``llvm.stackrestore``, which restores the stack to the given
 462   state.
 463
 464 * ``llvm.trap``
 465
 466   This intrinsic is lowered to a target dependent trap instruction, which aborts
 467   execution.
 468
 469 * ``llvm.nacl.read.tp``
 470
 471   See :ref:`thread pointer related intrinsics
 472   <bitcode_threadpointerintrinsics>`.
 473
 474 * ``llvm.nacl.longjmp``
 475 * ``llvm.nacl.setjmp``
 476
 477   See :ref:`Setjmp and Longjmp <bitcode_setjmplongjmp>`.
 478
 479 * ``llvm.nacl.atomic.store``
 480 * ``llvm.nacl.atomic.load``
 481 * ``llvm.nacl.atomic.rmw``
 482 * ``llvm.nacl.atomic.cmpxchg``
 483 * ``llvm.nacl.atomic.fence``
 484 * ``llvm.nacl.atomic.fence.all``
 485 * ``llvm.nacl.atomic.is.lock.free``
 486
 487   See :ref:`atomic intrinsics <bitcode_atomicintrinsics>`.
 488
 489 .. _bitcode_threadpointerintrinsics:
 490
 491 Thread pointer related intrinsics
 492 ---------------------------------
 493
 494 .. naclcode::
 495   :prettyprint: 0
 496
 497     declare i8* @llvm.nacl.read.tp()
 498
 499 Returns a read-only thread pointer. The value is controlled by the embedding
 500 sandbox's runtime.
 501
 502 .. _bitcode_setjmplongjmp:
 503
 504 Setjmp and Longjmp
 505 ------------------
 506
 507 .. naclcode::
 508   :prettyprint: 0
 509
 510     declare void @llvm.nacl.longjmp(i8* %jmpbuf, i32)
 511     declare i32 @llvm.nacl.setjmp(i8* %jmpbuf)
 512
 513 These intrinsics implement the semantics of C11 ``setjmp`` and ``longjmp``. The
 514 ``jmpbuf`` pointer must be 64-bit aligned and point to at least 1024 bytes of
 515 allocated memory.
 516
 517 .. _bitcode_atomicintrinsics:
 518
 519 Atomic intrinsics
 520 -----------------
 521
 522 .. naclcode::
 523   :prettyprint: 0
 524
 525     declare iN @llvm.nacl.atomic.load.<size>(
 526             iN* <source>, i32 <memory_order>)
 527     declare void @llvm.nacl.atomic.store.<size>(
 528             iN <operand>, iN* <destination>, i32 <memory_order>)
 529     declare iN @llvm.nacl.atomic.rmw.<size>(
 530             i32 <computation>, iN* <object>, iN <operand>, i32 <memory_order>)
 531     declare iN @llvm.nacl.atomic.cmpxchg.<size>(
 532             iN* <object>, iN <expected>, iN <desired>,
 533             i32 <memory_order_success>, i32 <memory_order_failure>)
 534     declare void @llvm.nacl.atomic.fence(i32 <memory_order>)
 535     declare void @llvm.nacl.atomic.fence.all()
 536
 537 Each of these intrinsics is overloaded on the ``iN`` argument, which is
 538 reflected through ``<size>`` in the overload's name. Integral types of
 539 8, 16, 32 and 64-bit width are supported for these arguments.
 540
 541 The ``@llvm.nacl.atomic.rmw`` intrinsic implements the following
 542 read-modify-write operations, from the general and arithmetic sections
 543 of the C11/C++11 standards:
 544
 545  - ``add``
 546  - ``sub``
 547  - ``or``
 548  - ``and``
 549  - ``xor``
 550  - ``exchange``
 551
 552 For all of these read-modify-write operations, the returned value is
 553 that at ``object`` before the computation. The ``computation`` argument
 554 must be a compile-time constant.
 555
 556 All atomic intrinsics also support C11/C++11 memory orderings, which
 557 must be compile-time constants.
 558
 559 Integer values for these computations and memory orderings are defined
 560 in ``"llvm/IR/NaClAtomicIntrinsics.h"``.
 561
 562 The ``@llvm.nacl.atomic.fence.all`` intrinsic is equivalent to the
 563 ``@llvm.nacl.atomic.fence`` intrinsic with sequentially consistent
 564 ordering and compiler barriers preventing most non-atomic memory
 565 accesses from reordering around it.
 566
 567 .. Note::
 568   :class: note
 569
 570     These intrinsics allow PNaCl to support C11/C++11 style atomic
 571     operations as well as some legacy GCC-style ``__sync_*`` builtins
 572     while remaining stable as the LLVM codebase changes. The user isn't
 573     expected to use these intrinsics directly.
 574
 575 .. naclcode::
 576   :prettyprint: 0
 577
 578     declare i1 @llvm.nacl.atomic.is.lock.free(i32 <byte_size>, i8* <address>)
 579
 580 The ``llvm.nacl.atomic.is.lock.free`` intrinsic is designed to
 581 determine at translation time whether atomic operations of a certain
 582 ``byte_size`` (a compile-time constant), at a particular ``address``,
 583 are lock-free or not. This reflects the C11 ``atomic_is_lock_free``
 584 function from header ``<stdatomic.h>`` and the C++11 ``is_lock_free``
 585 member function in header ``<atomic>``. It can be used through the
 586 ``__nacl_atomic_is_lock_free`` builtin.