1 ===========================
2 QBE Intermediate Language
3 ===========================
10 1. <@ Basic Concepts >
21 * <@ Aggregate Types >
28 * <@ Arithmetic and Bits >
36 8. <@ Instructions Index >
41 The intermediate language (IL) is a higher-level language
42 than the machine's assembly language. It smoothes most
43 of the irregularities of the underlying hardware and
44 allows an infinite number of temporaries to be used.
45 This higher abstraction level lets frontend programmers
46 focus on language design issues.
51 The intermediate language is provided to QBE as text.
52 Usually, one file is generated per each compilation unit from
53 the frontend input language. An IL file is a sequence of
54 <@ Definitions > for data, functions, and types. Once
55 processed by QBE, the resulting file can be assembled and
56 linked using a standard toolchain (e.g., GNU binutils).
58 Here is a complete "Hello World" IL file which defines a
59 function that prints to the screen. Since the string is
60 not a first class object (only the pointer is) it is
61 defined outside the function's body. Comments start with
62 a # character and finish with the end of the line.
64 # Define the string constant.
65 data $str = { b "hello world", b 0 }
67 export function w $main() {
69 # Call the puts function with $str as argument.
70 %r =w call $puts(l $str)
74 If you have read the LLVM language reference, you might
75 recognize the example above. In comparison, QBE makes a
76 much lighter use of types and the syntax is terser.
81 The language syntax is vaporously described in the sections
82 below using BNF syntax. The different BNF constructs used
85 * Keywords are enclosed between quotes;
86 * `... | ...` expresses alternatives;
87 * `( ... )` groups syntax;
88 * `[ ... ]` marks the nested syntax as optional;
89 * `( ... ),` designates a comma-separated list of the
91 * `...*` and `...+` are used for arbitrary and
92 at-least-once repetition respectively.
97 The intermediate language makes heavy use of sigils, all
98 user-defined names are prefixed with a sigil. This is
99 to avoid keyword conflicts, and also to quickly spot the
100 scope and nature of identifiers.
102 * `:` is for user-defined <@ Aggregate Types>
103 * `$` is for globals (represented by a pointer)
104 * `%` is for function-scope temporaries
105 * `@` is for block labels
107 In this BNF syntax, we use `?IDENT` to designate an identifier
108 starting with the sigil `?`.
116 Individual tokens in IL files must be separated by one or
117 more spacing characters. Both spaces and tabs are recognized
118 as spacing characters. In data and type definitions, newlines
119 may also be used as spaces to prevent overly long lines. When
120 exactly one of two consecutive tokens is a symbol (for example
121 `,` or `=` or `{`), spacing may be omitted.
130 BASETY := 'w' | 'l' | 's' | 'd' # Base types
131 EXTTY := BASETY | 'b' | 'h' # Extended types
133 The IL makes minimal use of types. By design, the types
134 used are restricted to what is necessary for unambiguous
135 compilation to machine code and C interfacing. Unlike LLVM,
136 QBE is not using types as a means to safety; they are only
137 here for semantic purposes.
139 The four base types are `w` (word), `l` (long), `s` (single),
140 and `d` (double), they stand respectively for 32-bit and
141 64-bit integers, and 32-bit and 64-bit floating-point numbers.
142 There are no pointer types available; pointers are typed
143 by an integer type sufficiently wide to represent all memory
144 addresses (e.g., `l` on 64-bit architectures). Temporaries
145 in the IL can only have a base type.
147 Extended types contain base types plus `b` (byte) and `h`
148 (half word), respectively for 8-bit and 16-bit integers.
149 They are used in <@ Aggregate Types> and <@ Data> definitions.
151 For C interfacing, the IL also provides user-defined aggregate
152 types as well as signed and unsigned variants of the sub-word
153 extended types. Read more about these types in the
154 <@ Aggregate Types > and <@ Functions > sections.
159 The IL has a minimal subtyping feature, for integer types only.
160 Any value of type `l` can be used in a `w` context. In that
161 case, only the 32 least significant bits of the word value
164 Make note that it is the opposite of the usual subtyping on
165 integers (in C, we can safely use an `int` where a `long`
166 is expected). A long value cannot be used in word context.
167 The rationale is that a word can be signed or unsigned, so
168 extending it to a long could be done in two ways, either
169 by zero-extension, or by sign-extension.
176 ['-'] NUMBER # Decimal integer
177 | 's_' FP # Single-precision float
178 | 'd_' FP # Double-precision float
179 | $IDENT # Global symbol
183 | 'thread' $IDENT # Thread-local symbol
185 Constants come in two kinds: compile-time constants and
186 dynamic constants. Dynamic constants include compile-time
187 constants and other symbol variants that are only known at
188 program-load time or execution time. Consequently, dynamic
189 constants can only occur in function bodies.
191 The representation of integers is two's complement.
192 Floating-point numbers are represented using the
193 single-precision and double-precision formats of the
196 Constants specify a sequence of bits and are untyped.
197 They are always parsed as 64-bit blobs. Depending on
198 the context surrounding a constant, only some of its
199 bits are used. For example, in the program below, the
200 two variables defined have the same value since the first
201 operand of the subtraction is a word (32-bit) context.
204 %y =w sub 4294967295, 0
206 Because specifying floating-point constants by their bits
207 makes the code less readable, syntactic sugar is provided
208 to express them. Standard scientific notation is prefixed
209 with `s_` and `d_` for single and double precision numbers
210 respectively. Once again, the following example defines twice
211 the same double-precision constant.
214 %y =d add d_0, -4616189618054758400
216 Global symbols can also be used directly as constants;
217 they will be resolved and turned into actual numeric
218 constants by the linker.
220 When the `thread` keyword prefixes a symbol name, the
221 symbol's numeric value is resolved at runtime in the
222 thread-local storage.
231 | 'section' SECNAME [NL]
232 | 'section' SECNAME SECFLAGS [NL]
234 SECNAME := '"' .... '"'
235 SECFLAGS := '"' .... '"'
237 Function and data definitions (see below) can specify
238 linkage information to be passed to the assembler and
239 eventually to the linker.
241 The `export` linkage flag marks the defined item as
242 visible outside the current file's scope. If absent,
243 the symbol can only be referred to locally. Functions
244 compiled by QBE and called from C need to be exported.
246 The `thread` linkage flag can only qualify data
247 definitions. It mandates that the object defined is
248 stored in thread-local storage. Each time a runtime
249 thread starts, the supporting platform runtime is in
250 charge of making a new copy of the object for the
251 fresh thread. Objects in thread-local storage must
252 be accessed using the `thread $IDENT` syntax, as
253 specified in the <@ Constants > section.
255 A `section` flag can be specified to tell the linker to
256 put the defined item in a certain section. The use of
257 the section flag is platform dependent and we refer the
258 user to the documentation of their assembler and linker
259 for relevant information.
261 section ".init_array"
262 data $.init.f = { l $f }
264 The section flag can be used to add function pointers to
265 a global initialization list, as depicted above. Note
266 that some platforms provide a BSS section that can be
267 used to minimize the footprint of uniformly zeroed data.
268 When this section is available, QBE will automatically
269 make use of it and no section flag is required.
271 The section and export linkage flags should each appear
272 at most once in a definition. If multiple occurrences
273 are present, QBE is free to use any.
278 Definitions are the essential components of an IL file.
279 They can define three types of objects: aggregate types,
280 data, and functions. Aggregate types are never exported
281 and do not compile to any code. Data and function
282 definitions have file scope and are mutually recursive
283 (even across IL files). Their visibility can be controlled
292 'type' :IDENT '=' ['align' NUMBER]
297 'type' :IDENT '=' 'align' NUMBER '{' NUMBER '}'
299 SUBTY := EXTTY | :IDENT
301 Aggregate type definitions start with the `type` keyword.
302 They have file scope, but types must be defined before being
303 referenced. The inner structure of a type is expressed by a
304 comma-separated list of types enclosed in curly braces.
306 type :fourfloats = { s, s, d, d }
308 For ease of IL generation, a trailing comma is tolerated by
309 the parser. In case many items of the same type are
310 sequenced (like in a C array), the shorter array syntax
313 type :abyteandmanywords = { b, w 100 }
315 By default, the alignment of an aggregate type is the
316 maximum alignment of its members. The alignment can be
317 explicitly specified by the programmer.
319 type :cryptovector = align 16 { w 4 }
321 Opaque types are used when the inner structure of an
322 aggregate cannot be specified; the alignment for opaque
323 types is mandatory. They are defined simply by enclosing
324 their size between curly braces.
326 type :opaque = align 16 { 32 }
334 'data' $IDENT '=' ['align' NUMBER]
341 $IDENT ['+' NUMBER] # Symbol and offset
342 | '"' ... '"' # String
345 Data definitions express objects that will be emitted in the
346 compiled file. Their visibility and location in the compiled
347 artifact are controlled with linkage flags described in the
348 <@ Linkage > section.
350 They define a global identifier (starting with the sigil
351 `$`), that will contain a pointer to the object specified
354 Objects are described by a sequence of fields that start with
355 a type letter. This letter can either be an extended type,
356 or the `z` letter. If the letter used is an extended type,
357 the data item following specifies the bits to be stored in
358 the field. When several data items follow a letter, they
359 initialize multiple fields of the same size.
361 The members of a struct will be packed. This means that
362 padding has to be emitted by the frontend when necessary.
363 Alignment of the whole data objects can be manually specified,
364 and when no alignment is provided, the maximum alignment from
365 the platform is used.
367 When the `z` letter is used the number following indicates
368 the size of the field; the contents of the field are zero
369 initialized. It can be used to add padding between fields
370 or zero-initialize big arrays.
372 Here are various examples of data definitions.
374 # Three 32-bit values 1, 2, and 3
375 # followed by a 0 byte.
376 data $a = { w 1 2 3, b 0 }
378 # A thousand bytes 0 initialized.
381 # An object containing two 64-bit
382 # fields, one with all bits sets and the
383 # other containing a pointer to the
385 data $c = { l -1, l $c }
393 'function' [ABITY] $IDENT '(' (PARAM), ')' [NL]
399 ABITY %IDENT # Regular parameter
400 | 'env' %IDENT # Environment parameter (first)
401 | '...' # Variadic marker (last)
403 SUBWTY := 'sb' | 'ub' | 'sh' | 'uh' # Sub-word types
404 ABITY := BASETY | SUBWTY | :IDENT
406 Function definitions contain the actual code to emit in
407 the compiled file. They define a global symbol that
408 contains a pointer to the function code. This pointer
409 can be used in `call` instructions or stored in memory.
411 The type given right before the function name is the
412 return type of the function. All return values of this
413 function must have this return type. If the return
414 type is missing, the function must not return any value.
416 The parameter list is a comma separated list of
417 temporary names prefixed by types. The types are used
418 to correctly implement C compatibility. When an argument
419 has an aggregate type, a pointer to the aggregate is passed
420 by the caller. In the example below, we have to use a load
421 instruction to get the value of the first (and only)
422 member of the struct.
426 function w $getone(:one %p) {
432 If a function accepts or returns values that are smaller
433 than a word, such as `signed char` or `unsigned short` in C,
434 one of the sub-word type must be used. The sub-word types
435 `sb`, `ub`, `sh`, and `uh` stand, respectively, for signed
436 and unsigned 8-bit values, and signed and unsigned 16-bit
437 values. Parameters associated with a sub-word type of bit
438 width N only have their N least significant bits set and
439 have base type `w`. For example, the function
441 function w $addbyte(w %a, sb %b) {
448 needs to sign-extend its second argument before the
449 addition. Dually, return values with sub-word types do
450 not need to be sign or zero extended.
452 If the parameter list ends with `...`, the function is
453 a variadic function: it can accept a variable number of
454 arguments. To access the extra arguments provided by
455 the caller, use the `vastart` and `vaarg` instructions
456 described in the <@ Variadic > section.
458 Optionally, the parameter list can start with an
459 environment parameter `env %e`. This special parameter is
460 a 64-bit integer temporary (i.e., of type `l`). If the
461 function does not use its environment parameter, callers
462 can safely omit it. This parameter is invisible to a C
463 caller: for example, the function
465 export function w $add(env %e, w %a, w %b) {
471 must be given the C prototype `int add(int, int)`.
472 The intended use of this feature is to pass the
473 environment pointer of closures while retaining a
474 very good compatibility with C. The <@ Call > section
475 explains how to pass an environment parameter.
477 Since global symbols are defined mutually recursive,
478 there is no need for function declarations: a function
479 can be referenced before its definition.
480 Similarly, functions from other modules can be used
481 without previous declaration. All the type information
482 necessary to compile a call is in the instruction itself.
484 The syntax and semantics for the body of functions
485 are described in the <@ Control > section.
490 The IL represents programs as textual transcriptions of
491 control flow graphs. The control flow is serialized as
492 a sequence of blocks of straight-line code which are
493 connected using jump instructions.
500 @IDENT NL # Block label
501 ( PHI NL )* # Phi instructions
502 ( INST NL )* # Regular instructions
503 JUMP NL # Jump or return
505 All blocks have a name that is specified by a label at
506 their beginning. Then follows a sequence of instructions
507 that have "fall-through" flow. Finally one jump terminates
508 the block. The jump can either transfer control to another
509 block of the same function or return; jumps are described
512 The first block in a function must not be the target of
513 any jump in the program. If a jump to the function start
514 is needed, the frontend must insert an empty prelude block
515 at the beginning of the function.
517 When one block jumps to the next block in the IL file,
518 it is not necessary to write the jump instruction, it
519 will be automatically added by the parser. For example
520 the start block in the example below jumps directly
526 %x =w phi @start 100, @loop %x1
538 'jmp' @IDENT # Unconditional
539 | 'jnz' VAL, @IDENT, @IDENT # Conditional
540 | 'ret' [VAL] # Return
541 | 'hlt' # Termination
543 A jump instruction ends every block and transfers the
544 control to another program location. The target of
545 a jump must never be the first block in a function.
546 The three kinds of jumps available are described in
549 1. Unconditional jump.
551 Simply jumps to another block of the same function.
555 When its word argument is non-zero, it jumps to its
556 first label argument; otherwise it jumps to the other
557 label. The argument must be of word type; because of
558 subtyping a long argument can be passed, but only its
559 least significant 32 bits will be compared to 0.
563 Terminates the execution of the current function,
564 optionally returning a value to the caller. The value
565 returned must be of the type given in the function
566 prototype. If the function prototype does not specify
567 a return type, no return value can be used.
569 4. Program termination.
571 Terminates the execution of the program with a
572 target-dependent error. This instruction can be used
573 when it is expected that the execution never reaches
574 the end of the block it closes; for example, after
575 having called a function such as `exit()`.
580 Instructions are the smallest piece of code in the IL, they
581 form the body of <@ Blocks >. The IL uses a three-address
582 code, which means that one instruction computes an operation
583 between two operands and assigns the result to a third one.
585 An instruction has both a name and a return type, this
586 return type is a base type that defines the size of the
587 instruction's result. The type of the arguments can be
588 unambiguously inferred using the instruction name and the
589 return type. For example, for all arithmetic instructions,
590 the type of the arguments is the same as the return type.
591 The two additions below are valid if `%y` is a word or a long
592 (because of <@ Subtyping >).
597 Some instructions, like comparisons and memory loads
598 have operand types that differ from their return types.
599 For instance, two floating points can be compared to give a
600 word result (0 if the comparison succeeds, 1 if it fails).
604 In the example above, both operands have to have single type.
605 This is made explicit by the instruction suffix.
607 The types of instructions are described below using a short
608 type string. A type string specifies all the valid return
609 types an instruction can have, its arity, and the type of
610 its arguments depending on its return type.
612 Type strings begin with acceptable return types, then
613 follows, in parentheses, the possible types for the arguments.
614 If the N-th return type of the type string is used for an
615 instruction, the arguments must use the N-th type listed for
616 them in the type string. When an instruction does not have a
617 return type, the type string only contains the types of the
620 The following abbreviations are used.
622 * `T` stands for `wlsd`
623 * `I` stands for `wl`
624 * `F` stands for `sd`
625 * `m` stands for the type of pointers on the target; on
626 64-bit architectures it is the same as `l`
628 For example, consider the type string `wl(F)`, it mentions
629 that the instruction has only one argument and that if the
630 return type used is long, the argument must be of type double.
632 ~ Arithmetic and Bits
633 ~~~~~~~~~~~~~~~~~~~~~
635 * `add`, `sub`, `div`, `mul` -- `T(T,T)`
637 * `udiv`, `rem`, `urem` -- `I(I,I)`
638 * `or`, `xor`, `and` -- `I(I,I)`
639 * `sar`, `shr`, `shl` -- `I(I,ww)`
641 The base arithmetic instructions in the first bullet are
642 available for all types, integers and floating points.
644 When `div` is used with word or long return type, the
645 arguments are treated as signed. The unsigned integral
646 division is available as `udiv` instruction. When the
647 result of a division is not an integer, it is truncated
650 The signed and unsigned remainder operations are available
651 as `rem` and `urem`. The sign of the remainder is the same
652 as the one of the dividend. Its magnitude is smaller than
653 the divisor one. These two instructions and `udiv` are only
654 available with integer arguments and result.
656 Bitwise OR, AND, and XOR operations are available for both
657 integer types. Logical operations of typical programming
658 languages can be implemented using <@ Comparisons > and
661 Shift instructions `sar`, `shr`, and `shl`, shift right or
662 left their first operand by the amount from the second
663 operand. The shifting amount is taken modulo the size of
664 the result type. Shifting right can either preserve the
665 sign of the value (using `sar`), or fill the newly freed
666 bits with zeroes (using `shr`). Shifting left always
667 fills the freed bits with zeroes.
669 Remark that an arithmetic shift right (`sar`) is only
670 equivalent to a division by a power of two for non-negative
671 numbers. This is because the shift right "truncates"
672 towards minus infinity, while the division truncates
678 * Store instructions.
680 * `stored` -- `(d,m)`
681 * `stores` -- `(s,m)`
682 * `storel` -- `(l,m)`
683 * `storew` -- `(w,m)`
684 * `storeh` -- `(w,m)`
685 * `storeb` -- `(w,m)`
687 Store instructions exist to store a value of any base type
688 and any extended type. Since halfwords and bytes are not
689 first class in the IL, `storeh` and `storeb` take a word
690 as argument. Only the first 16 or 8 bits of this word will
691 be stored in memory at the address specified in the second
699 * `loadsw`, `loaduw` -- `I(mm)`
700 * `loadsh`, `loaduh` -- `I(mm)`
701 * `loadsb`, `loadub` -- `I(mm)`
703 For types smaller than long, two variants of the load
704 instruction are available: one will sign extend the loaded
705 value, while the other will zero extend it. Note that
706 all loads smaller than long can load to either a long or
709 The two instructions `loadsw` and `loaduw` have the same
710 effect when they are used to define a word temporary.
711 A `loadw` instruction is provided as syntactic sugar for
712 `loadsw` to make explicit that the extension mechanism
717 * `blit` -- `(m,m,w)`
719 The blit instruction copies in-memory data from its
720 first address argument to its second address argument.
721 The third argument is the number of bytes to copy. The
722 source and destination spans are required to be either
723 non-overlapping, or fully overlapping (source address
724 identical to the destination address). The byte count
725 argument must be a nonnegative numeric constant; it
726 cannot be a temporary.
728 One blit instruction may generate a number of
729 instructions proportional to its byte count argument,
730 consequently, it is recommended to keep this argument
731 relatively small. If large copies are necessary, it is
732 preferable that frontends generate calls to a supporting
739 * `alloc16` -- `m(l)`
741 These instructions allocate a chunk of memory on the
742 stack. The number ending the instruction name is the
743 alignment required for the allocated slot. QBE will
744 make sure that the returned address is a multiple of
745 that alignment value.
747 Stack allocation instructions are used, for example,
748 when compiling the C local variables, because their
749 address can be taken. When compiling Fortran,
750 temporaries can be used directly instead, because
751 it is illegal to take the address of a variable.
753 The following example makes use of some of the memory
754 instructions. Pointers are stored in long temporaries.
756 %A0 =l alloc4 8 # stack allocate an array A of 2 words
758 storew 43, %A0 # A[0] <- 43
759 storew 255, %A1 # A[1] <- 255
760 %v1 =w loadw %A0 # %v1 <- A[0] as word
761 %v2 =w loadsb %A1 # %v2 <- A[1] as signed byte
762 %v3 =w add %v1, %v2 # %v3 is 42 here
767 Comparison instructions return an integer value (either a word
768 or a long), and compare values of arbitrary types. The returned
769 value is 1 if the two operands satisfy the comparison
770 relation, or 0 otherwise. The names of comparisons respect
771 a standard naming scheme in three parts.
773 1. All comparisons start with the letter `c`.
775 2. Then comes a comparison type. The following
776 types are available for integer comparisons:
779 * `ne` for inequality
780 * `sle` for signed lower or equal
781 * `slt` for signed lower
782 * `sge` for signed greater or equal
783 * `sgt` for signed greater
784 * `ule` for unsigned lower or equal
785 * `ult` for unsigned lower
786 * `uge` for unsigned greater or equal
787 * `ugt` for unsigned greater
789 Floating point comparisons use one of these types:
792 * `ne` for inequality
793 * `le` for lower or equal
795 * `ge` for greater or equal
797 * `o` for ordered (no operand is a NaN)
798 * `uo` for unordered (at least one operand is a NaN)
800 Because floating point types always have a sign bit,
801 all the comparisons available are signed.
803 3. Finally, the instruction name is terminated with a
804 basic type suffix precising the type of the operands
807 For example, `cod` (`I(dd,dd)`) compares two double-precision
808 floating point numbers and returns 1 if the two floating points
809 are not NaNs, or 0 otherwise. The `csltw` (`I(ww,ww)`)
810 instruction compares two words representing signed numbers and
811 returns 1 when the first argument is smaller than the second one.
816 Conversion operations change the representation of a value,
817 possibly modifying it if the target type cannot hold the value
818 of the source type. Conversions can extend the precision of a
819 temporary (e.g., from signed 8-bit to 32-bit), or convert a
820 floating point into an integer and vice versa.
822 * `extsw`, `extuw` -- `l(w)`
823 * `extsh`, `extuh` -- `I(ww)`
824 * `extsb`, `extub` -- `I(ww)`
836 Extending the precision of a temporary is done using the
837 `ext` family of instructions. Because QBE types do not
838 specify the signedness (like in LLVM), extension instructions
839 exist to sign-extend and zero-extend a value. For example,
840 `extsb` takes a word argument and sign-extends the 8
841 least-significant bits to a full word or long, depending on
844 The instructions `exts` (extend single) and `truncd` (truncate
845 double) are provided to change the precision of a floating
846 point value. When the double argument of `truncd` cannot
847 be represented as a single-precision floating point, it is
848 truncated towards zero.
850 Converting between signed integers and floating points is done
851 using `stosi` (single to signed integer), `stoui` (single to
852 unsigned integer, `dtosi` (double to signed integer), `dtoui`
853 (double to unsigned integer), `swtof` (signed word to float),
854 `uwtof` (unsigned word to float), `sltof` (signed long to
855 float) and `ultof` (unsigned long to float).
857 Because of <@ Subtyping >, there is no need to have an
858 instruction to lower the precision of an integer temporary.
863 The `cast` and `copy` instructions return the bits of their
864 argument verbatim. However a `cast` will change an integer
865 into a floating point of the same width and vice versa.
867 * `cast` -- `wlsd(sdwl)`
870 Casts can be used to make bitwise operations on the
871 representation of floating point numbers. For example
872 the following program will compute the opposite of the
873 single-precision floating point number `%f` into `%rs`.
876 %b1 =w xor 2147483648, %b0 # flip the msb
883 CALL := [%IDENT '=' ABITY] 'call' VAL '(' (ARG), ')'
886 ABITY VAL # Regular argument
887 | 'env' VAL # Environment argument (first)
888 | '...' # Variadic marker
890 SUBWTY := 'sb' | 'ub' | 'sh' | 'uh' # Sub-word types
891 ABITY := BASETY | SUBWTY | :IDENT
893 The call instruction is special in several ways. It is not
894 a three-address instruction and requires the type of all
895 its arguments to be given. Also, the return type can be
896 either a base type or an aggregate type. These specifics
897 are required to compile calls with C compatibility (i.e.,
900 When an aggregate type is used as argument type or return
901 type, the value respectively passed or returned needs to be
902 a pointer to a memory location holding the value. This is
903 because aggregate types are not first-class citizens of
906 Sub-word types are used for arguments and return values
907 of width less than a word. Details on these types are
908 presented in the <@ Functions > section. Arguments with
909 sub-word types need not be sign or zero extended according
910 to their type. Calls with a sub-word return type define
911 a temporary of base type `w` with its most significant bits
914 Unless the called function does not return a value, a
915 return temporary must be specified, even if it is never
918 An environment parameter can be passed as first argument
919 using the `env` keyword. The passed value must be a 64-bit
920 integer. If the called function does not expect an environment
921 parameter, it will be safely discarded. See the <@ Functions >
922 section for more information about environment parameters.
924 When the called function is variadic, there must be a `...`
925 marker separating the named and variadic arguments.
930 The `vastart` and `vaarg` instructions provide a portable
931 way to access the extra parameters of a variadic function.
934 * `vaarg` -- `T(mmmm)`
936 The `vastart` instruction initializes a *variable argument
937 list* used to access the extra parameters of the enclosing
938 variadic function. It is safe to call it multiple times.
940 The `vaarg` instruction fetches the next argument from
941 a variable argument list. It is currently limited to
942 fetching arguments that have a base type. This instruction
943 is essentially effectful: calling it twice in a row will
944 return two consecutive arguments from the argument list.
946 Both instructions take a pointer to a variable argument
947 list as sole argument. The size and alignment of variable
948 argument lists depend on the target used. However, it
949 is possible to conservatively use the maximum size and
950 alignment required by all the targets.
952 type :valist = align 8 { 24 } # For amd64_sysv
953 type :valist = align 8 { 32 } # For arm64
954 type :valist = align 8 { 8 } # For rv64
956 The following example defines a variadic function adding
957 its first three arguments.
959 function s $add3(s %a, ...) {
963 %r =s call $vadd(s %a, l %ap)
967 function s $vadd(s %a, l %ap) {
980 PHI := %IDENT '=' BASETY 'phi' ( @IDENT VAL ),
982 First and foremost, phi instructions are NOT necessary when
983 writing a frontend to QBE. One solution to avoid having to
984 deal with SSA form is to use stack allocated variables for
985 all source program variables and perform assignments and
986 lookups using <@ Memory > operations. This is what LLVM
989 Another solution is to simply emit code that is not in SSA
990 form! Contrary to LLVM, QBE is able to fixup programs not
991 in SSA form without requiring the boilerplate of loading
992 and storing in memory. For example, the following program
993 will be correctly compiled by QBE.
1005 Now, if you want to know what phi instructions are and how
1006 to use them in QBE, you can read the following.
1008 Phi instructions are specific to SSA form. In SSA form
1009 values can only be assigned once, without phi instructions,
1010 this requirement is too strong to represent many programs.
1011 For example consider the following C program.
1022 The variable `y` is assigned twice, the solution to
1023 translate it in SSA form is to insert a phi instruction.
1032 %y =w phi @ift 1, @iff 2
1035 Phi instructions return one of their arguments depending
1036 on where the control came from. In the example, `%y` is
1037 set to 1 if the `@ift` branch is taken, or it is set to
1040 An important remark about phi instructions is that QBE
1041 assumes that if a variable is defined by a phi it respects
1042 all the SSA invariants. So it is critical to not use phi
1043 instructions unless you know exactly what you are doing.
1045 - 8. Instructions Index
1046 -----------------------
1048 * <@ Arithmetic and Bits >:
1146 * <@ Cast and Copy > :