6 This document describes concisely the subset of the amd64
7 ABI as it is implemented in QBE. The subset can handle
8 correctly arbitrary standard C-like structs containing
9 float and integer types. Structs that have unaligned
10 members are also supported through opaque types, see
11 the IL description document for more information about
15 - ABI Subset Implemented
16 ------------------------
18 Data classes of interest as defined by the ABI:
26 1. The size of each argument gets rounded up to eightbytes.
27 (It keeps the stack always 8 bytes aligned.)
28 2. _Bool, char, short, int, long, long long and pointers
29 are in the INTEGER class. In the context of QBE, it
30 means that 'l' and 'w' are in the INTEGER class.
31 3. float and double are in the SSE class. In the context
32 of QBE, it means that 's' and 'd' are in the SSE class.
33 4. If the size of an object is larger than two eightbytes
34 or if contains unaligned fields, it has class MEMORY.
35 In the context of QBE, those are big aggregate types
37 5. Otherwise, recursively classify fields and determine
38 the class of the two eightbytes using the classes of
39 their components. If any is INTEGER the result is
40 INTEGER, otherwise the result is SSE.
44 * Classify arguments in order.
45 * INTEGER arguments use in order `%rdi` `%rsi` `%rdx`
47 * SSE arguments use in order `%xmm0` - `%xmm7`.
48 * MEMORY gets passed on the stack. They are "pushed"
49 in the right-to-left order, so from the callee's
50 point of view, the left-most argument appears first
52 * When we run out of registers for an aggregate, revert
53 the assignment for the first eightbytes and pass it
55 * When all registers are taken, write arguments on the
56 stack from right to left.
57 * When calling a variadic function, %al stores the number
58 of vector registers used to pass arguments (it must be
59 an upper bound and does not have to be exact).
60 * Registers `%rbx`, `%r12` - `%r15` are callee-save.
64 * Classify the return type.
65 * Use `%rax` and `%rdx` in order for INTEGER return
67 * Use `%xmm0` and `%xmm1` in order for SSE return values.
68 * If the return value's class is MEMORY, the first
69 argument of the function `%rdi` was a pointer to an
70 area big enough to fit the return value. The function
71 writes the return value there and returns the address
72 (that was in `%rdi`) in `%rax`.
75 - Alignment on the Stack
76 ------------------------
78 The ABI is unclear on the alignment requirement of the
79 stack. What must be ensured is that, right before
80 executing a 'call' instruction, the stack pointer `%rsp`
81 is aligned on 16 bytes. On entry of the called
82 function, the stack pointer is 8 modulo 16. Since most
83 functions will have a prelude pushing `%rbp`, the frame
84 pointer, upon entry of the body code of the function is
85 also aligned on 16 bytes (== 0 mod 16).
87 Here is a diagram of the stack layout after a call from
95 | |xxxxxxxxxxxxx| | f()'s MEMORY
96 growing | +-------------+ | arguments
97 addresses | | stack arg 1 | ,
99 | +-------------+ -> 0 mod 16
103 +-------------+ -> f()'s %rbp
104 | f() locals | 0 mod 16
109 * `xxxxx` Optional padding.
115 * A struct can be returned in registers in one of three
116 ways. Either `%rax`, `%rdx` are used, or `%xmm0`,
117 `%xmm1`, or finally `%rax`, `%xmm0`. The last case
118 happens when a struct is returned with one half
119 classified as INTEGER and the other as SSE. This
120 is a consequence of the <@Returning> section above.
122 * The size of the arguments area of the stack needs to
123 be computed first, then arguments are packed starting
124 from the bottom of the argument area, respecting
125 alignment constraints. The ABI mentions "pushing"
126 arguments in right-to-left order, but I think it's a
127 mistaken view because of the alignment constraints.
129 Example: If three 8 bytes MEMORY arguments are passed
130 to the callee and the caller's stack pointer is 16 bytes
131 algined, the layout will be like this.
134 |xxxxxxxxxxxxx| padding
138 +-------------+ -> 0 mod 16
140 The padding must not be at the end of the stack area.
141 A "pushing" logic would put it at the end.