Roll src/third_party/WebKit d9c6159:8139f33 (svn 201974:201975)
[chromium-blink-merge.git] / native_client_sdk / doc_generated / reference / sandbox_internals / x86-64-sandbox.html
blobb24d24e3bd37ba0ed8187237363cbbf4a0d8c726
1 {{+bindTo:partials.standard_nacl_article}}
3 <section id="nacl-sfi-model-on-x86-64-systems">
4 <span id="x86-64-sandbox"></span><h1 id="nacl-sfi-model-on-x86-64-systems"><span id="x86-64-sandbox"></span>NaCl SFI model on x86-64 systems</h1>
5 <div class="contents local" id="contents" style="display: none">
6 <ul class="small-gap">
7 <li><a class="reference internal" href="#summary" id="id5">Summary</a></li>
8 <li><a class="reference internal" href="#binary-format" id="id6">Binary Format</a></li>
9 <li><a class="reference internal" href="#runtime-invariants" id="id7">Runtime Invariants</a></li>
10 <li><a class="reference internal" href="#text-segment-rules" id="id8">Text Segment Rules</a></li>
11 <li><a class="reference internal" href="#list-of-pseudo-instructions" id="id9">List of Pseudo-instructions</a></li>
12 </ul>
14 </div><h2 id="summary">Summary</h2>
15 <p>This document addresses the details of the Software Fault Isolation
16 (SFI) model for executable code that can be run in Native Client on an
17 x86-64 system. An overview of this model can be found in the paper:
18 <a class="reference external" href="https://research.google.com/pubs/archive/35649.pdf">Adapting Software Fault Isolation to Contemporary CPU Architectures</a>.
19 The primary focus of the SFI model is a Windows x86-64 system but the
20 same techniques can be applied to run identical x86-64 binaries on
21 other x86-64 systems such as Linux, Mac, FreeBSD, etc, so the
22 description of the SFI model tries to abstract away system
23 dependencies when possible.</p>
24 <p>Please note: throughout this document we use the AT&amp;T notation for
25 assembler syntax, in which the target operand appears last, e.g. <code>mov
26 src, dst</code>.</p>
27 <h2 id="binary-format">Binary Format</h2>
28 <p>The format of Native Client executable binaries is identical to the
29 x86-64 ELF binary format (<a class="reference external" href="http://en.wikipedia.org/wiki/Executable_and_Linkable_Format">[0]</a>, <a class="reference external" href="http://www.sco.com/developers/devspecs/gabi41.pdf">[1]</a>, <a class="reference external" href="http://www.sco.com/developers/gabi/latest/contents.html">[2]</a>, <a class="reference external" href="http://downloads.openwatcom.org/ftp/devel/docs/elf-64-gen.pdf">[3]</a>) for
30 Linux or BSD with a few extra requirements. The additional rules that
31 a Native Client ELF binary must follow are:</p>
32 <ul class="small-gap">
33 <li>The ELF magic OS ABI field must be 123.</li>
34 <li>The ELF magic OS ABI VERSION field must be 5.</li>
35 <li>The ELF e_flags field must be 0x200000 (32-byte alignment).</li>
36 <li>There must be exactly one PT_LOAD text segment. It must begin at
37 0x20000 (128 kB) and be marked RX (no W). The contents of the text
38 segment must follow <a class="reference internal" href="#x86-64-text-segment-rules"><em>Text Segment Rules</em></a>.</li>
39 <li>There can be at most one PT_LOAD data segment marked R.</li>
40 <li>There can be at most one PT_LOAD data segment marked RW.</li>
41 <li>There can be at most one PT_GNU_STACK segment. It must be marked RW.</li>
42 <li>All segments must end before limit address (4 GiB).</li>
43 </ul>
44 <h2 id="runtime-invariants">Runtime Invariants</h2>
45 <p>To ensure fault isolation at runtime, the system must maintain a
46 number of runtime <em>invariants</em> across the lifetime of the running
47 program. Both the <em>Validator</em> and the <em>Service Runtime</em> are
48 responsible for maintaining the invariants. See the paper for the
49 rationale for the invariants:</p>
50 <ul class="small-gap">
51 <li><code>RIP</code> always points to valid instruction boundary (the validator must
52 ensure this with direct jumps and direct calls).</li>
53 <li><code>R15</code> (aka <code>RBASE</code> and <code>RZP</code>) is never modified by code (the
54 validator must ensure this). Low 32 bits of <code>RZP</code> are all zero
55 (loader must ensure this).</li>
56 <li><code>RIP</code>, <code>RBP</code> and <code>RSP</code> are always in the <strong>safe zone</strong>: between
57 <code>R15</code> and <code>R15+4GiB</code>.</li>
58 </ul>
59 <blockquote>
60 <div><ul class="small-gap">
61 <li>Exception: <code>RSP</code> and <code>RBP</code> are allowed to be in the range of
62 <code>0..4GiB</code> inside <em>pseudo-instructions</em>: <code>naclrestbp</code>,
63 <code>naclrestsp</code>, <code>naclspadj</code>, <code>naclasp</code>, <code>naclssp</code>.</li>
64 </ul>
65 </div></blockquote>
66 <ul class="small-gap">
67 <li>84GiB are allocated for NaCl module (i.e. <strong>untrusted region</strong>):</li>
68 </ul>
69 <blockquote>
70 <div><ul class="small-gap">
71 <li><code>R15-40GiB..R15</code> and <code>R15+4GIB..R15+44GiB</code> are buffer zones with
72 PROT_NONE flags.</li>
73 <li>The 4GB <em>safe zone</em> has pages with either PROT_WRITE or PROT_EXEC
74 but must not have PROT_WRITE+PROT_EXEC pages.</li>
75 <li>All executable code in PROT_EXEC pages is validatable and
76 guaranteed to obey the invariant.</li>
77 </ul>
78 </div></blockquote>
79 <ul class="small-gap">
80 <li>Trampoline/springboard code is mapped to a non-writable region in
81 the <em>untrusted 84GB region</em>; each trampoline/springboard is 32-byte
82 aligned and fits within a single <em>bundle</em>.</li>
83 <li>The OS must not put any internal structures/code into the untrusted
84 region at any time (not using OS dynamic linker, etc)</li>
85 </ul>
86 <h2 id="text-segment-rules"><span id="x86-64-text-segment-rules"></span>Text Segment Rules</h2>
87 <ul class="small-gap">
88 <li>The validation process must ensure that the text segment complies
89 with the following rules. The validation process must complete
90 successfully strictly before executing any instruction of the
91 untrusted code.</li>
92 <li>The following instructions are illegal and must be rejected by the
93 validator (the list is not exhaustive as the validator uses a
94 whiteist, not a blacklist; this means there is a large but finite
95 list of instructions the validator allows, not a small list of
96 instructions the validator rejects):</li>
97 </ul>
98 <blockquote>
99 <div><ul class="small-gap">
100 <li>any privileged instructions</li>
101 <li><code>mov</code> to/from segment registers</li>
102 <li><code>int</code></li>
103 <li><code>pusha</code>/<code>popa</code> (not dangerous but not needed for GCC)</li>
104 </ul>
105 </div></blockquote>
106 <ul class="small-gap">
107 <li>There must be space for at least 32 bytes after the text segment and
108 before the next segment in ELF (towards higher addresses) that ends
109 strictly at a 64K boundary (a minimum page size for untrusted
110 code). This space will be padded with HLT instructions as part of
111 the validation process, along with the optional 64K page.</li>
112 <li>Neither instructions nor <em>pseudo-instructions</em> are permitted to span
113 a 32-byte boundary.</li>
114 <li>The ELF entry address must be 32-byte aligned.</li>
115 <li>Direct <code>CALL</code>/<code>JUMP</code> targets:</li>
116 </ul>
117 <blockquote>
118 <div><ul class="small-gap">
119 <li>must point to a valid instruction boundary</li>
120 <li>must not point into a <em>pseudo-instruction</em></li>
121 <li>must not point between a <em>restricted register</em> (see below for
122 definition) producer instruction and its corresponding restricted
123 register consumer instruction.</li>
124 </ul>
125 </div></blockquote>
126 <ul class="small-gap">
127 <li><code>CALL</code> instructions must be 5 bytes before a 32-byte boundary, so
128 that the return address will be 32-byte aligned.</li>
129 <li>Indirect call targets must be 32-byte aligned. Instead of indirect
130 <code>CALL</code>/<code>JMP</code> x, use <code>nacljmp</code> and <code>naclcall</code> (see below for
131 definitions of these <em>pseudo-instructions</em>)</li>
132 <li>All instructions that <strong>read</strong> or <strong>write</strong> from/to memory must use
133 one of the four registers <code>RZP</code>, <code>RIP</code>, <code>RBP</code> or <code>RSP</code> as a
134 base, restricted (see below) register index (multiplied by 0, 1, 2,
135 4 or 8) and constant displacement (optional).</li>
136 </ul>
137 <blockquote>
138 <div><ul class="small-gap">
139 <li><p class="first">Exception to this rule: string instructions are allowed if used in
140 following sequences (the sequences should not cross <em>bundle</em>
141 boundaries; segment overrides are disallowed):</p>
142 <pre>
143 mov %edi, %edi
144 lea (%rZP,%rdi),%rdi
145 [rep] stos ; other string instructions can be used here
146 </pre>
147 <p>Note: this is identical to the <em>pseudo-instruction</em>: <code>[rep] stos
148 %?ax, %nacl:(%rdi),%rZP</code></p>
149 </li>
150 </ul>
151 </div></blockquote>
152 <ul class="small-gap">
153 <li>An operand of a command is said to be a <strong>restricted register</strong> iff
154 it is a register that is the target of a 32-bit move in the
155 immediately-preceding command in the same <em>bundle</em> (consider the
156 previous command as additional sandboxing prefix):</li>
157 </ul>
158 <blockquote>
159 <div><pre>
160 ; any 32-bit register can be used here; the first operand is
161 ; unrestricted but often is the same register
162 mov ..., %eXX
163 </pre>
164 </div></blockquote>
165 <ul class="small-gap">
166 <li>Instructions capable of changing <code>%RBP</code> and <code>%RSP</code> are
167 forbidden, except the instruction sequences in the whitelist below,
168 which must not cross <em>bundle</em> boundaries:</li>
169 </ul>
170 <blockquote>
171 <div><pre>
172 mov %rbp, %rsp
173 mov %rsp, %rbp
174 mov ..., %ebp
175 ; restoration of %RBP from memory, register or stack - keeps the
176 ; invariant intact
177 add %rZP, %rbp
178 mov ..., %esp
179 ; restoration of %RSP from memory, register or stack - keeps the
180 ; invariant intact
181 add %rZP, %rsp
182 lea xxx(%rbp), %esp
183 add %rZP, %rsp ; restoration of %RSP from %RBP with adjust
184 sub ..., %esp
185 add %rZP, %rsp ; stack space allocation
186 add ..., %esp
187 add %rZP, %rsp ; stack space deallocation
188 and $XX, %rsp ; alignment; XX must be between -128 and -1
189 pushq ...
190 popq ... ; except pop %RSP, pop %RBP
191 </pre>
192 </div></blockquote>
193 <h2 id="list-of-pseudo-instructions">List of Pseudo-instructions</h2>
194 <p>Pseudo-instructions were introduced to let the compiler maintain the
195 invariants without needing to know the code alignment rules. The
196 assembler guarantees 32-bit alignment for all <em>pseudo-instructions</em> in
197 the table below. In addition, to the pseudo-instructions, one
198 pseudo-operand prefix is introduced: <code>%nacl</code>. Presence of the
199 <code>%nacl</code> operand prefix ensures that:</p>
200 <ul class="small-gap">
201 <li>The instruction <code>&quot;%mov %eXX, %eXX&quot;</code> is added immediately before the
202 actual command using prefix <code>%nacl</code> (where <code>%eXX</code> is a 32-bit
203 part of the index register of the actual command, for example: in
204 operand <code>%nacl:(,%r11)</code>, the notation <code>%eXX</code> is referring to
205 <code>%r11d</code>)</li>
206 <li>The resulting sequence of two instructions does not cross the
207 <em>bundle</em> boundary.</li>
208 </ul>
209 <p>For example, the instruction:</p>
210 <pre>
211 mov %eax,%nacl:(%r15,%rdi,2)
212 </pre>
213 <p>is translated by the assembler to:</p>
214 <pre>
215 mov %edi,%edi
216 mov %eax,(%r15,%rdi,2)
217 </pre>
218 <p>The complete list of introduced <em>pseudo-instructions</em> is as follows:</p>
219 <table border=1>
220 <tbody>
221 <tr>
222 <td>Pseudo-instruction</td>
223 <td>Is translated to<br/>
224 </td>
225 </tr>
226 <tr>
227 <td>[rep] cmps %nacl:(%rsi),%nacl:(%rdi),%rZP<br/>
228 <i>(sandboxed cmps)</i><br/>
229 </td>
230 <td>mov %esi,%esi<br/>
231 lea (%rZP,%rsi,1),%rsi<br/>
232 mov %edi,%edi<br/>
233 lea (%rZP,%rdi,1),%rdi<br/>
234 [rep] cmps (%rsi),(%rdi)<i><br/>
235 </i>
236 </td>
237 </tr>
238 <tr>
239 <td>[rep] movs %nacl:(%rsi),%nacl:(%rdi),%rZP<br/>
240 <i>(sandboxed movs)</i><br/>
241 </td>
242 <td>mov %esi,%esi<br/>
243 lea (%rZP,%rsi,1),%rsi<br/>
244 mov %edi,%edi<br/>
245 lea (%rZP,%rdi,1),%rdi<br/>
246 [rep] movs (%rsi),(%rdi)<i><br/>
247 </i>
248 </td>
249 </tr>
250 <tr>
251 <td>naclasp ...,%rZP<br/>
252 <i>(sandboxed stack increment)</i></td>
253 <td>add ...,%esp<br/>
254 add %rZP,%rsp</td>
255 </tr>
256 <tr>
257 <td>naclcall %eXX,%rZP<br/>
258 <i>(sandboxed indirect call)</i></td>
259 <td>and $-32, %eXX<br/>
260 add %rZP, %rXX<br/>
261 call *%rXX<br/>
262 <i>Note: the assembler ensures all calls (including
263 naclcall) will end at the bundle boundary.</i></td>
264 </tr>
265 <tr>
266 <td>nacljmp %eXX,%rZP<br/>
267 <i>(sandboxed indirect jump)</i></td>
268 <td>and $-32,%eXX<br/>
269 add %rZP,%rXX<br/>
270 jmp *%rXX<br/>
271 </td>
272 </tr>
273 <tr>
274 <td>naclrestbp ...,%rZP<br/>
275 <i>(sandboxed %ebp/rbp restore)</i></td>
276 <td>mov ...,%ebp<br/>
277 add %rZP,%rbp</td>
278 </tr>
279 <tr>
280 <td>naclrestsp ...,%rZP
281 <i>(sandboxed %esp/rsp restore)</i></td>
282 <td>mov ...,%esp<br/>
283 add %rZP,%rsp</td>
284 </tr>
285 <tr>
286 <td>naclrestsp_noflags ...,%rZP
287 <i>(sandboxed %esp/rsp restore)</i></td>
288 <td>mov ...,%esp<br/>
289 lea (%rsp,%rZP,1),%rsp</td>
290 </tr>
291 <tr>
292 <td>naclspadj $N,%rZP<br/>
293 <i>(sandboxed %esp/rsp restore from %rbp; incudes $N offset)</i></td>
294 <td>lea N(%rbp),%esp<br/>
295 add %rZP,%rsp</td>
296 </tr>
297 <tr>
298 <td>naclssp ...,%rZP<br/>
299 <i>(sandboxed stack decrement)</i></td>
300 <td>sub ...,%esp<br/>
301 add %rZP,%rsp</td>
302 </tr>
303 <tr>
304 <td>[rep] scas %nacl:(%rdi),%?ax,%rZP<br/>
305 <i>(sandboxed stos)</i></td>
306 <td>mov %edi,%edi<br/>
307 lea (%rZP,%rdi,1),%rdi<br/>
308 [rep] scas (%rdi),%?ax<br/>
309 </td>
310 </tr>
311 <tr>
312 <td>[rep] stos %?ax,%nacl:(%rdi),%rZP<br/>
313 <i>(sandboxed stos)</i></td>
314 <td>mov %edi,%edi<br/>
315 lea (%rZP,%rdi,1),%rdi<br/>
316 [rep] stos %?ax,(%rdi)<br/>
317 </td>
318 </tr>
319 </tbody>
320 </table></section>
322 {{/partials.standard_nacl_article}}