PIC16 does allow colon after MBB labels, simplify EmitBasicBlockStart.
[llvm/avr.git] / docs / tutorial / LangImpl7.html
blob040d6e0e232932c5fe0145fa124b006a6c332f7b
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
4 <html>
5 <head>
6 <title>Kaleidoscope: Extending the Language: Mutable Variables / SSA
7 construction</title>
8 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
9 <meta name="author" content="Chris Lattner">
10 <link rel="stylesheet" href="../llvm.css" type="text/css">
11 </head>
13 <body>
15 <div class="doc_title">Kaleidoscope: Extending the Language: Mutable Variables</div>
17 <ul>
18 <li><a href="index.html">Up to Tutorial Index</a></li>
19 <li>Chapter 7
20 <ol>
21 <li><a href="#intro">Chapter 7 Introduction</a></li>
22 <li><a href="#why">Why is this a hard problem?</a></li>
23 <li><a href="#memory">Memory in LLVM</a></li>
24 <li><a href="#kalvars">Mutable Variables in Kaleidoscope</a></li>
25 <li><a href="#adjustments">Adjusting Existing Variables for
26 Mutation</a></li>
27 <li><a href="#assignment">New Assignment Operator</a></li>
28 <li><a href="#localvars">User-defined Local Variables</a></li>
29 <li><a href="#code">Full Code Listing</a></li>
30 </ol>
31 </li>
32 <li><a href="LangImpl8.html">Chapter 8</a>: Conclusion and other useful LLVM
33 tidbits</li>
34 </ul>
36 <div class="doc_author">
37 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
38 </div>
40 <!-- *********************************************************************** -->
41 <div class="doc_section"><a name="intro">Chapter 7 Introduction</a></div>
42 <!-- *********************************************************************** -->
44 <div class="doc_text">
46 <p>Welcome to Chapter 7 of the "<a href="index.html">Implementing a language
47 with LLVM</a>" tutorial. In chapters 1 through 6, we've built a very
48 respectable, albeit simple, <a
49 href="http://en.wikipedia.org/wiki/Functional_programming">functional
50 programming language</a>. In our journey, we learned some parsing techniques,
51 how to build and represent an AST, how to build LLVM IR, and how to optimize
52 the resultant code as well as JIT compile it.</p>
54 <p>While Kaleidoscope is interesting as a functional language, the fact that it
55 is functional makes it "too easy" to generate LLVM IR for it. In particular, a
56 functional language makes it very easy to build LLVM IR directly in <a
57 href="http://en.wikipedia.org/wiki/Static_single_assignment_form">SSA form</a>.
58 Since LLVM requires that the input code be in SSA form, this is a very nice
59 property and it is often unclear to newcomers how to generate code for an
60 imperative language with mutable variables.</p>
62 <p>The short (and happy) summary of this chapter is that there is no need for
63 your front-end to build SSA form: LLVM provides highly tuned and well tested
64 support for this, though the way it works is a bit unexpected for some.</p>
66 </div>
68 <!-- *********************************************************************** -->
69 <div class="doc_section"><a name="why">Why is this a hard problem?</a></div>
70 <!-- *********************************************************************** -->
72 <div class="doc_text">
74 <p>
75 To understand why mutable variables cause complexities in SSA construction,
76 consider this extremely simple C example:
77 </p>
79 <div class="doc_code">
80 <pre>
81 int G, H;
82 int test(_Bool Condition) {
83 int X;
84 if (Condition)
85 X = G;
86 else
87 X = H;
88 return X;
90 </pre>
91 </div>
93 <p>In this case, we have the variable "X", whose value depends on the path
94 executed in the program. Because there are two different possible values for X
95 before the return instruction, a PHI node is inserted to merge the two values.
96 The LLVM IR that we want for this example looks like this:</p>
98 <div class="doc_code">
99 <pre>
100 @G = weak global i32 0 ; type of @G is i32*
101 @H = weak global i32 0 ; type of @H is i32*
103 define i32 @test(i1 %Condition) {
104 entry:
105 br i1 %Condition, label %cond_true, label %cond_false
107 cond_true:
108 %X.0 = load i32* @G
109 br label %cond_next
111 cond_false:
112 %X.1 = load i32* @H
113 br label %cond_next
115 cond_next:
116 %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
117 ret i32 %X.2
119 </pre>
120 </div>
122 <p>In this example, the loads from the G and H global variables are explicit in
123 the LLVM IR, and they live in the then/else branches of the if statement
124 (cond_true/cond_false). In order to merge the incoming values, the X.2 phi node
125 in the cond_next block selects the right value to use based on where control
126 flow is coming from: if control flow comes from the cond_false block, X.2 gets
127 the value of X.1. Alternatively, if control flow comes from cond_true, it gets
128 the value of X.0. The intent of this chapter is not to explain the details of
129 SSA form. For more information, see one of the many <a
130 href="http://en.wikipedia.org/wiki/Static_single_assignment_form">online
131 references</a>.</p>
133 <p>The question for this article is "who places the phi nodes when lowering
134 assignments to mutable variables?". The issue here is that LLVM
135 <em>requires</em> that its IR be in SSA form: there is no "non-ssa" mode for it.
136 However, SSA construction requires non-trivial algorithms and data structures,
137 so it is inconvenient and wasteful for every front-end to have to reproduce this
138 logic.</p>
140 </div>
142 <!-- *********************************************************************** -->
143 <div class="doc_section"><a name="memory">Memory in LLVM</a></div>
144 <!-- *********************************************************************** -->
146 <div class="doc_text">
148 <p>The 'trick' here is that while LLVM does require all register values to be
149 in SSA form, it does not require (or permit) memory objects to be in SSA form.
150 In the example above, note that the loads from G and H are direct accesses to
151 G and H: they are not renamed or versioned. This differs from some other
152 compiler systems, which do try to version memory objects. In LLVM, instead of
153 encoding dataflow analysis of memory into the LLVM IR, it is handled with <a
154 href="../WritingAnLLVMPass.html">Analysis Passes</a> which are computed on
155 demand.</p>
158 With this in mind, the high-level idea is that we want to make a stack variable
159 (which lives in memory, because it is on the stack) for each mutable object in
160 a function. To take advantage of this trick, we need to talk about how LLVM
161 represents stack variables.
162 </p>
164 <p>In LLVM, all memory accesses are explicit with load/store instructions, and
165 it is carefully designed not to have (or need) an "address-of" operator. Notice
166 how the type of the @G/@H global variables is actually "i32*" even though the
167 variable is defined as "i32". What this means is that @G defines <em>space</em>
168 for an i32 in the global data area, but its <em>name</em> actually refers to the
169 address for that space. Stack variables work the same way, except that instead of
170 being declared with global variable definitions, they are declared with the
171 <a href="../LangRef.html#i_alloca">LLVM alloca instruction</a>:</p>
173 <div class="doc_code">
174 <pre>
175 define i32 @example() {
176 entry:
177 %X = alloca i32 ; type of %X is i32*.
179 %tmp = load i32* %X ; load the stack value %X from the stack.
180 %tmp2 = add i32 %tmp, 1 ; increment it
181 store i32 %tmp2, i32* %X ; store it back
183 </pre>
184 </div>
186 <p>This code shows an example of how you can declare and manipulate a stack
187 variable in the LLVM IR. Stack memory allocated with the alloca instruction is
188 fully general: you can pass the address of the stack slot to functions, you can
189 store it in other variables, etc. In our example above, we could rewrite the
190 example to use the alloca technique to avoid using a PHI node:</p>
192 <div class="doc_code">
193 <pre>
194 @G = weak global i32 0 ; type of @G is i32*
195 @H = weak global i32 0 ; type of @H is i32*
197 define i32 @test(i1 %Condition) {
198 entry:
199 %X = alloca i32 ; type of %X is i32*.
200 br i1 %Condition, label %cond_true, label %cond_false
202 cond_true:
203 %X.0 = load i32* @G
204 store i32 %X.0, i32* %X ; Update X
205 br label %cond_next
207 cond_false:
208 %X.1 = load i32* @H
209 store i32 %X.1, i32* %X ; Update X
210 br label %cond_next
212 cond_next:
213 %X.2 = load i32* %X ; Read X
214 ret i32 %X.2
216 </pre>
217 </div>
219 <p>With this, we have discovered a way to handle arbitrary mutable variables
220 without the need to create Phi nodes at all:</p>
222 <ol>
223 <li>Each mutable variable becomes a stack allocation.</li>
224 <li>Each read of the variable becomes a load from the stack.</li>
225 <li>Each update of the variable becomes a store to the stack.</li>
226 <li>Taking the address of a variable just uses the stack address directly.</li>
227 </ol>
229 <p>While this solution has solved our immediate problem, it introduced another
230 one: we have now apparently introduced a lot of stack traffic for very simple
231 and common operations, a major performance problem. Fortunately for us, the
232 LLVM optimizer has a highly-tuned optimization pass named "mem2reg" that handles
233 this case, promoting allocas like this into SSA registers, inserting Phi nodes
234 as appropriate. If you run this example through the pass, for example, you'll
235 get:</p>
237 <div class="doc_code">
238 <pre>
239 $ <b>llvm-as &lt; example.ll | opt -mem2reg | llvm-dis</b>
240 @G = weak global i32 0
241 @H = weak global i32 0
243 define i32 @test(i1 %Condition) {
244 entry:
245 br i1 %Condition, label %cond_true, label %cond_false
247 cond_true:
248 %X.0 = load i32* @G
249 br label %cond_next
251 cond_false:
252 %X.1 = load i32* @H
253 br label %cond_next
255 cond_next:
256 %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
257 ret i32 %X.01
259 </pre>
260 </div>
262 <p>The mem2reg pass implements the standard "iterated dominance frontier"
263 algorithm for constructing SSA form and has a number of optimizations that speed
264 up (very common) degenerate cases. The mem2reg optimization pass is the answer to dealing
265 with mutable variables, and we highly recommend that you depend on it. Note that
266 mem2reg only works on variables in certain circumstances:</p>
268 <ol>
269 <li>mem2reg is alloca-driven: it looks for allocas and if it can handle them, it
270 promotes them. It does not apply to global variables or heap allocations.</li>
272 <li>mem2reg only looks for alloca instructions in the entry block of the
273 function. Being in the entry block guarantees that the alloca is only executed
274 once, which makes analysis simpler.</li>
276 <li>mem2reg only promotes allocas whose uses are direct loads and stores. If
277 the address of the stack object is passed to a function, or if any funny pointer
278 arithmetic is involved, the alloca will not be promoted.</li>
280 <li>mem2reg only works on allocas of <a
281 href="../LangRef.html#t_classifications">first class</a>
282 values (such as pointers, scalars and vectors), and only if the array size
283 of the allocation is 1 (or missing in the .ll file). mem2reg is not capable of
284 promoting structs or arrays to registers. Note that the "scalarrepl" pass is
285 more powerful and can promote structs, "unions", and arrays in many cases.</li>
287 </ol>
290 All of these properties are easy to satisfy for most imperative languages, and
291 we'll illustrate it below with Kaleidoscope. The final question you may be
292 asking is: should I bother with this nonsense for my front-end? Wouldn't it be
293 better if I just did SSA construction directly, avoiding use of the mem2reg
294 optimization pass? In short, we strongly recommend that you use this technique
295 for building SSA form, unless there is an extremely good reason not to. Using
296 this technique is:</p>
298 <ul>
299 <li>Proven and well tested: llvm-gcc and clang both use this technique for local
300 mutable variables. As such, the most common clients of LLVM are using this to
301 handle a bulk of their variables. You can be sure that bugs are found fast and
302 fixed early.</li>
304 <li>Extremely Fast: mem2reg has a number of special cases that make it fast in
305 common cases as well as fully general. For example, it has fast-paths for
306 variables that are only used in a single block, variables that only have one
307 assignment point, good heuristics to avoid insertion of unneeded phi nodes, etc.
308 </li>
310 <li>Needed for debug info generation: <a href="../SourceLevelDebugging.html">
311 Debug information in LLVM</a> relies on having the address of the variable
312 exposed so that debug info can be attached to it. This technique dovetails
313 very naturally with this style of debug info.</li>
314 </ul>
316 <p>If nothing else, this makes it much easier to get your front-end up and
317 running, and is very simple to implement. Lets extend Kaleidoscope with mutable
318 variables now!
319 </p>
321 </div>
323 <!-- *********************************************************************** -->
324 <div class="doc_section"><a name="kalvars">Mutable Variables in
325 Kaleidoscope</a></div>
326 <!-- *********************************************************************** -->
328 <div class="doc_text">
330 <p>Now that we know the sort of problem we want to tackle, lets see what this
331 looks like in the context of our little Kaleidoscope language. We're going to
332 add two features:</p>
334 <ol>
335 <li>The ability to mutate variables with the '=' operator.</li>
336 <li>The ability to define new variables.</li>
337 </ol>
339 <p>While the first item is really what this is about, we only have variables
340 for incoming arguments as well as for induction variables, and redefining those only
341 goes so far :). Also, the ability to define new variables is a
342 useful thing regardless of whether you will be mutating them. Here's a
343 motivating example that shows how we could use these:</p>
345 <div class="doc_code">
346 <pre>
347 # Define ':' for sequencing: as a low-precedence operator that ignores operands
348 # and just returns the RHS.
349 def binary : 1 (x y) y;
351 # Recursive fib, we could do this before.
352 def fib(x)
353 if (x &lt; 3) then
355 else
356 fib(x-1)+fib(x-2);
358 # Iterative fib.
359 def fibi(x)
360 <b>var a = 1, b = 1, c in</b>
361 (for i = 3, i &lt; x in
362 <b>c = a + b</b> :
363 <b>a = b</b> :
364 <b>b = c</b>) :
367 # Call it.
368 fibi(10);
369 </pre>
370 </div>
373 In order to mutate variables, we have to change our existing variables to use
374 the "alloca trick". Once we have that, we'll add our new operator, then extend
375 Kaleidoscope to support new variable definitions.
376 </p>
378 </div>
380 <!-- *********************************************************************** -->
381 <div class="doc_section"><a name="adjustments">Adjusting Existing Variables for
382 Mutation</a></div>
383 <!-- *********************************************************************** -->
385 <div class="doc_text">
388 The symbol table in Kaleidoscope is managed at code generation time by the
389 '<tt>NamedValues</tt>' map. This map currently keeps track of the LLVM "Value*"
390 that holds the double value for the named variable. In order to support
391 mutation, we need to change this slightly, so that it <tt>NamedValues</tt> holds
392 the <em>memory location</em> of the variable in question. Note that this
393 change is a refactoring: it changes the structure of the code, but does not
394 (by itself) change the behavior of the compiler. All of these changes are
395 isolated in the Kaleidoscope code generator.</p>
398 At this point in Kaleidoscope's development, it only supports variables for two
399 things: incoming arguments to functions and the induction variable of 'for'
400 loops. For consistency, we'll allow mutation of these variables in addition to
401 other user-defined variables. This means that these will both need memory
402 locations.
403 </p>
405 <p>To start our transformation of Kaleidoscope, we'll change the NamedValues
406 map so that it maps to AllocaInst* instead of Value*. Once we do this, the C++
407 compiler will tell us what parts of the code we need to update:</p>
409 <div class="doc_code">
410 <pre>
411 static std::map&lt;std::string, AllocaInst*&gt; NamedValues;
412 </pre>
413 </div>
415 <p>Also, since we will need to create these alloca's, we'll use a helper
416 function that ensures that the allocas are created in the entry block of the
417 function:</p>
419 <div class="doc_code">
420 <pre>
421 /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
422 /// the function. This is used for mutable variables etc.
423 static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
424 const std::string &amp;VarName) {
425 IRBuilder&lt;&gt; TmpB(&amp;TheFunction-&gt;getEntryBlock(),
426 TheFunction-&gt;getEntryBlock().begin());
427 return TmpB.CreateAlloca(Type::getDoubleTy(getGlobalContext()), 0,
428 VarName.c_str());
430 </pre>
431 </div>
433 <p>This funny looking code creates an IRBuilder object that is pointing at
434 the first instruction (.begin()) of the entry block. It then creates an alloca
435 with the expected name and returns it. Because all values in Kaleidoscope are
436 doubles, there is no need to pass in a type to use.</p>
438 <p>With this in place, the first functionality change we want to make is to
439 variable references. In our new scheme, variables live on the stack, so code
440 generating a reference to them actually needs to produce a load from the stack
441 slot:</p>
443 <div class="doc_code">
444 <pre>
445 Value *VariableExprAST::Codegen() {
446 // Look this variable up in the function.
447 Value *V = NamedValues[Name];
448 if (V == 0) return ErrorV("Unknown variable name");
450 <b>// Load the value.
451 return Builder.CreateLoad(V, Name.c_str());</b>
453 </pre>
454 </div>
456 <p>As you can see, this is pretty straightforward. Now we need to update the
457 things that define the variables to set up the alloca. We'll start with
458 <tt>ForExprAST::Codegen</tt> (see the <a href="#code">full code listing</a> for
459 the unabridged code):</p>
461 <div class="doc_code">
462 <pre>
463 Function *TheFunction = Builder.GetInsertBlock()->getParent();
465 <b>// Create an alloca for the variable in the entry block.
466 AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);</b>
468 // Emit the start code first, without 'variable' in scope.
469 Value *StartVal = Start-&gt;Codegen();
470 if (StartVal == 0) return 0;
472 <b>// Store the value into the alloca.
473 Builder.CreateStore(StartVal, Alloca);</b>
476 // Compute the end condition.
477 Value *EndCond = End-&gt;Codegen();
478 if (EndCond == 0) return EndCond;
480 <b>// Reload, increment, and restore the alloca. This handles the case where
481 // the body of the loop mutates the variable.
482 Value *CurVar = Builder.CreateLoad(Alloca);
483 Value *NextVar = Builder.CreateAdd(CurVar, StepVal, "nextvar");
484 Builder.CreateStore(NextVar, Alloca);</b>
486 </pre>
487 </div>
489 <p>This code is virtually identical to the code <a
490 href="LangImpl5.html#forcodegen">before we allowed mutable variables</a>. The
491 big difference is that we no longer have to construct a PHI node, and we use
492 load/store to access the variable as needed.</p>
494 <p>To support mutable argument variables, we need to also make allocas for them.
495 The code for this is also pretty simple:</p>
497 <div class="doc_code">
498 <pre>
499 /// CreateArgumentAllocas - Create an alloca for each argument and register the
500 /// argument in the symbol table so that references to it will succeed.
501 void PrototypeAST::CreateArgumentAllocas(Function *F) {
502 Function::arg_iterator AI = F-&gt;arg_begin();
503 for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
504 // Create an alloca for this variable.
505 AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
507 // Store the initial value into the alloca.
508 Builder.CreateStore(AI, Alloca);
510 // Add arguments to variable symbol table.
511 NamedValues[Args[Idx]] = Alloca;
514 </pre>
515 </div>
517 <p>For each argument, we make an alloca, store the input value to the function
518 into the alloca, and register the alloca as the memory location for the
519 argument. This method gets invoked by <tt>FunctionAST::Codegen</tt> right after
520 it sets up the entry block for the function.</p>
522 <p>The final missing piece is adding the mem2reg pass, which allows us to get
523 good codegen once again:</p>
525 <div class="doc_code">
526 <pre>
527 // Set up the optimizer pipeline. Start with registering info about how the
528 // target lays out data structures.
529 OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
530 <b>// Promote allocas to registers.
531 OurFPM.add(createPromoteMemoryToRegisterPass());</b>
532 // Do simple "peephole" optimizations and bit-twiddling optzns.
533 OurFPM.add(createInstructionCombiningPass());
534 // Reassociate expressions.
535 OurFPM.add(createReassociatePass());
536 </pre>
537 </div>
539 <p>It is interesting to see what the code looks like before and after the
540 mem2reg optimization runs. For example, this is the before/after code for our
541 recursive fib function. Before the optimization:</p>
543 <div class="doc_code">
544 <pre>
545 define double @fib(double %x) {
546 entry:
547 <b>%x1 = alloca double
548 store double %x, double* %x1
549 %x2 = load double* %x1</b>
550 %cmptmp = fcmp ult double %x2, 3.000000e+00
551 %booltmp = uitofp i1 %cmptmp to double
552 %ifcond = fcmp one double %booltmp, 0.000000e+00
553 br i1 %ifcond, label %then, label %else
555 then: ; preds = %entry
556 br label %ifcont
558 else: ; preds = %entry
559 <b>%x3 = load double* %x1</b>
560 %subtmp = sub double %x3, 1.000000e+00
561 %calltmp = call double @fib( double %subtmp )
562 <b>%x4 = load double* %x1</b>
563 %subtmp5 = sub double %x4, 2.000000e+00
564 %calltmp6 = call double @fib( double %subtmp5 )
565 %addtmp = add double %calltmp, %calltmp6
566 br label %ifcont
568 ifcont: ; preds = %else, %then
569 %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
570 ret double %iftmp
572 </pre>
573 </div>
575 <p>Here there is only one variable (x, the input argument) but you can still
576 see the extremely simple-minded code generation strategy we are using. In the
577 entry block, an alloca is created, and the initial input value is stored into
578 it. Each reference to the variable does a reload from the stack. Also, note
579 that we didn't modify the if/then/else expression, so it still inserts a PHI
580 node. While we could make an alloca for it, it is actually easier to create a
581 PHI node for it, so we still just make the PHI.</p>
583 <p>Here is the code after the mem2reg pass runs:</p>
585 <div class="doc_code">
586 <pre>
587 define double @fib(double %x) {
588 entry:
589 %cmptmp = fcmp ult double <b>%x</b>, 3.000000e+00
590 %booltmp = uitofp i1 %cmptmp to double
591 %ifcond = fcmp one double %booltmp, 0.000000e+00
592 br i1 %ifcond, label %then, label %else
594 then:
595 br label %ifcont
597 else:
598 %subtmp = sub double <b>%x</b>, 1.000000e+00
599 %calltmp = call double @fib( double %subtmp )
600 %subtmp5 = sub double <b>%x</b>, 2.000000e+00
601 %calltmp6 = call double @fib( double %subtmp5 )
602 %addtmp = add double %calltmp, %calltmp6
603 br label %ifcont
605 ifcont: ; preds = %else, %then
606 %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
607 ret double %iftmp
609 </pre>
610 </div>
612 <p>This is a trivial case for mem2reg, since there are no redefinitions of the
613 variable. The point of showing this is to calm your tension about inserting
614 such blatent inefficiencies :).</p>
616 <p>After the rest of the optimizers run, we get:</p>
618 <div class="doc_code">
619 <pre>
620 define double @fib(double %x) {
621 entry:
622 %cmptmp = fcmp ult double %x, 3.000000e+00
623 %booltmp = uitofp i1 %cmptmp to double
624 %ifcond = fcmp ueq double %booltmp, 0.000000e+00
625 br i1 %ifcond, label %else, label %ifcont
627 else:
628 %subtmp = sub double %x, 1.000000e+00
629 %calltmp = call double @fib( double %subtmp )
630 %subtmp5 = sub double %x, 2.000000e+00
631 %calltmp6 = call double @fib( double %subtmp5 )
632 %addtmp = add double %calltmp, %calltmp6
633 ret double %addtmp
635 ifcont:
636 ret double 1.000000e+00
638 </pre>
639 </div>
641 <p>Here we see that the simplifycfg pass decided to clone the return instruction
642 into the end of the 'else' block. This allowed it to eliminate some branches
643 and the PHI node.</p>
645 <p>Now that all symbol table references are updated to use stack variables,
646 we'll add the assignment operator.</p>
648 </div>
650 <!-- *********************************************************************** -->
651 <div class="doc_section"><a name="assignment">New Assignment Operator</a></div>
652 <!-- *********************************************************************** -->
654 <div class="doc_text">
656 <p>With our current framework, adding a new assignment operator is really
657 simple. We will parse it just like any other binary operator, but handle it
658 internally (instead of allowing the user to define it). The first step is to
659 set a precedence:</p>
661 <div class="doc_code">
662 <pre>
663 int main() {
664 // Install standard binary operators.
665 // 1 is lowest precedence.
666 <b>BinopPrecedence['='] = 2;</b>
667 BinopPrecedence['&lt;'] = 10;
668 BinopPrecedence['+'] = 20;
669 BinopPrecedence['-'] = 20;
670 </pre>
671 </div>
673 <p>Now that the parser knows the precedence of the binary operator, it takes
674 care of all the parsing and AST generation. We just need to implement codegen
675 for the assignment operator. This looks like:</p>
677 <div class="doc_code">
678 <pre>
679 Value *BinaryExprAST::Codegen() {
680 // Special case '=' because we don't want to emit the LHS as an expression.
681 if (Op == '=') {
682 // Assignment requires the LHS to be an identifier.
683 VariableExprAST *LHSE = dynamic_cast&lt;VariableExprAST*&gt;(LHS);
684 if (!LHSE)
685 return ErrorV("destination of '=' must be a variable");
686 </pre>
687 </div>
689 <p>Unlike the rest of the binary operators, our assignment operator doesn't
690 follow the "emit LHS, emit RHS, do computation" model. As such, it is handled
691 as a special case before the other binary operators are handled. The other
692 strange thing is that it requires the LHS to be a variable. It is invalid to
693 have "(x+1) = expr" - only things like "x = expr" are allowed.
694 </p>
696 <div class="doc_code">
697 <pre>
698 // Codegen the RHS.
699 Value *Val = RHS-&gt;Codegen();
700 if (Val == 0) return 0;
702 // Look up the name.
703 Value *Variable = NamedValues[LHSE-&gt;getName()];
704 if (Variable == 0) return ErrorV("Unknown variable name");
706 Builder.CreateStore(Val, Variable);
707 return Val;
709 ...
710 </pre>
711 </div>
713 <p>Once we have the variable, codegen'ing the assignment is straightforward:
714 we emit the RHS of the assignment, create a store, and return the computed
715 value. Returning a value allows for chained assignments like "X = (Y = Z)".</p>
717 <p>Now that we have an assignment operator, we can mutate loop variables and
718 arguments. For example, we can now run code like this:</p>
720 <div class="doc_code">
721 <pre>
722 # Function to print a double.
723 extern printd(x);
725 # Define ':' for sequencing: as a low-precedence operator that ignores operands
726 # and just returns the RHS.
727 def binary : 1 (x y) y;
729 def test(x)
730 printd(x) :
731 x = 4 :
732 printd(x);
734 test(123);
735 </pre>
736 </div>
738 <p>When run, this example prints "123" and then "4", showing that we did
739 actually mutate the value! Okay, we have now officially implemented our goal:
740 getting this to work requires SSA construction in the general case. However,
741 to be really useful, we want the ability to define our own local variables, lets
742 add this next!
743 </p>
745 </div>
747 <!-- *********************************************************************** -->
748 <div class="doc_section"><a name="localvars">User-defined Local
749 Variables</a></div>
750 <!-- *********************************************************************** -->
752 <div class="doc_text">
754 <p>Adding var/in is just like any other other extensions we made to
755 Kaleidoscope: we extend the lexer, the parser, the AST and the code generator.
756 The first step for adding our new 'var/in' construct is to extend the lexer.
757 As before, this is pretty trivial, the code looks like this:</p>
759 <div class="doc_code">
760 <pre>
761 enum Token {
763 <b>// var definition
764 tok_var = -13</b>
768 static int gettok() {
770 if (IdentifierStr == "in") return tok_in;
771 if (IdentifierStr == "binary") return tok_binary;
772 if (IdentifierStr == "unary") return tok_unary;
773 <b>if (IdentifierStr == "var") return tok_var;</b>
774 return tok_identifier;
776 </pre>
777 </div>
779 <p>The next step is to define the AST node that we will construct. For var/in,
780 it looks like this:</p>
782 <div class="doc_code">
783 <pre>
784 /// VarExprAST - Expression class for var/in
785 class VarExprAST : public ExprAST {
786 std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
787 ExprAST *Body;
788 public:
789 VarExprAST(const std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; &amp;varnames,
790 ExprAST *body)
791 : VarNames(varnames), Body(body) {}
793 virtual Value *Codegen();
795 </pre>
796 </div>
798 <p>var/in allows a list of names to be defined all at once, and each name can
799 optionally have an initializer value. As such, we capture this information in
800 the VarNames vector. Also, var/in has a body, this body is allowed to access
801 the variables defined by the var/in.</p>
803 <p>With this in place, we can define the parser pieces. The first thing we do is add
804 it as a primary expression:</p>
806 <div class="doc_code">
807 <pre>
808 /// primary
809 /// ::= identifierexpr
810 /// ::= numberexpr
811 /// ::= parenexpr
812 /// ::= ifexpr
813 /// ::= forexpr
814 <b>/// ::= varexpr</b>
815 static ExprAST *ParsePrimary() {
816 switch (CurTok) {
817 default: return Error("unknown token when expecting an expression");
818 case tok_identifier: return ParseIdentifierExpr();
819 case tok_number: return ParseNumberExpr();
820 case '(': return ParseParenExpr();
821 case tok_if: return ParseIfExpr();
822 case tok_for: return ParseForExpr();
823 <b>case tok_var: return ParseVarExpr();</b>
826 </pre>
827 </div>
829 <p>Next we define ParseVarExpr:</p>
831 <div class="doc_code">
832 <pre>
833 /// varexpr ::= 'var' identifier ('=' expression)?
834 // (',' identifier ('=' expression)?)* 'in' expression
835 static ExprAST *ParseVarExpr() {
836 getNextToken(); // eat the var.
838 std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
840 // At least one variable name is required.
841 if (CurTok != tok_identifier)
842 return Error("expected identifier after var");
843 </pre>
844 </div>
846 <p>The first part of this code parses the list of identifier/expr pairs into the
847 local <tt>VarNames</tt> vector.
849 <div class="doc_code">
850 <pre>
851 while (1) {
852 std::string Name = IdentifierStr;
853 getNextToken(); // eat identifier.
855 // Read the optional initializer.
856 ExprAST *Init = 0;
857 if (CurTok == '=') {
858 getNextToken(); // eat the '='.
860 Init = ParseExpression();
861 if (Init == 0) return 0;
864 VarNames.push_back(std::make_pair(Name, Init));
866 // End of var list, exit loop.
867 if (CurTok != ',') break;
868 getNextToken(); // eat the ','.
870 if (CurTok != tok_identifier)
871 return Error("expected identifier list after var");
873 </pre>
874 </div>
876 <p>Once all the variables are parsed, we then parse the body and create the
877 AST node:</p>
879 <div class="doc_code">
880 <pre>
881 // At this point, we have to have 'in'.
882 if (CurTok != tok_in)
883 return Error("expected 'in' keyword after 'var'");
884 getNextToken(); // eat 'in'.
886 ExprAST *Body = ParseExpression();
887 if (Body == 0) return 0;
889 return new VarExprAST(VarNames, Body);
891 </pre>
892 </div>
894 <p>Now that we can parse and represent the code, we need to support emission of
895 LLVM IR for it. This code starts out with:</p>
897 <div class="doc_code">
898 <pre>
899 Value *VarExprAST::Codegen() {
900 std::vector&lt;AllocaInst *&gt; OldBindings;
902 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
904 // Register all variables and emit their initializer.
905 for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
906 const std::string &amp;VarName = VarNames[i].first;
907 ExprAST *Init = VarNames[i].second;
908 </pre>
909 </div>
911 <p>Basically it loops over all the variables, installing them one at a time.
912 For each variable we put into the symbol table, we remember the previous value
913 that we replace in OldBindings.</p>
915 <div class="doc_code">
916 <pre>
917 // Emit the initializer before adding the variable to scope, this prevents
918 // the initializer from referencing the variable itself, and permits stuff
919 // like this:
920 // var a = 1 in
921 // var a = a in ... # refers to outer 'a'.
922 Value *InitVal;
923 if (Init) {
924 InitVal = Init-&gt;Codegen();
925 if (InitVal == 0) return 0;
926 } else { // If not specified, use 0.0.
927 InitVal = ConstantFP::get(getGlobalContext(), APFloat(0.0));
930 AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
931 Builder.CreateStore(InitVal, Alloca);
933 // Remember the old variable binding so that we can restore the binding when
934 // we unrecurse.
935 OldBindings.push_back(NamedValues[VarName]);
937 // Remember this binding.
938 NamedValues[VarName] = Alloca;
940 </pre>
941 </div>
943 <p>There are more comments here than code. The basic idea is that we emit the
944 initializer, create the alloca, then update the symbol table to point to it.
945 Once all the variables are installed in the symbol table, we evaluate the body
946 of the var/in expression:</p>
948 <div class="doc_code">
949 <pre>
950 // Codegen the body, now that all vars are in scope.
951 Value *BodyVal = Body-&gt;Codegen();
952 if (BodyVal == 0) return 0;
953 </pre>
954 </div>
956 <p>Finally, before returning, we restore the previous variable bindings:</p>
958 <div class="doc_code">
959 <pre>
960 // Pop all our variables from scope.
961 for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
962 NamedValues[VarNames[i].first] = OldBindings[i];
964 // Return the body computation.
965 return BodyVal;
967 </pre>
968 </div>
970 <p>The end result of all of this is that we get properly scoped variable
971 definitions, and we even (trivially) allow mutation of them :).</p>
973 <p>With this, we completed what we set out to do. Our nice iterative fib
974 example from the intro compiles and runs just fine. The mem2reg pass optimizes
975 all of our stack variables into SSA registers, inserting PHI nodes where needed,
976 and our front-end remains simple: no "iterated dominance frontier" computation
977 anywhere in sight.</p>
979 </div>
981 <!-- *********************************************************************** -->
982 <div class="doc_section"><a name="code">Full Code Listing</a></div>
983 <!-- *********************************************************************** -->
985 <div class="doc_text">
988 Here is the complete code listing for our running example, enhanced with mutable
989 variables and var/in support. To build this example, use:
990 </p>
992 <div class="doc_code">
993 <pre>
994 # Compile
995 g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
996 # Run
997 ./toy
998 </pre>
999 </div>
1001 <p>Here is the code:</p>
1003 <div class="doc_code">
1004 <pre>
1005 #include "llvm/DerivedTypes.h"
1006 #include "llvm/ExecutionEngine/ExecutionEngine.h"
1007 #include "llvm/ExecutionEngine/Interpreter.h"
1008 #include "llvm/ExecutionEngine/JIT.h"
1009 #include "llvm/LLVMContext.h"
1010 #include "llvm/Module.h"
1011 #include "llvm/ModuleProvider.h"
1012 #include "llvm/PassManager.h"
1013 #include "llvm/Analysis/Verifier.h"
1014 #include "llvm/Target/TargetData.h"
1015 #include "llvm/Target/TargetSelect.h"
1016 #include "llvm/Transforms/Scalar.h"
1017 #include "llvm/Support/IRBuilder.h"
1018 #include &lt;cstdio&gt;
1019 #include &lt;string&gt;
1020 #include &lt;map&gt;
1021 #include &lt;vector&gt;
1022 using namespace llvm;
1024 //===----------------------------------------------------------------------===//
1025 // Lexer
1026 //===----------------------------------------------------------------------===//
1028 // The lexer returns tokens [0-255] if it is an unknown character, otherwise one
1029 // of these for known things.
1030 enum Token {
1031 tok_eof = -1,
1033 // commands
1034 tok_def = -2, tok_extern = -3,
1036 // primary
1037 tok_identifier = -4, tok_number = -5,
1039 // control
1040 tok_if = -6, tok_then = -7, tok_else = -8,
1041 tok_for = -9, tok_in = -10,
1043 // operators
1044 tok_binary = -11, tok_unary = -12,
1046 // var definition
1047 tok_var = -13
1050 static std::string IdentifierStr; // Filled in if tok_identifier
1051 static double NumVal; // Filled in if tok_number
1053 /// gettok - Return the next token from standard input.
1054 static int gettok() {
1055 static int LastChar = ' ';
1057 // Skip any whitespace.
1058 while (isspace(LastChar))
1059 LastChar = getchar();
1061 if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
1062 IdentifierStr = LastChar;
1063 while (isalnum((LastChar = getchar())))
1064 IdentifierStr += LastChar;
1066 if (IdentifierStr == "def") return tok_def;
1067 if (IdentifierStr == "extern") return tok_extern;
1068 if (IdentifierStr == "if") return tok_if;
1069 if (IdentifierStr == "then") return tok_then;
1070 if (IdentifierStr == "else") return tok_else;
1071 if (IdentifierStr == "for") return tok_for;
1072 if (IdentifierStr == "in") return tok_in;
1073 if (IdentifierStr == "binary") return tok_binary;
1074 if (IdentifierStr == "unary") return tok_unary;
1075 if (IdentifierStr == "var") return tok_var;
1076 return tok_identifier;
1079 if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
1080 std::string NumStr;
1081 do {
1082 NumStr += LastChar;
1083 LastChar = getchar();
1084 } while (isdigit(LastChar) || LastChar == '.');
1086 NumVal = strtod(NumStr.c_str(), 0);
1087 return tok_number;
1090 if (LastChar == '#') {
1091 // Comment until end of line.
1092 do LastChar = getchar();
1093 while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp;&amp; LastChar != '\r');
1095 if (LastChar != EOF)
1096 return gettok();
1099 // Check for end of file. Don't eat the EOF.
1100 if (LastChar == EOF)
1101 return tok_eof;
1103 // Otherwise, just return the character as its ascii value.
1104 int ThisChar = LastChar;
1105 LastChar = getchar();
1106 return ThisChar;
1109 //===----------------------------------------------------------------------===//
1110 // Abstract Syntax Tree (aka Parse Tree)
1111 //===----------------------------------------------------------------------===//
1113 /// ExprAST - Base class for all expression nodes.
1114 class ExprAST {
1115 public:
1116 virtual ~ExprAST() {}
1117 virtual Value *Codegen() = 0;
1120 /// NumberExprAST - Expression class for numeric literals like "1.0".
1121 class NumberExprAST : public ExprAST {
1122 double Val;
1123 public:
1124 NumberExprAST(double val) : Val(val) {}
1125 virtual Value *Codegen();
1128 /// VariableExprAST - Expression class for referencing a variable, like "a".
1129 class VariableExprAST : public ExprAST {
1130 std::string Name;
1131 public:
1132 VariableExprAST(const std::string &amp;name) : Name(name) {}
1133 const std::string &amp;getName() const { return Name; }
1134 virtual Value *Codegen();
1137 /// UnaryExprAST - Expression class for a unary operator.
1138 class UnaryExprAST : public ExprAST {
1139 char Opcode;
1140 ExprAST *Operand;
1141 public:
1142 UnaryExprAST(char opcode, ExprAST *operand)
1143 : Opcode(opcode), Operand(operand) {}
1144 virtual Value *Codegen();
1147 /// BinaryExprAST - Expression class for a binary operator.
1148 class BinaryExprAST : public ExprAST {
1149 char Op;
1150 ExprAST *LHS, *RHS;
1151 public:
1152 BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
1153 : Op(op), LHS(lhs), RHS(rhs) {}
1154 virtual Value *Codegen();
1157 /// CallExprAST - Expression class for function calls.
1158 class CallExprAST : public ExprAST {
1159 std::string Callee;
1160 std::vector&lt;ExprAST*&gt; Args;
1161 public:
1162 CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
1163 : Callee(callee), Args(args) {}
1164 virtual Value *Codegen();
1167 /// IfExprAST - Expression class for if/then/else.
1168 class IfExprAST : public ExprAST {
1169 ExprAST *Cond, *Then, *Else;
1170 public:
1171 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
1172 : Cond(cond), Then(then), Else(_else) {}
1173 virtual Value *Codegen();
1176 /// ForExprAST - Expression class for for/in.
1177 class ForExprAST : public ExprAST {
1178 std::string VarName;
1179 ExprAST *Start, *End, *Step, *Body;
1180 public:
1181 ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
1182 ExprAST *step, ExprAST *body)
1183 : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
1184 virtual Value *Codegen();
1187 /// VarExprAST - Expression class for var/in
1188 class VarExprAST : public ExprAST {
1189 std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
1190 ExprAST *Body;
1191 public:
1192 VarExprAST(const std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; &amp;varnames,
1193 ExprAST *body)
1194 : VarNames(varnames), Body(body) {}
1196 virtual Value *Codegen();
1199 /// PrototypeAST - This class represents the "prototype" for a function,
1200 /// which captures its argument names as well as if it is an operator.
1201 class PrototypeAST {
1202 std::string Name;
1203 std::vector&lt;std::string&gt; Args;
1204 bool isOperator;
1205 unsigned Precedence; // Precedence if a binary op.
1206 public:
1207 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
1208 bool isoperator = false, unsigned prec = 0)
1209 : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
1211 bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
1212 bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
1214 char getOperatorName() const {
1215 assert(isUnaryOp() || isBinaryOp());
1216 return Name[Name.size()-1];
1219 unsigned getBinaryPrecedence() const { return Precedence; }
1221 Function *Codegen();
1223 void CreateArgumentAllocas(Function *F);
1226 /// FunctionAST - This class represents a function definition itself.
1227 class FunctionAST {
1228 PrototypeAST *Proto;
1229 ExprAST *Body;
1230 public:
1231 FunctionAST(PrototypeAST *proto, ExprAST *body)
1232 : Proto(proto), Body(body) {}
1234 Function *Codegen();
1237 //===----------------------------------------------------------------------===//
1238 // Parser
1239 //===----------------------------------------------------------------------===//
1241 /// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
1242 /// token the parser it looking at. getNextToken reads another token from the
1243 /// lexer and updates CurTok with its results.
1244 static int CurTok;
1245 static int getNextToken() {
1246 return CurTok = gettok();
1249 /// BinopPrecedence - This holds the precedence for each binary operator that is
1250 /// defined.
1251 static std::map&lt;char, int&gt; BinopPrecedence;
1253 /// GetTokPrecedence - Get the precedence of the pending binary operator token.
1254 static int GetTokPrecedence() {
1255 if (!isascii(CurTok))
1256 return -1;
1258 // Make sure it's a declared binop.
1259 int TokPrec = BinopPrecedence[CurTok];
1260 if (TokPrec &lt;= 0) return -1;
1261 return TokPrec;
1264 /// Error* - These are little helper functions for error handling.
1265 ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1266 PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1267 FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1269 static ExprAST *ParseExpression();
1271 /// identifierexpr
1272 /// ::= identifier
1273 /// ::= identifier '(' expression* ')'
1274 static ExprAST *ParseIdentifierExpr() {
1275 std::string IdName = IdentifierStr;
1277 getNextToken(); // eat identifier.
1279 if (CurTok != '(') // Simple variable ref.
1280 return new VariableExprAST(IdName);
1282 // Call.
1283 getNextToken(); // eat (
1284 std::vector&lt;ExprAST*&gt; Args;
1285 if (CurTok != ')') {
1286 while (1) {
1287 ExprAST *Arg = ParseExpression();
1288 if (!Arg) return 0;
1289 Args.push_back(Arg);
1291 if (CurTok == ')') break;
1293 if (CurTok != ',')
1294 return Error("Expected ')' or ',' in argument list");
1295 getNextToken();
1299 // Eat the ')'.
1300 getNextToken();
1302 return new CallExprAST(IdName, Args);
1305 /// numberexpr ::= number
1306 static ExprAST *ParseNumberExpr() {
1307 ExprAST *Result = new NumberExprAST(NumVal);
1308 getNextToken(); // consume the number
1309 return Result;
1312 /// parenexpr ::= '(' expression ')'
1313 static ExprAST *ParseParenExpr() {
1314 getNextToken(); // eat (.
1315 ExprAST *V = ParseExpression();
1316 if (!V) return 0;
1318 if (CurTok != ')')
1319 return Error("expected ')'");
1320 getNextToken(); // eat ).
1321 return V;
1324 /// ifexpr ::= 'if' expression 'then' expression 'else' expression
1325 static ExprAST *ParseIfExpr() {
1326 getNextToken(); // eat the if.
1328 // condition.
1329 ExprAST *Cond = ParseExpression();
1330 if (!Cond) return 0;
1332 if (CurTok != tok_then)
1333 return Error("expected then");
1334 getNextToken(); // eat the then
1336 ExprAST *Then = ParseExpression();
1337 if (Then == 0) return 0;
1339 if (CurTok != tok_else)
1340 return Error("expected else");
1342 getNextToken();
1344 ExprAST *Else = ParseExpression();
1345 if (!Else) return 0;
1347 return new IfExprAST(Cond, Then, Else);
1350 /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
1351 static ExprAST *ParseForExpr() {
1352 getNextToken(); // eat the for.
1354 if (CurTok != tok_identifier)
1355 return Error("expected identifier after for");
1357 std::string IdName = IdentifierStr;
1358 getNextToken(); // eat identifier.
1360 if (CurTok != '=')
1361 return Error("expected '=' after for");
1362 getNextToken(); // eat '='.
1365 ExprAST *Start = ParseExpression();
1366 if (Start == 0) return 0;
1367 if (CurTok != ',')
1368 return Error("expected ',' after for start value");
1369 getNextToken();
1371 ExprAST *End = ParseExpression();
1372 if (End == 0) return 0;
1374 // The step value is optional.
1375 ExprAST *Step = 0;
1376 if (CurTok == ',') {
1377 getNextToken();
1378 Step = ParseExpression();
1379 if (Step == 0) return 0;
1382 if (CurTok != tok_in)
1383 return Error("expected 'in' after for");
1384 getNextToken(); // eat 'in'.
1386 ExprAST *Body = ParseExpression();
1387 if (Body == 0) return 0;
1389 return new ForExprAST(IdName, Start, End, Step, Body);
1392 /// varexpr ::= 'var' identifier ('=' expression)?
1393 // (',' identifier ('=' expression)?)* 'in' expression
1394 static ExprAST *ParseVarExpr() {
1395 getNextToken(); // eat the var.
1397 std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
1399 // At least one variable name is required.
1400 if (CurTok != tok_identifier)
1401 return Error("expected identifier after var");
1403 while (1) {
1404 std::string Name = IdentifierStr;
1405 getNextToken(); // eat identifier.
1407 // Read the optional initializer.
1408 ExprAST *Init = 0;
1409 if (CurTok == '=') {
1410 getNextToken(); // eat the '='.
1412 Init = ParseExpression();
1413 if (Init == 0) return 0;
1416 VarNames.push_back(std::make_pair(Name, Init));
1418 // End of var list, exit loop.
1419 if (CurTok != ',') break;
1420 getNextToken(); // eat the ','.
1422 if (CurTok != tok_identifier)
1423 return Error("expected identifier list after var");
1426 // At this point, we have to have 'in'.
1427 if (CurTok != tok_in)
1428 return Error("expected 'in' keyword after 'var'");
1429 getNextToken(); // eat 'in'.
1431 ExprAST *Body = ParseExpression();
1432 if (Body == 0) return 0;
1434 return new VarExprAST(VarNames, Body);
1438 /// primary
1439 /// ::= identifierexpr
1440 /// ::= numberexpr
1441 /// ::= parenexpr
1442 /// ::= ifexpr
1443 /// ::= forexpr
1444 /// ::= varexpr
1445 static ExprAST *ParsePrimary() {
1446 switch (CurTok) {
1447 default: return Error("unknown token when expecting an expression");
1448 case tok_identifier: return ParseIdentifierExpr();
1449 case tok_number: return ParseNumberExpr();
1450 case '(': return ParseParenExpr();
1451 case tok_if: return ParseIfExpr();
1452 case tok_for: return ParseForExpr();
1453 case tok_var: return ParseVarExpr();
1457 /// unary
1458 /// ::= primary
1459 /// ::= '!' unary
1460 static ExprAST *ParseUnary() {
1461 // If the current token is not an operator, it must be a primary expr.
1462 if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
1463 return ParsePrimary();
1465 // If this is a unary operator, read it.
1466 int Opc = CurTok;
1467 getNextToken();
1468 if (ExprAST *Operand = ParseUnary())
1469 return new UnaryExprAST(Opc, Operand);
1470 return 0;
1473 /// binoprhs
1474 /// ::= ('+' unary)*
1475 static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1476 // If this is a binop, find its precedence.
1477 while (1) {
1478 int TokPrec = GetTokPrecedence();
1480 // If this is a binop that binds at least as tightly as the current binop,
1481 // consume it, otherwise we are done.
1482 if (TokPrec &lt; ExprPrec)
1483 return LHS;
1485 // Okay, we know this is a binop.
1486 int BinOp = CurTok;
1487 getNextToken(); // eat binop
1489 // Parse the unary expression after the binary operator.
1490 ExprAST *RHS = ParseUnary();
1491 if (!RHS) return 0;
1493 // If BinOp binds less tightly with RHS than the operator after RHS, let
1494 // the pending operator take RHS as its LHS.
1495 int NextPrec = GetTokPrecedence();
1496 if (TokPrec &lt; NextPrec) {
1497 RHS = ParseBinOpRHS(TokPrec+1, RHS);
1498 if (RHS == 0) return 0;
1501 // Merge LHS/RHS.
1502 LHS = new BinaryExprAST(BinOp, LHS, RHS);
1506 /// expression
1507 /// ::= unary binoprhs
1509 static ExprAST *ParseExpression() {
1510 ExprAST *LHS = ParseUnary();
1511 if (!LHS) return 0;
1513 return ParseBinOpRHS(0, LHS);
1516 /// prototype
1517 /// ::= id '(' id* ')'
1518 /// ::= binary LETTER number? (id, id)
1519 /// ::= unary LETTER (id)
1520 static PrototypeAST *ParsePrototype() {
1521 std::string FnName;
1523 int Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
1524 unsigned BinaryPrecedence = 30;
1526 switch (CurTok) {
1527 default:
1528 return ErrorP("Expected function name in prototype");
1529 case tok_identifier:
1530 FnName = IdentifierStr;
1531 Kind = 0;
1532 getNextToken();
1533 break;
1534 case tok_unary:
1535 getNextToken();
1536 if (!isascii(CurTok))
1537 return ErrorP("Expected unary operator");
1538 FnName = "unary";
1539 FnName += (char)CurTok;
1540 Kind = 1;
1541 getNextToken();
1542 break;
1543 case tok_binary:
1544 getNextToken();
1545 if (!isascii(CurTok))
1546 return ErrorP("Expected binary operator");
1547 FnName = "binary";
1548 FnName += (char)CurTok;
1549 Kind = 2;
1550 getNextToken();
1552 // Read the precedence if present.
1553 if (CurTok == tok_number) {
1554 if (NumVal &lt; 1 || NumVal &gt; 100)
1555 return ErrorP("Invalid precedecnce: must be 1..100");
1556 BinaryPrecedence = (unsigned)NumVal;
1557 getNextToken();
1559 break;
1562 if (CurTok != '(')
1563 return ErrorP("Expected '(' in prototype");
1565 std::vector&lt;std::string&gt; ArgNames;
1566 while (getNextToken() == tok_identifier)
1567 ArgNames.push_back(IdentifierStr);
1568 if (CurTok != ')')
1569 return ErrorP("Expected ')' in prototype");
1571 // success.
1572 getNextToken(); // eat ')'.
1574 // Verify right number of names for operator.
1575 if (Kind &amp;&amp; ArgNames.size() != Kind)
1576 return ErrorP("Invalid number of operands for operator");
1578 return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
1581 /// definition ::= 'def' prototype expression
1582 static FunctionAST *ParseDefinition() {
1583 getNextToken(); // eat def.
1584 PrototypeAST *Proto = ParsePrototype();
1585 if (Proto == 0) return 0;
1587 if (ExprAST *E = ParseExpression())
1588 return new FunctionAST(Proto, E);
1589 return 0;
1592 /// toplevelexpr ::= expression
1593 static FunctionAST *ParseTopLevelExpr() {
1594 if (ExprAST *E = ParseExpression()) {
1595 // Make an anonymous proto.
1596 PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1597 return new FunctionAST(Proto, E);
1599 return 0;
1602 /// external ::= 'extern' prototype
1603 static PrototypeAST *ParseExtern() {
1604 getNextToken(); // eat extern.
1605 return ParsePrototype();
1608 //===----------------------------------------------------------------------===//
1609 // Code Generation
1610 //===----------------------------------------------------------------------===//
1612 static Module *TheModule;
1613 static IRBuilder&lt;&gt; Builder(getGlobalContext());
1614 static std::map&lt;std::string, AllocaInst*&gt; NamedValues;
1615 static FunctionPassManager *TheFPM;
1617 Value *ErrorV(const char *Str) { Error(Str); return 0; }
1619 /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
1620 /// the function. This is used for mutable variables etc.
1621 static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
1622 const std::string &amp;VarName) {
1623 IRBuilder&lt;&gt; TmpB(&amp;TheFunction-&gt;getEntryBlock(),
1624 TheFunction-&gt;getEntryBlock().begin());
1625 return TmpB.CreateAlloca(Type::getDoubleTy(getGlobalContext()), 0, VarName.c_str());
1629 Value *NumberExprAST::Codegen() {
1630 return ConstantFP::get(getGlobalContext(), APFloat(Val));
1633 Value *VariableExprAST::Codegen() {
1634 // Look this variable up in the function.
1635 Value *V = NamedValues[Name];
1636 if (V == 0) return ErrorV("Unknown variable name");
1638 // Load the value.
1639 return Builder.CreateLoad(V, Name.c_str());
1642 Value *UnaryExprAST::Codegen() {
1643 Value *OperandV = Operand-&gt;Codegen();
1644 if (OperandV == 0) return 0;
1646 Function *F = TheModule-&gt;getFunction(std::string("unary")+Opcode);
1647 if (F == 0)
1648 return ErrorV("Unknown unary operator");
1650 return Builder.CreateCall(F, OperandV, "unop");
1654 Value *BinaryExprAST::Codegen() {
1655 // Special case '=' because we don't want to emit the LHS as an expression.
1656 if (Op == '=') {
1657 // Assignment requires the LHS to be an identifier.
1658 VariableExprAST *LHSE = dynamic_cast&lt;VariableExprAST*&gt;(LHS);
1659 if (!LHSE)
1660 return ErrorV("destination of '=' must be a variable");
1661 // Codegen the RHS.
1662 Value *Val = RHS-&gt;Codegen();
1663 if (Val == 0) return 0;
1665 // Look up the name.
1666 Value *Variable = NamedValues[LHSE-&gt;getName()];
1667 if (Variable == 0) return ErrorV("Unknown variable name");
1669 Builder.CreateStore(Val, Variable);
1670 return Val;
1674 Value *L = LHS-&gt;Codegen();
1675 Value *R = RHS-&gt;Codegen();
1676 if (L == 0 || R == 0) return 0;
1678 switch (Op) {
1679 case '+': return Builder.CreateAdd(L, R, "addtmp");
1680 case '-': return Builder.CreateSub(L, R, "subtmp");
1681 case '*': return Builder.CreateMul(L, R, "multmp");
1682 case '&lt;':
1683 L = Builder.CreateFCmpULT(L, R, "cmptmp");
1684 // Convert bool 0/1 to double 0.0 or 1.0
1685 return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
1686 "booltmp");
1687 default: break;
1690 // If it wasn't a builtin binary operator, it must be a user defined one. Emit
1691 // a call to it.
1692 Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
1693 assert(F &amp;&amp; "binary operator not found!");
1695 Value *Ops[] = { L, R };
1696 return Builder.CreateCall(F, Ops, Ops+2, "binop");
1699 Value *CallExprAST::Codegen() {
1700 // Look up the name in the global module table.
1701 Function *CalleeF = TheModule-&gt;getFunction(Callee);
1702 if (CalleeF == 0)
1703 return ErrorV("Unknown function referenced");
1705 // If argument mismatch error.
1706 if (CalleeF-&gt;arg_size() != Args.size())
1707 return ErrorV("Incorrect # arguments passed");
1709 std::vector&lt;Value*&gt; ArgsV;
1710 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1711 ArgsV.push_back(Args[i]-&gt;Codegen());
1712 if (ArgsV.back() == 0) return 0;
1715 return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
1718 Value *IfExprAST::Codegen() {
1719 Value *CondV = Cond-&gt;Codegen();
1720 if (CondV == 0) return 0;
1722 // Convert condition to a bool by comparing equal to 0.0.
1723 CondV = Builder.CreateFCmpONE(CondV,
1724 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
1725 "ifcond");
1727 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1729 // Create blocks for the then and else cases. Insert the 'then' block at the
1730 // end of the function.
1731 BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
1732 BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
1733 BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
1735 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1737 // Emit then value.
1738 Builder.SetInsertPoint(ThenBB);
1740 Value *ThenV = Then-&gt;Codegen();
1741 if (ThenV == 0) return 0;
1743 Builder.CreateBr(MergeBB);
1744 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1745 ThenBB = Builder.GetInsertBlock();
1747 // Emit else block.
1748 TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1749 Builder.SetInsertPoint(ElseBB);
1751 Value *ElseV = Else-&gt;Codegen();
1752 if (ElseV == 0) return 0;
1754 Builder.CreateBr(MergeBB);
1755 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1756 ElseBB = Builder.GetInsertBlock();
1758 // Emit merge block.
1759 TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1760 Builder.SetInsertPoint(MergeBB);
1761 PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()),
1762 "iftmp");
1764 PN-&gt;addIncoming(ThenV, ThenBB);
1765 PN-&gt;addIncoming(ElseV, ElseBB);
1766 return PN;
1769 Value *ForExprAST::Codegen() {
1770 // Output this as:
1771 // var = alloca double
1772 // ...
1773 // start = startexpr
1774 // store start -&gt; var
1775 // goto loop
1776 // loop:
1777 // ...
1778 // bodyexpr
1779 // ...
1780 // loopend:
1781 // step = stepexpr
1782 // endcond = endexpr
1784 // curvar = load var
1785 // nextvar = curvar + step
1786 // store nextvar -&gt; var
1787 // br endcond, loop, endloop
1788 // outloop:
1790 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1792 // Create an alloca for the variable in the entry block.
1793 AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
1795 // Emit the start code first, without 'variable' in scope.
1796 Value *StartVal = Start-&gt;Codegen();
1797 if (StartVal == 0) return 0;
1799 // Store the value into the alloca.
1800 Builder.CreateStore(StartVal, Alloca);
1802 // Make the new basic block for the loop header, inserting after current
1803 // block.
1804 BasicBlock *PreheaderBB = Builder.GetInsertBlock();
1805 BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
1807 // Insert an explicit fall through from the current block to the LoopBB.
1808 Builder.CreateBr(LoopBB);
1810 // Start insertion in LoopBB.
1811 Builder.SetInsertPoint(LoopBB);
1813 // Within the loop, the variable is defined equal to the PHI node. If it
1814 // shadows an existing variable, we have to restore it, so save it now.
1815 AllocaInst *OldVal = NamedValues[VarName];
1816 NamedValues[VarName] = Alloca;
1818 // Emit the body of the loop. This, like any other expr, can change the
1819 // current BB. Note that we ignore the value computed by the body, but don't
1820 // allow an error.
1821 if (Body-&gt;Codegen() == 0)
1822 return 0;
1824 // Emit the step value.
1825 Value *StepVal;
1826 if (Step) {
1827 StepVal = Step-&gt;Codegen();
1828 if (StepVal == 0) return 0;
1829 } else {
1830 // If not specified, use 1.0.
1831 StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
1834 // Compute the end condition.
1835 Value *EndCond = End-&gt;Codegen();
1836 if (EndCond == 0) return EndCond;
1838 // Reload, increment, and restore the alloca. This handles the case where
1839 // the body of the loop mutates the variable.
1840 Value *CurVar = Builder.CreateLoad(Alloca, VarName.c_str());
1841 Value *NextVar = Builder.CreateAdd(CurVar, StepVal, "nextvar");
1842 Builder.CreateStore(NextVar, Alloca);
1844 // Convert condition to a bool by comparing equal to 0.0.
1845 EndCond = Builder.CreateFCmpONE(EndCond,
1846 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
1847 "loopcond");
1849 // Create the "after loop" block and insert it.
1850 BasicBlock *LoopEndBB = Builder.GetInsertBlock();
1851 BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
1853 // Insert the conditional branch into the end of LoopEndBB.
1854 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1856 // Any new code will be inserted in AfterBB.
1857 Builder.SetInsertPoint(AfterBB);
1859 // Restore the unshadowed variable.
1860 if (OldVal)
1861 NamedValues[VarName] = OldVal;
1862 else
1863 NamedValues.erase(VarName);
1866 // for expr always returns 0.0.
1867 return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
1870 Value *VarExprAST::Codegen() {
1871 std::vector&lt;AllocaInst *&gt; OldBindings;
1873 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1875 // Register all variables and emit their initializer.
1876 for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
1877 const std::string &amp;VarName = VarNames[i].first;
1878 ExprAST *Init = VarNames[i].second;
1880 // Emit the initializer before adding the variable to scope, this prevents
1881 // the initializer from referencing the variable itself, and permits stuff
1882 // like this:
1883 // var a = 1 in
1884 // var a = a in ... # refers to outer 'a'.
1885 Value *InitVal;
1886 if (Init) {
1887 InitVal = Init-&gt;Codegen();
1888 if (InitVal == 0) return 0;
1889 } else { // If not specified, use 0.0.
1890 InitVal = ConstantFP::get(getGlobalContext(), APFloat(0.0));
1893 AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
1894 Builder.CreateStore(InitVal, Alloca);
1896 // Remember the old variable binding so that we can restore the binding when
1897 // we unrecurse.
1898 OldBindings.push_back(NamedValues[VarName]);
1900 // Remember this binding.
1901 NamedValues[VarName] = Alloca;
1904 // Codegen the body, now that all vars are in scope.
1905 Value *BodyVal = Body-&gt;Codegen();
1906 if (BodyVal == 0) return 0;
1908 // Pop all our variables from scope.
1909 for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
1910 NamedValues[VarNames[i].first] = OldBindings[i];
1912 // Return the body computation.
1913 return BodyVal;
1917 Function *PrototypeAST::Codegen() {
1918 // Make the function type: double(double,double) etc.
1919 std::vector&lt;const Type*&gt; Doubles(Args.size(),
1920 Type::getDoubleTy(getGlobalContext()));
1921 FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
1922 Doubles, false);
1924 Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
1926 // If F conflicted, there was already something named 'Name'. If it has a
1927 // body, don't allow redefinition or reextern.
1928 if (F-&gt;getName() != Name) {
1929 // Delete the one we just made and get the existing one.
1930 F-&gt;eraseFromParent();
1931 F = TheModule-&gt;getFunction(Name);
1933 // If F already has a body, reject this.
1934 if (!F-&gt;empty()) {
1935 ErrorF("redefinition of function");
1936 return 0;
1939 // If F took a different number of args, reject.
1940 if (F-&gt;arg_size() != Args.size()) {
1941 ErrorF("redefinition of function with different # args");
1942 return 0;
1946 // Set names for all arguments.
1947 unsigned Idx = 0;
1948 for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1949 ++AI, ++Idx)
1950 AI-&gt;setName(Args[Idx]);
1952 return F;
1955 /// CreateArgumentAllocas - Create an alloca for each argument and register the
1956 /// argument in the symbol table so that references to it will succeed.
1957 void PrototypeAST::CreateArgumentAllocas(Function *F) {
1958 Function::arg_iterator AI = F-&gt;arg_begin();
1959 for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
1960 // Create an alloca for this variable.
1961 AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
1963 // Store the initial value into the alloca.
1964 Builder.CreateStore(AI, Alloca);
1966 // Add arguments to variable symbol table.
1967 NamedValues[Args[Idx]] = Alloca;
1972 Function *FunctionAST::Codegen() {
1973 NamedValues.clear();
1975 Function *TheFunction = Proto-&gt;Codegen();
1976 if (TheFunction == 0)
1977 return 0;
1979 // If this is an operator, install it.
1980 if (Proto-&gt;isBinaryOp())
1981 BinopPrecedence[Proto-&gt;getOperatorName()] = Proto-&gt;getBinaryPrecedence();
1983 // Create a new basic block to start insertion into.
1984 BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
1985 Builder.SetInsertPoint(BB);
1987 // Add all arguments to the symbol table and create their allocas.
1988 Proto-&gt;CreateArgumentAllocas(TheFunction);
1990 if (Value *RetVal = Body-&gt;Codegen()) {
1991 // Finish off the function.
1992 Builder.CreateRet(RetVal);
1994 // Validate the generated code, checking for consistency.
1995 verifyFunction(*TheFunction);
1997 // Optimize the function.
1998 TheFPM-&gt;run(*TheFunction);
2000 return TheFunction;
2003 // Error reading body, remove function.
2004 TheFunction-&gt;eraseFromParent();
2006 if (Proto-&gt;isBinaryOp())
2007 BinopPrecedence.erase(Proto-&gt;getOperatorName());
2008 return 0;
2011 //===----------------------------------------------------------------------===//
2012 // Top-Level parsing and JIT Driver
2013 //===----------------------------------------------------------------------===//
2015 static ExecutionEngine *TheExecutionEngine;
2017 static void HandleDefinition() {
2018 if (FunctionAST *F = ParseDefinition()) {
2019 if (Function *LF = F-&gt;Codegen()) {
2020 fprintf(stderr, "Read function definition:");
2021 LF-&gt;dump();
2023 } else {
2024 // Skip token for error recovery.
2025 getNextToken();
2029 static void HandleExtern() {
2030 if (PrototypeAST *P = ParseExtern()) {
2031 if (Function *F = P-&gt;Codegen()) {
2032 fprintf(stderr, "Read extern: ");
2033 F-&gt;dump();
2035 } else {
2036 // Skip token for error recovery.
2037 getNextToken();
2041 static void HandleTopLevelExpression() {
2042 // Evaluate a top level expression into an anonymous function.
2043 if (FunctionAST *F = ParseTopLevelExpr()) {
2044 if (Function *LF = F-&gt;Codegen()) {
2045 // JIT the function, returning a function pointer.
2046 void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
2048 // Cast it to the right type (takes no arguments, returns a double) so we
2049 // can call it as a native function.
2050 double (*FP)() = (double (*)())(intptr_t)FPtr;
2051 fprintf(stderr, "Evaluated to %f\n", FP());
2053 } else {
2054 // Skip token for error recovery.
2055 getNextToken();
2059 /// top ::= definition | external | expression | ';'
2060 static void MainLoop() {
2061 while (1) {
2062 fprintf(stderr, "ready&gt; ");
2063 switch (CurTok) {
2064 case tok_eof: return;
2065 case ';': getNextToken(); break; // ignore top level semicolons.
2066 case tok_def: HandleDefinition(); break;
2067 case tok_extern: HandleExtern(); break;
2068 default: HandleTopLevelExpression(); break;
2075 //===----------------------------------------------------------------------===//
2076 // "Library" functions that can be "extern'd" from user code.
2077 //===----------------------------------------------------------------------===//
2079 /// putchard - putchar that takes a double and returns 0.
2080 extern "C"
2081 double putchard(double X) {
2082 putchar((char)X);
2083 return 0;
2086 /// printd - printf that takes a double prints it as "%f\n", returning 0.
2087 extern "C"
2088 double printd(double X) {
2089 printf("%f\n", X);
2090 return 0;
2093 //===----------------------------------------------------------------------===//
2094 // Main driver code.
2095 //===----------------------------------------------------------------------===//
2097 int main() {
2098 // Install standard binary operators.
2099 // 1 is lowest precedence.
2100 BinopPrecedence['='] = 2;
2101 BinopPrecedence['&lt;'] = 10;
2102 BinopPrecedence['+'] = 20;
2103 BinopPrecedence['-'] = 20;
2104 BinopPrecedence['*'] = 40; // highest.
2106 // Prime the first token.
2107 fprintf(stderr, "ready&gt; ");
2108 getNextToken();
2110 // Make the module, which holds all the code.
2111 TheModule = new Module("my cool jit", getGlobalContext());
2113 ExistingModuleProvider *OurModuleProvider =
2114 new ExistingModuleProvider(TheModule);
2116 // Create the JIT. This takes ownership of the module and module provider.
2117 TheExecutionEngine = EngineBuilder(OurModuleProvider).create();
2119 FunctionPassManager OurFPM(OurModuleProvider);
2121 // Set up the optimizer pipeline. Start with registering info about how the
2122 // target lays out data structures.
2123 OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
2124 // Promote allocas to registers.
2125 OurFPM.add(createPromoteMemoryToRegisterPass());
2126 // Do simple "peephole" optimizations and bit-twiddling optzns.
2127 OurFPM.add(createInstructionCombiningPass());
2128 // Reassociate expressions.
2129 OurFPM.add(createReassociatePass());
2130 // Eliminate Common SubExpressions.
2131 OurFPM.add(createGVNPass());
2132 // Simplify the control flow graph (deleting unreachable blocks, etc).
2133 OurFPM.add(createCFGSimplificationPass());
2135 OurFPM.doInitialization();
2137 // Set the global so the code gen can use this.
2138 TheFPM = &amp;OurFPM;
2140 // Run the main "interpreter loop" now.
2141 MainLoop();
2143 TheFPM = 0;
2145 // Print out all of the generated code.
2146 TheModule-&gt;dump();
2148 return 0;
2150 </pre>
2151 </div>
2153 <a href="LangImpl8.html">Next: Conclusion and other useful LLVM tidbits</a>
2154 </div>
2156 <!-- *********************************************************************** -->
2157 <hr>
2158 <address>
2159 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
2160 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
2161 <a href="http://validator.w3.org/check/referer"><img
2162 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
2164 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
2165 <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
2166 Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
2167 </address>
2168 </body>
2169 </html>