Fix PR4948 (and a leak): by not destroying the DwarfException
[llvm/avr.git] / docs / tutorial / LangImpl5.html
blob3ded1392afb10ef6132ac1f9f5d51407239b2cc2
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
4 <html>
5 <head>
6 <title>Kaleidoscope: Extending the Language: Control Flow</title>
7 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8 <meta name="author" content="Chris Lattner">
9 <link rel="stylesheet" href="../llvm.css" type="text/css">
10 </head>
12 <body>
14 <div class="doc_title">Kaleidoscope: Extending the Language: Control Flow</div>
16 <ul>
17 <li><a href="index.html">Up to Tutorial Index</a></li>
18 <li>Chapter 5
19 <ol>
20 <li><a href="#intro">Chapter 5 Introduction</a></li>
21 <li><a href="#ifthen">If/Then/Else</a>
22 <ol>
23 <li><a href="#iflexer">Lexer Extensions</a></li>
24 <li><a href="#ifast">AST Extensions</a></li>
25 <li><a href="#ifparser">Parser Extensions</a></li>
26 <li><a href="#ifir">LLVM IR</a></li>
27 <li><a href="#ifcodegen">Code Generation</a></li>
28 </ol>
29 </li>
30 <li><a href="#for">'for' Loop Expression</a>
31 <ol>
32 <li><a href="#forlexer">Lexer Extensions</a></li>
33 <li><a href="#forast">AST Extensions</a></li>
34 <li><a href="#forparser">Parser Extensions</a></li>
35 <li><a href="#forir">LLVM IR</a></li>
36 <li><a href="#forcodegen">Code Generation</a></li>
37 </ol>
38 </li>
39 <li><a href="#code">Full Code Listing</a></li>
40 </ol>
41 </li>
42 <li><a href="LangImpl6.html">Chapter 6</a>: Extending the Language:
43 User-defined Operators</li>
44 </ul>
46 <div class="doc_author">
47 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
48 </div>
50 <!-- *********************************************************************** -->
51 <div class="doc_section"><a name="intro">Chapter 5 Introduction</a></div>
52 <!-- *********************************************************************** -->
54 <div class="doc_text">
56 <p>Welcome to Chapter 5 of the "<a href="index.html">Implementing a language
57 with LLVM</a>" tutorial. Parts 1-4 described the implementation of the simple
58 Kaleidoscope language and included support for generating LLVM IR, followed by
59 optimizations and a JIT compiler. Unfortunately, as presented, Kaleidoscope is
60 mostly useless: it has no control flow other than call and return. This means
61 that you can't have conditional branches in the code, significantly limiting its
62 power. In this episode of "build that compiler", we'll extend Kaleidoscope to
63 have an if/then/else expression plus a simple 'for' loop.</p>
65 </div>
67 <!-- *********************************************************************** -->
68 <div class="doc_section"><a name="ifthen">If/Then/Else</a></div>
69 <!-- *********************************************************************** -->
71 <div class="doc_text">
73 <p>
74 Extending Kaleidoscope to support if/then/else is quite straightforward. It
75 basically requires adding lexer support for this "new" concept to the lexer,
76 parser, AST, and LLVM code emitter. This example is nice, because it shows how
77 easy it is to "grow" a language over time, incrementally extending it as new
78 ideas are discovered.</p>
80 <p>Before we get going on "how" we add this extension, lets talk about "what" we
81 want. The basic idea is that we want to be able to write this sort of thing:
82 </p>
84 <div class="doc_code">
85 <pre>
86 def fib(x)
87 if x &lt; 3 then
89 else
90 fib(x-1)+fib(x-2);
91 </pre>
92 </div>
94 <p>In Kaleidoscope, every construct is an expression: there are no statements.
95 As such, the if/then/else expression needs to return a value like any other.
96 Since we're using a mostly functional form, we'll have it evaluate its
97 conditional, then return the 'then' or 'else' value based on how the condition
98 was resolved. This is very similar to the C "?:" expression.</p>
100 <p>The semantics of the if/then/else expression is that it evaluates the
101 condition to a boolean equality value: 0.0 is considered to be false and
102 everything else is considered to be true.
103 If the condition is true, the first subexpression is evaluated and returned, if
104 the condition is false, the second subexpression is evaluated and returned.
105 Since Kaleidoscope allows side-effects, this behavior is important to nail down.
106 </p>
108 <p>Now that we know what we "want", lets break this down into its constituent
109 pieces.</p>
111 </div>
113 <!-- ======================================================================= -->
114 <div class="doc_subsubsection"><a name="iflexer">Lexer Extensions for
115 If/Then/Else</a></div>
116 <!-- ======================================================================= -->
119 <div class="doc_text">
121 <p>The lexer extensions are straightforward. First we add new enum values
122 for the relevant tokens:</p>
124 <div class="doc_code">
125 <pre>
126 // control
127 tok_if = -6, tok_then = -7, tok_else = -8,
128 </pre>
129 </div>
131 <p>Once we have that, we recognize the new keywords in the lexer. This is pretty simple
132 stuff:</p>
134 <div class="doc_code">
135 <pre>
137 if (IdentifierStr == "def") return tok_def;
138 if (IdentifierStr == "extern") return tok_extern;
139 <b>if (IdentifierStr == "if") return tok_if;
140 if (IdentifierStr == "then") return tok_then;
141 if (IdentifierStr == "else") return tok_else;</b>
142 return tok_identifier;
143 </pre>
144 </div>
146 </div>
148 <!-- ======================================================================= -->
149 <div class="doc_subsubsection"><a name="ifast">AST Extensions for
150 If/Then/Else</a></div>
151 <!-- ======================================================================= -->
153 <div class="doc_text">
155 <p>To represent the new expression we add a new AST node for it:</p>
157 <div class="doc_code">
158 <pre>
159 /// IfExprAST - Expression class for if/then/else.
160 class IfExprAST : public ExprAST {
161 ExprAST *Cond, *Then, *Else;
162 public:
163 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
164 : Cond(cond), Then(then), Else(_else) {}
165 virtual Value *Codegen();
167 </pre>
168 </div>
170 <p>The AST node just has pointers to the various subexpressions.</p>
172 </div>
174 <!-- ======================================================================= -->
175 <div class="doc_subsubsection"><a name="ifparser">Parser Extensions for
176 If/Then/Else</a></div>
177 <!-- ======================================================================= -->
179 <div class="doc_text">
181 <p>Now that we have the relevant tokens coming from the lexer and we have the
182 AST node to build, our parsing logic is relatively straightforward. First we
183 define a new parsing function:</p>
185 <div class="doc_code">
186 <pre>
187 /// ifexpr ::= 'if' expression 'then' expression 'else' expression
188 static ExprAST *ParseIfExpr() {
189 getNextToken(); // eat the if.
191 // condition.
192 ExprAST *Cond = ParseExpression();
193 if (!Cond) return 0;
195 if (CurTok != tok_then)
196 return Error("expected then");
197 getNextToken(); // eat the then
199 ExprAST *Then = ParseExpression();
200 if (Then == 0) return 0;
202 if (CurTok != tok_else)
203 return Error("expected else");
205 getNextToken();
207 ExprAST *Else = ParseExpression();
208 if (!Else) return 0;
210 return new IfExprAST(Cond, Then, Else);
212 </pre>
213 </div>
215 <p>Next we hook it up as a primary expression:</p>
217 <div class="doc_code">
218 <pre>
219 static ExprAST *ParsePrimary() {
220 switch (CurTok) {
221 default: return Error("unknown token when expecting an expression");
222 case tok_identifier: return ParseIdentifierExpr();
223 case tok_number: return ParseNumberExpr();
224 case '(': return ParseParenExpr();
225 <b>case tok_if: return ParseIfExpr();</b>
228 </pre>
229 </div>
231 </div>
233 <!-- ======================================================================= -->
234 <div class="doc_subsubsection"><a name="ifir">LLVM IR for If/Then/Else</a></div>
235 <!-- ======================================================================= -->
237 <div class="doc_text">
239 <p>Now that we have it parsing and building the AST, the final piece is adding
240 LLVM code generation support. This is the most interesting part of the
241 if/then/else example, because this is where it starts to introduce new concepts.
242 All of the code above has been thoroughly described in previous chapters.
243 </p>
245 <p>To motivate the code we want to produce, lets take a look at a simple
246 example. Consider:</p>
248 <div class="doc_code">
249 <pre>
250 extern foo();
251 extern bar();
252 def baz(x) if x then foo() else bar();
253 </pre>
254 </div>
256 <p>If you disable optimizations, the code you'll (soon) get from Kaleidoscope
257 looks like this:</p>
259 <div class="doc_code">
260 <pre>
261 declare double @foo()
263 declare double @bar()
265 define double @baz(double %x) {
266 entry:
267 %ifcond = fcmp one double %x, 0.000000e+00
268 br i1 %ifcond, label %then, label %else
270 then: ; preds = %entry
271 %calltmp = call double @foo()
272 br label %ifcont
274 else: ; preds = %entry
275 %calltmp1 = call double @bar()
276 br label %ifcont
278 ifcont: ; preds = %else, %then
279 %iftmp = phi double [ %calltmp, %then ], [ %calltmp1, %else ]
280 ret double %iftmp
282 </pre>
283 </div>
285 <p>To visualize the control flow graph, you can use a nifty feature of the LLVM
286 '<a href="http://llvm.org/cmds/opt.html">opt</a>' tool. If you put this LLVM IR
287 into "t.ll" and run "<tt>llvm-as &lt; t.ll | opt -analyze -view-cfg</tt>", <a
288 href="../ProgrammersManual.html#ViewGraph">a window will pop up</a> and you'll
289 see this graph:</p>
291 <div style="text-align: center"><img src="LangImpl5-cfg.png" alt="Example CFG" width="423"
292 height="315"></div>
294 <p>Another way to get this is to call "<tt>F-&gt;viewCFG()</tt>" or
295 "<tt>F-&gt;viewCFGOnly()</tt>" (where F is a "<tt>Function*</tt>") either by
296 inserting actual calls into the code and recompiling or by calling these in the
297 debugger. LLVM has many nice features for visualizing various graphs.</p>
299 <p>Getting back to the generated code, it is fairly simple: the entry block
300 evaluates the conditional expression ("x" in our case here) and compares the
301 result to 0.0 with the "<tt><a href="../LangRef.html#i_fcmp">fcmp</a> one</tt>"
302 instruction ('one' is "Ordered and Not Equal"). Based on the result of this
303 expression, the code jumps to either the "then" or "else" blocks, which contain
304 the expressions for the true/false cases.</p>
306 <p>Once the then/else blocks are finished executing, they both branch back to the
307 'ifcont' block to execute the code that happens after the if/then/else. In this
308 case the only thing left to do is to return to the caller of the function. The
309 question then becomes: how does the code know which expression to return?</p>
311 <p>The answer to this question involves an important SSA operation: the
312 <a href="http://en.wikipedia.org/wiki/Static_single_assignment_form">Phi
313 operation</a>. If you're not familiar with SSA, <a
314 href="http://en.wikipedia.org/wiki/Static_single_assignment_form">the wikipedia
315 article</a> is a good introduction and there are various other introductions to
316 it available on your favorite search engine. The short version is that
317 "execution" of the Phi operation requires "remembering" which block control came
318 from. The Phi operation takes on the value corresponding to the input control
319 block. In this case, if control comes in from the "then" block, it gets the
320 value of "calltmp". If control comes from the "else" block, it gets the value
321 of "calltmp1".</p>
323 <p>At this point, you are probably starting to think "Oh no! This means my
324 simple and elegant front-end will have to start generating SSA form in order to
325 use LLVM!". Fortunately, this is not the case, and we strongly advise
326 <em>not</em> implementing an SSA construction algorithm in your front-end
327 unless there is an amazingly good reason to do so. In practice, there are two
328 sorts of values that float around in code written for your average imperative
329 programming language that might need Phi nodes:</p>
331 <ol>
332 <li>Code that involves user variables: <tt>x = 1; x = x + 1; </tt></li>
333 <li>Values that are implicit in the structure of your AST, such as the Phi node
334 in this case.</li>
335 </ol>
337 <p>In <a href="LangImpl7.html">Chapter 7</a> of this tutorial ("mutable
338 variables"), we'll talk about #1
339 in depth. For now, just believe me that you don't need SSA construction to
340 handle this case. For #2, you have the choice of using the techniques that we will
341 describe for #1, or you can insert Phi nodes directly, if convenient. In this
342 case, it is really really easy to generate the Phi node, so we choose to do it
343 directly.</p>
345 <p>Okay, enough of the motivation and overview, lets generate code!</p>
347 </div>
349 <!-- ======================================================================= -->
350 <div class="doc_subsubsection"><a name="ifcodegen">Code Generation for
351 If/Then/Else</a></div>
352 <!-- ======================================================================= -->
354 <div class="doc_text">
356 <p>In order to generate code for this, we implement the <tt>Codegen</tt> method
357 for <tt>IfExprAST</tt>:</p>
359 <div class="doc_code">
360 <pre>
361 Value *IfExprAST::Codegen() {
362 Value *CondV = Cond-&gt;Codegen();
363 if (CondV == 0) return 0;
365 // Convert condition to a bool by comparing equal to 0.0.
366 CondV = Builder.CreateFCmpONE(CondV,
367 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
368 "ifcond");
369 </pre>
370 </div>
372 <p>This code is straightforward and similar to what we saw before. We emit the
373 expression for the condition, then compare that value to zero to get a truth
374 value as a 1-bit (bool) value.</p>
376 <div class="doc_code">
377 <pre>
378 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
380 // Create blocks for the then and else cases. Insert the 'then' block at the
381 // end of the function.
382 BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
383 BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
384 BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
386 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
387 </pre>
388 </div>
390 <p>This code creates the basic blocks that are related to the if/then/else
391 statement, and correspond directly to the blocks in the example above. The
392 first line gets the current Function object that is being built. It
393 gets this by asking the builder for the current BasicBlock, and asking that
394 block for its "parent" (the function it is currently embedded into).</p>
396 <p>Once it has that, it creates three blocks. Note that it passes "TheFunction"
397 into the constructor for the "then" block. This causes the constructor to
398 automatically insert the new block into the end of the specified function. The
399 other two blocks are created, but aren't yet inserted into the function.</p>
401 <p>Once the blocks are created, we can emit the conditional branch that chooses
402 between them. Note that creating new blocks does not implicitly affect the
403 IRBuilder, so it is still inserting into the block that the condition
404 went into. Also note that it is creating a branch to the "then" block and the
405 "else" block, even though the "else" block isn't inserted into the function yet.
406 This is all ok: it is the standard way that LLVM supports forward
407 references.</p>
409 <div class="doc_code">
410 <pre>
411 // Emit then value.
412 Builder.SetInsertPoint(ThenBB);
414 Value *ThenV = Then-&gt;Codegen();
415 if (ThenV == 0) return 0;
417 Builder.CreateBr(MergeBB);
418 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
419 ThenBB = Builder.GetInsertBlock();
420 </pre>
421 </div>
423 <p>After the conditional branch is inserted, we move the builder to start
424 inserting into the "then" block. Strictly speaking, this call moves the
425 insertion point to be at the end of the specified block. However, since the
426 "then" block is empty, it also starts out by inserting at the beginning of the
427 block. :)</p>
429 <p>Once the insertion point is set, we recursively codegen the "then" expression
430 from the AST. To finish off the "then" block, we create an unconditional branch
431 to the merge block. One interesting (and very important) aspect of the LLVM IR
432 is that it <a href="../LangRef.html#functionstructure">requires all basic blocks
433 to be "terminated"</a> with a <a href="../LangRef.html#terminators">control flow
434 instruction</a> such as return or branch. This means that all control flow,
435 <em>including fall throughs</em> must be made explicit in the LLVM IR. If you
436 violate this rule, the verifier will emit an error.</p>
438 <p>The final line here is quite subtle, but is very important. The basic issue
439 is that when we create the Phi node in the merge block, we need to set up the
440 block/value pairs that indicate how the Phi will work. Importantly, the Phi
441 node expects to have an entry for each predecessor of the block in the CFG. Why
442 then, are we getting the current block when we just set it to ThenBB 5 lines
443 above? The problem is that the "Then" expression may actually itself change the
444 block that the Builder is emitting into if, for example, it contains a nested
445 "if/then/else" expression. Because calling Codegen recursively could
446 arbitrarily change the notion of the current block, we are required to get an
447 up-to-date value for code that will set up the Phi node.</p>
449 <div class="doc_code">
450 <pre>
451 // Emit else block.
452 TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
453 Builder.SetInsertPoint(ElseBB);
455 Value *ElseV = Else-&gt;Codegen();
456 if (ElseV == 0) return 0;
458 Builder.CreateBr(MergeBB);
459 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
460 ElseBB = Builder.GetInsertBlock();
461 </pre>
462 </div>
464 <p>Code generation for the 'else' block is basically identical to codegen for
465 the 'then' block. The only significant difference is the first line, which adds
466 the 'else' block to the function. Recall previously that the 'else' block was
467 created, but not added to the function. Now that the 'then' and 'else' blocks
468 are emitted, we can finish up with the merge code:</p>
470 <div class="doc_code">
471 <pre>
472 // Emit merge block.
473 TheFunction->getBasicBlockList().push_back(MergeBB);
474 Builder.SetInsertPoint(MergeBB);
475 PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), "iftmp");
477 PN->addIncoming(ThenV, ThenBB);
478 PN->addIncoming(ElseV, ElseBB);
479 return PN;
481 </pre>
482 </div>
484 <p>The first two lines here are now familiar: the first adds the "merge" block
485 to the Function object (it was previously floating, like the else block above).
486 The second block changes the insertion point so that newly created code will go
487 into the "merge" block. Once that is done, we need to create the PHI node and
488 set up the block/value pairs for the PHI.</p>
490 <p>Finally, the CodeGen function returns the phi node as the value computed by
491 the if/then/else expression. In our example above, this returned value will
492 feed into the code for the top-level function, which will create the return
493 instruction.</p>
495 <p>Overall, we now have the ability to execute conditional code in
496 Kaleidoscope. With this extension, Kaleidoscope is a fairly complete language
497 that can calculate a wide variety of numeric functions. Next up we'll add
498 another useful expression that is familiar from non-functional languages...</p>
500 </div>
502 <!-- *********************************************************************** -->
503 <div class="doc_section"><a name="for">'for' Loop Expression</a></div>
504 <!-- *********************************************************************** -->
506 <div class="doc_text">
508 <p>Now that we know how to add basic control flow constructs to the language,
509 we have the tools to add more powerful things. Lets add something more
510 aggressive, a 'for' expression:</p>
512 <div class="doc_code">
513 <pre>
514 extern putchard(char)
515 def printstar(n)
516 for i = 1, i &lt; n, 1.0 in
517 putchard(42); # ascii 42 = '*'
519 # print 100 '*' characters
520 printstar(100);
521 </pre>
522 </div>
524 <p>This expression defines a new variable ("i" in this case) which iterates from
525 a starting value, while the condition ("i &lt; n" in this case) is true,
526 incrementing by an optional step value ("1.0" in this case). If the step value
527 is omitted, it defaults to 1.0. While the loop is true, it executes its
528 body expression. Because we don't have anything better to return, we'll just
529 define the loop as always returning 0.0. In the future when we have mutable
530 variables, it will get more useful.</p>
532 <p>As before, lets talk about the changes that we need to Kaleidoscope to
533 support this.</p>
535 </div>
537 <!-- ======================================================================= -->
538 <div class="doc_subsubsection"><a name="forlexer">Lexer Extensions for
539 the 'for' Loop</a></div>
540 <!-- ======================================================================= -->
542 <div class="doc_text">
544 <p>The lexer extensions are the same sort of thing as for if/then/else:</p>
546 <div class="doc_code">
547 <pre>
548 ... in enum Token ...
549 // control
550 tok_if = -6, tok_then = -7, tok_else = -8,
551 <b> tok_for = -9, tok_in = -10</b>
553 ... in gettok ...
554 if (IdentifierStr == "def") return tok_def;
555 if (IdentifierStr == "extern") return tok_extern;
556 if (IdentifierStr == "if") return tok_if;
557 if (IdentifierStr == "then") return tok_then;
558 if (IdentifierStr == "else") return tok_else;
559 <b>if (IdentifierStr == "for") return tok_for;
560 if (IdentifierStr == "in") return tok_in;</b>
561 return tok_identifier;
562 </pre>
563 </div>
565 </div>
567 <!-- ======================================================================= -->
568 <div class="doc_subsubsection"><a name="forast">AST Extensions for
569 the 'for' Loop</a></div>
570 <!-- ======================================================================= -->
572 <div class="doc_text">
574 <p>The AST node is just as simple. It basically boils down to capturing
575 the variable name and the constituent expressions in the node.</p>
577 <div class="doc_code">
578 <pre>
579 /// ForExprAST - Expression class for for/in.
580 class ForExprAST : public ExprAST {
581 std::string VarName;
582 ExprAST *Start, *End, *Step, *Body;
583 public:
584 ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
585 ExprAST *step, ExprAST *body)
586 : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
587 virtual Value *Codegen();
589 </pre>
590 </div>
592 </div>
594 <!-- ======================================================================= -->
595 <div class="doc_subsubsection"><a name="forparser">Parser Extensions for
596 the 'for' Loop</a></div>
597 <!-- ======================================================================= -->
599 <div class="doc_text">
601 <p>The parser code is also fairly standard. The only interesting thing here is
602 handling of the optional step value. The parser code handles it by checking to
603 see if the second comma is present. If not, it sets the step value to null in
604 the AST node:</p>
606 <div class="doc_code">
607 <pre>
608 /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
609 static ExprAST *ParseForExpr() {
610 getNextToken(); // eat the for.
612 if (CurTok != tok_identifier)
613 return Error("expected identifier after for");
615 std::string IdName = IdentifierStr;
616 getNextToken(); // eat identifier.
618 if (CurTok != '=')
619 return Error("expected '=' after for");
620 getNextToken(); // eat '='.
623 ExprAST *Start = ParseExpression();
624 if (Start == 0) return 0;
625 if (CurTok != ',')
626 return Error("expected ',' after for start value");
627 getNextToken();
629 ExprAST *End = ParseExpression();
630 if (End == 0) return 0;
632 // The step value is optional.
633 ExprAST *Step = 0;
634 if (CurTok == ',') {
635 getNextToken();
636 Step = ParseExpression();
637 if (Step == 0) return 0;
640 if (CurTok != tok_in)
641 return Error("expected 'in' after for");
642 getNextToken(); // eat 'in'.
644 ExprAST *Body = ParseExpression();
645 if (Body == 0) return 0;
647 return new ForExprAST(IdName, Start, End, Step, Body);
649 </pre>
650 </div>
652 </div>
654 <!-- ======================================================================= -->
655 <div class="doc_subsubsection"><a name="forir">LLVM IR for
656 the 'for' Loop</a></div>
657 <!-- ======================================================================= -->
659 <div class="doc_text">
661 <p>Now we get to the good part: the LLVM IR we want to generate for this thing.
662 With the simple example above, we get this LLVM IR (note that this dump is
663 generated with optimizations disabled for clarity):
664 </p>
666 <div class="doc_code">
667 <pre>
668 declare double @putchard(double)
670 define double @printstar(double %n) {
671 entry:
672 ; initial value = 1.0 (inlined into phi)
673 br label %loop
675 loop: ; preds = %loop, %entry
676 %i = phi double [ 1.000000e+00, %entry ], [ %nextvar, %loop ]
677 ; body
678 %calltmp = call double @putchard( double 4.200000e+01 )
679 ; increment
680 %nextvar = add double %i, 1.000000e+00
682 ; termination test
683 %cmptmp = fcmp ult double %i, %n
684 %booltmp = uitofp i1 %cmptmp to double
685 %loopcond = fcmp one double %booltmp, 0.000000e+00
686 br i1 %loopcond, label %loop, label %afterloop
688 afterloop: ; preds = %loop
689 ; loop always returns 0.0
690 ret double 0.000000e+00
692 </pre>
693 </div>
695 <p>This loop contains all the same constructs we saw before: a phi node, several
696 expressions, and some basic blocks. Lets see how this fits together.</p>
698 </div>
700 <!-- ======================================================================= -->
701 <div class="doc_subsubsection"><a name="forcodegen">Code Generation for
702 the 'for' Loop</a></div>
703 <!-- ======================================================================= -->
705 <div class="doc_text">
707 <p>The first part of Codegen is very simple: we just output the start expression
708 for the loop value:</p>
710 <div class="doc_code">
711 <pre>
712 Value *ForExprAST::Codegen() {
713 // Emit the start code first, without 'variable' in scope.
714 Value *StartVal = Start-&gt;Codegen();
715 if (StartVal == 0) return 0;
716 </pre>
717 </div>
719 <p>With this out of the way, the next step is to set up the LLVM basic block
720 for the start of the loop body. In the case above, the whole loop body is one
721 block, but remember that the body code itself could consist of multiple blocks
722 (e.g. if it contains an if/then/else or a for/in expression).</p>
724 <div class="doc_code">
725 <pre>
726 // Make the new basic block for the loop header, inserting after current
727 // block.
728 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
729 BasicBlock *PreheaderBB = Builder.GetInsertBlock();
730 BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
732 // Insert an explicit fall through from the current block to the LoopBB.
733 Builder.CreateBr(LoopBB);
734 </pre>
735 </div>
737 <p>This code is similar to what we saw for if/then/else. Because we will need
738 it to create the Phi node, we remember the block that falls through into the
739 loop. Once we have that, we create the actual block that starts the loop and
740 create an unconditional branch for the fall-through between the two blocks.</p>
742 <div class="doc_code">
743 <pre>
744 // Start insertion in LoopBB.
745 Builder.SetInsertPoint(LoopBB);
747 // Start the PHI node with an entry for Start.
748 PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), VarName.c_str());
749 Variable-&gt;addIncoming(StartVal, PreheaderBB);
750 </pre>
751 </div>
753 <p>Now that the "preheader" for the loop is set up, we switch to emitting code
754 for the loop body. To begin with, we move the insertion point and create the
755 PHI node for the loop induction variable. Since we already know the incoming
756 value for the starting value, we add it to the Phi node. Note that the Phi will
757 eventually get a second value for the backedge, but we can't set it up yet
758 (because it doesn't exist!).</p>
760 <div class="doc_code">
761 <pre>
762 // Within the loop, the variable is defined equal to the PHI node. If it
763 // shadows an existing variable, we have to restore it, so save it now.
764 Value *OldVal = NamedValues[VarName];
765 NamedValues[VarName] = Variable;
767 // Emit the body of the loop. This, like any other expr, can change the
768 // current BB. Note that we ignore the value computed by the body, but don't
769 // allow an error.
770 if (Body-&gt;Codegen() == 0)
771 return 0;
772 </pre>
773 </div>
775 <p>Now the code starts to get more interesting. Our 'for' loop introduces a new
776 variable to the symbol table. This means that our symbol table can now contain
777 either function arguments or loop variables. To handle this, before we codegen
778 the body of the loop, we add the loop variable as the current value for its
779 name. Note that it is possible that there is a variable of the same name in the
780 outer scope. It would be easy to make this an error (emit an error and return
781 null if there is already an entry for VarName) but we choose to allow shadowing
782 of variables. In order to handle this correctly, we remember the Value that
783 we are potentially shadowing in <tt>OldVal</tt> (which will be null if there is
784 no shadowed variable).</p>
786 <p>Once the loop variable is set into the symbol table, the code recursively
787 codegen's the body. This allows the body to use the loop variable: any
788 references to it will naturally find it in the symbol table.</p>
790 <div class="doc_code">
791 <pre>
792 // Emit the step value.
793 Value *StepVal;
794 if (Step) {
795 StepVal = Step-&gt;Codegen();
796 if (StepVal == 0) return 0;
797 } else {
798 // If not specified, use 1.0.
799 StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
802 Value *NextVar = Builder.CreateAdd(Variable, StepVal, "nextvar");
803 </pre>
804 </div>
806 <p>Now that the body is emitted, we compute the next value of the iteration
807 variable by adding the step value, or 1.0 if it isn't present. '<tt>NextVar</tt>'
808 will be the value of the loop variable on the next iteration of the loop.</p>
810 <div class="doc_code">
811 <pre>
812 // Compute the end condition.
813 Value *EndCond = End-&gt;Codegen();
814 if (EndCond == 0) return EndCond;
816 // Convert condition to a bool by comparing equal to 0.0.
817 EndCond = Builder.CreateFCmpONE(EndCond,
818 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
819 "loopcond");
820 </pre>
821 </div>
823 <p>Finally, we evaluate the exit value of the loop, to determine whether the
824 loop should exit. This mirrors the condition evaluation for the if/then/else
825 statement.</p>
827 <div class="doc_code">
828 <pre>
829 // Create the "after loop" block and insert it.
830 BasicBlock *LoopEndBB = Builder.GetInsertBlock();
831 BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
833 // Insert the conditional branch into the end of LoopEndBB.
834 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
836 // Any new code will be inserted in AfterBB.
837 Builder.SetInsertPoint(AfterBB);
838 </pre>
839 </div>
841 <p>With the code for the body of the loop complete, we just need to finish up
842 the control flow for it. This code remembers the end block (for the phi node), then creates the block for the loop exit ("afterloop"). Based on the value of the
843 exit condition, it creates a conditional branch that chooses between executing
844 the loop again and exiting the loop. Any future code is emitted in the
845 "afterloop" block, so it sets the insertion position to it.</p>
847 <div class="doc_code">
848 <pre>
849 // Add a new entry to the PHI node for the backedge.
850 Variable-&gt;addIncoming(NextVar, LoopEndBB);
852 // Restore the unshadowed variable.
853 if (OldVal)
854 NamedValues[VarName] = OldVal;
855 else
856 NamedValues.erase(VarName);
858 // for expr always returns 0.0.
859 return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
861 </pre>
862 </div>
864 <p>The final code handles various cleanups: now that we have the "NextVar"
865 value, we can add the incoming value to the loop PHI node. After that, we
866 remove the loop variable from the symbol table, so that it isn't in scope after
867 the for loop. Finally, code generation of the for loop always returns 0.0, so
868 that is what we return from <tt>ForExprAST::Codegen</tt>.</p>
870 <p>With this, we conclude the "adding control flow to Kaleidoscope" chapter of
871 the tutorial. In this chapter we added two control flow constructs, and used them to motivate a couple of aspects of the LLVM IR that are important for front-end implementors
872 to know. In the next chapter of our saga, we will get a bit crazier and add
873 <a href="LangImpl6.html">user-defined operators</a> to our poor innocent
874 language.</p>
876 </div>
878 <!-- *********************************************************************** -->
879 <div class="doc_section"><a name="code">Full Code Listing</a></div>
880 <!-- *********************************************************************** -->
882 <div class="doc_text">
885 Here is the complete code listing for our running example, enhanced with the
886 if/then/else and for expressions.. To build this example, use:
887 </p>
889 <div class="doc_code">
890 <pre>
891 # Compile
892 g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
893 # Run
894 ./toy
895 </pre>
896 </div>
898 <p>Here is the code:</p>
900 <div class="doc_code">
901 <pre>
902 #include "llvm/DerivedTypes.h"
903 #include "llvm/ExecutionEngine/ExecutionEngine.h"
904 #include "llvm/LLVMContext.h"
905 #include "llvm/Module.h"
906 #include "llvm/ModuleProvider.h"
907 #include "llvm/PassManager.h"
908 #include "llvm/Analysis/Verifier.h"
909 #include "llvm/Target/TargetData.h"
910 #include "llvm/Transforms/Scalar.h"
911 #include "llvm/Support/IRBuilder.h"
912 #include &lt;cstdio&gt;
913 #include &lt;string&gt;
914 #include &lt;map&gt;
915 #include &lt;vector&gt;
916 using namespace llvm;
918 //===----------------------------------------------------------------------===//
919 // Lexer
920 //===----------------------------------------------------------------------===//
922 // The lexer returns tokens [0-255] if it is an unknown character, otherwise one
923 // of these for known things.
924 enum Token {
925 tok_eof = -1,
927 // commands
928 tok_def = -2, tok_extern = -3,
930 // primary
931 tok_identifier = -4, tok_number = -5,
933 // control
934 tok_if = -6, tok_then = -7, tok_else = -8,
935 tok_for = -9, tok_in = -10
938 static std::string IdentifierStr; // Filled in if tok_identifier
939 static double NumVal; // Filled in if tok_number
941 /// gettok - Return the next token from standard input.
942 static int gettok() {
943 static int LastChar = ' ';
945 // Skip any whitespace.
946 while (isspace(LastChar))
947 LastChar = getchar();
949 if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
950 IdentifierStr = LastChar;
951 while (isalnum((LastChar = getchar())))
952 IdentifierStr += LastChar;
954 if (IdentifierStr == "def") return tok_def;
955 if (IdentifierStr == "extern") return tok_extern;
956 if (IdentifierStr == "if") return tok_if;
957 if (IdentifierStr == "then") return tok_then;
958 if (IdentifierStr == "else") return tok_else;
959 if (IdentifierStr == "for") return tok_for;
960 if (IdentifierStr == "in") return tok_in;
961 return tok_identifier;
964 if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
965 std::string NumStr;
966 do {
967 NumStr += LastChar;
968 LastChar = getchar();
969 } while (isdigit(LastChar) || LastChar == '.');
971 NumVal = strtod(NumStr.c_str(), 0);
972 return tok_number;
975 if (LastChar == '#') {
976 // Comment until end of line.
977 do LastChar = getchar();
978 while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp;&amp; LastChar != '\r');
980 if (LastChar != EOF)
981 return gettok();
984 // Check for end of file. Don't eat the EOF.
985 if (LastChar == EOF)
986 return tok_eof;
988 // Otherwise, just return the character as its ascii value.
989 int ThisChar = LastChar;
990 LastChar = getchar();
991 return ThisChar;
994 //===----------------------------------------------------------------------===//
995 // Abstract Syntax Tree (aka Parse Tree)
996 //===----------------------------------------------------------------------===//
998 /// ExprAST - Base class for all expression nodes.
999 class ExprAST {
1000 public:
1001 virtual ~ExprAST() {}
1002 virtual Value *Codegen() = 0;
1005 /// NumberExprAST - Expression class for numeric literals like "1.0".
1006 class NumberExprAST : public ExprAST {
1007 double Val;
1008 public:
1009 NumberExprAST(double val) : Val(val) {}
1010 virtual Value *Codegen();
1013 /// VariableExprAST - Expression class for referencing a variable, like "a".
1014 class VariableExprAST : public ExprAST {
1015 std::string Name;
1016 public:
1017 VariableExprAST(const std::string &amp;name) : Name(name) {}
1018 virtual Value *Codegen();
1021 /// BinaryExprAST - Expression class for a binary operator.
1022 class BinaryExprAST : public ExprAST {
1023 char Op;
1024 ExprAST *LHS, *RHS;
1025 public:
1026 BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
1027 : Op(op), LHS(lhs), RHS(rhs) {}
1028 virtual Value *Codegen();
1031 /// CallExprAST - Expression class for function calls.
1032 class CallExprAST : public ExprAST {
1033 std::string Callee;
1034 std::vector&lt;ExprAST*&gt; Args;
1035 public:
1036 CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
1037 : Callee(callee), Args(args) {}
1038 virtual Value *Codegen();
1041 /// IfExprAST - Expression class for if/then/else.
1042 class IfExprAST : public ExprAST {
1043 ExprAST *Cond, *Then, *Else;
1044 public:
1045 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
1046 : Cond(cond), Then(then), Else(_else) {}
1047 virtual Value *Codegen();
1050 /// ForExprAST - Expression class for for/in.
1051 class ForExprAST : public ExprAST {
1052 std::string VarName;
1053 ExprAST *Start, *End, *Step, *Body;
1054 public:
1055 ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
1056 ExprAST *step, ExprAST *body)
1057 : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
1058 virtual Value *Codegen();
1061 /// PrototypeAST - This class represents the "prototype" for a function,
1062 /// which captures its argument names as well as if it is an operator.
1063 class PrototypeAST {
1064 std::string Name;
1065 std::vector&lt;std::string&gt; Args;
1066 public:
1067 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args)
1068 : Name(name), Args(args) {}
1070 Function *Codegen();
1073 /// FunctionAST - This class represents a function definition itself.
1074 class FunctionAST {
1075 PrototypeAST *Proto;
1076 ExprAST *Body;
1077 public:
1078 FunctionAST(PrototypeAST *proto, ExprAST *body)
1079 : Proto(proto), Body(body) {}
1081 Function *Codegen();
1084 //===----------------------------------------------------------------------===//
1085 // Parser
1086 //===----------------------------------------------------------------------===//
1088 /// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
1089 /// token the parser it looking at. getNextToken reads another token from the
1090 /// lexer and updates CurTok with its results.
1091 static int CurTok;
1092 static int getNextToken() {
1093 return CurTok = gettok();
1096 /// BinopPrecedence - This holds the precedence for each binary operator that is
1097 /// defined.
1098 static std::map&lt;char, int&gt; BinopPrecedence;
1100 /// GetTokPrecedence - Get the precedence of the pending binary operator token.
1101 static int GetTokPrecedence() {
1102 if (!isascii(CurTok))
1103 return -1;
1105 // Make sure it's a declared binop.
1106 int TokPrec = BinopPrecedence[CurTok];
1107 if (TokPrec &lt;= 0) return -1;
1108 return TokPrec;
1111 /// Error* - These are little helper functions for error handling.
1112 ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1113 PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1114 FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1116 static ExprAST *ParseExpression();
1118 /// identifierexpr
1119 /// ::= identifier
1120 /// ::= identifier '(' expression* ')'
1121 static ExprAST *ParseIdentifierExpr() {
1122 std::string IdName = IdentifierStr;
1124 getNextToken(); // eat identifier.
1126 if (CurTok != '(') // Simple variable ref.
1127 return new VariableExprAST(IdName);
1129 // Call.
1130 getNextToken(); // eat (
1131 std::vector&lt;ExprAST*&gt; Args;
1132 if (CurTok != ')') {
1133 while (1) {
1134 ExprAST *Arg = ParseExpression();
1135 if (!Arg) return 0;
1136 Args.push_back(Arg);
1138 if (CurTok == ')') break;
1140 if (CurTok != ',')
1141 return Error("Expected ')' or ',' in argument list");
1142 getNextToken();
1146 // Eat the ')'.
1147 getNextToken();
1149 return new CallExprAST(IdName, Args);
1152 /// numberexpr ::= number
1153 static ExprAST *ParseNumberExpr() {
1154 ExprAST *Result = new NumberExprAST(NumVal);
1155 getNextToken(); // consume the number
1156 return Result;
1159 /// parenexpr ::= '(' expression ')'
1160 static ExprAST *ParseParenExpr() {
1161 getNextToken(); // eat (.
1162 ExprAST *V = ParseExpression();
1163 if (!V) return 0;
1165 if (CurTok != ')')
1166 return Error("expected ')'");
1167 getNextToken(); // eat ).
1168 return V;
1171 /// ifexpr ::= 'if' expression 'then' expression 'else' expression
1172 static ExprAST *ParseIfExpr() {
1173 getNextToken(); // eat the if.
1175 // condition.
1176 ExprAST *Cond = ParseExpression();
1177 if (!Cond) return 0;
1179 if (CurTok != tok_then)
1180 return Error("expected then");
1181 getNextToken(); // eat the then
1183 ExprAST *Then = ParseExpression();
1184 if (Then == 0) return 0;
1186 if (CurTok != tok_else)
1187 return Error("expected else");
1189 getNextToken();
1191 ExprAST *Else = ParseExpression();
1192 if (!Else) return 0;
1194 return new IfExprAST(Cond, Then, Else);
1197 /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
1198 static ExprAST *ParseForExpr() {
1199 getNextToken(); // eat the for.
1201 if (CurTok != tok_identifier)
1202 return Error("expected identifier after for");
1204 std::string IdName = IdentifierStr;
1205 getNextToken(); // eat identifier.
1207 if (CurTok != '=')
1208 return Error("expected '=' after for");
1209 getNextToken(); // eat '='.
1212 ExprAST *Start = ParseExpression();
1213 if (Start == 0) return 0;
1214 if (CurTok != ',')
1215 return Error("expected ',' after for start value");
1216 getNextToken();
1218 ExprAST *End = ParseExpression();
1219 if (End == 0) return 0;
1221 // The step value is optional.
1222 ExprAST *Step = 0;
1223 if (CurTok == ',') {
1224 getNextToken();
1225 Step = ParseExpression();
1226 if (Step == 0) return 0;
1229 if (CurTok != tok_in)
1230 return Error("expected 'in' after for");
1231 getNextToken(); // eat 'in'.
1233 ExprAST *Body = ParseExpression();
1234 if (Body == 0) return 0;
1236 return new ForExprAST(IdName, Start, End, Step, Body);
1240 /// primary
1241 /// ::= identifierexpr
1242 /// ::= numberexpr
1243 /// ::= parenexpr
1244 /// ::= ifexpr
1245 /// ::= forexpr
1246 static ExprAST *ParsePrimary() {
1247 switch (CurTok) {
1248 default: return Error("unknown token when expecting an expression");
1249 case tok_identifier: return ParseIdentifierExpr();
1250 case tok_number: return ParseNumberExpr();
1251 case '(': return ParseParenExpr();
1252 case tok_if: return ParseIfExpr();
1253 case tok_for: return ParseForExpr();
1257 /// binoprhs
1258 /// ::= ('+' primary)*
1259 static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1260 // If this is a binop, find its precedence.
1261 while (1) {
1262 int TokPrec = GetTokPrecedence();
1264 // If this is a binop that binds at least as tightly as the current binop,
1265 // consume it, otherwise we are done.
1266 if (TokPrec &lt; ExprPrec)
1267 return LHS;
1269 // Okay, we know this is a binop.
1270 int BinOp = CurTok;
1271 getNextToken(); // eat binop
1273 // Parse the primary expression after the binary operator.
1274 ExprAST *RHS = ParsePrimary();
1275 if (!RHS) return 0;
1277 // If BinOp binds less tightly with RHS than the operator after RHS, let
1278 // the pending operator take RHS as its LHS.
1279 int NextPrec = GetTokPrecedence();
1280 if (TokPrec &lt; NextPrec) {
1281 RHS = ParseBinOpRHS(TokPrec+1, RHS);
1282 if (RHS == 0) return 0;
1285 // Merge LHS/RHS.
1286 LHS = new BinaryExprAST(BinOp, LHS, RHS);
1290 /// expression
1291 /// ::= primary binoprhs
1293 static ExprAST *ParseExpression() {
1294 ExprAST *LHS = ParsePrimary();
1295 if (!LHS) return 0;
1297 return ParseBinOpRHS(0, LHS);
1300 /// prototype
1301 /// ::= id '(' id* ')'
1302 static PrototypeAST *ParsePrototype() {
1303 if (CurTok != tok_identifier)
1304 return ErrorP("Expected function name in prototype");
1306 std::string FnName = IdentifierStr;
1307 getNextToken();
1309 if (CurTok != '(')
1310 return ErrorP("Expected '(' in prototype");
1312 std::vector&lt;std::string&gt; ArgNames;
1313 while (getNextToken() == tok_identifier)
1314 ArgNames.push_back(IdentifierStr);
1315 if (CurTok != ')')
1316 return ErrorP("Expected ')' in prototype");
1318 // success.
1319 getNextToken(); // eat ')'.
1321 return new PrototypeAST(FnName, ArgNames);
1324 /// definition ::= 'def' prototype expression
1325 static FunctionAST *ParseDefinition() {
1326 getNextToken(); // eat def.
1327 PrototypeAST *Proto = ParsePrototype();
1328 if (Proto == 0) return 0;
1330 if (ExprAST *E = ParseExpression())
1331 return new FunctionAST(Proto, E);
1332 return 0;
1335 /// toplevelexpr ::= expression
1336 static FunctionAST *ParseTopLevelExpr() {
1337 if (ExprAST *E = ParseExpression()) {
1338 // Make an anonymous proto.
1339 PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1340 return new FunctionAST(Proto, E);
1342 return 0;
1345 /// external ::= 'extern' prototype
1346 static PrototypeAST *ParseExtern() {
1347 getNextToken(); // eat extern.
1348 return ParsePrototype();
1351 //===----------------------------------------------------------------------===//
1352 // Code Generation
1353 //===----------------------------------------------------------------------===//
1355 static Module *TheModule;
1356 static IRBuilder&lt;&gt; Builder(getGlobalContext());
1357 static std::map&lt;std::string, Value*&gt; NamedValues;
1358 static FunctionPassManager *TheFPM;
1360 Value *ErrorV(const char *Str) { Error(Str); return 0; }
1362 Value *NumberExprAST::Codegen() {
1363 return ConstantFP::get(getGlobalContext(), APFloat(Val));
1366 Value *VariableExprAST::Codegen() {
1367 // Look this variable up in the function.
1368 Value *V = NamedValues[Name];
1369 return V ? V : ErrorV("Unknown variable name");
1372 Value *BinaryExprAST::Codegen() {
1373 Value *L = LHS-&gt;Codegen();
1374 Value *R = RHS-&gt;Codegen();
1375 if (L == 0 || R == 0) return 0;
1377 switch (Op) {
1378 case '+': return Builder.CreateAdd(L, R, "addtmp");
1379 case '-': return Builder.CreateSub(L, R, "subtmp");
1380 case '*': return Builder.CreateMul(L, R, "multmp");
1381 case '&lt;':
1382 L = Builder.CreateFCmpULT(L, R, "cmptmp");
1383 // Convert bool 0/1 to double 0.0 or 1.0
1384 return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()), "booltmp");
1385 default: return ErrorV("invalid binary operator");
1389 Value *CallExprAST::Codegen() {
1390 // Look up the name in the global module table.
1391 Function *CalleeF = TheModule-&gt;getFunction(Callee);
1392 if (CalleeF == 0)
1393 return ErrorV("Unknown function referenced");
1395 // If argument mismatch error.
1396 if (CalleeF-&gt;arg_size() != Args.size())
1397 return ErrorV("Incorrect # arguments passed");
1399 std::vector&lt;Value*&gt; ArgsV;
1400 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1401 ArgsV.push_back(Args[i]-&gt;Codegen());
1402 if (ArgsV.back() == 0) return 0;
1405 return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
1408 Value *IfExprAST::Codegen() {
1409 Value *CondV = Cond-&gt;Codegen();
1410 if (CondV == 0) return 0;
1412 // Convert condition to a bool by comparing equal to 0.0.
1413 CondV = Builder.CreateFCmpONE(CondV,
1414 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
1415 "ifcond");
1417 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1419 // Create blocks for the then and else cases. Insert the 'then' block at the
1420 // end of the function.
1421 BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
1422 BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
1423 BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
1425 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1427 // Emit then value.
1428 Builder.SetInsertPoint(ThenBB);
1430 Value *ThenV = Then-&gt;Codegen();
1431 if (ThenV == 0) return 0;
1433 Builder.CreateBr(MergeBB);
1434 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1435 ThenBB = Builder.GetInsertBlock();
1437 // Emit else block.
1438 TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1439 Builder.SetInsertPoint(ElseBB);
1441 Value *ElseV = Else-&gt;Codegen();
1442 if (ElseV == 0) return 0;
1444 Builder.CreateBr(MergeBB);
1445 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1446 ElseBB = Builder.GetInsertBlock();
1448 // Emit merge block.
1449 TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1450 Builder.SetInsertPoint(MergeBB);
1451 PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), "iftmp");
1453 PN-&gt;addIncoming(ThenV, ThenBB);
1454 PN-&gt;addIncoming(ElseV, ElseBB);
1455 return PN;
1458 Value *ForExprAST::Codegen() {
1459 // Output this as:
1460 // ...
1461 // start = startexpr
1462 // goto loop
1463 // loop:
1464 // variable = phi [start, loopheader], [nextvariable, loopend]
1465 // ...
1466 // bodyexpr
1467 // ...
1468 // loopend:
1469 // step = stepexpr
1470 // nextvariable = variable + step
1471 // endcond = endexpr
1472 // br endcond, loop, endloop
1473 // outloop:
1475 // Emit the start code first, without 'variable' in scope.
1476 Value *StartVal = Start-&gt;Codegen();
1477 if (StartVal == 0) return 0;
1479 // Make the new basic block for the loop header, inserting after current
1480 // block.
1481 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1482 BasicBlock *PreheaderBB = Builder.GetInsertBlock();
1483 BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
1485 // Insert an explicit fall through from the current block to the LoopBB.
1486 Builder.CreateBr(LoopBB);
1488 // Start insertion in LoopBB.
1489 Builder.SetInsertPoint(LoopBB);
1491 // Start the PHI node with an entry for Start.
1492 PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), VarName.c_str());
1493 Variable-&gt;addIncoming(StartVal, PreheaderBB);
1495 // Within the loop, the variable is defined equal to the PHI node. If it
1496 // shadows an existing variable, we have to restore it, so save it now.
1497 Value *OldVal = NamedValues[VarName];
1498 NamedValues[VarName] = Variable;
1500 // Emit the body of the loop. This, like any other expr, can change the
1501 // current BB. Note that we ignore the value computed by the body, but don't
1502 // allow an error.
1503 if (Body-&gt;Codegen() == 0)
1504 return 0;
1506 // Emit the step value.
1507 Value *StepVal;
1508 if (Step) {
1509 StepVal = Step-&gt;Codegen();
1510 if (StepVal == 0) return 0;
1511 } else {
1512 // If not specified, use 1.0.
1513 StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
1516 Value *NextVar = Builder.CreateAdd(Variable, StepVal, "nextvar");
1518 // Compute the end condition.
1519 Value *EndCond = End-&gt;Codegen();
1520 if (EndCond == 0) return EndCond;
1522 // Convert condition to a bool by comparing equal to 0.0.
1523 EndCond = Builder.CreateFCmpONE(EndCond,
1524 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
1525 "loopcond");
1527 // Create the "after loop" block and insert it.
1528 BasicBlock *LoopEndBB = Builder.GetInsertBlock();
1529 BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
1531 // Insert the conditional branch into the end of LoopEndBB.
1532 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1534 // Any new code will be inserted in AfterBB.
1535 Builder.SetInsertPoint(AfterBB);
1537 // Add a new entry to the PHI node for the backedge.
1538 Variable-&gt;addIncoming(NextVar, LoopEndBB);
1540 // Restore the unshadowed variable.
1541 if (OldVal)
1542 NamedValues[VarName] = OldVal;
1543 else
1544 NamedValues.erase(VarName);
1547 // for expr always returns 0.0.
1548 return getGlobalContext().getNullValue(Type::getDoubleTy(getGlobalContext()));
1551 Function *PrototypeAST::Codegen() {
1552 // Make the function type: double(double,double) etc.
1553 std::vector&lt;const Type*&gt; Doubles(Args.size(), Type::getDoubleTy(getGlobalContext()));
1554 FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()), Doubles, false);
1556 Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
1558 // If F conflicted, there was already something named 'Name'. If it has a
1559 // body, don't allow redefinition or reextern.
1560 if (F-&gt;getName() != Name) {
1561 // Delete the one we just made and get the existing one.
1562 F-&gt;eraseFromParent();
1563 F = TheModule-&gt;getFunction(Name);
1565 // If F already has a body, reject this.
1566 if (!F-&gt;empty()) {
1567 ErrorF("redefinition of function");
1568 return 0;
1571 // If F took a different number of args, reject.
1572 if (F-&gt;arg_size() != Args.size()) {
1573 ErrorF("redefinition of function with different # args");
1574 return 0;
1578 // Set names for all arguments.
1579 unsigned Idx = 0;
1580 for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1581 ++AI, ++Idx) {
1582 AI-&gt;setName(Args[Idx]);
1584 // Add arguments to variable symbol table.
1585 NamedValues[Args[Idx]] = AI;
1588 return F;
1591 Function *FunctionAST::Codegen() {
1592 NamedValues.clear();
1594 Function *TheFunction = Proto-&gt;Codegen();
1595 if (TheFunction == 0)
1596 return 0;
1598 // Create a new basic block to start insertion into.
1599 BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
1600 Builder.SetInsertPoint(BB);
1602 if (Value *RetVal = Body-&gt;Codegen()) {
1603 // Finish off the function.
1604 Builder.CreateRet(RetVal);
1606 // Validate the generated code, checking for consistency.
1607 verifyFunction(*TheFunction);
1609 // Optimize the function.
1610 TheFPM-&gt;run(*TheFunction);
1612 return TheFunction;
1615 // Error reading body, remove function.
1616 TheFunction-&gt;eraseFromParent();
1617 return 0;
1620 //===----------------------------------------------------------------------===//
1621 // Top-Level parsing and JIT Driver
1622 //===----------------------------------------------------------------------===//
1624 static ExecutionEngine *TheExecutionEngine;
1626 static void HandleDefinition() {
1627 if (FunctionAST *F = ParseDefinition()) {
1628 if (Function *LF = F-&gt;Codegen()) {
1629 fprintf(stderr, "Read function definition:");
1630 LF-&gt;dump();
1632 } else {
1633 // Skip token for error recovery.
1634 getNextToken();
1638 static void HandleExtern() {
1639 if (PrototypeAST *P = ParseExtern()) {
1640 if (Function *F = P-&gt;Codegen()) {
1641 fprintf(stderr, "Read extern: ");
1642 F-&gt;dump();
1644 } else {
1645 // Skip token for error recovery.
1646 getNextToken();
1650 static void HandleTopLevelExpression() {
1651 // Evaluate a top level expression into an anonymous function.
1652 if (FunctionAST *F = ParseTopLevelExpr()) {
1653 if (Function *LF = F-&gt;Codegen()) {
1654 // JIT the function, returning a function pointer.
1655 void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
1657 // Cast it to the right type (takes no arguments, returns a double) so we
1658 // can call it as a native function.
1659 double (*FP)() = (double (*)())FPtr;
1660 fprintf(stderr, "Evaluated to %f\n", FP());
1662 } else {
1663 // Skip token for error recovery.
1664 getNextToken();
1668 /// top ::= definition | external | expression | ';'
1669 static void MainLoop() {
1670 while (1) {
1671 fprintf(stderr, "ready&gt; ");
1672 switch (CurTok) {
1673 case tok_eof: return;
1674 case ';': getNextToken(); break; // ignore top level semicolons.
1675 case tok_def: HandleDefinition(); break;
1676 case tok_extern: HandleExtern(); break;
1677 default: HandleTopLevelExpression(); break;
1684 //===----------------------------------------------------------------------===//
1685 // "Library" functions that can be "extern'd" from user code.
1686 //===----------------------------------------------------------------------===//
1688 /// putchard - putchar that takes a double and returns 0.
1689 extern "C"
1690 double putchard(double X) {
1691 putchar((char)X);
1692 return 0;
1695 //===----------------------------------------------------------------------===//
1696 // Main driver code.
1697 //===----------------------------------------------------------------------===//
1699 int main() {
1700 // Install standard binary operators.
1701 // 1 is lowest precedence.
1702 BinopPrecedence['&lt;'] = 10;
1703 BinopPrecedence['+'] = 20;
1704 BinopPrecedence['-'] = 20;
1705 BinopPrecedence['*'] = 40; // highest.
1707 // Prime the first token.
1708 fprintf(stderr, "ready&gt; ");
1709 getNextToken();
1711 // Make the module, which holds all the code.
1712 TheModule = new Module("my cool jit", getGlobalContext());
1714 ExistingModuleProvider *OurModuleProvider =
1715 new ExistingModuleProvider(TheModule);
1717 // Create the JIT. This takes ownership of the module and module provider.
1718 TheExecutionEngine = EngineBuilder(OurModuleProvider).create();
1720 FunctionPassManager OurFPM(OurModuleProvider);
1722 // Set up the optimizer pipeline. Start with registering info about how the
1723 // target lays out data structures.
1724 OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
1725 // Do simple "peephole" optimizations and bit-twiddling optzns.
1726 OurFPM.add(createInstructionCombiningPass());
1727 // Reassociate expressions.
1728 OurFPM.add(createReassociatePass());
1729 // Eliminate Common SubExpressions.
1730 OurFPM.add(createGVNPass());
1731 // Simplify the control flow graph (deleting unreachable blocks, etc).
1732 OurFPM.add(createCFGSimplificationPass());
1734 // Set the global so the code gen can use this.
1735 TheFPM = &amp;OurFPM;
1737 // Run the main "interpreter loop" now.
1738 MainLoop();
1740 TheFPM = 0;
1742 // Print out all of the generated code.
1743 TheModule-&gt;dump();
1745 return 0;
1747 </pre>
1748 </div>
1750 <a href="LangImpl6.html">Next: Extending the language: user-defined operators</a>
1751 </div>
1753 <!-- *********************************************************************** -->
1754 <hr>
1755 <address>
1756 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1757 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1758 <a href="http://validator.w3.org/check/referer"><img
1759 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
1761 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1762 <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
1763 Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
1764 </address>
1765 </body>
1766 </html>