Merge branch 'master' into msp430
[llvm/msp430.git] / docs / tutorial / OCamlLangImpl6.html
blob780cab819142a70c29619fae6bb500dd9f8af810
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
4 <html>
5 <head>
6 <title>Kaleidoscope: Extending the Language: User-defined Operators</title>
7 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8 <meta name="author" content="Chris Lattner">
9 <meta name="author" content="Erick Tryzelaar">
10 <link rel="stylesheet" href="../llvm.css" type="text/css">
11 </head>
13 <body>
15 <div class="doc_title">Kaleidoscope: Extending the Language: User-defined Operators</div>
17 <ul>
18 <li><a href="index.html">Up to Tutorial Index</a></li>
19 <li>Chapter 6
20 <ol>
21 <li><a href="#intro">Chapter 6 Introduction</a></li>
22 <li><a href="#idea">User-defined Operators: the Idea</a></li>
23 <li><a href="#binary">User-defined Binary Operators</a></li>
24 <li><a href="#unary">User-defined Unary Operators</a></li>
25 <li><a href="#example">Kicking the Tires</a></li>
26 <li><a href="#code">Full Code Listing</a></li>
27 </ol>
28 </li>
29 <li><a href="OCamlLangImpl7.html">Chapter 7</a>: Extending the Language: Mutable
30 Variables / SSA Construction</li>
31 </ul>
33 <div class="doc_author">
34 <p>
35 Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a>
36 and <a href="mailto:idadesub@users.sourceforge.net">Erick Tryzelaar</a>
37 </p>
38 </div>
40 <!-- *********************************************************************** -->
41 <div class="doc_section"><a name="intro">Chapter 6 Introduction</a></div>
42 <!-- *********************************************************************** -->
44 <div class="doc_text">
46 <p>Welcome to Chapter 6 of the "<a href="index.html">Implementing a language
47 with LLVM</a>" tutorial. At this point in our tutorial, we now have a fully
48 functional language that is fairly minimal, but also useful. There
49 is still one big problem with it, however. Our language doesn't have many
50 useful operators (like division, logical negation, or even any comparisons
51 besides less-than).</p>
53 <p>This chapter of the tutorial takes a wild digression into adding user-defined
54 operators to the simple and beautiful Kaleidoscope language. This digression now
55 gives us a simple and ugly language in some ways, but also a powerful one at the
56 same time. One of the great things about creating your own language is that you
57 get to decide what is good or bad. In this tutorial we'll assume that it is
58 okay to use this as a way to show some interesting parsing techniques.</p>
60 <p>At the end of this tutorial, we'll run through an example Kaleidoscope
61 application that <a href="#example">renders the Mandelbrot set</a>. This gives
62 an example of what you can build with Kaleidoscope and its feature set.</p>
64 </div>
66 <!-- *********************************************************************** -->
67 <div class="doc_section"><a name="idea">User-defined Operators: the Idea</a></div>
68 <!-- *********************************************************************** -->
70 <div class="doc_text">
72 <p>
73 The "operator overloading" that we will add to Kaleidoscope is more general than
74 languages like C++. In C++, you are only allowed to redefine existing
75 operators: you can't programatically change the grammar, introduce new
76 operators, change precedence levels, etc. In this chapter, we will add this
77 capability to Kaleidoscope, which will let the user round out the set of
78 operators that are supported.</p>
80 <p>The point of going into user-defined operators in a tutorial like this is to
81 show the power and flexibility of using a hand-written parser. Thus far, the parser
82 we have been implementing uses recursive descent for most parts of the grammar and
83 operator precedence parsing for the expressions. See <a
84 href="OCamlLangImpl2.html">Chapter 2</a> for details. Without using operator
85 precedence parsing, it would be very difficult to allow the programmer to
86 introduce new operators into the grammar: the grammar is dynamically extensible
87 as the JIT runs.</p>
89 <p>The two specific features we'll add are programmable unary operators (right
90 now, Kaleidoscope has no unary operators at all) as well as binary operators.
91 An example of this is:</p>
93 <div class="doc_code">
94 <pre>
95 # Logical unary not.
96 def unary!(v)
97 if v then
99 else
102 # Define &gt; with the same precedence as &lt;.
103 def binary&gt; 10 (LHS RHS)
104 RHS &lt; LHS;
106 # Binary "logical or", (note that it does not "short circuit")
107 def binary| 5 (LHS RHS)
108 if LHS then
110 else if RHS then
112 else
115 # Define = with slightly lower precedence than relationals.
116 def binary= 9 (LHS RHS)
117 !(LHS &lt; RHS | LHS &gt; RHS);
118 </pre>
119 </div>
121 <p>Many languages aspire to being able to implement their standard runtime
122 library in the language itself. In Kaleidoscope, we can implement significant
123 parts of the language in the library!</p>
125 <p>We will break down implementation of these features into two parts:
126 implementing support for user-defined binary operators and adding unary
127 operators.</p>
129 </div>
131 <!-- *********************************************************************** -->
132 <div class="doc_section"><a name="binary">User-defined Binary Operators</a></div>
133 <!-- *********************************************************************** -->
135 <div class="doc_text">
137 <p>Adding support for user-defined binary operators is pretty simple with our
138 current framework. We'll first add support for the unary/binary keywords:</p>
140 <div class="doc_code">
141 <pre>
142 type token =
144 <b>(* operators *)
145 | Binary | Unary</b>
149 and lex_ident buffer = parser
151 | "for" -&gt; [&lt; 'Token.For; stream &gt;]
152 | "in" -&gt; [&lt; 'Token.In; stream &gt;]
153 <b>| "binary" -&gt; [&lt; 'Token.Binary; stream &gt;]
154 | "unary" -&gt; [&lt; 'Token.Unary; stream &gt;]</b>
155 </pre>
156 </div>
158 <p>This just adds lexer support for the unary and binary keywords, like we
159 did in <a href="OCamlLangImpl5.html#iflexer">previous chapters</a>. One nice
160 thing about our current AST, is that we represent binary operators with full
161 generalisation by using their ASCII code as the opcode. For our extended
162 operators, we'll use this same representation, so we don't need any new AST or
163 parser support.</p>
165 <p>On the other hand, we have to be able to represent the definitions of these
166 new operators, in the "def binary| 5" part of the function definition. In our
167 grammar so far, the "name" for the function definition is parsed as the
168 "prototype" production and into the <tt>Ast.Prototype</tt> AST node. To
169 represent our new user-defined operators as prototypes, we have to extend
170 the <tt>Ast.Prototype</tt> AST node like this:</p>
172 <div class="doc_code">
173 <pre>
174 (* proto - This type represents the "prototype" for a function, which captures
175 * its name, and its argument names (thus implicitly the number of arguments the
176 * function takes). *)
177 type proto =
178 | Prototype of string * string array
179 <b>| BinOpPrototype of string * string array * int</b>
180 </pre>
181 </div>
183 <p>Basically, in addition to knowing a name for the prototype, we now keep track
184 of whether it was an operator, and if it was, what precedence level the operator
185 is at. The precedence is only used for binary operators (as you'll see below,
186 it just doesn't apply for unary operators). Now that we have a way to represent
187 the prototype for a user-defined operator, we need to parse it:</p>
189 <div class="doc_code">
190 <pre>
191 (* prototype
192 * ::= id '(' id* ')'
193 <b>* ::= binary LETTER number? (id, id)
194 * ::= unary LETTER number? (id) *)</b>
195 let parse_prototype =
196 let rec parse_args accumulator = parser
197 | [&lt; 'Token.Ident id; e=parse_args (id::accumulator) &gt;] -&gt; e
198 | [&lt; &gt;] -&gt; accumulator
200 let parse_operator = parser
201 | [&lt; 'Token.Unary &gt;] -&gt; "unary", 1
202 | [&lt; 'Token.Binary &gt;] -&gt; "binary", 2
204 let parse_binary_precedence = parser
205 | [&lt; 'Token.Number n &gt;] -&gt; int_of_float n
206 | [&lt; &gt;] -&gt; 30
208 parser
209 | [&lt; 'Token.Ident id;
210 'Token.Kwd '(' ?? "expected '(' in prototype";
211 args=parse_args [];
212 'Token.Kwd ')' ?? "expected ')' in prototype" &gt;] -&gt;
213 (* success. *)
214 Ast.Prototype (id, Array.of_list (List.rev args))
215 <b>| [&lt; (prefix, kind)=parse_operator;
216 'Token.Kwd op ?? "expected an operator";
217 (* Read the precedence if present. *)
218 binary_precedence=parse_binary_precedence;
219 'Token.Kwd '(' ?? "expected '(' in prototype";
220 args=parse_args [];
221 'Token.Kwd ')' ?? "expected ')' in prototype" &gt;] -&gt;
222 let name = prefix ^ (String.make 1 op) in
223 let args = Array.of_list (List.rev args) in
225 (* Verify right number of arguments for operator. *)
226 if Array.length args != kind
227 then raise (Stream.Error "invalid number of operands for operator")
228 else
229 if kind == 1 then
230 Ast.Prototype (name, args)
231 else
232 Ast.BinOpPrototype (name, args, binary_precedence)</b>
233 | [&lt; &gt;] -&gt;
234 raise (Stream.Error "expected function name in prototype")
235 </pre>
236 </div>
238 <p>This is all fairly straightforward parsing code, and we have already seen
239 a lot of similar code in the past. One interesting part about the code above is
240 the couple lines that set up <tt>name</tt> for binary operators. This builds
241 names like "binary@" for a newly defined "@" operator. This then takes
242 advantage of the fact that symbol names in the LLVM symbol table are allowed to
243 have any character in them, including embedded nul characters.</p>
245 <p>The next interesting thing to add, is codegen support for these binary
246 operators. Given our current structure, this is a simple addition of a default
247 case for our existing binary operator node:</p>
249 <div class="doc_code">
250 <pre>
251 let codegen_expr = function
253 | Ast.Binary (op, lhs, rhs) -&gt;
254 let lhs_val = codegen_expr lhs in
255 let rhs_val = codegen_expr rhs in
256 begin
257 match op with
258 | '+' -&gt; build_add lhs_val rhs_val "addtmp" builder
259 | '-' -&gt; build_sub lhs_val rhs_val "subtmp" builder
260 | '*' -&gt; build_mul lhs_val rhs_val "multmp" builder
261 | '&lt;' -&gt;
262 (* Convert bool 0/1 to double 0.0 or 1.0 *)
263 let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
264 build_uitofp i double_type "booltmp" builder
265 <b>| _ -&gt;
266 (* If it wasn't a builtin binary operator, it must be a user defined
267 * one. Emit a call to it. *)
268 let callee = "binary" ^ (String.make 1 op) in
269 let callee =
270 match lookup_function callee the_module with
271 | Some callee -&gt; callee
272 | None -&gt; raise (Error "binary operator not found!")
274 build_call callee [|lhs_val; rhs_val|] "binop" builder</b>
276 </pre>
277 </div>
279 <p>As you can see above, the new code is actually really simple. It just does
280 a lookup for the appropriate operator in the symbol table and generates a
281 function call to it. Since user-defined operators are just built as normal
282 functions (because the "prototype" boils down to a function with the right
283 name) everything falls into place.</p>
285 <p>The final piece of code we are missing, is a bit of top level magic:</p>
287 <div class="doc_code">
288 <pre>
289 let codegen_func the_fpm = function
290 | Ast.Function (proto, body) -&gt;
291 Hashtbl.clear named_values;
292 let the_function = codegen_proto proto in
294 <b>(* If this is an operator, install it. *)
295 begin match proto with
296 | Ast.BinOpPrototype (name, args, prec) -&gt;
297 let op = name.[String.length name - 1] in
298 Hashtbl.add Parser.binop_precedence op prec;
299 | _ -&gt; ()
300 end;</b>
302 (* Create a new basic block to start insertion into. *)
303 let bb = append_block "entry" the_function in
304 position_at_end bb builder;
306 </pre>
307 </div>
309 <p>Basically, before codegening a function, if it is a user-defined operator, we
310 register it in the precedence table. This allows the binary operator parsing
311 logic we already have in place to handle it. Since we are working on a
312 fully-general operator precedence parser, this is all we need to do to "extend
313 the grammar".</p>
315 <p>Now we have useful user-defined binary operators. This builds a lot
316 on the previous framework we built for other operators. Adding unary operators
317 is a bit more challenging, because we don't have any framework for it yet - lets
318 see what it takes.</p>
320 </div>
322 <!-- *********************************************************************** -->
323 <div class="doc_section"><a name="unary">User-defined Unary Operators</a></div>
324 <!-- *********************************************************************** -->
326 <div class="doc_text">
328 <p>Since we don't currently support unary operators in the Kaleidoscope
329 language, we'll need to add everything to support them. Above, we added simple
330 support for the 'unary' keyword to the lexer. In addition to that, we need an
331 AST node:</p>
333 <div class="doc_code">
334 <pre>
335 type expr =
337 (* variant for a unary operator. *)
338 | Unary of char * expr
340 </pre>
341 </div>
343 <p>This AST node is very simple and obvious by now. It directly mirrors the
344 binary operator AST node, except that it only has one child. With this, we
345 need to add the parsing logic. Parsing a unary operator is pretty simple: we'll
346 add a new function to do it:</p>
348 <div class="doc_code">
349 <pre>
350 (* unary
351 * ::= primary
352 * ::= '!' unary *)
353 and parse_unary = parser
354 (* If this is a unary operator, read it. *)
355 | [&lt; 'Token.Kwd op when op != '(' &amp;&amp; op != ')'; operand=parse_expr &gt;] -&gt;
356 Ast.Unary (op, operand)
358 (* If the current token is not an operator, it must be a primary expr. *)
359 | [&lt; stream &gt;] -&gt; parse_primary stream
360 </pre>
361 </div>
363 <p>The grammar we add is pretty straightforward here. If we see a unary
364 operator when parsing a primary operator, we eat the operator as a prefix and
365 parse the remaining piece as another unary operator. This allows us to handle
366 multiple unary operators (e.g. "!!x"). Note that unary operators can't have
367 ambiguous parses like binary operators can, so there is no need for precedence
368 information.</p>
370 <p>The problem with this function, is that we need to call ParseUnary from
371 somewhere. To do this, we change previous callers of ParsePrimary to call
372 <tt>parse_unary</tt> instead:</p>
374 <div class="doc_code">
375 <pre>
376 (* binoprhs
377 * ::= ('+' primary)* *)
378 and parse_bin_rhs expr_prec lhs stream =
380 <b>(* Parse the unary expression after the binary operator. *)
381 let rhs = parse_unary stream in</b>
386 (* expression
387 * ::= primary binoprhs *)
388 and parse_expr = parser
389 | [&lt; lhs=<b>parse_unary</b>; stream &gt;] -&gt; parse_bin_rhs 0 lhs stream
390 </pre>
391 </div>
393 <p>With these two simple changes, we are now able to parse unary operators and build the
394 AST for them. Next up, we need to add parser support for prototypes, to parse
395 the unary operator prototype. We extend the binary operator code above
396 with:</p>
398 <div class="doc_code">
399 <pre>
400 (* prototype
401 * ::= id '(' id* ')'
402 * ::= binary LETTER number? (id, id)
403 <b>* ::= unary LETTER number? (id)</b> *)
404 let parse_prototype =
405 let rec parse_args accumulator = parser
406 | [&lt; 'Token.Ident id; e=parse_args (id::accumulator) &gt;] -&gt; e
407 | [&lt; &gt;] -&gt; accumulator
409 <b>let parse_operator = parser
410 | [&lt; 'Token.Unary &gt;] -&gt; "unary", 1
411 | [&lt; 'Token.Binary &gt;] -&gt; "binary", 2
412 in</b>
413 let parse_binary_precedence = parser
414 | [&lt; 'Token.Number n &gt;] -&gt; int_of_float n
415 | [&lt; &gt;] -&gt; 30
417 parser
418 | [&lt; 'Token.Ident id;
419 'Token.Kwd '(' ?? "expected '(' in prototype";
420 args=parse_args [];
421 'Token.Kwd ')' ?? "expected ')' in prototype" &gt;] -&gt;
422 (* success. *)
423 Ast.Prototype (id, Array.of_list (List.rev args))
424 <b>| [&lt; (prefix, kind)=parse_operator;
425 'Token.Kwd op ?? "expected an operator";
426 (* Read the precedence if present. *)
427 binary_precedence=parse_binary_precedence;
428 'Token.Kwd '(' ?? "expected '(' in prototype";
429 args=parse_args [];
430 'Token.Kwd ')' ?? "expected ')' in prototype" &gt;] -&gt;
431 let name = prefix ^ (String.make 1 op) in
432 let args = Array.of_list (List.rev args) in
434 (* Verify right number of arguments for operator. *)
435 if Array.length args != kind
436 then raise (Stream.Error "invalid number of operands for operator")
437 else
438 if kind == 1 then
439 Ast.Prototype (name, args)
440 else
441 Ast.BinOpPrototype (name, args, binary_precedence)</b>
442 | [&lt; &gt;] -&gt;
443 raise (Stream.Error "expected function name in prototype")
444 </pre>
445 </div>
447 <p>As with binary operators, we name unary operators with a name that includes
448 the operator character. This assists us at code generation time. Speaking of,
449 the final piece we need to add is codegen support for unary operators. It looks
450 like this:</p>
452 <div class="doc_code">
453 <pre>
454 let rec codegen_expr = function
456 | Ast.Unary (op, operand) -&gt;
457 let operand = codegen_expr operand in
458 let callee = "unary" ^ (String.make 1 op) in
459 let callee =
460 match lookup_function callee the_module with
461 | Some callee -&gt; callee
462 | None -&gt; raise (Error "unknown unary operator")
464 build_call callee [|operand|] "unop" builder
465 </pre>
466 </div>
468 <p>This code is similar to, but simpler than, the code for binary operators. It
469 is simpler primarily because it doesn't need to handle any predefined operators.
470 </p>
472 </div>
474 <!-- *********************************************************************** -->
475 <div class="doc_section"><a name="example">Kicking the Tires</a></div>
476 <!-- *********************************************************************** -->
478 <div class="doc_text">
480 <p>It is somewhat hard to believe, but with a few simple extensions we've
481 covered in the last chapters, we have grown a real-ish language. With this, we
482 can do a lot of interesting things, including I/O, math, and a bunch of other
483 things. For example, we can now add a nice sequencing operator (printd is
484 defined to print out the specified value and a newline):</p>
486 <div class="doc_code">
487 <pre>
488 ready&gt; <b>extern printd(x);</b>
489 Read extern: declare double @printd(double)
490 ready&gt; <b>def binary : 1 (x y) 0; # Low-precedence operator that ignores operands.</b>
492 ready&gt; <b>printd(123) : printd(456) : printd(789);</b>
493 123.000000
494 456.000000
495 789.000000
496 Evaluated to 0.000000
497 </pre>
498 </div>
500 <p>We can also define a bunch of other "primitive" operations, such as:</p>
502 <div class="doc_code">
503 <pre>
504 # Logical unary not.
505 def unary!(v)
506 if v then
508 else
511 # Unary negate.
512 def unary-(v)
513 0-v;
515 # Define &gt; with the same precedence as &gt;.
516 def binary&gt; 10 (LHS RHS)
517 RHS &lt; LHS;
519 # Binary logical or, which does not short circuit.
520 def binary| 5 (LHS RHS)
521 if LHS then
523 else if RHS then
525 else
528 # Binary logical and, which does not short circuit.
529 def binary&amp; 6 (LHS RHS)
530 if !LHS then
532 else
533 !!RHS;
535 # Define = with slightly lower precedence than relationals.
536 def binary = 9 (LHS RHS)
537 !(LHS &lt; RHS | LHS &gt; RHS);
539 </pre>
540 </div>
543 <p>Given the previous if/then/else support, we can also define interesting
544 functions for I/O. For example, the following prints out a character whose
545 "density" reflects the value passed in: the lower the value, the denser the
546 character:</p>
548 <div class="doc_code">
549 <pre>
550 ready&gt;
552 extern putchard(char)
553 def printdensity(d)
554 if d &gt; 8 then
555 putchard(32) # ' '
556 else if d &gt; 4 then
557 putchard(46) # '.'
558 else if d &gt; 2 then
559 putchard(43) # '+'
560 else
561 putchard(42); # '*'</b>
563 ready&gt; <b>printdensity(1): printdensity(2): printdensity(3) :
564 printdensity(4): printdensity(5): printdensity(9): putchard(10);</b>
565 *++..
566 Evaluated to 0.000000
567 </pre>
568 </div>
570 <p>Based on these simple primitive operations, we can start to define more
571 interesting things. For example, here's a little function that solves for the
572 number of iterations it takes a function in the complex plane to
573 converge:</p>
575 <div class="doc_code">
576 <pre>
577 # determine whether the specific location diverges.
578 # Solve for z = z^2 + c in the complex plane.
579 def mandleconverger(real imag iters creal cimag)
580 if iters &gt; 255 | (real*real + imag*imag &gt; 4) then
581 iters
582 else
583 mandleconverger(real*real - imag*imag + creal,
584 2*real*imag + cimag,
585 iters+1, creal, cimag);
587 # return the number of iterations required for the iteration to escape
588 def mandleconverge(real imag)
589 mandleconverger(real, imag, 0, real, imag);
590 </pre>
591 </div>
593 <p>This "z = z<sup>2</sup> + c" function is a beautiful little creature that is the basis
594 for computation of the <a
595 href="http://en.wikipedia.org/wiki/Mandelbrot_set">Mandelbrot Set</a>. Our
596 <tt>mandelconverge</tt> function returns the number of iterations that it takes
597 for a complex orbit to escape, saturating to 255. This is not a very useful
598 function by itself, but if you plot its value over a two-dimensional plane,
599 you can see the Mandelbrot set. Given that we are limited to using putchard
600 here, our amazing graphical output is limited, but we can whip together
601 something using the density plotter above:</p>
603 <div class="doc_code">
604 <pre>
605 # compute and plot the mandlebrot set with the specified 2 dimensional range
606 # info.
607 def mandelhelp(xmin xmax xstep ymin ymax ystep)
608 for y = ymin, y &lt; ymax, ystep in (
609 (for x = xmin, x &lt; xmax, xstep in
610 printdensity(mandleconverge(x,y)))
611 : putchard(10)
614 # mandel - This is a convenient helper function for ploting the mandelbrot set
615 # from the specified position with the specified Magnification.
616 def mandel(realstart imagstart realmag imagmag)
617 mandelhelp(realstart, realstart+realmag*78, realmag,
618 imagstart, imagstart+imagmag*40, imagmag);
619 </pre>
620 </div>
622 <p>Given this, we can try plotting out the mandlebrot set! Lets try it out:</p>
624 <div class="doc_code">
625 <pre>
626 ready&gt; <b>mandel(-2.3, -1.3, 0.05, 0.07);</b>
627 *******************************+++++++++++*************************************
628 *************************+++++++++++++++++++++++*******************************
629 **********************+++++++++++++++++++++++++++++****************************
630 *******************+++++++++++++++++++++.. ...++++++++*************************
631 *****************++++++++++++++++++++++.... ...+++++++++***********************
632 ***************+++++++++++++++++++++++..... ...+++++++++*********************
633 **************+++++++++++++++++++++++.... ....+++++++++********************
634 *************++++++++++++++++++++++...... .....++++++++*******************
635 ************+++++++++++++++++++++....... .......+++++++******************
636 ***********+++++++++++++++++++.... ... .+++++++*****************
637 **********+++++++++++++++++....... .+++++++****************
638 *********++++++++++++++........... ...+++++++***************
639 ********++++++++++++............ ...++++++++**************
640 ********++++++++++... .......... .++++++++**************
641 *******+++++++++..... .+++++++++*************
642 *******++++++++...... ..+++++++++*************
643 *******++++++....... ..+++++++++*************
644 *******+++++...... ..+++++++++*************
645 *******.... .... ...+++++++++*************
646 *******.... . ...+++++++++*************
647 *******+++++...... ...+++++++++*************
648 *******++++++....... ..+++++++++*************
649 *******++++++++...... .+++++++++*************
650 *******+++++++++..... ..+++++++++*************
651 ********++++++++++... .......... .++++++++**************
652 ********++++++++++++............ ...++++++++**************
653 *********++++++++++++++.......... ...+++++++***************
654 **********++++++++++++++++........ .+++++++****************
655 **********++++++++++++++++++++.... ... ..+++++++****************
656 ***********++++++++++++++++++++++....... .......++++++++*****************
657 ************+++++++++++++++++++++++...... ......++++++++******************
658 **************+++++++++++++++++++++++.... ....++++++++********************
659 ***************+++++++++++++++++++++++..... ...+++++++++*********************
660 *****************++++++++++++++++++++++.... ...++++++++***********************
661 *******************+++++++++++++++++++++......++++++++*************************
662 *********************++++++++++++++++++++++.++++++++***************************
663 *************************+++++++++++++++++++++++*******************************
664 ******************************+++++++++++++************************************
665 *******************************************************************************
666 *******************************************************************************
667 *******************************************************************************
668 Evaluated to 0.000000
669 ready&gt; <b>mandel(-2, -1, 0.02, 0.04);</b>
670 **************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
671 ***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
672 *********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
673 *******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
674 *****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
675 ***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
676 **************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
677 ************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
678 ***********++++++++++++++++++++++++++++++++++++++++++++++++++........ .
679 **********++++++++++++++++++++++++++++++++++++++++++++++.............
680 ********+++++++++++++++++++++++++++++++++++++++++++..................
681 *******+++++++++++++++++++++++++++++++++++++++.......................
682 ******+++++++++++++++++++++++++++++++++++...........................
683 *****++++++++++++++++++++++++++++++++............................
684 *****++++++++++++++++++++++++++++...............................
685 ****++++++++++++++++++++++++++...... .........................
686 ***++++++++++++++++++++++++......... ...... ...........
687 ***++++++++++++++++++++++............
688 **+++++++++++++++++++++..............
689 **+++++++++++++++++++................
690 *++++++++++++++++++.................
691 *++++++++++++++++............ ...
692 *++++++++++++++..............
693 *+++....++++................
694 *.......... ...........
696 *.......... ...........
697 *+++....++++................
698 *++++++++++++++..............
699 *++++++++++++++++............ ...
700 *++++++++++++++++++.................
701 **+++++++++++++++++++................
702 **+++++++++++++++++++++..............
703 ***++++++++++++++++++++++............
704 ***++++++++++++++++++++++++......... ...... ...........
705 ****++++++++++++++++++++++++++...... .........................
706 *****++++++++++++++++++++++++++++...............................
707 *****++++++++++++++++++++++++++++++++............................
708 ******+++++++++++++++++++++++++++++++++++...........................
709 *******+++++++++++++++++++++++++++++++++++++++.......................
710 ********+++++++++++++++++++++++++++++++++++++++++++..................
711 Evaluated to 0.000000
712 ready&gt; <b>mandel(-0.9, -1.4, 0.02, 0.03);</b>
713 *******************************************************************************
714 *******************************************************************************
715 *******************************************************************************
716 **********+++++++++++++++++++++************************************************
717 *+++++++++++++++++++++++++++++++++++++++***************************************
718 +++++++++++++++++++++++++++++++++++++++++++++**********************************
719 ++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
720 ++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
721 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
722 +++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
723 +++++++++++++++++++++++++++++++.... ......+++++++++++++++++++****************
724 +++++++++++++++++++++++++++++....... ........+++++++++++++++++++**************
725 ++++++++++++++++++++++++++++........ ........++++++++++++++++++++************
726 +++++++++++++++++++++++++++......... .. ...+++++++++++++++++++++**********
727 ++++++++++++++++++++++++++........... ....++++++++++++++++++++++********
728 ++++++++++++++++++++++++............. .......++++++++++++++++++++++******
729 +++++++++++++++++++++++............. ........+++++++++++++++++++++++****
730 ++++++++++++++++++++++........... ..........++++++++++++++++++++++***
731 ++++++++++++++++++++........... .........++++++++++++++++++++++*
732 ++++++++++++++++++............ ...........++++++++++++++++++++
733 ++++++++++++++++............... .............++++++++++++++++++
734 ++++++++++++++................. ...............++++++++++++++++
735 ++++++++++++.................. .................++++++++++++++
736 +++++++++.................. .................+++++++++++++
737 ++++++........ . ......... ..++++++++++++
738 ++............ ...... ....++++++++++
739 .............. ...++++++++++
740 .............. ....+++++++++
741 .............. .....++++++++
742 ............. ......++++++++
743 ........... .......++++++++
744 ......... ........+++++++
745 ......... ........+++++++
746 ......... ....+++++++
747 ........ ...+++++++
748 ....... ...+++++++
749 ....+++++++
750 .....+++++++
751 ....+++++++
752 ....+++++++
753 ....+++++++
754 Evaluated to 0.000000
755 ready&gt; <b>^D</b>
756 </pre>
757 </div>
759 <p>At this point, you may be starting to realize that Kaleidoscope is a real
760 and powerful language. It may not be self-similar :), but it can be used to
761 plot things that are!</p>
763 <p>With this, we conclude the "adding user-defined operators" chapter of the
764 tutorial. We have successfully augmented our language, adding the ability to
765 extend the language in the library, and we have shown how this can be used to
766 build a simple but interesting end-user application in Kaleidoscope. At this
767 point, Kaleidoscope can build a variety of applications that are functional and
768 can call functions with side-effects, but it can't actually define and mutate a
769 variable itself.</p>
771 <p>Strikingly, variable mutation is an important feature of some
772 languages, and it is not at all obvious how to <a href="OCamlLangImpl7.html">add
773 support for mutable variables</a> without having to add an "SSA construction"
774 phase to your front-end. In the next chapter, we will describe how you can
775 add variable mutation without building SSA in your front-end.</p>
777 </div>
780 <!-- *********************************************************************** -->
781 <div class="doc_section"><a name="code">Full Code Listing</a></div>
782 <!-- *********************************************************************** -->
784 <div class="doc_text">
787 Here is the complete code listing for our running example, enhanced with the
788 if/then/else and for expressions.. To build this example, use:
789 </p>
791 <div class="doc_code">
792 <pre>
793 # Compile
794 ocamlbuild toy.byte
795 # Run
796 ./toy.byte
797 </pre>
798 </div>
800 <p>Here is the code:</p>
802 <dl>
803 <dt>_tags:</dt>
804 <dd class="doc_code">
805 <pre>
806 &lt;{lexer,parser}.ml&gt;: use_camlp4, pp(camlp4of)
807 &lt;*.{byte,native}&gt;: g++, use_llvm, use_llvm_analysis
808 &lt;*.{byte,native}&gt;: use_llvm_executionengine, use_llvm_target
809 &lt;*.{byte,native}&gt;: use_llvm_scalar_opts, use_bindings
810 </pre>
811 </dd>
813 <dt>myocamlbuild.ml:</dt>
814 <dd class="doc_code">
815 <pre>
816 open Ocamlbuild_plugin;;
818 ocaml_lib ~extern:true "llvm";;
819 ocaml_lib ~extern:true "llvm_analysis";;
820 ocaml_lib ~extern:true "llvm_executionengine";;
821 ocaml_lib ~extern:true "llvm_target";;
822 ocaml_lib ~extern:true "llvm_scalar_opts";;
824 flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"]);;
825 dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
826 </pre>
827 </dd>
829 <dt>token.ml:</dt>
830 <dd class="doc_code">
831 <pre>
832 (*===----------------------------------------------------------------------===
833 * Lexer Tokens
834 *===----------------------------------------------------------------------===*)
836 (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
837 * these others for known things. *)
838 type token =
839 (* commands *)
840 | Def | Extern
842 (* primary *)
843 | Ident of string | Number of float
845 (* unknown *)
846 | Kwd of char
848 (* control *)
849 | If | Then | Else
850 | For | In
852 (* operators *)
853 | Binary | Unary
854 </pre>
855 </dd>
857 <dt>lexer.ml:</dt>
858 <dd class="doc_code">
859 <pre>
860 (*===----------------------------------------------------------------------===
861 * Lexer
862 *===----------------------------------------------------------------------===*)
864 let rec lex = parser
865 (* Skip any whitespace. *)
866 | [&lt; ' (' ' | '\n' | '\r' | '\t'); stream &gt;] -&gt; lex stream
868 (* identifier: [a-zA-Z][a-zA-Z0-9] *)
869 | [&lt; ' ('A' .. 'Z' | 'a' .. 'z' as c); stream &gt;] -&gt;
870 let buffer = Buffer.create 1 in
871 Buffer.add_char buffer c;
872 lex_ident buffer stream
874 (* number: [0-9.]+ *)
875 | [&lt; ' ('0' .. '9' as c); stream &gt;] -&gt;
876 let buffer = Buffer.create 1 in
877 Buffer.add_char buffer c;
878 lex_number buffer stream
880 (* Comment until end of line. *)
881 | [&lt; ' ('#'); stream &gt;] -&gt;
882 lex_comment stream
884 (* Otherwise, just return the character as its ascii value. *)
885 | [&lt; 'c; stream &gt;] -&gt;
886 [&lt; 'Token.Kwd c; lex stream &gt;]
888 (* end of stream. *)
889 | [&lt; &gt;] -&gt; [&lt; &gt;]
891 and lex_number buffer = parser
892 | [&lt; ' ('0' .. '9' | '.' as c); stream &gt;] -&gt;
893 Buffer.add_char buffer c;
894 lex_number buffer stream
895 | [&lt; stream=lex &gt;] -&gt;
896 [&lt; 'Token.Number (float_of_string (Buffer.contents buffer)); stream &gt;]
898 and lex_ident buffer = parser
899 | [&lt; ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream &gt;] -&gt;
900 Buffer.add_char buffer c;
901 lex_ident buffer stream
902 | [&lt; stream=lex &gt;] -&gt;
903 match Buffer.contents buffer with
904 | "def" -&gt; [&lt; 'Token.Def; stream &gt;]
905 | "extern" -&gt; [&lt; 'Token.Extern; stream &gt;]
906 | "if" -&gt; [&lt; 'Token.If; stream &gt;]
907 | "then" -&gt; [&lt; 'Token.Then; stream &gt;]
908 | "else" -&gt; [&lt; 'Token.Else; stream &gt;]
909 | "for" -&gt; [&lt; 'Token.For; stream &gt;]
910 | "in" -&gt; [&lt; 'Token.In; stream &gt;]
911 | "binary" -&gt; [&lt; 'Token.Binary; stream &gt;]
912 | "unary" -&gt; [&lt; 'Token.Unary; stream &gt;]
913 | id -&gt; [&lt; 'Token.Ident id; stream &gt;]
915 and lex_comment = parser
916 | [&lt; ' ('\n'); stream=lex &gt;] -&gt; stream
917 | [&lt; 'c; e=lex_comment &gt;] -&gt; e
918 | [&lt; &gt;] -&gt; [&lt; &gt;]
919 </pre>
920 </dd>
922 <dt>ast.ml:</dt>
923 <dd class="doc_code">
924 <pre>
925 (*===----------------------------------------------------------------------===
926 * Abstract Syntax Tree (aka Parse Tree)
927 *===----------------------------------------------------------------------===*)
929 (* expr - Base type for all expression nodes. *)
930 type expr =
931 (* variant for numeric literals like "1.0". *)
932 | Number of float
934 (* variant for referencing a variable, like "a". *)
935 | Variable of string
937 (* variant for a unary operator. *)
938 | Unary of char * expr
940 (* variant for a binary operator. *)
941 | Binary of char * expr * expr
943 (* variant for function calls. *)
944 | Call of string * expr array
946 (* variant for if/then/else. *)
947 | If of expr * expr * expr
949 (* variant for for/in. *)
950 | For of string * expr * expr * expr option * expr
952 (* proto - This type represents the "prototype" for a function, which captures
953 * its name, and its argument names (thus implicitly the number of arguments the
954 * function takes). *)
955 type proto =
956 | Prototype of string * string array
957 | BinOpPrototype of string * string array * int
959 (* func - This type represents a function definition itself. *)
960 type func = Function of proto * expr
961 </pre>
962 </dd>
964 <dt>parser.ml:</dt>
965 <dd class="doc_code">
966 <pre>
967 (*===---------------------------------------------------------------------===
968 * Parser
969 *===---------------------------------------------------------------------===*)
971 (* binop_precedence - This holds the precedence for each binary operator that is
972 * defined *)
973 let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
975 (* precedence - Get the precedence of the pending binary operator token. *)
976 let precedence c = try Hashtbl.find binop_precedence c with Not_found -&gt; -1
978 (* primary
979 * ::= identifier
980 * ::= numberexpr
981 * ::= parenexpr
982 * ::= ifexpr
983 * ::= forexpr *)
984 let rec parse_primary = parser
985 (* numberexpr ::= number *)
986 | [&lt; 'Token.Number n &gt;] -&gt; Ast.Number n
988 (* parenexpr ::= '(' expression ')' *)
989 | [&lt; 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" &gt;] -&gt; e
991 (* identifierexpr
992 * ::= identifier
993 * ::= identifier '(' argumentexpr ')' *)
994 | [&lt; 'Token.Ident id; stream &gt;] -&gt;
995 let rec parse_args accumulator = parser
996 | [&lt; e=parse_expr; stream &gt;] -&gt;
997 begin parser
998 | [&lt; 'Token.Kwd ','; e=parse_args (e :: accumulator) &gt;] -&gt; e
999 | [&lt; &gt;] -&gt; e :: accumulator
1000 end stream
1001 | [&lt; &gt;] -&gt; accumulator
1003 let rec parse_ident id = parser
1004 (* Call. *)
1005 | [&lt; 'Token.Kwd '(';
1006 args=parse_args [];
1007 'Token.Kwd ')' ?? "expected ')'"&gt;] -&gt;
1008 Ast.Call (id, Array.of_list (List.rev args))
1010 (* Simple variable ref. *)
1011 | [&lt; &gt;] -&gt; Ast.Variable id
1013 parse_ident id stream
1015 (* ifexpr ::= 'if' expr 'then' expr 'else' expr *)
1016 | [&lt; 'Token.If; c=parse_expr;
1017 'Token.Then ?? "expected 'then'"; t=parse_expr;
1018 'Token.Else ?? "expected 'else'"; e=parse_expr &gt;] -&gt;
1019 Ast.If (c, t, e)
1021 (* forexpr
1022 ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)
1023 | [&lt; 'Token.For;
1024 'Token.Ident id ?? "expected identifier after for";
1025 'Token.Kwd '=' ?? "expected '=' after for";
1026 stream &gt;] -&gt;
1027 begin parser
1028 | [&lt;
1029 start=parse_expr;
1030 'Token.Kwd ',' ?? "expected ',' after for";
1031 end_=parse_expr;
1032 stream &gt;] -&gt;
1033 let step =
1034 begin parser
1035 | [&lt; 'Token.Kwd ','; step=parse_expr &gt;] -&gt; Some step
1036 | [&lt; &gt;] -&gt; None
1037 end stream
1039 begin parser
1040 | [&lt; 'Token.In; body=parse_expr &gt;] -&gt;
1041 Ast.For (id, start, end_, step, body)
1042 | [&lt; &gt;] -&gt;
1043 raise (Stream.Error "expected 'in' after for")
1044 end stream
1045 | [&lt; &gt;] -&gt;
1046 raise (Stream.Error "expected '=' after for")
1047 end stream
1049 | [&lt; &gt;] -&gt; raise (Stream.Error "unknown token when expecting an expression.")
1051 (* unary
1052 * ::= primary
1053 * ::= '!' unary *)
1054 and parse_unary = parser
1055 (* If this is a unary operator, read it. *)
1056 | [&lt; 'Token.Kwd op when op != '(' &amp;&amp; op != ')'; operand=parse_expr &gt;] -&gt;
1057 Ast.Unary (op, operand)
1059 (* If the current token is not an operator, it must be a primary expr. *)
1060 | [&lt; stream &gt;] -&gt; parse_primary stream
1062 (* binoprhs
1063 * ::= ('+' primary)* *)
1064 and parse_bin_rhs expr_prec lhs stream =
1065 match Stream.peek stream with
1066 (* If this is a binop, find its precedence. *)
1067 | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c -&gt;
1068 let token_prec = precedence c in
1070 (* If this is a binop that binds at least as tightly as the current binop,
1071 * consume it, otherwise we are done. *)
1072 if token_prec &lt; expr_prec then lhs else begin
1073 (* Eat the binop. *)
1074 Stream.junk stream;
1076 (* Parse the unary expression after the binary operator. *)
1077 let rhs = parse_unary stream in
1079 (* Okay, we know this is a binop. *)
1080 let rhs =
1081 match Stream.peek stream with
1082 | Some (Token.Kwd c2) -&gt;
1083 (* If BinOp binds less tightly with rhs than the operator after
1084 * rhs, let the pending operator take rhs as its lhs. *)
1085 let next_prec = precedence c2 in
1086 if token_prec &lt; next_prec
1087 then parse_bin_rhs (token_prec + 1) rhs stream
1088 else rhs
1089 | _ -&gt; rhs
1092 (* Merge lhs/rhs. *)
1093 let lhs = Ast.Binary (c, lhs, rhs) in
1094 parse_bin_rhs expr_prec lhs stream
1096 | _ -&gt; lhs
1098 (* expression
1099 * ::= primary binoprhs *)
1100 and parse_expr = parser
1101 | [&lt; lhs=parse_unary; stream &gt;] -&gt; parse_bin_rhs 0 lhs stream
1103 (* prototype
1104 * ::= id '(' id* ')'
1105 * ::= binary LETTER number? (id, id)
1106 * ::= unary LETTER number? (id) *)
1107 let parse_prototype =
1108 let rec parse_args accumulator = parser
1109 | [&lt; 'Token.Ident id; e=parse_args (id::accumulator) &gt;] -&gt; e
1110 | [&lt; &gt;] -&gt; accumulator
1112 let parse_operator = parser
1113 | [&lt; 'Token.Unary &gt;] -&gt; "unary", 1
1114 | [&lt; 'Token.Binary &gt;] -&gt; "binary", 2
1116 let parse_binary_precedence = parser
1117 | [&lt; 'Token.Number n &gt;] -&gt; int_of_float n
1118 | [&lt; &gt;] -&gt; 30
1120 parser
1121 | [&lt; 'Token.Ident id;
1122 'Token.Kwd '(' ?? "expected '(' in prototype";
1123 args=parse_args [];
1124 'Token.Kwd ')' ?? "expected ')' in prototype" &gt;] -&gt;
1125 (* success. *)
1126 Ast.Prototype (id, Array.of_list (List.rev args))
1127 | [&lt; (prefix, kind)=parse_operator;
1128 'Token.Kwd op ?? "expected an operator";
1129 (* Read the precedence if present. *)
1130 binary_precedence=parse_binary_precedence;
1131 'Token.Kwd '(' ?? "expected '(' in prototype";
1132 args=parse_args [];
1133 'Token.Kwd ')' ?? "expected ')' in prototype" &gt;] -&gt;
1134 let name = prefix ^ (String.make 1 op) in
1135 let args = Array.of_list (List.rev args) in
1137 (* Verify right number of arguments for operator. *)
1138 if Array.length args != kind
1139 then raise (Stream.Error "invalid number of operands for operator")
1140 else
1141 if kind == 1 then
1142 Ast.Prototype (name, args)
1143 else
1144 Ast.BinOpPrototype (name, args, binary_precedence)
1145 | [&lt; &gt;] -&gt;
1146 raise (Stream.Error "expected function name in prototype")
1148 (* definition ::= 'def' prototype expression *)
1149 let parse_definition = parser
1150 | [&lt; 'Token.Def; p=parse_prototype; e=parse_expr &gt;] -&gt;
1151 Ast.Function (p, e)
1153 (* toplevelexpr ::= expression *)
1154 let parse_toplevel = parser
1155 | [&lt; e=parse_expr &gt;] -&gt;
1156 (* Make an anonymous proto. *)
1157 Ast.Function (Ast.Prototype ("", [||]), e)
1159 (* external ::= 'extern' prototype *)
1160 let parse_extern = parser
1161 | [&lt; 'Token.Extern; e=parse_prototype &gt;] -&gt; e
1162 </pre>
1163 </dd>
1165 <dt>codegen.ml:</dt>
1166 <dd class="doc_code">
1167 <pre>
1168 (*===----------------------------------------------------------------------===
1169 * Code Generation
1170 *===----------------------------------------------------------------------===*)
1172 open Llvm
1174 exception Error of string
1176 let the_module = create_module "my cool jit"
1177 let builder = builder ()
1178 let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
1180 let rec codegen_expr = function
1181 | Ast.Number n -&gt; const_float double_type n
1182 | Ast.Variable name -&gt;
1183 (try Hashtbl.find named_values name with
1184 | Not_found -&gt; raise (Error "unknown variable name"))
1185 | Ast.Unary (op, operand) -&gt;
1186 let operand = codegen_expr operand in
1187 let callee = "unary" ^ (String.make 1 op) in
1188 let callee =
1189 match lookup_function callee the_module with
1190 | Some callee -&gt; callee
1191 | None -&gt; raise (Error "unknown unary operator")
1193 build_call callee [|operand|] "unop" builder
1194 | Ast.Binary (op, lhs, rhs) -&gt;
1195 let lhs_val = codegen_expr lhs in
1196 let rhs_val = codegen_expr rhs in
1197 begin
1198 match op with
1199 | '+' -&gt; build_add lhs_val rhs_val "addtmp" builder
1200 | '-' -&gt; build_sub lhs_val rhs_val "subtmp" builder
1201 | '*' -&gt; build_mul lhs_val rhs_val "multmp" builder
1202 | '&lt;' -&gt;
1203 (* Convert bool 0/1 to double 0.0 or 1.0 *)
1204 let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
1205 build_uitofp i double_type "booltmp" builder
1206 | _ -&gt;
1207 (* If it wasn't a builtin binary operator, it must be a user defined
1208 * one. Emit a call to it. *)
1209 let callee = "binary" ^ (String.make 1 op) in
1210 let callee =
1211 match lookup_function callee the_module with
1212 | Some callee -&gt; callee
1213 | None -&gt; raise (Error "binary operator not found!")
1215 build_call callee [|lhs_val; rhs_val|] "binop" builder
1217 | Ast.Call (callee, args) -&gt;
1218 (* Look up the name in the module table. *)
1219 let callee =
1220 match lookup_function callee the_module with
1221 | Some callee -&gt; callee
1222 | None -&gt; raise (Error "unknown function referenced")
1224 let params = params callee in
1226 (* If argument mismatch error. *)
1227 if Array.length params == Array.length args then () else
1228 raise (Error "incorrect # arguments passed");
1229 let args = Array.map codegen_expr args in
1230 build_call callee args "calltmp" builder
1231 | Ast.If (cond, then_, else_) -&gt;
1232 let cond = codegen_expr cond in
1234 (* Convert condition to a bool by comparing equal to 0.0 *)
1235 let zero = const_float double_type 0.0 in
1236 let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in
1238 (* Grab the first block so that we might later add the conditional branch
1239 * to it at the end of the function. *)
1240 let start_bb = insertion_block builder in
1241 let the_function = block_parent start_bb in
1243 let then_bb = append_block "then" the_function in
1245 (* Emit 'then' value. *)
1246 position_at_end then_bb builder;
1247 let then_val = codegen_expr then_ in
1249 (* Codegen of 'then' can change the current block, update then_bb for the
1250 * phi. We create a new name because one is used for the phi node, and the
1251 * other is used for the conditional branch. *)
1252 let new_then_bb = insertion_block builder in
1254 (* Emit 'else' value. *)
1255 let else_bb = append_block "else" the_function in
1256 position_at_end else_bb builder;
1257 let else_val = codegen_expr else_ in
1259 (* Codegen of 'else' can change the current block, update else_bb for the
1260 * phi. *)
1261 let new_else_bb = insertion_block builder in
1263 (* Emit merge block. *)
1264 let merge_bb = append_block "ifcont" the_function in
1265 position_at_end merge_bb builder;
1266 let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in
1267 let phi = build_phi incoming "iftmp" builder in
1269 (* Return to the start block to add the conditional branch. *)
1270 position_at_end start_bb builder;
1271 ignore (build_cond_br cond_val then_bb else_bb builder);
1273 (* Set a unconditional branch at the end of the 'then' block and the
1274 * 'else' block to the 'merge' block. *)
1275 position_at_end new_then_bb builder; ignore (build_br merge_bb builder);
1276 position_at_end new_else_bb builder; ignore (build_br merge_bb builder);
1278 (* Finally, set the builder to the end of the merge block. *)
1279 position_at_end merge_bb builder;
1282 | Ast.For (var_name, start, end_, step, body) -&gt;
1283 (* Emit the start code first, without 'variable' in scope. *)
1284 let start_val = codegen_expr start in
1286 (* Make the new basic block for the loop header, inserting after current
1287 * block. *)
1288 let preheader_bb = insertion_block builder in
1289 let the_function = block_parent preheader_bb in
1290 let loop_bb = append_block "loop" the_function in
1292 (* Insert an explicit fall through from the current block to the
1293 * loop_bb. *)
1294 ignore (build_br loop_bb builder);
1296 (* Start insertion in loop_bb. *)
1297 position_at_end loop_bb builder;
1299 (* Start the PHI node with an entry for start. *)
1300 let variable = build_phi [(start_val, preheader_bb)] var_name builder in
1302 (* Within the loop, the variable is defined equal to the PHI node. If it
1303 * shadows an existing variable, we have to restore it, so save it
1304 * now. *)
1305 let old_val =
1306 try Some (Hashtbl.find named_values var_name) with Not_found -&gt; None
1308 Hashtbl.add named_values var_name variable;
1310 (* Emit the body of the loop. This, like any other expr, can change the
1311 * current BB. Note that we ignore the value computed by the body, but
1312 * don't allow an error *)
1313 ignore (codegen_expr body);
1315 (* Emit the step value. *)
1316 let step_val =
1317 match step with
1318 | Some step -&gt; codegen_expr step
1319 (* If not specified, use 1.0. *)
1320 | None -&gt; const_float double_type 1.0
1323 let next_var = build_add variable step_val "nextvar" builder in
1325 (* Compute the end condition. *)
1326 let end_cond = codegen_expr end_ in
1328 (* Convert condition to a bool by comparing equal to 0.0. *)
1329 let zero = const_float double_type 0.0 in
1330 let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in
1332 (* Create the "after loop" block and insert it. *)
1333 let loop_end_bb = insertion_block builder in
1334 let after_bb = append_block "afterloop" the_function in
1336 (* Insert the conditional branch into the end of loop_end_bb. *)
1337 ignore (build_cond_br end_cond loop_bb after_bb builder);
1339 (* Any new code will be inserted in after_bb. *)
1340 position_at_end after_bb builder;
1342 (* Add a new entry to the PHI node for the backedge. *)
1343 add_incoming (next_var, loop_end_bb) variable;
1345 (* Restore the unshadowed variable. *)
1346 begin match old_val with
1347 | Some old_val -&gt; Hashtbl.add named_values var_name old_val
1348 | None -&gt; ()
1349 end;
1351 (* for expr always returns 0.0. *)
1352 const_null double_type
1354 let codegen_proto = function
1355 | Ast.Prototype (name, args) | Ast.BinOpPrototype (name, args, _) -&gt;
1356 (* Make the function type: double(double,double) etc. *)
1357 let doubles = Array.make (Array.length args) double_type in
1358 let ft = function_type double_type doubles in
1359 let f =
1360 match lookup_function name the_module with
1361 | None -&gt; declare_function name ft the_module
1363 (* If 'f' conflicted, there was already something named 'name'. If it
1364 * has a body, don't allow redefinition or reextern. *)
1365 | Some f -&gt;
1366 (* If 'f' already has a body, reject this. *)
1367 if block_begin f &lt;&gt; At_end f then
1368 raise (Error "redefinition of function");
1370 (* If 'f' took a different number of arguments, reject. *)
1371 if element_type (type_of f) &lt;&gt; ft then
1372 raise (Error "redefinition of function with different # args");
1376 (* Set names for all arguments. *)
1377 Array.iteri (fun i a -&gt;
1378 let n = args.(i) in
1379 set_value_name n a;
1380 Hashtbl.add named_values n a;
1381 ) (params f);
1384 let codegen_func the_fpm = function
1385 | Ast.Function (proto, body) -&gt;
1386 Hashtbl.clear named_values;
1387 let the_function = codegen_proto proto in
1389 (* If this is an operator, install it. *)
1390 begin match proto with
1391 | Ast.BinOpPrototype (name, args, prec) -&gt;
1392 let op = name.[String.length name - 1] in
1393 Hashtbl.add Parser.binop_precedence op prec;
1394 | _ -&gt; ()
1395 end;
1397 (* Create a new basic block to start insertion into. *)
1398 let bb = append_block "entry" the_function in
1399 position_at_end bb builder;
1402 let ret_val = codegen_expr body in
1404 (* Finish off the function. *)
1405 let _ = build_ret ret_val builder in
1407 (* Validate the generated code, checking for consistency. *)
1408 Llvm_analysis.assert_valid_function the_function;
1410 (* Optimize the function. *)
1411 let _ = PassManager.run_function the_function the_fpm in
1413 the_function
1414 with e -&gt;
1415 delete_function the_function;
1416 raise e
1417 </pre>
1418 </dd>
1420 <dt>toplevel.ml:</dt>
1421 <dd class="doc_code">
1422 <pre>
1423 (*===----------------------------------------------------------------------===
1424 * Top-Level parsing and JIT Driver
1425 *===----------------------------------------------------------------------===*)
1427 open Llvm
1428 open Llvm_executionengine
1430 (* top ::= definition | external | expression | ';' *)
1431 let rec main_loop the_fpm the_execution_engine stream =
1432 match Stream.peek stream with
1433 | None -&gt; ()
1435 (* ignore top-level semicolons. *)
1436 | Some (Token.Kwd ';') -&gt;
1437 Stream.junk stream;
1438 main_loop the_fpm the_execution_engine stream
1440 | Some token -&gt;
1441 begin
1442 try match token with
1443 | Token.Def -&gt;
1444 let e = Parser.parse_definition stream in
1445 print_endline "parsed a function definition.";
1446 dump_value (Codegen.codegen_func the_fpm e);
1447 | Token.Extern -&gt;
1448 let e = Parser.parse_extern stream in
1449 print_endline "parsed an extern.";
1450 dump_value (Codegen.codegen_proto e);
1451 | _ -&gt;
1452 (* Evaluate a top-level expression into an anonymous function. *)
1453 let e = Parser.parse_toplevel stream in
1454 print_endline "parsed a top-level expr";
1455 let the_function = Codegen.codegen_func the_fpm e in
1456 dump_value the_function;
1458 (* JIT the function, returning a function pointer. *)
1459 let result = ExecutionEngine.run_function the_function [||]
1460 the_execution_engine in
1462 print_string "Evaluated to ";
1463 print_float (GenericValue.as_float double_type result);
1464 print_newline ();
1465 with Stream.Error s | Codegen.Error s -&gt;
1466 (* Skip token for error recovery. *)
1467 Stream.junk stream;
1468 print_endline s;
1469 end;
1470 print_string "ready&gt; "; flush stdout;
1471 main_loop the_fpm the_execution_engine stream
1472 </pre>
1473 </dd>
1475 <dt>toy.ml:</dt>
1476 <dd class="doc_code">
1477 <pre>
1478 (*===----------------------------------------------------------------------===
1479 * Main driver code.
1480 *===----------------------------------------------------------------------===*)
1482 open Llvm
1483 open Llvm_executionengine
1484 open Llvm_target
1485 open Llvm_scalar_opts
1487 let main () =
1488 (* Install standard binary operators.
1489 * 1 is the lowest precedence. *)
1490 Hashtbl.add Parser.binop_precedence '&lt;' 10;
1491 Hashtbl.add Parser.binop_precedence '+' 20;
1492 Hashtbl.add Parser.binop_precedence '-' 20;
1493 Hashtbl.add Parser.binop_precedence '*' 40; (* highest. *)
1495 (* Prime the first token. *)
1496 print_string "ready&gt; "; flush stdout;
1497 let stream = Lexer.lex (Stream.of_channel stdin) in
1499 (* Create the JIT. *)
1500 let the_module_provider = ModuleProvider.create Codegen.the_module in
1501 let the_execution_engine = ExecutionEngine.create the_module_provider in
1502 let the_fpm = PassManager.create_function the_module_provider in
1504 (* Set up the optimizer pipeline. Start with registering info about how the
1505 * target lays out data structures. *)
1506 TargetData.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
1508 (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
1509 add_instruction_combining the_fpm;
1511 (* reassociate expressions. *)
1512 add_reassociation the_fpm;
1514 (* Eliminate Common SubExpressions. *)
1515 add_gvn the_fpm;
1517 (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
1518 add_cfg_simplification the_fpm;
1520 (* Run the main "interpreter loop" now. *)
1521 Toplevel.main_loop the_fpm the_execution_engine stream;
1523 (* Print out all the generated code. *)
1524 dump_module Codegen.the_module
1527 main ()
1528 </pre>
1529 </dd>
1531 <dt>bindings.c</dt>
1532 <dd class="doc_code">
1533 <pre>
1534 #include &lt;stdio.h&gt;
1536 /* putchard - putchar that takes a double and returns 0. */
1537 extern double putchard(double X) {
1538 putchar((char)X);
1539 return 0;
1542 /* printd - printf that takes a double prints it as "%f\n", returning 0. */
1543 extern double printd(double X) {
1544 printf("%f\n", X);
1545 return 0;
1547 </pre>
1548 </dd>
1549 </dl>
1551 <a href="OCamlLangImpl7.html">Next: Extending the language: mutable variables /
1552 SSA construction</a>
1553 </div>
1555 <!-- *********************************************************************** -->
1556 <hr>
1557 <address>
1558 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1559 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1560 <a href="http://validator.w3.org/check/referer"><img
1561 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
1563 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1564 <a href="mailto:idadesub@users.sourceforge.net">Erick Tryzelaar</a><br>
1565 <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
1566 Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
1567 </address>
1568 </body>
1569 </html>