docs/tutorial/OCamlLangImpl4.rst

   1 ==============================================
   2 Kaleidoscope: Adding JIT and Optimizer Support
   3 ==============================================
   4
   5 .. contents::
   6    :local:
   7
   8 Chapter 4 Introduction
   9 ======================
  10
  11 Welcome to Chapter 4 of the "`Implementing a language with
  12 LLVM <index.html>`_" tutorial. Chapters 1-3 described the implementation
  13 of a simple language and added support for generating LLVM IR. This
  14 chapter describes two new techniques: adding optimizer support to your
  15 language, and adding JIT compiler support. These additions will
  16 demonstrate how to get nice, efficient code for the Kaleidoscope
  17 language.
  18
  19 Trivial Constant Folding
  20 ========================
  21
  22 **Note:** the default ``IRBuilder`` now always includes the constant
  23 folding optimisations below.
  24
  25 Our demonstration for Chapter 3 is elegant and easy to extend.
  26 Unfortunately, it does not produce wonderful code. For example, when
  27 compiling simple code, we don't get obvious optimizations:
  28
  29 ::
  30
  31     ready> def test(x) 1+2+x;
  32     Read function definition:
  33     define double @test(double %x) {
  34     entry:
  35             %addtmp = fadd double 1.000000e+00, 2.000000e+00
  36             %addtmp1 = fadd double %addtmp, %x
  37             ret double %addtmp1
  38     }
  39
  40 This code is a very, very literal transcription of the AST built by
  41 parsing the input. As such, this transcription lacks optimizations like
  42 constant folding (we'd like to get "``add x, 3.0``" in the example
  43 above) as well as other more important optimizations. Constant folding,
  44 in particular, is a very common and very important optimization: so much
  45 so that many language implementors implement constant folding support in
  46 their AST representation.
  47
  48 With LLVM, you don't need this support in the AST. Since all calls to
  49 build LLVM IR go through the LLVM builder, it would be nice if the
  50 builder itself checked to see if there was a constant folding
  51 opportunity when you call it. If so, it could just do the constant fold
  52 and return the constant instead of creating an instruction. This is
  53 exactly what the ``LLVMFoldingBuilder`` class does.
  54
  55 All we did was switch from ``LLVMBuilder`` to ``LLVMFoldingBuilder``.
  56 Though we change no other code, we now have all of our instructions
  57 implicitly constant folded without us having to do anything about it.
  58 For example, the input above now compiles to:
  59
  60 ::
  61
  62     ready> def test(x) 1+2+x;
  63     Read function definition:
  64     define double @test(double %x) {
  65     entry:
  66             %addtmp = fadd double 3.000000e+00, %x
  67             ret double %addtmp
  68     }
  69
  70 Well, that was easy :). In practice, we recommend always using
  71 ``LLVMFoldingBuilder`` when generating code like this. It has no
  72 "syntactic overhead" for its use (you don't have to uglify your compiler
  73 with constant checks everywhere) and it can dramatically reduce the
  74 amount of LLVM IR that is generated in some cases (particular for
  75 languages with a macro preprocessor or that use a lot of constants).
  76
  77 On the other hand, the ``LLVMFoldingBuilder`` is limited by the fact
  78 that it does all of its analysis inline with the code as it is built. If
  79 you take a slightly more complex example:
  80
  81 ::
  82
  83     ready> def test(x) (1+2+x)*(x+(1+2));
  84     ready> Read function definition:
  85     define double @test(double %x) {
  86     entry:
  87             %addtmp = fadd double 3.000000e+00, %x
  88             %addtmp1 = fadd double %x, 3.000000e+00
  89             %multmp = fmul double %addtmp, %addtmp1
  90             ret double %multmp
  91     }
  92
  93 In this case, the LHS and RHS of the multiplication are the same value.
  94 We'd really like to see this generate "``tmp = x+3; result = tmp*tmp;``"
  95 instead of computing "``x*3``" twice.
  96
  97 Unfortunately, no amount of local analysis will be able to detect and
  98 correct this. This requires two transformations: reassociation of
  99 expressions (to make the add's lexically identical) and Common
 100 Subexpression Elimination (CSE) to delete the redundant add instruction.
 101 Fortunately, LLVM provides a broad range of optimizations that you can
 102 use, in the form of "passes".
 103
 104 LLVM Optimization Passes
 105 ========================
 106
 107 LLVM provides many optimization passes, which do many different sorts of
 108 things and have different tradeoffs. Unlike other systems, LLVM doesn't
 109 hold to the mistaken notion that one set of optimizations is right for
 110 all languages and for all situations. LLVM allows a compiler implementor
 111 to make complete decisions about what optimizations to use, in which
 112 order, and in what situation.
 113
 114 As a concrete example, LLVM supports both "whole module" passes, which
 115 look across as large of body of code as they can (often a whole file,
 116 but if run at link time, this can be a substantial portion of the whole
 117 program). It also supports and includes "per-function" passes which just
 118 operate on a single function at a time, without looking at other
 119 functions. For more information on passes and how they are run, see the
 120 `How to Write a Pass <../WritingAnLLVMPass.html>`_ document and the
 121 `List of LLVM Passes <../Passes.html>`_.
 122
 123 For Kaleidoscope, we are currently generating functions on the fly, one
 124 at a time, as the user types them in. We aren't shooting for the
 125 ultimate optimization experience in this setting, but we also want to
 126 catch the easy and quick stuff where possible. As such, we will choose
 127 to run a few per-function optimizations as the user types the function
 128 in. If we wanted to make a "static Kaleidoscope compiler", we would use
 129 exactly the code we have now, except that we would defer running the
 130 optimizer until the entire file has been parsed.
 131
 132 In order to get per-function optimizations going, we need to set up a
 133 `Llvm.PassManager <../WritingAnLLVMPass.html#what-passmanager-does>`_ to hold and
 134 organize the LLVM optimizations that we want to run. Once we have that,
 135 we can add a set of optimizations to run. The code looks like this:
 136
 137 .. code-block:: ocaml
 138
 139       (* Create the JIT. *)
 140       let the_execution_engine = ExecutionEngine.create Codegen.the_module in
 141       let the_fpm = PassManager.create_function Codegen.the_module in
 142
 143       (* Set up the optimizer pipeline.  Start with registering info about how the
 144        * target lays out data structures. *)
 145       DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
 146
 147       (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
 148       add_instruction_combining the_fpm;
 149
 150       (* reassociate expressions. *)
 151       add_reassociation the_fpm;
 152
 153       (* Eliminate Common SubExpressions. *)
 154       add_gvn the_fpm;
 155
 156       (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
 157       add_cfg_simplification the_fpm;
 158
 159       ignore (PassManager.initialize the_fpm);
 160
 161       (* Run the main "interpreter loop" now. *)
 162       Toplevel.main_loop the_fpm the_execution_engine stream;
 163
 164 The meat of the matter here, is the definition of "``the_fpm``". It
 165 requires a pointer to the ``the_module`` to construct itself. Once it is
 166 set up, we use a series of "add" calls to add a bunch of LLVM passes.
 167 The first pass is basically boilerplate, it adds a pass so that later
 168 optimizations know how the data structures in the program are laid out.
 169 The "``the_execution_engine``" variable is related to the JIT, which we
 170 will get to in the next section.
 171
 172 In this case, we choose to add 4 optimization passes. The passes we
 173 chose here are a pretty standard set of "cleanup" optimizations that are
 174 useful for a wide variety of code. I won't delve into what they do but,
 175 believe me, they are a good starting place :).
 176
 177 Once the ``Llvm.PassManager.`` is set up, we need to make use of it. We
 178 do this by running it after our newly created function is constructed
 179 (in ``Codegen.codegen_func``), but before it is returned to the client:
 180
 181 .. code-block:: ocaml
 182
 183     let codegen_func the_fpm = function
 184           ...
 185           try
 186             let ret_val = codegen_expr body in
 187
 188             (* Finish off the function. *)
 189             let _ = build_ret ret_val builder in
 190
 191             (* Validate the generated code, checking for consistency. *)
 192             Llvm_analysis.assert_valid_function the_function;
 193
 194             (* Optimize the function. *)
 195             let _ = PassManager.run_function the_function the_fpm in
 196
 197             the_function
 198
 199 As you can see, this is pretty straightforward. The ``the_fpm``
 200 optimizes and updates the LLVM Function\* in place, improving
 201 (hopefully) its body. With this in place, we can try our test above
 202 again:
 203
 204 ::
 205
 206     ready> def test(x) (1+2+x)*(x+(1+2));
 207     ready> Read function definition:
 208     define double @test(double %x) {
 209     entry:
 210             %addtmp = fadd double %x, 3.000000e+00
 211             %multmp = fmul double %addtmp, %addtmp
 212             ret double %multmp
 213     }
 214
 215 As expected, we now get our nicely optimized code, saving a floating
 216 point add instruction from every execution of this function.
 217
 218 LLVM provides a wide variety of optimizations that can be used in
 219 certain circumstances. Some `documentation about the various
 220 passes <../Passes.html>`_ is available, but it isn't very complete.
 221 Another good source of ideas can come from looking at the passes that
 222 ``Clang`` runs to get started. The "``opt``" tool allows you to
 223 experiment with passes from the command line, so you can see if they do
 224 anything.
 225
 226 Now that we have reasonable code coming out of our front-end, lets talk
 227 about executing it!
 228
 229 Adding a JIT Compiler
 230 =====================
 231
 232 Code that is available in LLVM IR can have a wide variety of tools
 233 applied to it. For example, you can run optimizations on it (as we did
 234 above), you can dump it out in textual or binary forms, you can compile
 235 the code to an assembly file (.s) for some target, or you can JIT
 236 compile it. The nice thing about the LLVM IR representation is that it
 237 is the "common currency" between many different parts of the compiler.
 238
 239 In this section, we'll add JIT compiler support to our interpreter. The
 240 basic idea that we want for Kaleidoscope is to have the user enter
 241 function bodies as they do now, but immediately evaluate the top-level
 242 expressions they type in. For example, if they type in "1 + 2;", we
 243 should evaluate and print out 3. If they define a function, they should
 244 be able to call it from the command line.
 245
 246 In order to do this, we first declare and initialize the JIT. This is
 247 done by adding a global variable and a call in ``main``:
 248
 249 .. code-block:: ocaml
 250
 251     ...
 252     let main () =
 253       ...
 254       (* Create the JIT. *)
 255       let the_execution_engine = ExecutionEngine.create Codegen.the_module in
 256       ...
 257
 258 This creates an abstract "Execution Engine" which can be either a JIT
 259 compiler or the LLVM interpreter. LLVM will automatically pick a JIT
 260 compiler for you if one is available for your platform, otherwise it
 261 will fall back to the interpreter.
 262
 263 Once the ``Llvm_executionengine.ExecutionEngine.t`` is created, the JIT
 264 is ready to be used. There are a variety of APIs that are useful, but
 265 the simplest one is the
 266 "``Llvm_executionengine.ExecutionEngine.run_function``" function. This
 267 method JIT compiles the specified LLVM Function and returns a function
 268 pointer to the generated machine code. In our case, this means that we
 269 can change the code that parses a top-level expression to look like
 270 this:
 271
 272 .. code-block:: ocaml
 273
 274                 (* Evaluate a top-level expression into an anonymous function. *)
 275                 let e = Parser.parse_toplevel stream in
 276                 print_endline "parsed a top-level expr";
 277                 let the_function = Codegen.codegen_func the_fpm e in
 278                 dump_value the_function;
 279
 280                 (* JIT the function, returning a function pointer. *)
 281                 let result = ExecutionEngine.run_function the_function [||]
 282                   the_execution_engine in
 283
 284                 print_string "Evaluated to ";
 285                 print_float (GenericValue.as_float Codegen.double_type result);
 286                 print_newline ();
 287
 288 Recall that we compile top-level expressions into a self-contained LLVM
 289 function that takes no arguments and returns the computed double.
 290 Because the LLVM JIT compiler matches the native platform ABI, this
 291 means that you can just cast the result pointer to a function pointer of
 292 that type and call it directly. This means, there is no difference
 293 between JIT compiled code and native machine code that is statically
 294 linked into your application.
 295
 296 With just these two changes, lets see how Kaleidoscope works now!
 297
 298 ::
 299
 300     ready> 4+5;
 301     define double @""() {
 302     entry:
 303             ret double 9.000000e+00
 304     }
 305
 306     Evaluated to 9.000000
 307
 308 Well this looks like it is basically working. The dump of the function
 309 shows the "no argument function that always returns double" that we
 310 synthesize for each top level expression that is typed in. This
 311 demonstrates very basic functionality, but can we do more?
 312
 313 ::
 314
 315     ready> def testfunc(x y) x + y*2;
 316     Read function definition:
 317     define double @testfunc(double %x, double %y) {
 318     entry:
 319             %multmp = fmul double %y, 2.000000e+00
 320             %addtmp = fadd double %multmp, %x
 321             ret double %addtmp
 322     }
 323
 324     ready> testfunc(4, 10);
 325     define double @""() {
 326     entry:
 327             %calltmp = call double @testfunc(double 4.000000e+00, double 1.000000e+01)
 328             ret double %calltmp
 329     }
 330
 331     Evaluated to 24.000000
 332
 333 This illustrates that we can now call user code, but there is something
 334 a bit subtle going on here. Note that we only invoke the JIT on the
 335 anonymous functions that *call testfunc*, but we never invoked it on
 336 *testfunc* itself. What actually happened here is that the JIT scanned
 337 for all non-JIT'd functions transitively called from the anonymous
 338 function and compiled all of them before returning from
 339 ``run_function``.
 340
 341 The JIT provides a number of other more advanced interfaces for things
 342 like freeing allocated machine code, rejit'ing functions to update them,
 343 etc. However, even with this simple code, we get some surprisingly
 344 powerful capabilities - check this out (I removed the dump of the
 345 anonymous functions, you should get the idea by now :) :
 346
 347 ::
 348
 349     ready> extern sin(x);
 350     Read extern:
 351     declare double @sin(double)
 352
 353     ready> extern cos(x);
 354     Read extern:
 355     declare double @cos(double)
 356
 357     ready> sin(1.0);
 358     Evaluated to 0.841471
 359
 360     ready> def foo(x) sin(x)*sin(x) + cos(x)*cos(x);
 361     Read function definition:
 362     define double @foo(double %x) {
 363     entry:
 364             %calltmp = call double @sin(double %x)
 365             %multmp = fmul double %calltmp, %calltmp
 366             %calltmp2 = call double @cos(double %x)
 367             %multmp4 = fmul double %calltmp2, %calltmp2
 368             %addtmp = fadd double %multmp, %multmp4
 369             ret double %addtmp
 370     }
 371
 372     ready> foo(4.0);
 373     Evaluated to 1.000000
 374
 375 Whoa, how does the JIT know about sin and cos? The answer is
 376 surprisingly simple: in this example, the JIT started execution of a
 377 function and got to a function call. It realized that the function was
 378 not yet JIT compiled and invoked the standard set of routines to resolve
 379 the function. In this case, there is no body defined for the function,
 380 so the JIT ended up calling "``dlsym("sin")``" on the Kaleidoscope
 381 process itself. Since "``sin``" is defined within the JIT's address
 382 space, it simply patches up calls in the module to call the libm version
 383 of ``sin`` directly.
 384
 385 The LLVM JIT provides a number of interfaces (look in the
 386 ``llvm_executionengine.mli`` file) for controlling how unknown functions
 387 get resolved. It allows you to establish explicit mappings between IR
 388 objects and addresses (useful for LLVM global variables that you want to
 389 map to static tables, for example), allows you to dynamically decide on
 390 the fly based on the function name, and even allows you to have the JIT
 391 compile functions lazily the first time they're called.
 392
 393 One interesting application of this is that we can now extend the
 394 language by writing arbitrary C code to implement operations. For
 395 example, if we add:
 396
 397 .. code-block:: c++
 398
 399     /* putchard - putchar that takes a double and returns 0. */
 400     extern "C"
 401     double putchard(double X) {
 402       putchar((char)X);
 403       return 0;
 404     }
 405
 406 Now we can produce simple output to the console by using things like:
 407 "``extern putchard(x); putchard(120);``", which prints a lowercase 'x'
 408 on the console (120 is the ASCII code for 'x'). Similar code could be
 409 used to implement file I/O, console input, and many other capabilities
 410 in Kaleidoscope.
 411
 412 This completes the JIT and optimizer chapter of the Kaleidoscope
 413 tutorial. At this point, we can compile a non-Turing-complete
 414 programming language, optimize and JIT compile it in a user-driven way.
 415 Next up we'll look into `extending the language with control flow
 416 constructs <OCamlLangImpl5.html>`_, tackling some interesting LLVM IR
 417 issues along the way.
 418
 419 Full Code Listing
 420 =================
 421
 422 Here is the complete code listing for our running example, enhanced with
 423 the LLVM JIT and optimizer. To build this example, use:
 424
 425 .. code-block:: bash
 426
 427     # Compile
 428     ocamlbuild toy.byte
 429     # Run
 430     ./toy.byte
 431
 432 Here is the code:
 433
 434 \_tags:
 435     ::
 436
 437         <{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
 438         <*.{byte,native}>: g++, use_llvm, use_llvm_analysis
 439         <*.{byte,native}>: use_llvm_executionengine, use_llvm_target
 440         <*.{byte,native}>: use_llvm_scalar_opts, use_bindings
 441
 442 myocamlbuild.ml:
 443     .. code-block:: ocaml
 444
 445         open Ocamlbuild_plugin;;
 446
 447         ocaml_lib ~extern:true "llvm";;
 448         ocaml_lib ~extern:true "llvm_analysis";;
 449         ocaml_lib ~extern:true "llvm_executionengine";;
 450         ocaml_lib ~extern:true "llvm_target";;
 451         ocaml_lib ~extern:true "llvm_scalar_opts";;
 452
 453         flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"]);;
 454         dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
 455
 456 token.ml:
 457     .. code-block:: ocaml
 458
 459         (*===----------------------------------------------------------------------===
 460          * Lexer Tokens
 461          *===----------------------------------------------------------------------===*)
 462
 463         (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
 464          * these others for known things. *)
 465         type token =
 466           (* commands *)
 467           | Def | Extern
 468
 469           (* primary *)
 470           | Ident of string | Number of float
 471
 472           (* unknown *)
 473           | Kwd of char
 474
 475 lexer.ml:
 476     .. code-block:: ocaml
 477
 478         (*===----------------------------------------------------------------------===
 479          * Lexer
 480          *===----------------------------------------------------------------------===*)
 481
 482         let rec lex = parser
 483           (* Skip any whitespace. *)
 484           | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
 485
 486           (* identifier: [a-zA-Z][a-zA-Z0-9] *)
 487           | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
 488               let buffer = Buffer.create 1 in
 489               Buffer.add_char buffer c;
 490               lex_ident buffer stream
 491
 492           (* number: [0-9.]+ *)
 493           | [< ' ('0' .. '9' as c); stream >] ->
 494               let buffer = Buffer.create 1 in
 495               Buffer.add_char buffer c;
 496               lex_number buffer stream
 497
 498           (* Comment until end of line. *)
 499           | [< ' ('#'); stream >] ->
 500               lex_comment stream
 501
 502           (* Otherwise, just return the character as its ascii value. *)
 503           | [< 'c; stream >] ->
 504               [< 'Token.Kwd c; lex stream >]
 505
 506           (* end of stream. *)
 507           | [< >] -> [< >]
 508
 509         and lex_number buffer = parser
 510           | [< ' ('0' .. '9' | '.' as c); stream >] ->
 511               Buffer.add_char buffer c;
 512               lex_number buffer stream
 513           | [< stream=lex >] ->
 514               [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
 515
 516         and lex_ident buffer = parser
 517           | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
 518               Buffer.add_char buffer c;
 519               lex_ident buffer stream
 520           | [< stream=lex >] ->
 521               match Buffer.contents buffer with
 522               | "def" -> [< 'Token.Def; stream >]
 523               | "extern" -> [< 'Token.Extern; stream >]
 524               | id -> [< 'Token.Ident id; stream >]
 525
 526         and lex_comment = parser
 527           | [< ' ('\n'); stream=lex >] -> stream
 528           | [< 'c; e=lex_comment >] -> e
 529           | [< >] -> [< >]
 530
 531 ast.ml:
 532     .. code-block:: ocaml
 533
 534         (*===----------------------------------------------------------------------===
 535          * Abstract Syntax Tree (aka Parse Tree)
 536          *===----------------------------------------------------------------------===*)
 537
 538         (* expr - Base type for all expression nodes. *)
 539         type expr =
 540           (* variant for numeric literals like "1.0". *)
 541           | Number of float
 542
 543           (* variant for referencing a variable, like "a". *)
 544           | Variable of string
 545
 546           (* variant for a binary operator. *)
 547           | Binary of char * expr * expr
 548
 549           (* variant for function calls. *)
 550           | Call of string * expr array
 551
 552         (* proto - This type represents the "prototype" for a function, which captures
 553          * its name, and its argument names (thus implicitly the number of arguments the
 554          * function takes). *)
 555         type proto = Prototype of string * string array
 556
 557         (* func - This type represents a function definition itself. *)
 558         type func = Function of proto * expr
 559
 560 parser.ml:
 561     .. code-block:: ocaml
 562
 563         (*===---------------------------------------------------------------------===
 564          * Parser
 565          *===---------------------------------------------------------------------===*)
 566
 567         (* binop_precedence - This holds the precedence for each binary operator that is
 568          * defined *)
 569         let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
 570
 571         (* precedence - Get the precedence of the pending binary operator token. *)
 572         let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
 573
 574         (* primary
 575          *   ::= identifier
 576          *   ::= numberexpr
 577          *   ::= parenexpr *)
 578         let rec parse_primary = parser
 579           (* numberexpr ::= number *)
 580           | [< 'Token.Number n >] -> Ast.Number n
 581
 582           (* parenexpr ::= '(' expression ')' *)
 583           | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
 584
 585           (* identifierexpr
 586            *   ::= identifier
 587            *   ::= identifier '(' argumentexpr ')' *)
 588           | [< 'Token.Ident id; stream >] ->
 589               let rec parse_args accumulator = parser
 590                 | [< e=parse_expr; stream >] ->
 591                     begin parser
 592                       | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
 593                       | [< >] -> e :: accumulator
 594                     end stream
 595                 | [< >] -> accumulator
 596               in
 597               let rec parse_ident id = parser
 598                 (* Call. *)
 599                 | [< 'Token.Kwd '(';
 600                      args=parse_args [];
 601                      'Token.Kwd ')' ?? "expected ')'">] ->
 602                     Ast.Call (id, Array.of_list (List.rev args))
 603
 604                 (* Simple variable ref. *)
 605                 | [< >] -> Ast.Variable id
 606               in
 607               parse_ident id stream
 608
 609           | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
 610
 611         (* binoprhs
 612          *   ::= ('+' primary)* *)
 613         and parse_bin_rhs expr_prec lhs stream =
 614           match Stream.peek stream with
 615           (* If this is a binop, find its precedence. *)
 616           | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
 617               let token_prec = precedence c in
 618
 619               (* If this is a binop that binds at least as tightly as the current binop,
 620                * consume it, otherwise we are done. *)
 621               if token_prec < expr_prec then lhs else begin
 622                 (* Eat the binop. *)
 623                 Stream.junk stream;
 624
 625                 (* Parse the primary expression after the binary operator. *)
 626                 let rhs = parse_primary stream in
 627
 628                 (* Okay, we know this is a binop. *)
 629                 let rhs =
 630                   match Stream.peek stream with
 631                   | Some (Token.Kwd c2) ->
 632                       (* If BinOp binds less tightly with rhs than the operator after
 633                        * rhs, let the pending operator take rhs as its lhs. *)
 634                       let next_prec = precedence c2 in
 635                       if token_prec < next_prec
 636                       then parse_bin_rhs (token_prec + 1) rhs stream
 637                       else rhs
 638                   | _ -> rhs
 639                 in
 640
 641                 (* Merge lhs/rhs. *)
 642                 let lhs = Ast.Binary (c, lhs, rhs) in
 643                 parse_bin_rhs expr_prec lhs stream
 644               end
 645           | _ -> lhs
 646
 647         (* expression
 648          *   ::= primary binoprhs *)
 649         and parse_expr = parser
 650           | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream
 651
 652         (* prototype
 653          *   ::= id '(' id* ')' *)
 654         let parse_prototype =
 655           let rec parse_args accumulator = parser
 656             | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
 657             | [< >] -> accumulator
 658           in
 659
 660           parser
 661           | [< 'Token.Ident id;
 662                'Token.Kwd '(' ?? "expected '(' in prototype";
 663                args=parse_args [];
 664                'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
 665               (* success. *)
 666               Ast.Prototype (id, Array.of_list (List.rev args))
 667
 668           | [< >] ->
 669               raise (Stream.Error "expected function name in prototype")
 670
 671         (* definition ::= 'def' prototype expression *)
 672         let parse_definition = parser
 673           | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
 674               Ast.Function (p, e)
 675
 676         (* toplevelexpr ::= expression *)
 677         let parse_toplevel = parser
 678           | [< e=parse_expr >] ->
 679               (* Make an anonymous proto. *)
 680               Ast.Function (Ast.Prototype ("", [||]), e)
 681
 682         (*  external ::= 'extern' prototype *)
 683         let parse_extern = parser
 684           | [< 'Token.Extern; e=parse_prototype >] -> e
 685
 686 codegen.ml:
 687     .. code-block:: ocaml
 688
 689         (*===----------------------------------------------------------------------===
 690          * Code Generation
 691          *===----------------------------------------------------------------------===*)
 692
 693         open Llvm
 694
 695         exception Error of string
 696
 697         let context = global_context ()
 698         let the_module = create_module context "my cool jit"
 699         let builder = builder context
 700         let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
 701         let double_type = double_type context
 702
 703         let rec codegen_expr = function
 704           | Ast.Number n -> const_float double_type n
 705           | Ast.Variable name ->
 706               (try Hashtbl.find named_values name with
 707                 | Not_found -> raise (Error "unknown variable name"))
 708           | Ast.Binary (op, lhs, rhs) ->
 709               let lhs_val = codegen_expr lhs in
 710               let rhs_val = codegen_expr rhs in
 711               begin
 712                 match op with
 713                 | '+' -> build_add lhs_val rhs_val "addtmp" builder
 714                 | '-' -> build_sub lhs_val rhs_val "subtmp" builder
 715                 | '*' -> build_mul lhs_val rhs_val "multmp" builder
 716                 | '<' ->
 717                     (* Convert bool 0/1 to double 0.0 or 1.0 *)
 718                     let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
 719                     build_uitofp i double_type "booltmp" builder
 720                 | _ -> raise (Error "invalid binary operator")
 721               end
 722           | Ast.Call (callee, args) ->
 723               (* Look up the name in the module table. *)
 724               let callee =
 725                 match lookup_function callee the_module with
 726                 | Some callee -> callee
 727                 | None -> raise (Error "unknown function referenced")
 728               in
 729               let params = params callee in
 730
 731               (* If argument mismatch error. *)
 732               if Array.length params == Array.length args then () else
 733                 raise (Error "incorrect # arguments passed");
 734               let args = Array.map codegen_expr args in
 735               build_call callee args "calltmp" builder
 736
 737         let codegen_proto = function
 738           | Ast.Prototype (name, args) ->
 739               (* Make the function type: double(double,double) etc. *)
 740               let doubles = Array.make (Array.length args) double_type in
 741               let ft = function_type double_type doubles in
 742               let f =
 743                 match lookup_function name the_module with
 744                 | None -> declare_function name ft the_module
 745
 746                 (* If 'f' conflicted, there was already something named 'name'. If it
 747                  * has a body, don't allow redefinition or reextern. *)
 748                 | Some f ->
 749                     (* If 'f' already has a body, reject this. *)
 750                     if block_begin f <> At_end f then
 751                       raise (Error "redefinition of function");
 752
 753                     (* If 'f' took a different number of arguments, reject. *)
 754                     if element_type (type_of f) <> ft then
 755                       raise (Error "redefinition of function with different # args");
 756                     f
 757               in
 758
 759               (* Set names for all arguments. *)
 760               Array.iteri (fun i a ->
 761                 let n = args.(i) in
 762                 set_value_name n a;
 763                 Hashtbl.add named_values n a;
 764               ) (params f);
 765               f
 766
 767         let codegen_func the_fpm = function
 768           | Ast.Function (proto, body) ->
 769               Hashtbl.clear named_values;
 770               let the_function = codegen_proto proto in
 771
 772               (* Create a new basic block to start insertion into. *)
 773               let bb = append_block context "entry" the_function in
 774               position_at_end bb builder;
 775
 776               try
 777                 let ret_val = codegen_expr body in
 778
 779                 (* Finish off the function. *)
 780                 let _ = build_ret ret_val builder in
 781
 782                 (* Validate the generated code, checking for consistency. *)
 783                 Llvm_analysis.assert_valid_function the_function;
 784
 785                 (* Optimize the function. *)
 786                 let _ = PassManager.run_function the_function the_fpm in
 787
 788                 the_function
 789               with e ->
 790                 delete_function the_function;
 791                 raise e
 792
 793 toplevel.ml:
 794     .. code-block:: ocaml
 795
 796         (*===----------------------------------------------------------------------===
 797          * Top-Level parsing and JIT Driver
 798          *===----------------------------------------------------------------------===*)
 799
 800         open Llvm
 801         open Llvm_executionengine
 802
 803         (* top ::= definition | external | expression | ';' *)
 804         let rec main_loop the_fpm the_execution_engine stream =
 805           match Stream.peek stream with
 806           | None -> ()
 807
 808           (* ignore top-level semicolons. *)
 809           | Some (Token.Kwd ';') ->
 810               Stream.junk stream;
 811               main_loop the_fpm the_execution_engine stream
 812
 813           | Some token ->
 814               begin
 815                 try match token with
 816                 | Token.Def ->
 817                     let e = Parser.parse_definition stream in
 818                     print_endline "parsed a function definition.";
 819                     dump_value (Codegen.codegen_func the_fpm e);
 820                 | Token.Extern ->
 821                     let e = Parser.parse_extern stream in
 822                     print_endline "parsed an extern.";
 823                     dump_value (Codegen.codegen_proto e);
 824                 | _ ->
 825                     (* Evaluate a top-level expression into an anonymous function. *)
 826                     let e = Parser.parse_toplevel stream in
 827                     print_endline "parsed a top-level expr";
 828                     let the_function = Codegen.codegen_func the_fpm e in
 829                     dump_value the_function;
 830
 831                     (* JIT the function, returning a function pointer. *)
 832                     let result = ExecutionEngine.run_function the_function [||]
 833                       the_execution_engine in
 834
 835                     print_string "Evaluated to ";
 836                     print_float (GenericValue.as_float Codegen.double_type result);
 837                     print_newline ();
 838                 with Stream.Error s | Codegen.Error s ->
 839                   (* Skip token for error recovery. *)
 840                   Stream.junk stream;
 841                   print_endline s;
 842               end;
 843               print_string "ready> "; flush stdout;
 844               main_loop the_fpm the_execution_engine stream
 845
 846 toy.ml:
 847     .. code-block:: ocaml
 848
 849         (*===----------------------------------------------------------------------===
 850          * Main driver code.
 851          *===----------------------------------------------------------------------===*)
 852
 853         open Llvm
 854         open Llvm_executionengine
 855         open Llvm_target
 856         open Llvm_scalar_opts
 857
 858         let main () =
 859           ignore (initialize_native_target ());
 860
 861           (* Install standard binary operators.
 862            * 1 is the lowest precedence. *)
 863           Hashtbl.add Parser.binop_precedence '<' 10;
 864           Hashtbl.add Parser.binop_precedence '+' 20;
 865           Hashtbl.add Parser.binop_precedence '-' 20;
 866           Hashtbl.add Parser.binop_precedence '*' 40;    (* highest. *)
 867
 868           (* Prime the first token. *)
 869           print_string "ready> "; flush stdout;
 870           let stream = Lexer.lex (Stream.of_channel stdin) in
 871
 872           (* Create the JIT. *)
 873           let the_execution_engine = ExecutionEngine.create Codegen.the_module in
 874           let the_fpm = PassManager.create_function Codegen.the_module in
 875
 876           (* Set up the optimizer pipeline.  Start with registering info about how the
 877            * target lays out data structures. *)
 878           DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
 879
 880           (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
 881           add_instruction_combination the_fpm;
 882
 883           (* reassociate expressions. *)
 884           add_reassociation the_fpm;
 885
 886           (* Eliminate Common SubExpressions. *)
 887           add_gvn the_fpm;
 888
 889           (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
 890           add_cfg_simplification the_fpm;
 891
 892           ignore (PassManager.initialize the_fpm);
 893
 894           (* Run the main "interpreter loop" now. *)
 895           Toplevel.main_loop the_fpm the_execution_engine stream;
 896
 897           (* Print out all the generated code. *)
 898           dump_module Codegen.the_module
 899         ;;
 900
 901         main ()
 902
 903 bindings.c
 904     .. code-block:: c
 905
 906         #include <stdio.h>
 907
 908         /* putchard - putchar that takes a double and returns 0. */
 909         extern double putchard(double X) {
 910           putchar((char)X);
 911           return 0;
 912         }
 913
 914 `Next: Extending the language: control flow <OCamlLangImpl5.html>`_
 915