doc/compiler_intro.txt

   1 = Introduction to the Rubinius compiler
   2
   3 This document presents the sequence of transformations that it takes
   4 to compile a very simple Ruby source file into Rubinius bytecode (or,
   5 to be specific, intcode) for the purposes of understanding the basic
   6 operation of the compiler component of the VM. Even source files with
   7 more complex syntax will follow the same general processing pattern.
   8 The compiler is conceptually very straightforward and elegant.
   9
  10 I use a very liberal pseudocode syntax for most of the explanation
  11 just to keep things simple. For any clarifications, you can check the
  12 source. Generally any data structures shown represent what is being
  13 passed to the next step rather than the state in the explanation just
  14 above it: for example in the first stage, the sexp shown is always
  15 essentially the "argument" to the next method or section.
  16
  17 == Stage 0: Input
  18
  19 The sample input is in test.rb and looks like this:
  20
  21   def foo()
  22     puts "foo"
  23   end
  24
  25 Rubinius extends File with the method #to_sexp which will have
  26 the parser produce us a nested set of Arrays with the necessary info
  27 to be able to interpret (or in our case, compile) the program. How
  28 the parser works is outside the scope of treatise but this is its
  29 output, hereafter referred to as the *sexp* (S-expression):
  30
  31   [:newline, 1, "test.rb",
  32     [:defn, :foo,
  33       [:scope,
  34         [:block, [:args],
  35           [:newline, 2, "test.rb",
  36             [:fcall, :puts, [:array, [:str, "foo"]]]
  37           ]
  38         ],
  39         []
  40       ]
  41     ]
  42   ]
  43
  44 If you compare the sexp to the code snippet above, is pretty simple.
  45 +:defn+ is clearly the method definition, +:scope+ and +:block+ contain
  46 the code within it and that contained code is the +:fcall+. The
  47 +:newline+ nodes are just that, line information. This all will become
  48 much clearer as we go along.
  49
  50 Next step is to create a new Compiler instance (nothing fancy there)
  51 and get to work translating
  52
  53 == Stage 1: Sexp to AST
  54
  55 === 1.1 Descending into the sexp
  56
  57 The sexp is first sent to <tt>Compiler#into_script</tt> (compiler.rb) which
  58 will eventually return some type of a Node object for further processing.
  59 Nodes represent nodes in the resulting AST (Abstract Syntax Tree).
  60
  61 General notes:
  62
  63 * compiler.rb implements the compiler infrastructure as well as the basic
  64   Node class and its default operations. Subclasses of Node reside in
  65   nodes.rb and override methods as necessary.
  66 * Node.kind allows node types to register themselves into a mapping of sexp
  67   node name => node class, this is <tt>Compiler::Node::Mapping</tt>. For
  68   example, a method definition appears as +:defn+ in the sexp and that is
  69   mapped to the Define node although usually the names are the same with the
  70   exception that the node class name is of course capitalised.
  71 * Error handling is present all along the way but I ignore it because
  72   there is nothing fancy there. Any errors here are unrecoverable.
  73
  74 Compiler#into_script wraps the plain sexp inside a +:script+ and sends
  75 it to Compiler#convert_sexp.
  76
  77   [:script,
  78     [:newline, 1, "test.rb",
  79       [:defn, :foo,
  80         [:scope,
  81           [:block, [:args],
  82             [:newline, 2, "test.rb",
  83               [:fcall, :puts, [:array, [:str, "foo"]]]
  84             ]
  85           ],
  86           []
  87         ]
  88       ]
  89     ]
  90   ]
  91
  92 We will run into #convert_sexp pretty much every single step of the way.
  93 It simply looks up the correct node type from the mapping by using the
  94 first element in the sexp that it receives and then calls the #create
  95 method on the class with the sexp and a reference to the compiler.
  96
  97 So <tt>c.convert_sexp sexp</tt> -> <tt>Script.create c, sexp</tt>
  98
  99   [:script,
 100     [:newline, 1, "test.rb",
 101       [:defn, :foo,
 102         [:scope,
 103           [:block, [:args],
 104             [:newline, 2, "test.rb",
 105               [:fcall, :puts, [:array, [:str, "foo"]]]
 106             ]
 107           ],
 108           []
 109         ]
 110       ]
 111     ]
 112   ]
 113
 114 None of the Node subclasses implement #create themselves so we are still
 115 in compiler.rb. .create shifts away the now-unnecessary first element of
 116 the sexp and then creates a new instance of the node subclass.
 117
 118 The following step is the main function of this compilation stage. We ask
 119 the Node subclass instance to #consume the remaining sexp which produces
 120 an output. By default #consume will simply recursively process all
 121 nested Arrays (sexps) and leave any non-Array elements alone but this
 122 behaviour is overridden by some subclasses (as we will see shortly) to
 123 make the most important processing happen.
 124
 125 The output is then sent to the #args method of the node instance or,
 126 if the node type needs argument normalisation (to reconcile all variants
 127 of a particular sexp type), to #normalize instead. Neither of these
 128 have a default implementation in Node so look under the specific class
 129 in nodes.rb.
 130
 131 In our case, Script.create c, sexp, we see the first slight modification
 132 in creating the Script instance. Script < ClosedScope, and the purpose of
 133 all ClosedScopes is to represent visibility scopes in the code. The main
 134 part here is that a new LocalScope object is created (you can see locals.rb
 135 but we will get back to this later.)
 136
 137 Next, we step into Script#consume (or ClosedScope#consume to be
 138 specific) with our newly stripped sexp.
 139
 140   # Script.create c, sexp
 141   #   Script.new   (creates a new LocalScope)
 142   #   s.consume sexp
 143   #   args *result
 144   [
 145     [:newline, 1, "test.rb",
 146       [:defn, :foo,
 147         [:scope,
 148           [:block, [:args],
 149             [:newline, 2, "test.rb",
 150               [:fcall, :puts, [:array, [:str, "foo"]]]
 151             ]
 152           ],
 153           []
 154         ]
 155       ]
 156     ]
 157   ]
 158
 159 ClosedScope#consume assumes that it is getting a single-element
 160 Array containing the rest of the sexp for this particular scope (in
 161 our case the entire script.)
 162
 163 It has a couple other important duties too, though: firstly, it
 164 uses the compiler's #set method to set the current scope as itself
 165 as well as the default visibility as public. #set and #get are
 166 the compiler's way to change and describe the current compilation
 167 state and various auxiliary data.
 168
 169 Secondly, the scope will #formalize! all contained scopes. We
 170 will return to this later, but formalization is the process of
 171 reserving stack space and assigning indexes to any local variables
 172 or arguments.
 173
 174   # Script#consume        (nodes.rb)
 175   #   super to ClosedScope#consume
 176   #   out = convert sexp.first (which is a Newline)
 177   #   formalize all scopes
 178   [:newline, 1, "test.rb",
 179     [:defn, :foo,
 180       [:scope,
 181         [:block, [:args],
 182           [:newline, 2, "test.rb",
 183             [:fcall, :puts, [:array, [:str, "foo"]]]
 184           ]
 185         ],
 186         []
 187       ]
 188     ]
 189   ]
 190
 191 The Node#convert call you see there just calls Compiler#convert_sexp
 192 which, as you recall, is where this whole thing got started. This will be
 193 a recurring pattern.
 194
 195 Through the normal route, we get to Newline.create which in turn
 196 uses default processing to #consume the rest of the sexp (the first
 197 element is again stripped after the node type lookup.)
 198
 199   # Compiler#convert_sexp -> Newline.create
 200   #   Newline#consume sexp.shift
 201   [1, "test.rb",
 202     [:defn, :foo,
 203       [:scope,
 204         [:block, [:args],
 205           [:newline, 2, "test.rb",
 206             [:fcall, :puts, [:array, [:str, "foo"]]]
 207           ]
 208         ],
 209         []
 210       ]
 211     ]
 212   ]
 213
 214 Newline#consume first registers the current file and line
 215 information (the first two elements) with the compiler to help
 216 pinpoint compilation errors and then falls back on default
 217 processing from Node#consume.
 218
 219 The process here is simple: loop through the sexp and push
 220 elements to output. All nested sexps are recursively #converted
 221 first.
 222
 223 So here the eventual output to be returned to .create is
 224 just a three-element Array: [1, 'test.rb', <Define node>].
 225
 226 As we can see, the +:defn+ node will need processing so we
 227 move on to that.
 228
 229   # convert [:defn
 230   [:defn, :foo,
 231     [:scope,
 232       [:block, [:args],
 233         [:newline, 2, "test.rb",
 234           [:fcall, :puts, [:array, [:str, "foo"]]]
 235         ]
 236       ],
 237       []
 238     ]
 239   ]
 240
 241 (I will start abbreviating some common steps here.)
 242
 243   # n.convert -> c.convert_sexp -> Define.create
 244   #   Define.new.consume sexp.shift
 245   [:foo,
 246     [:scope,
 247       [:block, [:args],
 248         [:newline, 2, "test.rb",
 249           [:fcall, :puts, [:array, [:str, "foo"]]]
 250         ]
 251       ],
 252       []
 253     ]
 254   ]
 255
 256 Define is also a ClosedScope but it will significantly augment
 257 the normal #consume from the latter. First we separate the name
 258 and the body, after which the body gets supered to ClosedScope#consume
 259 wrapped inside a dummy Array. After getting the body back, we will
 260 massage it a bit to suit our needs but this will be done in part 2.
 261 The eventual output here will be [name, <scope>, <args>].
 262
 263   # Define#consume -> ClosedScope#consume [body] -> convert body.first
 264   [:scope,
 265     [:block, [:args],
 266       [:newline, 2, "test.rb",
 267         [:fcall, :puts, [:array, [:str, "foo"]]]
 268       ]
 269     ],
 270     []
 271   ]
 272
 273 Scope nodes do not manage the scopes themselves, they are merely
 274 abstractions within the actual visibility scopes. Scope#consume
 275 assumes it is getting a two-element sexp of which only the first
 276 needs further processing (the second contains local variable names.)
 277
 278   # n.convert -> c.convert_sexp -> Scope.create -> Scope#consume
 279   [
 280     [:block, [:args],
 281       [:newline, 2, "test.rb",
 282         [:fcall, :puts, [:array, [:str, "foo"]]]
 283       ]
 284     ],
 285     []
 286   ]
 287
 288   # Scope#consume
 289   #   sexp[0] = convert sexp[0]
 290   [:block,
 291     [:args],
 292     [:newline, 2, "test.rb",
 293       [:fcall, :puts, [:array, [:str, "foo"]]]
 294     ]
 295   ]
 296
 297 Block, then, typically encapsulates any type of block of code,
 298 not a lambda block. It is used in method definitions, if-expressions
 299 and so on.
 300
 301   # c.convert_sexp -> Block.create
 302   #   Block.new.consume sexp.shift
 303   [
 304     [:args],
 305     [:newline, 2, "test.rb",
 306       [:fcall, :puts, [:array, [:str, "foo"]]]
 307     ]
 308   ]
 309
 310 This is the first time we encounter a sexp with two separate nested
 311 sexps. Both will be processed separately and the output will be just
 312 [<arg node>, <newline node>].
 313
 314   # Block#consume -> Node#consume
 315   #   Loop through sexp, recursively convert nested sexps
 316   #   convert [:args
 317   [:args]
 318   #   convert [:newline
 319   [:newline, 2, "test.rb",
 320     [:fcall, :puts, [:array, [:str, "foo"]]]
 321   ]
 322
 323 Stepping through Arguments first.
 324
 325   # c.convert_sexp -> Arguments.create
 326   #   Arguments.new.consume sexp.shift
 327   []
 328
 329 Since we have no arguments, there is no work to be done. However, to
 330 make things consistent, Arguments#consume will return a structure
 331 representing no args of any kind, [[], [], nil, nil] (the fields are
 332 required arg names, optional arg names, splat arg name and default arg
 333 computations. See the source in nodes.rb for more info.)
 334
 335 The second node to process is the newline, one of which we have seen
 336 already. Nothing new here.
 337
 338   # c.convert_sexp -> Newline.create
 339   #   Newline.new.consume sexp.shift
 340   [2, "test.rb",
 341     [:fcall, :puts, [:array, [:str, "foo"]]]
 342   ]
 343
 344   # Newline#consume
 345   #  The line and file are passed to Compiler#set_position
 346   #  super sexp -> Node#consume sexp
 347   #    Loop through sexp, recursively convert nested sexps
 348   #      convert [:fcall
 349   [:fcall, :puts, [:array, [:str, "foo"]]]
 350
 351 FCall is a bit more interesting. It represents a method call without
 352 an explicit receiver ("functional" style) and it is the first node that
 353 must be #normalized. FCall < Call < MethodCall < Node.
 354
 355   # c.convert_sexp -> FCall.create
 356   #   FCall.new
 357   #     MethodCall does a bit of extra work here by storing
 358   #     the scope type (it was #set before, if you recall.)
 359   #     In our case the scope type is Script. We also start
 360   #     out assuming no block is involved with this call.
 361   #
 362   #   f.consume sexp.shift
 363   [:puts, [:array, [:str, "foo"]]]
 364
 365   # FCall#consume -> Node#consume
 366   #   Loop through sexp, recursively convert nested sexps
 367   #   convert [:array
 368   [[:array, [:str, "foo"]]]
 369
 370   # c.convert_sexp -> ArrayLiteral.create
 371   #   ArrayLiteral.new.consume sexp.shift
 372   [[:str, "foo"]]
 373
 374   # ArrayLiteral#consume -> Node#consume
 375   #   Loop through sexp, recursively convert nested sexps
 376   #   convert [:str
 377   [:str, "foo"]
 378
 379   # c.convert_sexp -> StringLiteral.create
 380   #   StringLiteral.new.consume sexp.shift
 381   ["foo"]
 382
 383   # StringLiteral#consume -> Node#consume
 384   #   Loop through sexp, recursively convert nested sexps
 385   #   (Nothing to do, just "foo" remains.)
 386
 387 At this point, we are at the end of the tree. There is nothing
 388 more to parse and we are ready for the next phase.
 389
 390 === 1.2 Creating the AST
 391
 392 Now we traverse back up the sexp, pushing the generated Nodes
 393 upwards as we go. As you remember, the second part of each
 394 .create was sending the output of #consume to #args or
 395 #normalize. In the last step, we ended up with StringLiteral
 396 and that is where we will pick up.
 397
 398 The output of a default Node#consume is basically just the original
 399 sexp but with any nested sexps also converted (this may not hold
 400 true for subclasses.) In the case of StringLiteral, there were
 401 no nested expressions, just the actual literal. (<- is used to
 402 denote returning.)
 403
 404 StringLiteral#consume returns ["foo"] and execution picks
 405 back up in Stringliteral.create where the returned value is
 406 passed into StringLiteral#args. All #args does here is
 407 store the string within the StringLiteral node. After this,
 408 StringLiteral.create is done, returning the node object.
 409
 410 The execution then unwinds from StringLiteral.create to
 411 Compiler#convert_sexp to <Node class>#convert and ends up
 412 in the #consume of the parent. In the case of StringLiteral,
 413 we end up in ArrayLiteral#consume.
 414
 415 This sequence will largely be the same for all cases so you will
 416 see it abbreviated like this:
 417
 418   # StringLiteral#consume
 419   #   <- StringLiteral.create
 420   #     StringLiteral#args return
 421   #       string = 'foo'
 422   #     <- c.convert_sexp
 423   #       <- n.convert
 424   #         <- ArrayLiteral#consume
 425
 426   StringLiteral
 427   .string = 'foo'
 428
 429 So the StringLiteral goes back to ArrayLiteral#consume
 430 which then sends it to ArrayLiteral#args:
 431
 432   # ArrayLiteral#consume
 433   #  <= ArrayLiteral.create
 434   #   ArrayLiteral#args
 435   #     body =
 436
 437   ArrayLiteral
 438   .body = [StringLiteral
 439            .string = 'foo']
 440
 441   # Return to FCall#consume
 442
 443 FCall#consume yields its process results back to .create
 444 and, because FCall does implement #normalize, the output
 445 is sent there rather than #args. #collapse_args will just
 446 strip away the unnecessary ArrayLiteral and give us an
 447 internal Array containing the ArrayLiteral contents instead.
 448 FCall also checks #detect_special_form to be able to
 449 translate pseudo-calls like +:ivar_as_index+ but we have
 450 none in our little script.
 451
 452   # FCall#consume
 453     out = [:puts, anArrayLiteral]
 454     <- FCall.create
 455         FCall#normalize
 456           method, arguments = :puts, al
 457           collapse_args -> Call#collapse_args
 458
 459   FCall
 460   .method = :puts
 461   .arguments = ArrayLiteral
 462                 ...
 463
 464   # collapse_args
 465
 466   FCall
 467   .method = :puts
 468   .arguments = [StringLiteral
 469                 .string = 'foo']
 470
 471   # Return to Newline#consume
 472
 473 Nothing special in Newline, the processed nodes just get
 474 stored. Notably we do not even clear the file, line info
 475 that was set earlier.
 476
 477   # Newline#consume
 478   #   out = [1, 'test.rb', fc]
 479   #   <- Newline.create
 480   #     We do NOT clear the compiler file, line positions
 481   #     Newline#args
 482   #     file, line, child = 'test.rb', 1, fc
 483
 484   Newline
 485   .file = 'test.rb'
 486   .line = 1
 487   .child = FCall
 488            .method = :puts
 489            .arguments = [StringLiteral
 490                          .string = 'foo']
 491
 492 There were two forks in Block#consume so we complete the other
 493 one, Arguments, before going further up the chain. We have
 494 no arguments so your basic assignment again. (If we did have
 495 args, Arguments#populate would handle properly associating
 496 the variable names to Local objects.)
 497
 498   # Arguments#consume
 499   #   out = [[], [], nil, nil]
 500   #   <- Arguments#create
 501   #     Arguments#args
 502   #       required, optional, splat, defaults = [], [], nil, nil
 503   #       block_arg = nil   (may be set later)
 504   #     Skip Arguments#populate
 505
 506   Arguments
 507   .required = []
 508   .optional = []
 509   .splat = nil
 510   .defaults = nil
 511   .block_arg = nil
 512
 513 Alright, now we have both trees back and can finish up with
 514 the Block which actually just stores both the args *and*
 515 the body in a single Array.
 516
 517   # Block#consume
 518   #   out = [args, nl1]
 519   #   <- Block.create
 520   #     Block#args
 521
 522   Block
 523   .body = [Arguments
 524            .required = []
 525            .optional = []
 526            .splat = nil
 527            .defaults = nil
 528            .block_arg = nil
 529           ,
 530            Newline
 531            .file = 'test.rb'
 532            .line = 1
 533            .child = FCall
 534                     .method = :puts
 535                     .arguments = [StringLiteral
 536                                   .string = 'foo']
 537           ]
 538
 539 Back up we go to Scope#consume. As you rememeber, this only
 540 manipulated sexp[0] (which is our Block), so:
 541
 542   # Scope#consume
 543   #   out = [Block ..., []]
 544   #   <- Scope.create
 545
 546   Scope
 547   .locals = []
 548   .block = [Arguments
 549             ...,
 550             Newline
 551             ...]
 552
 553            Block
 554            .body = [Arguments
 555                     .required = []
 556                     .optional = []
 557                     .splat = nil
 558                     .defaults = nil
 559                     .block_arg = nil
 560                    ,
 561                     Newline
 562                     .file = 'test.rb'
 563                     .line = 1
 564                     .child = FCall
 565                              .method = :puts
 566                              .arguments = [StringLiteral
 567                                            .string = 'foo']
 568                    ]
 569
 570 Return gets us to to ClosedScope#consume because Define#consume
 571 supered to it for its body part.
 572
 573 ClosedScope#consume is done with its subnodes so it will #formalize!
 574 all scopes. This deals with assigning indexes for local variables, and
 575 possibly reserving stack space when necessary. We have no locals so
 576 we need not worry about it. The Scope from above is returned to
 577 Define#consume.
 578
 579 Define#consume extracts Scope.block.body back out again and then
 580 further splits it into args and the rest. If a block argument is
 581 detected at this point, it gets recorded in the args.
 582
 583   # Define#consume
 584   #   out = [:foo, <modified scope>, <args extracted from scope>]
 585   #   <- Define.create
 586   #     Define#args
 587
 588   # This...
 589
 590   Scope
 591   .locals = []
 592   .block = Block
 593            .body = [Arguments
 594                     .required = []
 595                     .optional = []
 596                     .splat = nil
 597                     .defaults = nil
 598                     .block_arg = nil
 599                    ,
 600                     Newline
 601                     .file = 'test.rb'
 602                     .line = 1
 603                     .child = FCall
 604                              .method = :puts
 605                              .arguments = [StringLiteral
 606                                            .string = 'foo']
 607                    ]
 608
 609   # Becomes this when we extract scope.block.args
 610   # And add the Define node
 611
 612   Define
 613   .name = :foo
 614   .body = Scope
 615           .block = Block
 616                    .body = [Newline
 617                             .file  = 'test.rb'
 618                             .line  = 2
 619                             .child = FCall
 620                                      .method    = :puts
 621                                      .arguments = [StringLiteral
 622                                                    .string = 'foo']]
 623           .locals = []
 624   .args = Arguments         # args are here now
 625           .required  = []
 626           .optional  = []
 627           .splat     = nil
 628           .defaults  = nil
 629           .block_arg = nil
 630
 631
 632 Newline#consume has nothing new to offer, we just store the
 633 positional information.
 634
 635   # Newline#consume
 636   #   out = [1, 'foo.rb', meth]
 637   #   <- Newline.create
 638   #     Newline#args
 639
 640   Newline
 641   .file  = 'test.rb'
 642   .line  = 1
 643   .child = Define
 644            .name = :foo
 645            .body = Scope
 646                    .block = Block
 647                             .body = [Newline
 648                                      .file  = 'test.rb'
 649                                      .line  = 2
 650                                      .child = FCall
 651                                               .method    = :puts
 652                                               .arguments = [StringLiteral
 653                                                             .string = 'foo']]
 654                    .locals = []
 655            .args = Arguments
 656                    .required  = []
 657                    .optional  = []
 658                    .splat     = nil
 659                    .defaults  = nil
 660                    .block_arg = nil
 661
 662 Finally, we are back in Script#consume. Being a ClosedScope,
 663 it will also #formalize! its locals (and we still have none.)
 664
 665 Curiously, the "body" of the script is just the single newline
 666 but this is just because the top-level is only really concerned
 667 about the code it directly contains. If our original code had been:
 668
 669   def foo
 670     puts "foo"
 671   end
 672
 673   foo
 674
 675 Then the Script would have ended up with its body a Block
 676 that in turn contained two Newline objects (with line numbers
 677 1 and 5 respectively.)
 678
 679 Anyway, for now Script#args just stores the body.
 680
 681 That is it for the compilation phase! This is the complete
 682 resulting structure for our tiny program:
 683
 684   Script
 685   .body = Newline
 686           .file  = 'test.rb'
 687           .line  = 1
 688           .child = Define
 689                    .name = :foo
 690                    .body = Scope
 691                            .block = Block
 692                                     .body = [Newline
 693                                              .file  = 'test.rb'
 694                                              .line  = 2
 695                                              .child = FCall
 696                                                       .method    = :puts
 697                                                       .arguments = [StringLiteral
 698                                                                     .string = 'foo']]
 699                            .locals = []
 700                    .args = Arguments
 701                            .required  = []
 702                            .optional  = []
 703                            .splat     = nil
 704                            .defaults  = nil
 705                            .block_arg = nil
 706
 707 Now we return to Compiler#into_script.
 708
 709 == Stage 2: Bytecode generation
 710
 711 The benefit of having the nested Node structure over the original
 712 sexp may not be immediately obvious, but it is the ability of the
 713 Node objects to have behaviours, unlike the plain sexp. In other
 714 words instead of having an external entity manage it, the Nodes
 715 will generate their own code using the knowledge that they have
 716 accumulated. To separate AST generation from bytecode generation,
 717 Nodes originally from node.rb are reopened in bytecode.rb
 718 to add the #bytecode method (and some other stuff.) The VM
 719 works off a stack so the generated bytecode will be stack-based.
 720
 721 A word about that code: this stage is a bit ambiguous. Conceptually
 722 speaking the first step is to generate assembly code of sorts,
 723 essentially something that is still "symbolic code" and not just
 724 a string of ones and zeroes (that would be "machine code".) The
 725 benefit of the former is, of course, that it is easier for humans
 726 to read than the raw bytes. We could technically have the Nodes
 727 produce a string of instructions (and in fact that is just what
 728 TextGenerator in text.rb does) and then interpret that string
 729 into the *real* bytecode or machine code but this is a bit wasteful.
 730
 731 Rubinius opts for a sort-of middle ground by first having the Node
 732 objects generate a "stream" or sequence of instructions in the form
 733 of simpler Ruby objects which will then later be encoded into the raw
 734 bytecode format in stage 3.
 735
 736 As you have probably noticed, the Rubinius runtime model is based
 737 on methods. In fact, any script or snippet is essentially a method
 738 in its own right and the basic building blocks in the VM are
 739 CompiledMethods more on which later and MethodContexts which
 740 are beyond the scope of this treatise. For this intermediate
 741 representation we naturally opt for the same approach.
 742
 743 MethodDescription will store our important data. The second and
 744 more important player is the Generator class from generator.rb
 745 which, despite its name, does not actually so much generate the
 746 code as act as receptacle thereof. In other words, the "syntax,"
 747 if you wish, is in Generator and the logic comes from the Nodes.
 748 The Encoder will also make an appearance, although its main
 749 function is not explored until the next stage.
 750
 751 A note about the pseudocode used in the description here: The
 752 generated code will be in a sort-of-assembly syntax and I will
 753 use assembly comments ; for any comments that could be imagined
 754 to be left there by the "writer." Ruby comments # are used
 755 for "meta-information." I will also introduce new pseudocode
 756 elements as they are needed so as to not confuse matters up
 757 front and will always explain when I do so. Hopefully this will
 758 help rather than hinder clarity.
 759
 760 Now.
 761
 762 If you still remember, the last stage ended with the topmost Node,
 763 a Script to be specific, was returned from the compiler. This is
 764 fortunate because only ClosedScopes implement the #to_description
 765 method that allows us to start this phase.
 766
 767   # Eventually, the topmost MethodDescription is returned.
 768   ast = compiler.into_script ...
 769   desc = ast.to_description
 770
 771 This will obviously send us into ClosedScope#to_description which
 772 you will find in bytecode.rb.
 773
 774 This is a bit of a tangled web. We start with three players: the
 775 Script (which we will start calling $script) sets up a new MethodDescription
 776 which also provides a Generator ( <--- SUBJECT TO CHANGE ) which is
 777 exposed to $script. Then, $script asks the MethodDescription to
 778 run using $script as its argument. This, in turn will cause the
 779 Generator ($gen) to run again using $script as its starting
 780 point and as before, starts its descent from here. Eventually after
 781 this the MethodDescription gathers #argument_info from the Node
 782 and returns. Because $script is sort of a pseudomethod, it will
 783 only have dummy values here. For now, let us proceed though.
 784
 785   desc.run $script -> $gen.run $script
 786
 787 Generator#run is an exceedingly simple matter: it merely requests
 788 the Node given to it to execute its #bytecode method, passing
 789 itself as a parameter in order to continue down the tree. So in
 790 our case, we head into Script#bytecode.
 791
 792 The #bytecode methods are the backbone of this entire operation,
 793 as mentioned earlier. Each Node will know the logic required to
 794 compile itself and this logic collected by the #bytecode methods
 795 calling various methods on the Generator object they are passed.
 796 The Generator, then, records those operations into its internal
 797 instruction stream.
 798
 799 Script#bytecode starts by generating a prelude (compiler term for
 800 work that is done to set up a method before code inside it starts
 801 executing) which simply reserves space for local variables within
 802 this scope. The space required has been determined in the earlier
 803 calls to ClosedScope#formalize! which was a part of the cycle
 804 for ClosedScope#consume. Since there were no locals at the toplevel,
 805 the size we need is 0.
 806
 807 To actually achieve this, we call $gen.allocate_stack which in
 808 turn just calls Generator#add with the argument pair +:allocate_stack+, 0
 809 This is, essentially, the "assembly code."
 810
 811 Generator#add handles incrementing its instruction pointer to
 812 account for the new code in the stream (the ip always points to
 813 the last element in the stream) and then looks up the actual numeric
 814 opcode of our "assembly" instruction +:allocate_stack+. It happens
 815 to currently be 97 but may change at any time so for clarity
 816 our pseudocode will not refer to it at all, it is just extra
 817 info kept in the stream by the Generator itself.
 818
 819 For our code, the instruction pointer (ip) is not that important
 820 but it is very much so when dealing with branching. After you are
 821 done with this, see (TODO: link to text about labels) for more info.
 822
 823 The prelude done, $script wants to have its body processed before
 824 proceeding with the rest of its code so that the instruction stream
 825 is in the correct order.
 826
 827   # $script.bytecode
 828   #   $gen.allocate_stack 0
 829   #     $gen.add :allocate_stack, 0
 830   #       Increment instruction pointer, look up opcode
 831   #   $script.body.bytecode
 832
 833   # "Assembly" stream
 834   ; No allocation needed
 835
 836   # We are starting back up here
 837   Script
 838   .body = Newline
 839           .file  = 'test.rb'
 840           .line  = 1
 841           .child = Define
 842                    ...
 843
 844 The body itself is a Newline node and the first thing one of
 845 those wants to do is to let the Generator know its line and
 846 file information. Generator#set_line actually does a bit more
 847 work because for all the lines in the code, it keeps track of
 848 the starting and ending instruction pointers (or in other words,
 849 stream positions.) Importantly, this information is not recorded
 850 in the instruction stream and is just kept track of by $gen.
 851 (These helper methods are collectively referred to as "commands"
 852 whereas the ones that actually affect the stream are "operations.")
 853
 854 Since we just started, line 1 of the real code starts at $gen.ip == 2
 855 and ends.. well, we do not know yet. The file is also stored but
 856 since Ruby code is always compiled in units of one file, there is
 857 nothing fancy to do there.
 858
 859 The only other thing for a Newline to do is to send the code
 860 it contains to be processed.
 861
 862   # Nothing new in the stream
 863   ; No allocation needed
 864
 865   Script
 866   .body = Newline
 867           .file  = 'test.rb'
 868           .line  = 1
 869           .child = Define
 870                    .name = :foo
 871                    .body = Scope
 872                            ...
 873                    .args = Arguments
 874                            ...
 875
 876 The next step is quite interesting. As you remember, we are technically
 877 inside a MethodDescription object--so what happens when we need to
 878 describe another method? There are some important concepts to cover.
 879
 880 Define#bytecode starts off with $gen.push_literal compile_body().
 881 Define#compile_body actually produces a new MethodDescription --
 882 to which we will get in a moment -- so the "literal" is not one in the
 883 sense of a string literal or an array literal. What actually happens
 884 here is that the object given to #push_literal is stored with other
 885 literals in $gen and its index within that group is used for the
 886 actual "assembly". The actual literal objects will eventually get
 887 stored in the CompiledMethod object (if we are compiling to a file,
 888 then even further along they will be marshalled along with the rest
 889 of the CompiledMethod.)
 890
 891 Again, in the interest of maintaining the correct sequence, the code
 892 that makes up the method body will get generated first. Exploring
 893 Define#compile_body shows a very similar process to what got us
 894 started with $script. We set up a new MethodDescription which
 895 means that for the time being we will be working with a different
 896 Generator. A new prelude is produced but since we still have no
 897 local variables, a 0 stack allocation is needed.
 898
 899 In the pseudocode, in order to distinguish the Generators, the
 900 original will be $gen and the one for foo is $subgen whenever
 901 needed. For fun, I will also indent foo.
 902
 903 Next, in this order, we generate the argument handling code and then
 904 the method body.
 905
 906   # Toplevel
 907   ; No allocation needed
 908
 909   # foo
 910     ; No allocation needed
 911
 912   Script
 913   .body = Newline
 914           .file  = 'test.rb'
 915           .line  = 1
 916           .child = Define
 917                    .name = :foo
 918                    .body = Scope
 919                            ...
 920                    .args = Arguments
 921                            .required  = []
 922                            .optional  = []
 923                            .splat     = nil
 924                            .defaults  = nil
 925                            .block_arg = nil
 926
 927 The processing of Arguments is a bit more involved than our
 928 previous Nodes (or would be if #foo took any.) Even in our
 929 case, though, the argument count needs to be checked so we
 930 figure out how many are needed (0) and how many can be given
 931 (also 0.) Interestingly, the maximum possible number of arguments
 932 is currently 1024. First, back to Define and then down the
 933 other fork to its body.
 934
 935   # Toplevel
 936   ; No allocation needed
 937
 938   # foo
 939     ; Argument handling
 940     check_argcount 0 0    ; ip == 3
 941
 942   Script
 943   .body = Newline
 944           .file  = 'test.rb'
 945           .line  = 1
 946           .child = Define
 947                    .name = :foo
 948                    .body = Scope
 949                            .block = Block
 950                                     ...
 951                            .locals = []
 952
 953                    .args = Arguments
 954                            ...
 955
 956 Define wants to generate the code for its body so Scope#bytecode
 957 is our next destination but all there is to do is to further descend
 958 in and ask for Block#bytecode from the block contained (nope, we
 959 are STILL not touching the locals.) Block in our case also ends up
 960 just asking its body for its #bytecodes since we only have one
 961 element in it.
 962
 963   # Toplevel
 964   ; No allocation needed
 965
 966   # foo
 967     ; Argument handling
 968     check_argcount 0 0    ; ip == 3
 969
 970   Script
 971   .body = Newline
 972           .file  = 'test.rb'
 973           .line  = 1
 974           .child = Define
 975                    .name = :foo
 976                    .body = Scope
 977                            .block = Block
 978                                     .body = [Newline
 979                                              .file  = 'test.rb'
 980                                              .line  = 2
 981                                              .child = FCall
 982                                                       ...
 983                            .locals = []
 984
 985                    .args = Arguments
 986                            ...
 987
 988 Another Newline! We started one earlier but since that one resides
 989 in $gen, opening a new one here really does not have any effect on
 990 it. Line 2 starts at ip 3 which gives us the interesting property that
 991 ips 0-2 really do not live anywhere. Next, processing goes to the child
 992 of the Newline.
 993
 994   # Toplevel
 995   ; No allocation needed
 996
 997   # foo
 998     ; Argument handling
 999     check_argcount 0 0    ; ip == 3
1000
1001   Script
1002   .body = Newline
1003           .file  = 'test.rb'
1004           .line  = 1
1005           .child = Define
1006                    .name = :foo
1007                    .body = Scope
1008                            .block = Block
1009                                     .body = [Newline
1010                                              .file  = 'test.rb'
1011                                              .line  = 2
1012                                              .child = FCall
1013                                                       .method    = :puts
1014                                                       .arguments = [StringLiteral
1015                                                                     .string = 'foo']]
1016                            .locals = []
1017                    .args = Arguments
1018
1019 The main bytecode generation for all call types happens not in the top
1020 class of that hierarchy, MethodCall, but in Call#bytecode. So also
1021 for our FCall. First order of business is to check whether a plugin
1022 can handle this node (TODO: separate doc about the plugin arch) and in
1023 this case, the answer is "no" so we proceed. Next, arguments for the
1024 call through #emit_args: we do have args and the args are held in
1025 an Array so each of those gets generated -- in reverse order -- and
1026 the argument count is stored. This means a short side trip to
1027 StringLiteral#bytecode which just uses $subgen.push_literal on its
1028 string and then calls $subgen.string_dup which will just cause a new
1029 duplicate string to replace the original on the stack when run (here
1030 is a bit of Generator magic: any operation that is not explicitly
1031 defined just gets #added using #method_missing.) #push_literal,
1032 as you recall, either stores or retrieves an already-stored literal
1033 and gives back its index. A peek at what we look like before heading
1034 back to FCall#emit_args.
1035
1036 (Another pseudocode convention: #<literal> means the index of the
1037 literal from #find_literal.)
1038
1039   # Toplevel
1040   ; No allocation needed
1041
1042   # foo
1043     ; Argument handling
1044     check_argcount 0 0      ; ip == 3
1045
1046     ; FCall arguments
1047     push_literal #<"foo">   ; ip == 5
1048     string_dup              ; ip == 6
1049
1050 Nothing further to do in #emit_args but to store the arg count
1051 and then we drop down to FCall#bytecode again. There are checks
1052 for block code generation as well as dynamic arguments (these are
1053 +:splats+, +:argscats+ and +:argspushes+) neither of which affects
1054 us. Next code to generate would be for the receiver (the receiver
1055 could for example itself be a result of a method call) but we are
1056 again lucky because an FCall by definition has no receiver so
1057 all calls will be to self. This work is all put together in
1058 the $subgen.send operation, which in our case will start by setting
1059 the call flags to allow for private methods (which MUST be called
1060 without an explicit receiver*.)  Further, the method name we want
1061 is looked up or stored in literals and its index retrieved. We
1062 also encounter a small optimisation here: normally, the next
1063 instruction would be +:send_stack+, but because we have <5 args,
1064 1 to be precise, a special instruction +:meta_send_stack_1+ is
1065 used instead. This is what we have when we head back towards
1066 the top again:
1067
1068   # Toplevel
1069   ; No allocation needed
1070
1071   # foo
1072     ; Argument handling
1073     check_argcount 0 0          ; ip == 3
1074
1075     ; FCall arguments
1076     push_literal #<"foo">       ; ip == 5
1077     string_dup                  ; ip == 6
1078
1079     ; FCall receiver
1080     push_self                   ; ip == 7
1081
1082     ; FCall send
1083     set_call_flags 1            ; ip == 9
1084     meta_send_stack_1 :puts     ; ip == 11
1085
1086   Script
1087   .body = Newline
1088           .file  = 'test.rb'
1089           .line  = 1
1090           .child = Define
1091                    .name = :foo
1092                    .body = Scope
1093                            .block = Block
1094                                     .body = [Newline
1095                                              ...
1096                            .locals = []
1097                    .args = Arguments
1098                            .required  = []
1099                            .optional  = []
1100                            .splat     = nil
1101                            .defaults  = nil
1102                            .block_arg = nil
1103
1104 Nothing further to do in Newline#bytecode, Block#bytecode is done
1105 as is Scope#bytecode so we finally just end up in Define#compile_body
1106 which we need to finish up. Two small things to do, $subgen.sret will
1107 emit code necessary to return from the method and $subgen.close is a
1108 command to finalise any open lines so now we know that line 2 started
1109 at ip 3 and ended at ip 12.
1110
1111   # Toplevel
1112   ; No allocation needed
1113
1114   # foo
1115     ; Argument handling
1116     check_argcount 0 0          ; ip == 3
1117
1118     ; FCall arguments
1119     push_literal #<"foo">       ; ip == 5
1120     string_dup                  ; ip == 6
1121
1122     ; FCall receiver
1123     push_self                   ; ip == 7
1124
1125     ; FCall send
1126     set_call_flags 1            ; ip == 9
1127     meta_send_stack_1 :puts     ; ip == 11
1128
1129     ; Cleanup
1130     sret                        ; ip == 12
1131
1132 Code generation responsibility shifts again to the original MethodDescription
1133 and its $gen.
1134
1135 #compile_body actually returns the new MethodDescription we have
1136 been working with so we can finally store the literal and then have
1137 the newly generated method added to self which in our case is the
1138 top-level entity. Notably, the +:add_method+ operation handles looking
1139 up its argument (the name) in literals before generating the real
1140 instruction so it is not necessary to do that part.
1141
1142   # Toplevel
1143   ; Add method
1144   push_literal #<foo>       ; ip == 2
1145   push_self                 ; ip == 3
1146   add_method :foo           ; ip == 5
1147
1148   # foo
1149     ; Argument handling
1150     check_argcount 0 0          ; ip == 3
1151
1152     ; FCall arguments
1153     push_literal #<"foo">       ; ip == 5
1154     string_dup                  ; ip == 6
1155
1156     ; FCall receiver
1157     push_self                   ; ip == 7
1158
1159     ; FCall send
1160     set_call_flags 1            ; ip == 9
1161     meta_send_stack_1 :puts     ; ip == 11
1162
1163     ; Cleanup
1164     sret                        ; ip == 12
1165
1166   Script
1167   .body = Newline
1168           .file  = 'test.rb'
1169           .line  = 1
1170           .child = Define
1171                    ...
1172
1173 Nothing to do in the next Newline either so eventually we just end
1174 up at the very topmost Script#bytecode which merely #pops whatever
1175 the return value of the method definition is (since there is no use
1176 for it and it cannot be left to litter the stack), #pushes true
1177 as its own "return value" and returns.
1178
1179   # Toplevel
1180   ; Add method
1181   push_literal #<foo>       ; ip == 2
1182   push_self                 ; ip == 3
1183   add_method #<:foo>        ; ip == 5
1184
1185   ; Cleanup
1186   pop                       ; ip == 6
1187   push_true                 ; ip == 7
1188   sret                      ; ip == 8
1189
1190   # foo
1191     ; Argument handling
1192     check_argcount 0 0          ; ip == 3
1193
1194     ; FCall arguments
1195     push_literal #<"foo">       ; ip == 5
1196     string_dup                  ; ip == 6
1197
1198     ; FCall receiver
1199     push_self                   ; ip == 7
1200
1201     ; FCall send
1202     set_call_flags 1            ; ip == 9
1203     meta_send_stack_1 :puts     ; ip == 11
1204
1205     ; Cleanup
1206     sret                        ; ip == 12
1207
1208 Execution takes us back to the first MethodDescription#run which has
1209 one last thing to do: #$gen.close. We now know that line 1 ends at
1210 $gen ip == 8 (and started at ip 2.) At this point, we return to caller.
1211 Now we can actually create the CompiledMethod.
1212
1213 If you want to know what the instructions above do, you can look at the
1214 {opcode docs}[http://rubini.us/doc/vm].
1215
1216   desc = ast.to_description
1217   compiled = desc.to_cmethod   # -> $gen.to_cmethod desc
1218
1219 == Stage 3: Encoding and CompiledMethod
1220
1221 One clarification: Rubinius does not actually use BYTEcode in the sense of
1222 one-byte-wide opcodes so it is used here in its banal meaning of virtual
1223 machine machine code. Rubinius' machine code is in fact INTcode. encoder.rb
1224 will be the main venue here (although, again, {the VM
1225 docs}[http://rubini.us/doc/vm] are an excellent resource for anything to do
1226 with Rubinius bytecode.)
1227
1228 The process of creating a CompiledMethod starts from Generator#to_cmethod
1229 to which we get through MethodDescription#to_cmethod. The first thing that
1230 happens is that the Generator goes through its entire stream, "collapsing"
1231 any Labels it encounters. You can review (TODO: link to label text) for
1232 more info but suffice to say Labels are just positions is the stream that
1233 can be used for branching, gotos and jumps. Our code has no Labels so we
1234 need not worry about it. The first thing we need to do is have the Encoder
1235 convert our "assembly code" into an InstructionSequence.
1236
1237 If you look at encoder.rb, it is almost deceptively simple. At the top there
1238 are several data structures to help us keep track of the various opcodes and
1239 any special properties they may have. Towards the bottom, there is only a
1240 little tiny bit of code that is supposed to handle everything for us.
1241
1242 We start off with Encoder#encode_stream, expecting back an InstructionSequence.
1243 #encode_stream first calculates the size required for the iseq and then creates
1244 one. Then it simply #encodes each operation in the stream, keeping track of
1245 the offset into the InstructionSequence.
1246
1247 #encode is not a very complicated method, either: the numeric opcode is
1248 looked up and then its "width" or number of additional arguments is
1249 determined from the lookup tables at the top of encoder.rb. For each
1250 one of these elements (maximum three, opcode + 0, 1 or 2 arguments) gets
1251 encoded into the InstructionSequence using Encoder#int2str.
1252
1253 #int2str just writes the given number into the iseq as a big-endian 4-byte
1254 integer split into 4 byte-sized segments. For example, 17 would end up as
1255 [0, 0, 0, 17] whereas 256 becomes [0, 0, 1, 0] and so on. The fully
1256 encoded InstructionSequence is then returned to Generator#to_cmethod.
1257
1258 Next we create the CompiledMethod object (which, by the way, is immensely
1259 cool, do var = def my_method ... in sirb and see what all you can do
1260 with var! Also see TODO: Rubinius reflection capabilities) using the
1261 InstructionSequence just created. The CM is not quite ready yet, though,
1262 so we store its name, the file it was defined in (these are either real
1263 names or one of the special names such as +:__eval__+ or +:__unknown__+) as
1264 well as the serial which is essentially a version number.
1265
1266 Before you think I forgot about foo, that comes next as #to_cmethod
1267 goes through #encode_literals, #encode_lines and #encode_exceptions
1268 (as you rememeber, the MethodDescription for foo was stored as a literal.)
1269
1270 All three of those are very similar and do not really perform any actual
1271 encoding like we saw before. Each just goes through its ascribed list
1272 and stores the objects in a Tuple (a very primitive fixed-size container)
1273 and the only exception to this is #encode_literals and then only for
1274 any MethodDescriptions it finds. For those it calls lit.to_cmethod
1275 which obviously then recursively generates the CompiledMethods. The
1276 generated ones get stored in the literals tuple in place of the old
1277 MethodDescription.
1278
1279 And that is all there is to it. The CompiledMethod is finished and
1280 it can be stashed away somewhere (a MethodTable, for example), it
1281 can be #activated to have the code run and so on. If you are interested
1282 in what happens when one IS activated, read TODO: link to execution.
1283
1284 Well, maybe there is one little thing that might deserve to be
1285 covered still: how can CompiledMethods be stored on disk in .rbc
1286 files?
1287
1288 == Stage 4: Bytecode on disk.
1289
1290 The loading and unloading of code -- CompiledMethods -- to disk
1291 happens through Marshal and is mostly defined in shotgun/lib/cpu_marshal.c.
1292
1293 The entire thing, from a high level, is fairly trivial. cpu_marshal() first
1294 brands the file with the special string "RBIX" followed by a version number
1295 to be able to detect the formatting required (when loading back again.) From
1296 there, cpu_marshal_cmethod2() takes care of the rest. The basic idea for
1297 marshalling is simple although it varies slightly from one object to another:
1298 first a type identifier is printed ("M" for a CompiledMethod) possibly
1299 followed by some type of version identifier or a size (a string, for
1300 example, will write the number of characters) that will allow unmarshalling
1301 the object later. After this comes the object data.
1302
1303 cpu_marshal_cmethod2() loops through the 16 fields of a CompiledMethod
1304 and recursively marshals them. For example, the name (field 5) is a Symbol
1305 which uses the format 'x' + size + characters so for our method from
1306 above, the end-result would be, in hex:
1307
1308   78 00 00 00 03 66 6f 6f
1309    x           3  f  o  o
1310
1311 Unmarshalling back into a CompiledMethod is a very similar process.
1312 All that happens is that a new, blank CompiledMethod is allocated
1313 and then its fields are unmarshalled using a similar process. The
1314 end-result is a finished CompiledMethod.
1315
1316 Easy, right? The implementation, of course, is a bit more involved
1317 (mostly because of working with files) but from our perspective it
1318 is child's play. Do rememeber, though, that this marshal format is
1319 *not* compatible with MRI, not even for the parts that exist in
1320 both implementations such as Strings.
1321