1 = Introduction to the Rubinius compiler
3 This document presents the sequence of transformations that it takes
4 to compile a very simple Ruby source file into Rubinius bytecode (or,
5 to be specific, intcode) for the purposes of understanding the basic
6 operation of the compiler component of the VM. Even source files with
7 more complex syntax will follow the same general processing pattern.
8 The compiler is conceptually very straightforward and elegant.
10 I use a very liberal pseudocode syntax for most of the explanation
11 just to keep things simple. For any clarifications, you can check the
12 source. Generally any data structures shown represent what is being
13 passed to the next step rather than the state in the explanation just
14 above it: for example in the first stage, the sexp shown is always
15 essentially the "argument" to the next method or section.
19 The sample input is in test.rb and looks like this:
25 Rubinius extends File with the method #to_sexp which will have
26 the parser produce us a nested set of Arrays with the necessary info
27 to be able to interpret (or in our case, compile) the program. How
28 the parser works is outside the scope of treatise but this is its
29 output, hereafter referred to as the *sexp* (S-expression):
31 [:newline, 1, "test.rb",
35 [:newline, 2, "test.rb",
36 [:fcall, :puts, [:array, [:str, "foo"]]]
44 If you compare the sexp to the code snippet above, is pretty simple.
45 +:defn+ is clearly the method definition, +:scope+ and +:block+ contain
46 the code within it and that contained code is the +:fcall+. The
47 +:newline+ nodes are just that, line information. This all will become
48 much clearer as we go along.
50 Next step is to create a new Compiler instance (nothing fancy there)
51 and get to work translating
53 == Stage 1: Sexp to AST
55 === 1.1 Descending into the sexp
57 The sexp is first sent to <tt>Compiler#into_script</tt> (compiler.rb) which
58 will eventually return some type of a Node object for further processing.
59 Nodes represent nodes in the resulting AST (Abstract Syntax Tree).
63 * compiler.rb implements the compiler infrastructure as well as the basic
64 Node class and its default operations. Subclasses of Node reside in
65 nodes.rb and override methods as necessary.
66 * Node.kind allows node types to register themselves into a mapping of sexp
67 node name => node class, this is <tt>Compiler::Node::Mapping</tt>. For
68 example, a method definition appears as +:defn+ in the sexp and that is
69 mapped to the Define node although usually the names are the same with the
70 exception that the node class name is of course capitalised.
71 * Error handling is present all along the way but I ignore it because
72 there is nothing fancy there. Any errors here are unrecoverable.
74 Compiler#into_script wraps the plain sexp inside a +:script+ and sends
75 it to Compiler#convert_sexp.
78 [:newline, 1, "test.rb",
82 [:newline, 2, "test.rb",
83 [:fcall, :puts, [:array, [:str, "foo"]]]
92 We will run into #convert_sexp pretty much every single step of the way.
93 It simply looks up the correct node type from the mapping by using the
94 first element in the sexp that it receives and then calls the #create
95 method on the class with the sexp and a reference to the compiler.
97 So <tt>c.convert_sexp sexp</tt> -> <tt>Script.create c, sexp</tt>
100 [:newline, 1, "test.rb",
104 [:newline, 2, "test.rb",
105 [:fcall, :puts, [:array, [:str, "foo"]]]
114 None of the Node subclasses implement #create themselves so we are still
115 in compiler.rb. .create shifts away the now-unnecessary first element of
116 the sexp and then creates a new instance of the node subclass.
118 The following step is the main function of this compilation stage. We ask
119 the Node subclass instance to #consume the remaining sexp which produces
120 an output. By default #consume will simply recursively process all
121 nested Arrays (sexps) and leave any non-Array elements alone but this
122 behaviour is overridden by some subclasses (as we will see shortly) to
123 make the most important processing happen.
125 The output is then sent to the #args method of the node instance or,
126 if the node type needs argument normalisation (to reconcile all variants
127 of a particular sexp type), to #normalize instead. Neither of these
128 have a default implementation in Node so look under the specific class
131 In our case, Script.create c, sexp, we see the first slight modification
132 in creating the Script instance. Script < ClosedScope, and the purpose of
133 all ClosedScopes is to represent visibility scopes in the code. The main
134 part here is that a new LocalScope object is created (you can see locals.rb
135 but we will get back to this later.)
137 Next, we step into Script#consume (or ClosedScope#consume to be
138 specific) with our newly stripped sexp.
140 # Script.create c, sexp
141 # Script.new (creates a new LocalScope)
145 [:newline, 1, "test.rb",
149 [:newline, 2, "test.rb",
150 [:fcall, :puts, [:array, [:str, "foo"]]]
159 ClosedScope#consume assumes that it is getting a single-element
160 Array containing the rest of the sexp for this particular scope (in
161 our case the entire script.)
163 It has a couple other important duties too, though: firstly, it
164 uses the compiler's #set method to set the current scope as itself
165 as well as the default visibility as public. #set and #get are
166 the compiler's way to change and describe the current compilation
167 state and various auxiliary data.
169 Secondly, the scope will #formalize! all contained scopes. We
170 will return to this later, but formalization is the process of
171 reserving stack space and assigning indexes to any local variables
174 # Script#consume (nodes.rb)
175 # super to ClosedScope#consume
176 # out = convert sexp.first (which is a Newline)
177 # formalize all scopes
178 [:newline, 1, "test.rb",
182 [:newline, 2, "test.rb",
183 [:fcall, :puts, [:array, [:str, "foo"]]]
191 The Node#convert call you see there just calls Compiler#convert_sexp
192 which, as you recall, is where this whole thing got started. This will be
195 Through the normal route, we get to Newline.create which in turn
196 uses default processing to #consume the rest of the sexp (the first
197 element is again stripped after the node type lookup.)
199 # Compiler#convert_sexp -> Newline.create
200 # Newline#consume sexp.shift
205 [:newline, 2, "test.rb",
206 [:fcall, :puts, [:array, [:str, "foo"]]]
214 Newline#consume first registers the current file and line
215 information (the first two elements) with the compiler to help
216 pinpoint compilation errors and then falls back on default
217 processing from Node#consume.
219 The process here is simple: loop through the sexp and push
220 elements to output. All nested sexps are recursively #converted
223 So here the eventual output to be returned to .create is
224 just a three-element Array: [1, 'test.rb', <Define node>].
226 As we can see, the +:defn+ node will need processing so we
233 [:newline, 2, "test.rb",
234 [:fcall, :puts, [:array, [:str, "foo"]]]
241 (I will start abbreviating some common steps here.)
243 # n.convert -> c.convert_sexp -> Define.create
244 # Define.new.consume sexp.shift
248 [:newline, 2, "test.rb",
249 [:fcall, :puts, [:array, [:str, "foo"]]]
256 Define is also a ClosedScope but it will significantly augment
257 the normal #consume from the latter. First we separate the name
258 and the body, after which the body gets supered to ClosedScope#consume
259 wrapped inside a dummy Array. After getting the body back, we will
260 massage it a bit to suit our needs but this will be done in part 2.
261 The eventual output here will be [name, <scope>, <args>].
263 # Define#consume -> ClosedScope#consume [body] -> convert body.first
266 [:newline, 2, "test.rb",
267 [:fcall, :puts, [:array, [:str, "foo"]]]
273 Scope nodes do not manage the scopes themselves, they are merely
274 abstractions within the actual visibility scopes. Scope#consume
275 assumes it is getting a two-element sexp of which only the first
276 needs further processing (the second contains local variable names.)
278 # n.convert -> c.convert_sexp -> Scope.create -> Scope#consume
281 [:newline, 2, "test.rb",
282 [:fcall, :puts, [:array, [:str, "foo"]]]
289 # sexp[0] = convert sexp[0]
292 [:newline, 2, "test.rb",
293 [:fcall, :puts, [:array, [:str, "foo"]]]
297 Block, then, typically encapsulates any type of block of code,
298 not a lambda block. It is used in method definitions, if-expressions
301 # c.convert_sexp -> Block.create
302 # Block.new.consume sexp.shift
305 [:newline, 2, "test.rb",
306 [:fcall, :puts, [:array, [:str, "foo"]]]
310 This is the first time we encounter a sexp with two separate nested
311 sexps. Both will be processed separately and the output will be just
312 [<arg node>, <newline node>].
314 # Block#consume -> Node#consume
315 # Loop through sexp, recursively convert nested sexps
319 [:newline, 2, "test.rb",
320 [:fcall, :puts, [:array, [:str, "foo"]]]
323 Stepping through Arguments first.
325 # c.convert_sexp -> Arguments.create
326 # Arguments.new.consume sexp.shift
329 Since we have no arguments, there is no work to be done. However, to
330 make things consistent, Arguments#consume will return a structure
331 representing no args of any kind, [[], [], nil, nil] (the fields are
332 required arg names, optional arg names, splat arg name and default arg
333 computations. See the source in nodes.rb for more info.)
335 The second node to process is the newline, one of which we have seen
336 already. Nothing new here.
338 # c.convert_sexp -> Newline.create
339 # Newline.new.consume sexp.shift
341 [:fcall, :puts, [:array, [:str, "foo"]]]
345 # The line and file are passed to Compiler#set_position
346 # super sexp -> Node#consume sexp
347 # Loop through sexp, recursively convert nested sexps
349 [:fcall, :puts, [:array, [:str, "foo"]]]
351 FCall is a bit more interesting. It represents a method call without
352 an explicit receiver ("functional" style) and it is the first node that
353 must be #normalized. FCall < Call < MethodCall < Node.
355 # c.convert_sexp -> FCall.create
357 # MethodCall does a bit of extra work here by storing
358 # the scope type (it was #set before, if you recall.)
359 # In our case the scope type is Script. We also start
360 # out assuming no block is involved with this call.
362 # f.consume sexp.shift
363 [:puts, [:array, [:str, "foo"]]]
365 # FCall#consume -> Node#consume
366 # Loop through sexp, recursively convert nested sexps
368 [[:array, [:str, "foo"]]]
370 # c.convert_sexp -> ArrayLiteral.create
371 # ArrayLiteral.new.consume sexp.shift
374 # ArrayLiteral#consume -> Node#consume
375 # Loop through sexp, recursively convert nested sexps
379 # c.convert_sexp -> StringLiteral.create
380 # StringLiteral.new.consume sexp.shift
383 # StringLiteral#consume -> Node#consume
384 # Loop through sexp, recursively convert nested sexps
385 # (Nothing to do, just "foo" remains.)
387 At this point, we are at the end of the tree. There is nothing
388 more to parse and we are ready for the next phase.
390 === 1.2 Creating the AST
392 Now we traverse back up the sexp, pushing the generated Nodes
393 upwards as we go. As you remember, the second part of each
394 .create was sending the output of #consume to #args or
395 #normalize. In the last step, we ended up with StringLiteral
396 and that is where we will pick up.
398 The output of a default Node#consume is basically just the original
399 sexp but with any nested sexps also converted (this may not hold
400 true for subclasses.) In the case of StringLiteral, there were
401 no nested expressions, just the actual literal. (<- is used to
404 StringLiteral#consume returns ["foo"] and execution picks
405 back up in Stringliteral.create where the returned value is
406 passed into StringLiteral#args. All #args does here is
407 store the string within the StringLiteral node. After this,
408 StringLiteral.create is done, returning the node object.
410 The execution then unwinds from StringLiteral.create to
411 Compiler#convert_sexp to <Node class>#convert and ends up
412 in the #consume of the parent. In the case of StringLiteral,
413 we end up in ArrayLiteral#consume.
415 This sequence will largely be the same for all cases so you will
416 see it abbreviated like this:
418 # StringLiteral#consume
419 # <- StringLiteral.create
420 # StringLiteral#args return
424 # <- ArrayLiteral#consume
429 So the StringLiteral goes back to ArrayLiteral#consume
430 which then sends it to ArrayLiteral#args:
432 # ArrayLiteral#consume
433 # <= ArrayLiteral.create
438 .body = [StringLiteral
441 # Return to FCall#consume
443 FCall#consume yields its process results back to .create
444 and, because FCall does implement #normalize, the output
445 is sent there rather than #args. #collapse_args will just
446 strip away the unnecessary ArrayLiteral and give us an
447 internal Array containing the ArrayLiteral contents instead.
448 FCall also checks #detect_special_form to be able to
449 translate pseudo-calls like +:ivar_as_index+ but we have
450 none in our little script.
453 out = [:puts, anArrayLiteral]
456 method, arguments = :puts, al
457 collapse_args -> Call#collapse_args
461 .arguments = ArrayLiteral
468 .arguments = [StringLiteral
471 # Return to Newline#consume
473 Nothing special in Newline, the processed nodes just get
474 stored. Notably we do not even clear the file, line info
475 that was set earlier.
478 # out = [1, 'test.rb', fc]
480 # We do NOT clear the compiler file, line positions
482 # file, line, child = 'test.rb', 1, fc
489 .arguments = [StringLiteral
492 There were two forks in Block#consume so we complete the other
493 one, Arguments, before going further up the chain. We have
494 no arguments so your basic assignment again. (If we did have
495 args, Arguments#populate would handle properly associating
496 the variable names to Local objects.)
499 # out = [[], [], nil, nil]
500 # <- Arguments#create
502 # required, optional, splat, defaults = [], [], nil, nil
503 # block_arg = nil (may be set later)
504 # Skip Arguments#populate
513 Alright, now we have both trees back and can finish up with
514 the Block which actually just stores both the args *and*
515 the body in a single Array.
535 .arguments = [StringLiteral
539 Back up we go to Scope#consume. As you rememeber, this only
540 manipulated sexp[0] (which is our Block), so:
543 # out = [Block ..., []]
566 .arguments = [StringLiteral
570 Return gets us to to ClosedScope#consume because Define#consume
571 supered to it for its body part.
573 ClosedScope#consume is done with its subnodes so it will #formalize!
574 all scopes. This deals with assigning indexes for local variables, and
575 possibly reserving stack space when necessary. We have no locals so
576 we need not worry about it. The Scope from above is returned to
579 Define#consume extracts Scope.block.body back out again and then
580 further splits it into args and the rest. If a block argument is
581 detected at this point, it gets recorded in the args.
584 # out = [:foo, <modified scope>, <args extracted from scope>]
605 .arguments = [StringLiteral
609 # Becomes this when we extract scope.block.args
610 # And add the Define node
621 .arguments = [StringLiteral
624 .args = Arguments # args are here now
632 Newline#consume has nothing new to offer, we just store the
633 positional information.
636 # out = [1, 'foo.rb', meth]
652 .arguments = [StringLiteral
662 Finally, we are back in Script#consume. Being a ClosedScope,
663 it will also #formalize! its locals (and we still have none.)
665 Curiously, the "body" of the script is just the single newline
666 but this is just because the top-level is only really concerned
667 about the code it directly contains. If our original code had been:
675 Then the Script would have ended up with its body a Block
676 that in turn contained two Newline objects (with line numbers
677 1 and 5 respectively.)
679 Anyway, for now Script#args just stores the body.
681 That is it for the compilation phase! This is the complete
682 resulting structure for our tiny program:
697 .arguments = [StringLiteral
707 Now we return to Compiler#into_script.
709 == Stage 2: Bytecode generation
711 The benefit of having the nested Node structure over the original
712 sexp may not be immediately obvious, but it is the ability of the
713 Node objects to have behaviours, unlike the plain sexp. In other
714 words instead of having an external entity manage it, the Nodes
715 will generate their own code using the knowledge that they have
716 accumulated. To separate AST generation from bytecode generation,
717 Nodes originally from node.rb are reopened in bytecode.rb
718 to add the #bytecode method (and some other stuff.) The VM
719 works off a stack so the generated bytecode will be stack-based.
721 A word about that code: this stage is a bit ambiguous. Conceptually
722 speaking the first step is to generate assembly code of sorts,
723 essentially something that is still "symbolic code" and not just
724 a string of ones and zeroes (that would be "machine code".) The
725 benefit of the former is, of course, that it is easier for humans
726 to read than the raw bytes. We could technically have the Nodes
727 produce a string of instructions (and in fact that is just what
728 TextGenerator in text.rb does) and then interpret that string
729 into the *real* bytecode or machine code but this is a bit wasteful.
731 Rubinius opts for a sort-of middle ground by first having the Node
732 objects generate a "stream" or sequence of instructions in the form
733 of simpler Ruby objects which will then later be encoded into the raw
734 bytecode format in stage 3.
736 As you have probably noticed, the Rubinius runtime model is based
737 on methods. In fact, any script or snippet is essentially a method
738 in its own right and the basic building blocks in the VM are
739 CompiledMethods more on which later and MethodContexts which
740 are beyond the scope of this treatise. For this intermediate
741 representation we naturally opt for the same approach.
743 MethodDescription will store our important data. The second and
744 more important player is the Generator class from generator.rb
745 which, despite its name, does not actually so much generate the
746 code as act as receptacle thereof. In other words, the "syntax,"
747 if you wish, is in Generator and the logic comes from the Nodes.
748 The Encoder will also make an appearance, although its main
749 function is not explored until the next stage.
751 A note about the pseudocode used in the description here: The
752 generated code will be in a sort-of-assembly syntax and I will
753 use assembly comments ; for any comments that could be imagined
754 to be left there by the "writer." Ruby comments # are used
755 for "meta-information." I will also introduce new pseudocode
756 elements as they are needed so as to not confuse matters up
757 front and will always explain when I do so. Hopefully this will
758 help rather than hinder clarity.
762 If you still remember, the last stage ended with the topmost Node,
763 a Script to be specific, was returned from the compiler. This is
764 fortunate because only ClosedScopes implement the #to_description
765 method that allows us to start this phase.
767 # Eventually, the topmost MethodDescription is returned.
768 ast = compiler.into_script ...
769 desc = ast.to_description
771 This will obviously send us into ClosedScope#to_description which
772 you will find in bytecode.rb.
774 This is a bit of a tangled web. We start with three players: the
775 Script (which we will start calling $script) sets up a new MethodDescription
776 which also provides a Generator ( <--- SUBJECT TO CHANGE ) which is
777 exposed to $script. Then, $script asks the MethodDescription to
778 run using $script as its argument. This, in turn will cause the
779 Generator ($gen) to run again using $script as its starting
780 point and as before, starts its descent from here. Eventually after
781 this the MethodDescription gathers #argument_info from the Node
782 and returns. Because $script is sort of a pseudomethod, it will
783 only have dummy values here. For now, let us proceed though.
785 desc.run $script -> $gen.run $script
787 Generator#run is an exceedingly simple matter: it merely requests
788 the Node given to it to execute its #bytecode method, passing
789 itself as a parameter in order to continue down the tree. So in
790 our case, we head into Script#bytecode.
792 The #bytecode methods are the backbone of this entire operation,
793 as mentioned earlier. Each Node will know the logic required to
794 compile itself and this logic collected by the #bytecode methods
795 calling various methods on the Generator object they are passed.
796 The Generator, then, records those operations into its internal
799 Script#bytecode starts by generating a prelude (compiler term for
800 work that is done to set up a method before code inside it starts
801 executing) which simply reserves space for local variables within
802 this scope. The space required has been determined in the earlier
803 calls to ClosedScope#formalize! which was a part of the cycle
804 for ClosedScope#consume. Since there were no locals at the toplevel,
805 the size we need is 0.
807 To actually achieve this, we call $gen.allocate_stack which in
808 turn just calls Generator#add with the argument pair +:allocate_stack+, 0
809 This is, essentially, the "assembly code."
811 Generator#add handles incrementing its instruction pointer to
812 account for the new code in the stream (the ip always points to
813 the last element in the stream) and then looks up the actual numeric
814 opcode of our "assembly" instruction +:allocate_stack+. It happens
815 to currently be 97 but may change at any time so for clarity
816 our pseudocode will not refer to it at all, it is just extra
817 info kept in the stream by the Generator itself.
819 For our code, the instruction pointer (ip) is not that important
820 but it is very much so when dealing with branching. After you are
821 done with this, see (TODO: link to text about labels) for more info.
823 The prelude done, $script wants to have its body processed before
824 proceeding with the rest of its code so that the instruction stream
825 is in the correct order.
828 # $gen.allocate_stack 0
829 # $gen.add :allocate_stack, 0
830 # Increment instruction pointer, look up opcode
831 # $script.body.bytecode
834 ; No allocation needed
836 # We are starting back up here
844 The body itself is a Newline node and the first thing one of
845 those wants to do is to let the Generator know its line and
846 file information. Generator#set_line actually does a bit more
847 work because for all the lines in the code, it keeps track of
848 the starting and ending instruction pointers (or in other words,
849 stream positions.) Importantly, this information is not recorded
850 in the instruction stream and is just kept track of by $gen.
851 (These helper methods are collectively referred to as "commands"
852 whereas the ones that actually affect the stream are "operations.")
854 Since we just started, line 1 of the real code starts at $gen.ip == 2
855 and ends.. well, we do not know yet. The file is also stored but
856 since Ruby code is always compiled in units of one file, there is
857 nothing fancy to do there.
859 The only other thing for a Newline to do is to send the code
860 it contains to be processed.
862 # Nothing new in the stream
863 ; No allocation needed
876 The next step is quite interesting. As you remember, we are technically
877 inside a MethodDescription object--so what happens when we need to
878 describe another method? There are some important concepts to cover.
880 Define#bytecode starts off with $gen.push_literal compile_body().
881 Define#compile_body actually produces a new MethodDescription --
882 to which we will get in a moment -- so the "literal" is not one in the
883 sense of a string literal or an array literal. What actually happens
884 here is that the object given to #push_literal is stored with other
885 literals in $gen and its index within that group is used for the
886 actual "assembly". The actual literal objects will eventually get
887 stored in the CompiledMethod object (if we are compiling to a file,
888 then even further along they will be marshalled along with the rest
889 of the CompiledMethod.)
891 Again, in the interest of maintaining the correct sequence, the code
892 that makes up the method body will get generated first. Exploring
893 Define#compile_body shows a very similar process to what got us
894 started with $script. We set up a new MethodDescription which
895 means that for the time being we will be working with a different
896 Generator. A new prelude is produced but since we still have no
897 local variables, a 0 stack allocation is needed.
899 In the pseudocode, in order to distinguish the Generators, the
900 original will be $gen and the one for foo is $subgen whenever
901 needed. For fun, I will also indent foo.
903 Next, in this order, we generate the argument handling code and then
907 ; No allocation needed
910 ; No allocation needed
927 The processing of Arguments is a bit more involved than our
928 previous Nodes (or would be if #foo took any.) Even in our
929 case, though, the argument count needs to be checked so we
930 figure out how many are needed (0) and how many can be given
931 (also 0.) Interestingly, the maximum possible number of arguments
932 is currently 1024. First, back to Define and then down the
933 other fork to its body.
936 ; No allocation needed
940 check_argcount 0 0 ; ip == 3
956 Define wants to generate the code for its body so Scope#bytecode
957 is our next destination but all there is to do is to further descend
958 in and ask for Block#bytecode from the block contained (nope, we
959 are STILL not touching the locals.) Block in our case also ends up
960 just asking its body for its #bytecodes since we only have one
964 ; No allocation needed
968 check_argcount 0 0 ; ip == 3
988 Another Newline! We started one earlier but since that one resides
989 in $gen, opening a new one here really does not have any effect on
990 it. Line 2 starts at ip 3 which gives us the interesting property that
991 ips 0-2 really do not live anywhere. Next, processing goes to the child
995 ; No allocation needed
999 check_argcount 0 0 ; ip == 3
1014 .arguments = [StringLiteral
1019 The main bytecode generation for all call types happens not in the top
1020 class of that hierarchy, MethodCall, but in Call#bytecode. So also
1021 for our FCall. First order of business is to check whether a plugin
1022 can handle this node (TODO: separate doc about the plugin arch) and in
1023 this case, the answer is "no" so we proceed. Next, arguments for the
1024 call through #emit_args: we do have args and the args are held in
1025 an Array so each of those gets generated -- in reverse order -- and
1026 the argument count is stored. This means a short side trip to
1027 StringLiteral#bytecode which just uses $subgen.push_literal on its
1028 string and then calls $subgen.string_dup which will just cause a new
1029 duplicate string to replace the original on the stack when run (here
1030 is a bit of Generator magic: any operation that is not explicitly
1031 defined just gets #added using #method_missing.) #push_literal,
1032 as you recall, either stores or retrieves an already-stored literal
1033 and gives back its index. A peek at what we look like before heading
1034 back to FCall#emit_args.
1036 (Another pseudocode convention: #<literal> means the index of the
1037 literal from #find_literal.)
1040 ; No allocation needed
1044 check_argcount 0 0 ; ip == 3
1047 push_literal #<"foo"> ; ip == 5
1048 string_dup ; ip == 6
1050 Nothing further to do in #emit_args but to store the arg count
1051 and then we drop down to FCall#bytecode again. There are checks
1052 for block code generation as well as dynamic arguments (these are
1053 +:splats+, +:argscats+ and +:argspushes+) neither of which affects
1054 us. Next code to generate would be for the receiver (the receiver
1055 could for example itself be a result of a method call) but we are
1056 again lucky because an FCall by definition has no receiver so
1057 all calls will be to self. This work is all put together in
1058 the $subgen.send operation, which in our case will start by setting
1059 the call flags to allow for private methods (which MUST be called
1060 without an explicit receiver*.) Further, the method name we want
1061 is looked up or stored in literals and its index retrieved. We
1062 also encounter a small optimisation here: normally, the next
1063 instruction would be +:send_stack+, but because we have <5 args,
1064 1 to be precise, a special instruction +:meta_send_stack_1+ is
1065 used instead. This is what we have when we head back towards
1069 ; No allocation needed
1073 check_argcount 0 0 ; ip == 3
1076 push_literal #<"foo"> ; ip == 5
1077 string_dup ; ip == 6
1083 set_call_flags 1 ; ip == 9
1084 meta_send_stack_1 :puts ; ip == 11
1104 Nothing further to do in Newline#bytecode, Block#bytecode is done
1105 as is Scope#bytecode so we finally just end up in Define#compile_body
1106 which we need to finish up. Two small things to do, $subgen.sret will
1107 emit code necessary to return from the method and $subgen.close is a
1108 command to finalise any open lines so now we know that line 2 started
1109 at ip 3 and ended at ip 12.
1112 ; No allocation needed
1116 check_argcount 0 0 ; ip == 3
1119 push_literal #<"foo"> ; ip == 5
1120 string_dup ; ip == 6
1126 set_call_flags 1 ; ip == 9
1127 meta_send_stack_1 :puts ; ip == 11
1132 Code generation responsibility shifts again to the original MethodDescription
1135 #compile_body actually returns the new MethodDescription we have
1136 been working with so we can finally store the literal and then have
1137 the newly generated method added to self which in our case is the
1138 top-level entity. Notably, the +:add_method+ operation handles looking
1139 up its argument (the name) in literals before generating the real
1140 instruction so it is not necessary to do that part.
1144 push_literal #<foo> ; ip == 2
1146 add_method :foo ; ip == 5
1150 check_argcount 0 0 ; ip == 3
1153 push_literal #<"foo"> ; ip == 5
1154 string_dup ; ip == 6
1160 set_call_flags 1 ; ip == 9
1161 meta_send_stack_1 :puts ; ip == 11
1173 Nothing to do in the next Newline either so eventually we just end
1174 up at the very topmost Script#bytecode which merely #pops whatever
1175 the return value of the method definition is (since there is no use
1176 for it and it cannot be left to litter the stack), #pushes true
1177 as its own "return value" and returns.
1181 push_literal #<foo> ; ip == 2
1183 add_method #<:foo> ; ip == 5
1192 check_argcount 0 0 ; ip == 3
1195 push_literal #<"foo"> ; ip == 5
1196 string_dup ; ip == 6
1202 set_call_flags 1 ; ip == 9
1203 meta_send_stack_1 :puts ; ip == 11
1208 Execution takes us back to the first MethodDescription#run which has
1209 one last thing to do: #$gen.close. We now know that line 1 ends at
1210 $gen ip == 8 (and started at ip 2.) At this point, we return to caller.
1211 Now we can actually create the CompiledMethod.
1213 If you want to know what the instructions above do, you can look at the
1214 {opcode docs}[http://rubini.us/doc/vm].
1216 desc = ast.to_description
1217 compiled = desc.to_cmethod # -> $gen.to_cmethod desc
1219 == Stage 3: Encoding and CompiledMethod
1221 One clarification: Rubinius does not actually use BYTEcode in the sense of
1222 one-byte-wide opcodes so it is used here in its banal meaning of virtual
1223 machine machine code. Rubinius' machine code is in fact INTcode. encoder.rb
1224 will be the main venue here (although, again, {the VM
1225 docs}[http://rubini.us/doc/vm] are an excellent resource for anything to do
1226 with Rubinius bytecode.)
1228 The process of creating a CompiledMethod starts from Generator#to_cmethod
1229 to which we get through MethodDescription#to_cmethod. The first thing that
1230 happens is that the Generator goes through its entire stream, "collapsing"
1231 any Labels it encounters. You can review (TODO: link to label text) for
1232 more info but suffice to say Labels are just positions is the stream that
1233 can be used for branching, gotos and jumps. Our code has no Labels so we
1234 need not worry about it. The first thing we need to do is have the Encoder
1235 convert our "assembly code" into an InstructionSequence.
1237 If you look at encoder.rb, it is almost deceptively simple. At the top there
1238 are several data structures to help us keep track of the various opcodes and
1239 any special properties they may have. Towards the bottom, there is only a
1240 little tiny bit of code that is supposed to handle everything for us.
1242 We start off with Encoder#encode_stream, expecting back an InstructionSequence.
1243 #encode_stream first calculates the size required for the iseq and then creates
1244 one. Then it simply #encodes each operation in the stream, keeping track of
1245 the offset into the InstructionSequence.
1247 #encode is not a very complicated method, either: the numeric opcode is
1248 looked up and then its "width" or number of additional arguments is
1249 determined from the lookup tables at the top of encoder.rb. For each
1250 one of these elements (maximum three, opcode + 0, 1 or 2 arguments) gets
1251 encoded into the InstructionSequence using Encoder#int2str.
1253 #int2str just writes the given number into the iseq as a big-endian 4-byte
1254 integer split into 4 byte-sized segments. For example, 17 would end up as
1255 [0, 0, 0, 17] whereas 256 becomes [0, 0, 1, 0] and so on. The fully
1256 encoded InstructionSequence is then returned to Generator#to_cmethod.
1258 Next we create the CompiledMethod object (which, by the way, is immensely
1259 cool, do var = def my_method ... in sirb and see what all you can do
1260 with var! Also see TODO: Rubinius reflection capabilities) using the
1261 InstructionSequence just created. The CM is not quite ready yet, though,
1262 so we store its name, the file it was defined in (these are either real
1263 names or one of the special names such as +:__eval__+ or +:__unknown__+) as
1264 well as the serial which is essentially a version number.
1266 Before you think I forgot about foo, that comes next as #to_cmethod
1267 goes through #encode_literals, #encode_lines and #encode_exceptions
1268 (as you rememeber, the MethodDescription for foo was stored as a literal.)
1270 All three of those are very similar and do not really perform any actual
1271 encoding like we saw before. Each just goes through its ascribed list
1272 and stores the objects in a Tuple (a very primitive fixed-size container)
1273 and the only exception to this is #encode_literals and then only for
1274 any MethodDescriptions it finds. For those it calls lit.to_cmethod
1275 which obviously then recursively generates the CompiledMethods. The
1276 generated ones get stored in the literals tuple in place of the old
1279 And that is all there is to it. The CompiledMethod is finished and
1280 it can be stashed away somewhere (a MethodTable, for example), it
1281 can be #activated to have the code run and so on. If you are interested
1282 in what happens when one IS activated, read TODO: link to execution.
1284 Well, maybe there is one little thing that might deserve to be
1285 covered still: how can CompiledMethods be stored on disk in .rbc
1288 == Stage 4: Bytecode on disk.
1290 The loading and unloading of code -- CompiledMethods -- to disk
1291 happens through Marshal and is mostly defined in shotgun/lib/cpu_marshal.c.
1293 The entire thing, from a high level, is fairly trivial. cpu_marshal() first
1294 brands the file with the special string "RBIX" followed by a version number
1295 to be able to detect the formatting required (when loading back again.) From
1296 there, cpu_marshal_cmethod2() takes care of the rest. The basic idea for
1297 marshalling is simple although it varies slightly from one object to another:
1298 first a type identifier is printed ("M" for a CompiledMethod) possibly
1299 followed by some type of version identifier or a size (a string, for
1300 example, will write the number of characters) that will allow unmarshalling
1301 the object later. After this comes the object data.
1303 cpu_marshal_cmethod2() loops through the 16 fields of a CompiledMethod
1304 and recursively marshals them. For example, the name (field 5) is a Symbol
1305 which uses the format 'x' + size + characters so for our method from
1306 above, the end-result would be, in hex:
1308 78 00 00 00 03 66 6f 6f
1311 Unmarshalling back into a CompiledMethod is a very similar process.
1312 All that happens is that a new, blank CompiledMethod is allocated
1313 and then its fields are unmarshalled using a similar process. The
1314 end-result is a finished CompiledMethod.
1316 Easy, right? The implementation, of course, is a bit more involved
1317 (mostly because of working with files) but from our perspective it
1318 is child's play. Do rememeber, though, that this marshal format is
1319 *not* compatible with MRI, not even for the parts that exist in
1320 both implementations such as Strings.