1 This is Info file suif1.info, produced by Makeinfo version 1.68 from
2 the input file suif1.texi.
4 This file documents the SUIF library.
6 Copyright (C) 1994 Stanford University. All rights reserved.
8 Permission is given to use, copy, and modify this documentation for
9 any non-commercial purpose as long as this copyright notice is not
10 removed. All other uses, including redistribution in whole or in part,
11 are forbidden without prior written permission.
14 File: suif1.info, Node: For Nodes, Next: Block Nodes, Prev: Loop Nodes, Up: Tree Nodes
19 Many of our compiler passes are designed to work with Fortran `DO'
20 loops because they are relatively easy to analyze. A `DO' loop has a
21 single index variable that is incremented or decremented on every
22 iteration and varies from an initial to a final value. SUIF uses
23 `tree_for' nodes to represent well-structured `DO' loops. The exact
24 conditions that a loop must meet to be represented as a `tree_for' in
25 SUIF are described below. The expander's cleanup pass, which is run
26 immediately after the front-end, converts `tree_for' nodes that violate
27 any of these conditions into `tree_loop' nodes (*note Loop Nodes::.).
28 Even though they are primarily intended for Fortran code, `tree_for'
29 nodes may also be used for C loops that meet the same conditions.
31 The index variable of a `tree_for' can be accessed using the `index'
32 method and set using the `set_index' method. The index variable must
33 be a scalar variable that is defined in the scope of the `tree_for'
34 node. It may not be modified anywhere within the loop body. This
35 applies across procedures as well. If the loop body contains a call to
36 another procedure that modifies the index variable, then the loop
37 cannot be represented by a `tree_for' node. Moreover, if you are using
38 Fortran form, the index variable may not be a call-by-reference
39 parameter. *Note Fortran::.
41 The range of values for the index variable is specified by three
42 `operand' fields. *Note Operands::. The lower and upper bounds and
43 the step value can be accessed by the `lb_op', `ub_op', and `step_op'
44 methods. The `set_lb_op', `set_ub_op', and `set_step_op' methods are
45 provided to change them. These operands are expressions that are
46 evaluated once at the beginning of the loop. The index variable is
47 initially assigned the lower bound value and then incremented by the
48 step value on every iteration until it reaches the upper bound value;
49 the code to do this is automatically created when the `tree_for' is
50 expanded to low-SUIF code.
52 Most users will always use `tree_for' nodes in conjunction with
53 expression trees. Flat lists of instructions are typically used only in
54 the library and with back-end passes where the `tree_for' nodes have
55 been dismantled. It is possible to use `tree_for' nodes without
56 expression trees, but the bounds and step values cannot be treated as
57 operands. In fact, even with expression trees those operands are
58 actually stored on tree node lists. If necessary, these lists can be
59 accessed directly using the `lb_list', `ub_list', and `step_list'
60 methods. Each list is required to contain a single expression with a
61 dummy copy instruction at the root. The destination of the dummy copy
62 must be a null operand. Methods are provided in the tree node list
63 class to extract the operands from the tree node lists (*note Tree Node
66 The `tree_for' must also specify the comparison operator used to
67 determine when the index variable has reached the upper bound value.
68 The possible comparison operators are members of the `tree_for_test'
69 enumerated type. The `test' and `set_test' methods are used to access
70 and modify the comparison operator for a `tree_for' node. The
71 `tree_for_test' enumeration includes quite a few comparison operators,
72 but some of them are only used by the front-end. Both signed and
73 unsigned versions are available for most of the comparison operators,
74 as indicated by the letters "`S'" and "`U'" in their names.
79 Less than. Repeat as long as the index is strictly less than the
85 Less than or equal. Repeat as long as the index is less than or
86 equal to the upper bound.
91 Greater than. Repeat as long as the index is strictly greater
92 than the upper bound. Sometimes `DO' loops go backwards, using a
93 negative step value. For those loops, the comparison operator
94 must also be reversed.
99 Greater than or equal. Repeat as long as the index is greater
100 than or equal to the upper bound. Again, this may be used when
101 the step value is negative.
106 These comparisons are only used by the front-end. In FORTRAN, it
107 may not be possible to determine the direction of a loop at
108 compile time. For example, if the step value is not a constant,
109 it could be either positive or negative. These comparison
110 operators indicate that the loop test may be either "greater than
111 or equal" or "less than or equal", depending on the sign of the
112 step value. The expander's cleanup pass converts any `tree_for'
113 nodes with these tests to two `tree_for' nodes and a `tree_if'
114 node to decide between them. Thus, these comparison operators
115 should never be encountered in most SUIF code.
120 Equal and not equal. These comparisons are only used by the
121 front-end. The expander's cleanup pass dismantles `tree_for'
122 nodes that use these comparisons.
124 The body of a `tree_for' loop is stored in a tree node list. The
125 methods to get and set the loop body are `body' and `set_body',
126 respectively. The `body' list contains only the instructions
127 corresponding to the body of the loop in the source program. The
128 instructions to compare the index variable with the upper bound,
129 increment it, and branch back to the beginning of the body are not
130 included as part of the body; they are created when the `tree_for' is
131 expanded to low-SUIF code.
133 Besides the loop body, a `tree_for' node has an additional tree node
134 list called the `landing_pad'. The code in the landing pad is executed
135 if and only if the loop body is executed at least one time, but the
136 `landing_pad' is executed only once, unlike the body which is usually
137 executed many times. The `landing_pad' is executed immediately before
138 the first time through the loop body. The landing pad thus provides a
139 place to move loop-invariant code.
141 Two labels are associated with a `tree_for': `contlab' and `brklab'.
142 A "continue" statement in the loop body requires a jump over the rest
143 of the body to the code that increments the index and continues with
144 the next iteration. This can be implemented with a jump to the
145 `contlab' label, which is implicitly located at the end of the `body'
146 list. Similarly, a "break" statement in the loop is translated to a
147 jump to the `brklab' label which is located immediately after the loop.
148 These two labels must be defined in the scope of the `tree_for' node,
149 but the label instructions that mark their locations are not inserted
150 into the tree node lists until the `tree_for' node is expanded into
153 In summary, the semantics of a `tree_for' node are as follows. The
154 lower bound, upper bound, and step operands are evaluated once at the
155 beginning of the loop (1). The lower bound is compared to the upper
156 bound. If the test fails, the loop does not execute. Otherwise, the
157 lower bound is assigned to the index variable, and any code in the
158 landing pad list is executed. After that, the body is executed
159 repeatedly and the index variable is incremented by the step value on
160 every iteration, until the index variable fails the test with the upper
163 As an example, the following C code could be translated into the SUIF
164 code shown in a simplified form below:
166 for (i = 100; i >= 0; i--) {
170 FOR (Index=i Test=SGTE Cont=L:__L1 Brk=L:__L2)
180 e1: array e2+0, size(32), index(i), bound(e3)
185 ---------- Footnotes ----------
187 (1) The code produced directly by the C front-end assumes that the
188 upper bound and step operands are reevaluated on every iteration. The
189 expander's cleanup pass dismantles any `tree_for' nodes for which it
190 cannot guarantee that these semantics are equivalent.
193 File: suif1.info, Node: Block Nodes, Next: Procedure Nodes, Prev: For Nodes, Up: Tree Nodes
198 A `tree_block' node introduces a new scope. Nested scopes are
199 useful for retaining scope information from source programs and for
200 debugging purposes. They are also very useful for implementing code
201 transformations. For example, name conflicts are easily avoided by
202 introducing new scopes.
204 A `tree_block' node contains a `block_symtab' symbol table and a
205 tree node list for the body. The symbol table is accessed with the
206 `symtab' and `set_symtab' methods. The `body' and `set_body' methods
207 are used to get and set the list of tree nodes inside the block.
209 There is a one-to-one correspondence between a `tree_block' and its
210 symbol table, and the hierarchy of symbol tables within a procedure must
211 parallel the hierarchy of `tree_block' nodes. When inserting a
212 `tree_block' into an AST, the new block's symbol table must be inserted
213 in the appropriate place in the symbol table hierarchy. When a
214 `tree_block' is destroyed, the associated symbol table is detached from
215 its parent table and destroyed. However, the converse is not
216 true--when a `block_symtab' is deleted, the corresponding `tree_block'
219 The entries in a block's symbol table may not be referenced from
220 outside the block. There are no other restrictions on `tree_block'
221 nodes. The bodies may be empty or contain any kinds of tree nodes.
222 Blocks are usually entered from the beginning but that is not required;
223 unlike other AST nodes, branches into the middle of a block are allowed.
225 As an example, the following code creates a new `tree_block' and its
226 associated symbol table. It then adds the new block to the body of an
229 block_symtab *new_symtab = new block_symtab("my_symtab");
230 cur_block->symtab()->add_child(new_symtab);
231 tree_node_list *tl = new tree_node_list;
232 tree_block *new_block = new tree_block(tl, new_symtab);
233 cur_block->body()->append(new_block);
236 File: suif1.info, Node: Procedure Nodes, Prev: Block Nodes, Up: Tree Nodes
241 A special kind of block node is used as the root of each AST. These
242 `tree_proc' nodes are derived from the `tree_block' class. They are
243 essentially the same as other block nodes, but they include a few extra
244 methods and have a slightly different kind of symbol table. SUIF does
245 not support nested procedures, so `tree_proc' nodes may only be used at
248 A procedure node's symbol table must be a `proc_symtab' instead of
249 just an ordinary `block_symtab'. The `proc_syms' method is provided to
250 cast the symbol table pointer to be a `proc_symtab'. A `tree_proc'
251 node also has a pointer to the symbol of the corresponding procedure.
252 This pointer is set automatically when a `tree_proc' is attached to the
253 procedure symbol. For any tree node in an AST, the `proc' method
254 retrieves this procedure symbol from the `tree_proc' at the root of the
257 Because `tree_proc' nodes are always at the roots of the ASTs, their
258 `parent' list pointers (*note Other Node Features::.) are always `NULL'.
261 File: suif1.info, Node: Tree Node Lists, Next: Mapping Subtrees, Prev: Tree Nodes, Up: Trees
266 A `tree_node_list' is a doubly-linked list of tree nodes, derived
267 from the `dlist' class. *Note Doubly-Linked Lists::. Most nodes in an
268 AST have their children recorded in tree node lists. For example, the
269 body of a block node is a tree node list. Besides the standard list
270 features, each tree node list includes a back-pointer to its parent tree
271 node. These pointers are maintained automatically by the SUIF library
272 and can be accessed using the `parent' method. In addition, each node
273 on a tree node list contains pointers back to the list and the list
274 element. *Note Other Node Features::. These pointers are set by the
275 `set_elem' method (*note Generic Lists::.) which is automatically
276 called every time an element is added to the list.
278 In a `tree_for' node, the lower bound, upper bound, and step value
279 are usually treated as operands; however, the operands are actually
280 stored as tree node lists. *Note For Nodes::. Each of these lists is
281 required to contain a single expression with a dummy copy instruction at
282 the root. The destination of the dummy copy must be a null operand.
283 The `tree_node_list' class includes several methods to handle these
284 special lists. The `is_op' method checks if the list consists of a
285 single copy instruction with a null destination. The `op' method
286 retrieves the operand stored in the list. If after converting the list
287 to expression tree form, the `is_op' method returns `TRUE', the `op'
288 method returns the source operand of the dummy copy instruction;
289 otherwise, an error occurs. The `set_op' method is used to set the
290 contents of the tree node list to a dummy copy instruction with the
291 specified operand as its source. The list must either be empty or
292 already contain the dummy copy. Finally, the `op_is_intconst' method
293 checks if the operand in the tree node list is a constant integer, and
294 if so, returns the value.
296 Tree node lists can easily be printed out as text. The `print'
297 method simply iterates through the list and calls the `print' method
298 for each tree node. The optional `depth' parameter specifies the
299 indentation level to be used. A separate method is used to print tree
300 node lists that hold `tree_for' operands. If the `is_op' method for
301 one of these lists returns `TRUE', the `print_expr' method prints the
302 operand directly. Otherwise, it prints the list as usual.
305 File: suif1.info, Node: Mapping Subtrees, Prev: Tree Node Lists, Up: Trees
307 Mapping Functions Over Subtrees
308 ===============================
310 Many SUIF programs contain code that visits all of the nodes in an
311 AST to perform various operations. Rather than having every programmer
312 duplicate the code for this traversal, the SUIF library includes methods
313 to map functions over ASTs. The `tree_node' and `tree_node_list'
314 classes both provide these `map' methods.
316 The function to be mapped must be of type `tree_map_f'. For a tree
317 node, the `map' method applies the function to the tree node and all of
318 its children. The `preorder' parameter is used to select either
319 preorder or postorder traversals; the default is preorder. For
320 preorder traversals, the function is applied to the tree node before it
321 is applied to the children. For postorder, the function is first
322 applied to the tree_node's children and descendants and then last
323 applied to the tree node itself.
325 The `map' method for a tree node list calls `map' for each node in
326 the list. The nodes are visited in order from first to last regardless
327 of the `preorder' parameter. The `tree_node_list' version of the `map'
328 method has an extra optional parameter that allows you to only map the
329 function over the entries in the list without recursively visiting
332 The `tree_map_f' functions have two parameters: a pointer to the
333 tree node and a `void*' value passed on from the `map' method.
334 Additional parameters can be passed by making the `void*' value a
335 pointer to a structure containing the parameters. For example, the
336 following code counts the number of `tree_for' and `tree_loop' nodes in
339 struct count_result {
344 void count_loops (tree_node *t, void *x)
346 count_result *results = (count_result *)x;
348 if (t->kind() == TREE_FOR) results->num_fors++;
349 else if (t->kind() == TREE_LOOP) results->num_loops++;
353 * Return the number of tree_fors and tree_loops in the procedure.
356 void counter (tree_proc *p, int *for_count, int *loop_count)
358 count_result results;
359 results.num_fors = 0;
360 results.num_loops = 0;
362 p->map(count_loops, (void *)&results);
364 *for_count = results.num_fors;
365 *loop_count = results.num_loops;
369 File: suif1.info, Node: Instructions, Next: Symbols, Prev: Trees, Up: Top
371 Instructions and Expression Trees
372 *********************************
374 At the leaves of the abstract syntax trees, the `tree_instr' nodes
375 (*note Instruction Nodes::.) hold SUIF instructions. Each instruction
376 performs a single operation, specified by its opcode. With just a few
377 exceptions, the opcodes are simple and straightforward, similar to the
378 instruction set of a typical RISC processor. The instructions include
379 arithmetic, data transfer, and control flow operations.
381 SUIF instructions may be used in two ways. Most high-level passes
382 use expression trees, where the instructions are grouped together in
383 trees that correspond to the expressions in the source programs.
384 Back-end optimization passes have traditionally worked with flat lists
385 of instructions, and SUIF supports that representation as well.
387 The files `instruction.h' and `instruction.cc' contain the code for
388 the SUIF instruction classes.
392 * Basic Features:: Features shared by all kinds of instructions.
393 * Operands:: Instruction operands.
394 * Expression Trees:: How to use expression trees.
395 * Three Operand Instructions:: The `in_rrr' class (most instructions).
396 * Branch and Jump Instructions:: The `in_bj' class.
397 * Load Constant Instructions:: The `in_ldc' class.
398 * Call Instructions:: The `in_cal' class.
399 * Array Instructions:: The `in_array' class.
400 * Multi-way Branch Instructions:: The `in_mbr' class.
401 * Label Instructions:: The `in_lab' class.
402 * Generic Instructions:: The `in_gen' class (non-standard).
405 File: suif1.info, Node: Basic Features, Next: Operands, Up: Instructions
407 Basic Instruction Features
408 ==========================
410 All instructions share some basic features. This section describes
411 the fields in the base `instruction' class. Other classes are derived
412 from the `instruction' class to include fields that are specific to the
413 different instruction formats.
415 Besides the features below, the instructions have ID numbers that
416 can be used to identify them across passes of the compiler. This is
417 discussed in detail elsewhere. *Note ID Numbers::.
419 The SUIF library also declares an `instruction_list' class for
420 working with lists of pointers to instructions. This class is not used
421 much within the library itself because SUIF does not store instructions
422 directly on lists. Even with the flat list form, the instructions are
423 contained in `tree_instr' nodes, which in turn are stored in lists.
424 However, it may sometimes be convenient to work with lists of
425 instruction pointers even if those instructions are actually stored in
430 * Opcodes and Formats:: Description of opcode and format enumerations.
431 * Destination Operands:: Conventions for the destination operands.
432 * Result Types:: Specifying the types of the result values.
433 * Parent Tree Nodes:: Back-pointers to the `tree_instr' nodes.
434 * Source Operands:: Standard methods for accessing sources.
435 * Printing Methods:: Printing instructions and operands as text.
438 File: suif1.info, Node: Opcodes and Formats, Next: Destination Operands, Up: Basic Features
443 The `opcode' and `set_opcode' methods may be used to access the
444 opcode field in an instruction. These opcodes are members of the
445 `if_ops' enumerated type. The internal names for the opcodes all begin
446 with `io_', but that prefix is omitted in this documentation. The
447 `if_ops_name' function takes an opcode and returns a character string
448 holding the opcode name (without the `io_' prefix). This is used by
449 the library when it prints out instructions, but you may also call it
452 Each opcode is associated with a particular instruction format. The
453 formats are specified by members of the `inst_format' enumeration. The
454 `format' method may be used to retrieve the format for a particular
455 instruction. Alternatively, the `which_format' function takes an
456 opcode and returns its format without reference to any particular
457 instruction. Each format corresponds to a particular derived
461 Three operand instructions use the `in_rrr' class.
464 Branch and jump instructions use the `in_bj' class.
467 The load constant instruction uses the `in_ldc' class.
470 The call instruction uses the `in_cal' class.
473 The array instruction uses the `in_array' class.
476 The multi-way branch instruction uses the `in_mbr' class.
479 The label instruction uses the `in_lab' class.
482 Generic instructions use the `in_gen' class.
484 The `format' method is often used in `switch' statements for dealing
485 with different kinds of instructions. For example, the following code
486 computes the number of source operands in an instruction (instead of
487 using the `num_srcs' method):
490 switch (i->format()) {
500 in_cal *cali = (in_cal *)i;
501 n += 1 + cali->num_args();
505 in_array *arri = (in_array *)i;
506 n += 2 + 2 * arri->dims();
514 in_gen *geni = (in_gen *)i;
515 n += geni->num_srcs();
519 /* other formats (inf_ldc and inf_lab) have no source operands */
524 The opcodes and instruction formats are defined in the `opcodes.h'
525 and `opcodes.cc' files.
528 File: suif1.info, Node: Destination Operands, Next: Result Types, Prev: Opcodes and Formats, Up: Basic Features
533 The base `instruction' class includes a field for the destination
534 operand, even though not all instructions use it. The `dst_op' method
535 returns the value of this operand. If an instruction does not produce
536 a result, the destination operand should always be a null operand.
537 Moreover, the destination may be null even if an instruction does
538 produce a result. In that case, the result value is computed and then
541 If the destination operand is a variable, its type must be compatible
542 with the result type of the instruction.
544 If the result of an instruction is a temporary value, the destination
545 operand is a pointer to the instruction where the value is used. Note
546 that this is different than an instruction pointer in a source operand.
547 The source operands of an instruction point to the children in an
548 expression tree, while the destination operand points to the parent
551 The `set_dst' method may be used to set the destination operand for
552 an instruction. However, there are some important restrictions on using
555 * For derived classes with instructions that never produce result
556 values, it is illegal to call `set_dst'. This allows the library
557 to ensure that the destination will always be a null operand (1).
558 Trying to use `set_dst' for one of these instructions will cause an
561 * You may not set the destination operand to point to another
562 instruction; trying to do so will cause an error. This is to
563 prevent inconsistent pointers. The destination should only point
564 to another instruction if that other instruction has a source
565 operand that points back to it. Thus, the library automatically
566 sets the destination operand when an instruction is used as a
569 * If the destination operand already contains a pointer to another
570 instruction, you cannot use `set_dst' to overwrite it. This also
571 causes a run-time error. As in the previous case, this is to
572 prevent inconsistent pointers. Instead, you must first call the
573 instruction's `remove' method to clear the destination operand.
574 *Note Source Operands::.
576 ---------- Footnotes ----------
578 (1) Note that for other classes it is your responsibility to make
579 sure that the destination is null if the instruction does not produce a
583 File: suif1.info, Node: Result Types, Next: Parent Tree Nodes, Prev: Destination Operands, Up: Basic Features
588 Each instruction also has a field to specify the result type. The
589 `result_type' and `set_result_type' methods may be used to access this
590 field. If an instruction produces a result value (regardless of
591 whether the result is actually used), the type of the result must be
592 indicated. Otherwise, the result type should be the SUIF `void' type.
594 If the instruction produces a value, the result type must be an
595 object type (i.e. not a function type) with non-zero size and that size
596 must be known at compile time. This restriction is to exclude arrays
597 with unknown or symbolic bounds, the only types without known sizes.
600 File: suif1.info, Node: Parent Tree Nodes, Next: Source Operands, Prev: Result Types, Up: Basic Features
605 Each instruction automatically records the `tree_instr' node to
606 which it is attached. The `parent' method retrieves this pointer. In
607 an expression tree, all of the instructions share the same parent
608 `tree_instr' node, so an instruction may not be directly connected to
611 The `owner' method is identical to the `parent' method except for
612 instructions that are used as operands of `tree_for' nodes. Those
613 operands are actually attached to dummy copy instructions on lists
614 attached to the `tree_for' nodes (*note For Nodes::.), but many users
615 will want to treat them as if they are directly attached to the
616 `tree_for' nodes. Thus, for an instruction in a `tree_for' operand,
617 the `owner' method will return a pointer to the `tree_for' node instead
618 of returning the actual parent `tree_instr'. This only works if the
619 `tree_for' operand list contains a single expression (not a flat list);
620 otherwise, the actual `tree_instr' parent is returned.
623 File: suif1.info, Node: Source Operands, Next: Printing Methods, Prev: Parent Tree Nodes, Up: Basic Features
628 Even though the base `instruction' class does not contain any source
629 operand fields, it does provide several methods to access the source
630 operands in derived classes. First, the `num_srcs' method returns the
631 number of source operands in a particular instruction. Once you know
632 the number of sources, you can access them by number. This is
633 frequently useful when you need to visit all of the source operands
634 without concern for how they are used in the instruction. The `src_op'
635 and `set_src_op' methods provide this access. The operands are
636 numbered beginning from zero (like arrays in C). Specifying an operand
637 number that does not exist will cause an error. Do not use these
638 methods to access a particular operand field in an instruction; the
639 operand numbering used here is implementation-defined and subject to
642 The `src_map' method provides an alternate way to visit all of the
643 source operands. It applies a function that you specify to all of the
644 source operands. This can be used to implement recursive descents of
645 expression trees by making the mapped function call `src_map' if the
646 operand is an instruction. This is similar to what the `instr_map'
647 method in the `tree_instr' class does. *Note Instruction Nodes::. The
648 mapped function must match the `src_map_f' type which has three
649 parameters: a pointer to the instruction, a pointer to a copy of the
650 operand, and a `void*' value used to pass any other information needed
651 by the function. If the function returns `TRUE', the copy of the
652 operand will be assigned to the actual operand field within the
653 instruction. If it returns `FALSE', the source operand field is not
656 The SUIF library requires that instruction pointers in operands be
657 consistent. That is, a source operand may only point to an instruction
658 if the destination of that instruction points back to where the result
659 value is used. Because of this, you cannot simply overwrite a source
660 operand that contains an instruction pointer. Doing so would leave the
661 other instruction with an inconsistent destination operand. Instead,
662 you must first use the source operand's `remove' method. This clears
663 the instruction pointers in both the source and destination operands.
664 If the source instruction was part of an expression tree (i.e. it
665 didn't have its own parent `tree_instr' node), `remove' also clears its
666 `parent' pointer by calling the parent's `remove_instr' method (*note
667 Instruction Nodes::.). If the source operand is not an instruction,
668 the `remove' method does nothing so it is always safe to use it. The
669 `instruction' class also has a `remove' method which does the same
672 As an example, the following code clears all of the source operands
675 for (int n = 0; n < i->num_srcs(); n++) {
676 i->src_op(n).remove();
677 i->set_src_op(n, operand());
681 File: suif1.info, Node: Printing Methods, Prev: Source Operands, Up: Basic Features
686 Instructions can be printed out as text using the `print' method.
687 The optional `depth' parameter specifies the indentation level for the
688 output. The `elab' and `en' parameters are also optional and are only
689 used when printing sub-expressions. Most users should never need to
690 worry about these extra parameters. The `elab' parameter is the number
691 for the label attached to the instruction, and the `en' parameter is
692 the next number to be used for labeling subexpressions.
695 File: suif1.info, Node: Operands, Next: Expression Trees, Prev: Basic Features, Up: Instructions
700 Most SUIF instructions have operands that are represented by objects
701 of the `operand' class, which is implemented in the files `operand.h'
702 and `operand.cc'. An operand may have three different kinds of values:
704 * It may be a null operand. The `is_null' method tests for this
705 case. Null operands occur when an operand field in an instruction
708 * An operand may refer directly to the symbol for a variable. The
709 `is_symbol' method tests for this case. The `symbol' and
710 `set_symbol' methods access the symbol field. A variable in a
711 source operand indicates that the contents of the variable are
712 used. In a destination operand, a variable is assigned the result
715 * An operand may also point to another instruction. The `is_instr'
716 method tests for this, and the `instr' and `set_instr' methods
717 access the pointer field. This is used to implement expression
718 trees and, in low-SUIF, to refer to the results of other
719 instructions in a flat list. An instruction pointer in a source
720 operand means that the result of that instruction is used as the
721 operand. Conversely, an instruction pointer in a destination
722 operand indicates that the result value is used by that
725 The `is_expr' method is very similar to `is_instr'. Besides
726 checking if the operand holds an instruction pointer, it also tests if
727 that instruction is a subexpression that is not contained in a separate
728 `tree_instr' node (i.e. it's not in a flat list). This method should
729 not be used for destination operands.
731 Before changing a source operand from an instruction pointer to some
732 other value, the instruction must be removed. The `remove' method
733 checks if the operand is an instruction, and if so, calls the
734 instruction's `remove' method. *Note Source Operands::.
736 The operand class includes several methods to simplify common
737 operations. One frequently needs to know the type of a particular
738 operand. This can be determined from the variable's type or the
739 instruction's result type, but checking for these different kinds of
740 operands and then extracting the type is cumbersome. Instead, you can
741 use the `type' method which performs this operation for you. It
742 returns the SUIF `void' type for null operands. Another common
743 operation is testing an operand to see if it contains an integer
744 constant from a load constant instruction. The `is_const_int' method
745 checks if this is the case, and if so, it also returns the value of the
748 Operands have two different methods for printing themselves out to a
749 file as text. The `print' method is used by the library when printing
750 instructions. Because source and destination operands are handled
751 differently, `print' requires that you specify the instruction that
752 contains the operand so that it can determine how the operand is being
753 used. However, when debugging a program, you may not know which
754 instruction contains an operand. If you do know that it is used as a
755 source operand, you can use the `print_source' method to print it
756 without specifying the instruction.
759 File: suif1.info, Node: Expression Trees, Next: Three Operand Instructions, Prev: Operands, Up: Instructions
764 SUIF supports both expression trees and flat lists of instructions.
765 Procedures are always written out as flat lists. After a procedure body
766 has been read from an input file, it can be converted to expression tree
767 form. The `read_proc' method for the procedure symbol has a flag that
768 allows you to specify this; by default it builds the expression trees.
769 You may also do the conversion manually as described below. When the
770 procedure body is written out, it is automatically converted back to
773 In the flat list form, each SUIF instruction has its own
774 `tree_instr' node. With expression trees, the only change is that all
775 of the instructions within an expression are grouped under the same
776 `tree_instr' node. The instruction pointers in the operand fields do
779 Up to now we have implied that high-SUIF uses expression trees and
780 that low-SUIF uses the flat list representation, but the decision to use
781 expression trees is actually independent from the structure of the ASTs.
782 SUIF allows expressions trees to be used even if all of the high-level
783 control structures have been removed, and conversely the flat list
784 representation works with all kinds of AST nodes (although accessing the
785 `tree_for' operands is a little awkward). In fact, the two
786 representations can even be mixed in the same procedure!
788 Some caution is needed here. These two representations are not
789 equivalent. An expression tree only specifies a partial order for the
790 evaluation of the instructions in the tree, whereas a flat list is a
791 total ordering. Similarly, while flat lists allow you to intermix the
792 instructions from different expressions, with expression trees each
793 expression must be completely evaluated before proceeding to execute the
794 instructions for the next expression. Thus, flat lists give you precise
795 control of the execution order for the instructions. Because of this,
796 if you reorder the instructions in a flat list and then try to convert
797 to expression trees, the library may complain and promote temporaries to
798 variables in order to maintain the equivalent semantics.
800 Methods are provided in the `tree_node_list' class to convert back
801 and forth between expression trees and flat lists. The `flatten'
802 method converts all of the expression trees in the list and its child
803 lists to flat lists of instructions. This is simply a matter of
804 creating new `tree_instr' nodes to hold the subexpressions and
805 inserting them into the lists.
807 The `cvt_to_trees' method converts a tree node list to expression
808 tree form. This does not actually change the instructions, but rather
809 collects the instructions within an expression under a single
810 `tree_instr' node. Only contiguous instructions can be converted into
811 an expression tree. If other instructions from outside the expression
812 are intermixed, building the expression tree may change the behavior of
813 the code. Thus, if this occurs, the library will print a warning
814 message and split up the expression tree so that the original semantics
818 File: suif1.info, Node: Three Operand Instructions, Next: Branch and Jump Instructions, Prev: Expression Trees, Up: Instructions
820 Three Operand Instructions
821 ==========================
823 The vast majority of SUIF instructions are represented by the
824 `in_rrr' class which includes fields for two source operands. Not all
825 of the operands have to be used. For example, `nop' instructions don't
826 use any operands at all.
828 The `src1_op' and `src2_op' methods retrieve the source operands,
829 and the `set_src1' and `set_src2' methods assign new operands. Besides
830 the methods to directly access the two source operands, other methods
831 are provided to refer to these sources according to their uses with
834 * Shift and rotate instructions (`asr', `lsl', `lsr', and `rot') may
835 use the `shift_cnt_op' and `set_shift_cnt_op' methods to access
836 the operand that specifies the numbers of bits to shift or rotate.
837 The other source operand may be accessed with the `src_op' and
840 * Store (`str') and memory copy (`memcpy') instructions may use the
841 `dst_addr_op' and `set_dst_addr_op' methods to access the source
842 operand that holds the destination address for the memory
845 * Load (`lod') and memory copy (`memcpy') instructions may use the
846 `src_addr_op' and `set_src_addr_op' methods to access the operand
847 that specifies the source address for the memory reference. Note
848 that the actual operand field that is used is different for these
851 * Instructions that only use one of the source operands may use the
852 `src_op' and `set_src' methods without specifying whether the
853 first or second operand field is used. In addition, a store
854 (`str') instruction may use these methods to refer to the operand
855 holding the value to be stored. They are also used by shift and
856 rotate instructions as described above.
858 Using these opcode-specific methods with the wrong opcodes will cause
861 For your convenience, the `in_rrr' class provides a method,
862 `is_unary', to determine if an instruction produces a result value
863 using only one of the source operands. It does not consider return
864 (`ret') instructions to be unary because they do not produce result
865 values. Although load (`lod') instructions fit the pattern, they are a
866 special case and are not considered to be unary.
868 The `is_commutative' method checks the opcode of an instruction to
869 determine if it is a commutative operation. If so, the two source
870 operands should be interchangeable.
872 Some of the arithmetic instructions may generate run-time exceptions
873 if the appropriate class of exceptions is enabled (*note Miscellaneous
874 Annotes::.). If the exceptions are not enabled, the rules for ANSI C
875 are used to determine the result.
877 The following table lists all of the three operand SUIF instructions.
878 Each entry describes the operands used with that opcode and includes any
879 restrictions on the operand and result types.
882 Do nothing at all. All of the operands for these instructions
883 should be null, and the result type should be the SUIF `void' type.
886 Load the value at the address contained in the `src1' operand and
887 put it in the `dst' operand. The result type indicates the type
888 of the value being loaded and may be any type, subject to the usual
889 restrictions on a result type (*note Result Types::.). The type
890 of the expression in `src1' must be a pointer to the result type.
891 The `src2' operand is not used.
894 Store the value in the `src2' operand at the address contained in
895 the `src1' operand. Both operands must be specified. There is no
896 special restriction on the type of the `src2' operand, though the
897 restrictions on instruction result types (*note Result Types::.)
898 and variables (*note Variable Symbols::.) guarantee it will have a
899 known, non-zero size. The `src1' operand should contain an
900 expression that is a pointer to the type of the operand being
901 stored. The `dst' operand is not used.
904 Memory to memory copy. Load the value from the address in the
905 `src2' operand and store it at the address in the `src1' operand.
906 The type of the object to be copied is subject to the same
907 conditions as the result type of an instruction (*note Result
908 Types::.), so it must have known, non-zero size. Both of the
909 source operands must be pointers to this object type. The `dst'
913 Copy the `src1' operand to the `dst' operand. The `src2' operand
914 is not used. The result type must be the same as the type of the
915 source operand. The restrictions on instruction result types
916 (*note Result Types::.) guarantee that the object being copied has
917 known, non-zero size.
920 Convert the `src1' operand to the result type and put it in the
921 `dst' operand. The `src2' operand is not used. Nothing can can
922 be converted to or from a `struct', `union', `array', or `void'
923 type. Pointer types can only be converted to and from integer
924 types and other pointer types.
927 Negation. Change the sign of the value in the `src1' operand and
928 put the result in the `dst' operand. The `src2' operand is
929 unused. The result type and the type of the operand must be the
930 same signed integer or floating-point type.
933 Add the values in the `src1' and `src2' operands and put the
934 result in the `dst' operand. Except for pointer additions, the
935 result type and the types of the operands must be the same integer
936 or floating-point types. Pointer addition is a special case. One
937 of the source operands may have a pointer type, as long as the
938 other source operand has signed or unsigned integer type of any
939 size; the result type must also be a pointer type, but need not be
940 the same as the source pointer type.
943 Subtract the value in the `src2' operand from the value in the
944 `src1' operand and put the result in the `dst' operand. Except
945 for pointer subtractions, the result type and the types of the
946 operands must be the same integer or floating-point types. There
947 are two special cases for pointer subtractions. In either case,
948 the `src1' operand must have a pointer type. First, the `src2'
949 operand may have any integer type, in which case the result type
950 may be any pointer type, not necessarily the same as the source
951 pointer type. Second, the `src2' operand's type may be another
952 pointer, in which case the result type must be type_ptr_diff.
956 Multiply or divide the value in the `src1' operand by the value in
957 the `src2' operand and put the result in the `dst' operand. The
958 result type and the types of the operands must be the same integer
959 or floating-point type. Integer multiplication and division are
960 defined according to the rules for ANSI C.
964 Remainder and modulus. These two instructions are very similar.
965 Both divide the value in the `src1' operand by the value in the
966 `src2' operand to find the remainder or modulus. The `rem'
967 instruction is identical to the modulus operator in ANSI C. That
968 is, if either source operand is negative, the sign of the result is
969 undefined and depends on the semantics of integer division. The
970 `mod' instruction is the same except that its result is always
971 guaranteed to be positive. The result type and the types of the
972 destination and source operands must be the same integer type.
975 Bit-wise inversion. Compute the one's complement negation of the
976 value in the `src1' operand and put the result in the `dst'
977 operand. The `src2' operand is not used. The result type and the
978 types of the operand must be the same unsigned integer type.
983 Compute the bit-wise AND, inclusive OR, or exclusive OR of the
984 values in the `src1' and `src2' operands and put the result in the
985 `dst' operand. The result type and the type of the operands must
986 be the same unsigned integer type.
991 Shift the value in the `src1' operand right or left by the amount
992 specified in the `src2' operand. The variable in the `src2'
993 operand must always have an unsigned integer type. The `asr'
994 instruction performs sign extension and requires that the result
995 type and the type `src1' operand be the same signed integer type.
996 The `lsr' instructions does not perform sign extension and requires
997 that the result type and type of the `src1' operand be the same
998 unsigned integer type. Sign extension is not an issue for left
999 shifts, so the `lsl' instruction only requires that the result
1000 type and the type of the `src1' operand be the same integer type.
1004 Division combined with floor and ceiling operations. The
1005 `divfloor' opcode means take the rational quotient of the `src1'
1006 operand by the `src2' operand and apply the floor operation (i.e.
1007 round to the nearest integer less than or equal to the quotient).
1008 The `divceil' opcode means take the rational quotient of the
1009 `src1' operand by the `src2' operand and apply the ceiling
1010 operation (i.e. round to the nearest integer greater than or equal
1011 to the quotient). The result type and the operand types must be
1012 the same integer type.
1016 Minimum and maximum. The result value is the minimum or maximum,
1017 respectively, of the two source operands. The result type and the
1018 operand types must be the same integer or floating-point types.
1021 Absolute value. Compute the absolute value of the `src1' operand.
1022 The `src2' operand is unused. The result type and the source
1023 operand type must be the same integer or floating-point type.
1026 Rotate the value in the `src1' operand left or right by the amount
1027 specified in the `src2' operand. The variable in the `src2'
1028 operand must always have a signed integer type. If the shift
1029 amount is positive, the value is rotated to the left (toward the
1030 most-significant bit); if it is negative, the value is rotated to
1031 the right. The result type and the type of the `src1' operand must
1032 be the same integer type.
1038 Comparison instructions. If the `src1' operand is equal, not
1039 equal, less than, or less than or equal, respectively, to the
1040 `src2' operand, assign the integer value one to the `dst' operand.
1041 Otherwise, set the `dst' operand to zero. The result type must
1042 always be a signed integer type. The source operands must have
1043 integer, enumerated, floating point, or pointer type.
1046 Return from a procedure. Only the `src1' operand is used and it
1047 is optional. If specified, it is the return value and may contain
1048 an operand of any type except array or function types. If the
1049 procedure's function type has void return type, the operand must be
1050 null; otherwise the operand must not be null and must have the same
1051 type as the return type of the procedure.
1054 This pseudo-instruction only marks a position in the program and
1055 is used to hold miscellaneous annotations such as line numbers.
1056 It is functionally equivalent to a `nop' instruction. All of the
1057 operands for these instructions should be null, and the result type
1058 should be the SUIF `void' type.
1061 File: suif1.info, Node: Branch and Jump Instructions, Next: Load Constant Instructions, Prev: Three Operand Instructions, Up: Instructions
1063 Branch and Jump Instructions
1064 ============================
1066 The `in_bj' class is used to hold branch and jump instructions.
1067 This class includes a field to specify the label symbol at the branch
1068 target and an optional source operand that is used for conditional
1069 branches. The `target' and `set_target' methods are used to access the
1070 label symbol for the branch target. This label must be defined in the
1071 scope where the branch occurs, and it must be within the same
1072 procedure. The `src_op' and `set_src_op' methods access the optional
1073 source operand field. The `dst' operand is unused for all branch and
1074 jump instructions and the result type should always be the SUIF `void'
1075 type. The branch and jump opcodes are described below:
1078 Unconditional jump. The source operand is unused. The flow of
1079 control is unconditionally transferred to the code at the target
1083 Branch if true. If the source operand, which must have an signed
1084 integer type, contains a true (non-zero) value, the flow of
1085 control is transferred to the code at the target label.
1086 Otherwise, it continues with the next instruction in sequential
1090 Branch if false. If the source operand contains a false (zero)
1091 value, the flow of control is transferred to the code at the
1092 target label. Otherwise, it continues with the next instruction
1093 in sequential order. The source operand must have an signed