info/suif1.info-2

   1 This is Info file suif1.info, produced by Makeinfo version 1.68 from
   2 the input file suif1.texi.
   3
   4    This file documents the SUIF library.
   5
   6    Copyright (C) 1994 Stanford University.  All rights reserved.
   7
   8    Permission is given to use, copy, and modify this documentation for
   9 any non-commercial purpose as long as this copyright notice is not
  10 removed.  All other uses, including redistribution in whole or in part,
  11 are forbidden without prior written permission.
  12
  13 \x1f
  14 File: suif1.info,  Node: For Nodes,  Next: Block Nodes,  Prev: Loop Nodes,  Up: Tree Nodes
  15
  16 For Nodes
  17 ---------
  18
  19    Many of our compiler passes are designed to work with Fortran `DO'
  20 loops because they are relatively easy to analyze.  A `DO' loop has a
  21 single index variable that is incremented or decremented on every
  22 iteration and varies from an initial to a final value.  SUIF uses
  23 `tree_for' nodes to represent well-structured `DO' loops.  The exact
  24 conditions that a loop must meet to be represented as a `tree_for' in
  25 SUIF are described below.  The expander's cleanup pass, which is run
  26 immediately after the front-end, converts `tree_for' nodes that violate
  27 any of these conditions into `tree_loop' nodes (*note Loop Nodes::.).
  28 Even though they are primarily intended for Fortran code, `tree_for'
  29 nodes may also be used for C loops that meet the same conditions.
  30
  31    The index variable of a `tree_for' can be accessed using the `index'
  32 method and set using the `set_index' method.  The index variable must
  33 be a scalar variable that is defined in the scope of the `tree_for'
  34 node.  It may not be modified anywhere within the loop body.  This
  35 applies across procedures as well.  If the loop body contains a call to
  36 another procedure that modifies the index variable, then the loop
  37 cannot be represented by a `tree_for' node.  Moreover, if you are using
  38 Fortran form, the index variable may not be a call-by-reference
  39 parameter.  *Note Fortran::.
  40
  41    The range of values for the index variable is specified by three
  42 `operand' fields.  *Note Operands::.  The lower and upper bounds and
  43 the step value can be accessed by the `lb_op', `ub_op', and `step_op'
  44 methods.  The `set_lb_op', `set_ub_op', and `set_step_op' methods are
  45 provided to change them.  These operands are expressions that are
  46 evaluated once at the beginning of the loop.  The index variable is
  47 initially assigned the lower bound value and then incremented by the
  48 step value on every iteration until it reaches the upper bound value;
  49 the code to do this is automatically created when the `tree_for' is
  50 expanded to low-SUIF code.
  51
  52    Most users will always use `tree_for' nodes in conjunction with
  53 expression trees.  Flat lists of instructions are typically used only in
  54 the library and with back-end passes where the `tree_for' nodes have
  55 been dismantled.  It is possible to use `tree_for' nodes without
  56 expression trees, but the bounds and step values cannot be treated as
  57 operands.  In fact, even with expression trees those operands are
  58 actually stored on tree node lists.  If necessary, these lists can be
  59 accessed directly using the `lb_list', `ub_list', and `step_list'
  60 methods.  Each list is required to contain a single expression with a
  61 dummy copy instruction at the root.  The destination of the dummy copy
  62 must be a null operand.  Methods are provided in the tree node list
  63 class to extract the operands from the tree node lists (*note Tree Node
  64 Lists::.).
  65
  66    The `tree_for' must also specify the comparison operator used to
  67 determine when the index variable has reached the upper bound value.
  68 The possible comparison operators are members of the `tree_for_test'
  69 enumerated type.  The `test' and `set_test' methods are used to access
  70 and modify the comparison operator for a `tree_for' node.  The
  71 `tree_for_test' enumeration includes quite a few comparison operators,
  72 but some of them are only used by the front-end.  Both signed and
  73 unsigned versions are available for most of the comparison operators,
  74 as indicated by the letters "`S'" and "`U'" in their names.
  75
  76 `FOR_SLT'
  77
  78 `FOR_ULT'
  79      Less than.  Repeat as long as the index is strictly less than the
  80      upper bound.
  81
  82 `FOR_SLTE'
  83
  84 `FOR_ULTE'
  85      Less than or equal.  Repeat as long as the index is less than or
  86      equal to the upper bound.
  87
  88 `FOR_SGT'
  89
  90 `FOR_UGT'
  91      Greater than.  Repeat as long as the index is strictly greater
  92      than the upper bound.  Sometimes `DO' loops go backwards, using a
  93      negative step value.  For those loops, the comparison operator
  94      must also be reversed.
  95
  96 `FOR_SGTE'
  97
  98 `FOR_UGTE'
  99      Greater than or equal.  Repeat as long as the index is greater
 100      than or equal to the upper bound.  Again, this may be used when
 101      the step value is negative.
 102
 103 `FOR_SGELE'
 104
 105 `FOR_UGELE'
 106      These comparisons are only used by the front-end.  In FORTRAN, it
 107      may not be possible to determine the direction of a loop at
 108      compile time.  For example, if the step value is not a constant,
 109      it could be either positive or negative.  These comparison
 110      operators indicate that the loop test may be either "greater than
 111      or equal" or "less than or equal", depending on the sign of the
 112      step value.  The expander's cleanup pass converts any `tree_for'
 113      nodes with these tests to two `tree_for' nodes and a `tree_if'
 114      node to decide between them.  Thus, these comparison operators
 115      should never be encountered in most SUIF code.
 116
 117 `FOR_EQ'
 118
 119 `FOR_NEQ'
 120      Equal and not equal.  These comparisons are only used by the
 121      front-end.  The expander's cleanup pass dismantles `tree_for'
 122      nodes that use these comparisons.
 123
 124    The body of a `tree_for' loop is stored in a tree node list.  The
 125 methods to get and set the loop body are `body' and `set_body',
 126 respectively.  The `body' list contains only the instructions
 127 corresponding to the body of the loop in the source program.  The
 128 instructions to compare the index variable with the upper bound,
 129 increment it, and branch back to the beginning of the body are not
 130 included as part of the body; they are created when the `tree_for' is
 131 expanded to low-SUIF code.
 132
 133    Besides the loop body, a `tree_for' node has an additional tree node
 134 list called the `landing_pad'.  The code in the landing pad is executed
 135 if and only if the loop body is executed at least one time, but the
 136 `landing_pad' is executed only once, unlike the body which is usually
 137 executed many times.  The `landing_pad' is executed immediately before
 138 the first time through the loop body.  The landing pad thus provides a
 139 place to move loop-invariant code.
 140
 141    Two labels are associated with a `tree_for': `contlab' and `brklab'.
 142 A "continue" statement in the loop body requires a jump over the rest
 143 of the body to the code that increments the index and continues with
 144 the next iteration.  This can be implemented with a jump to the
 145 `contlab' label, which is implicitly located at the end of the `body'
 146 list.  Similarly, a "break" statement in the loop is translated to a
 147 jump to the `brklab' label which is located immediately after the loop.
 148 These two labels must be defined in the scope of the `tree_for' node,
 149 but the label instructions that mark their locations are not inserted
 150 into the tree node lists until the `tree_for' node is expanded into
 151 low-SUIF form.
 152
 153    In summary, the semantics of a `tree_for' node are as follows.  The
 154 lower bound, upper bound, and step operands are evaluated once at the
 155 beginning of the loop (1).  The lower bound is compared to the upper
 156 bound.  If the test fails, the loop does not execute.  Otherwise, the
 157 lower bound is assigned to the index variable, and any code in the
 158 landing pad list is executed.  After that, the body is executed
 159 repeatedly and the index variable is incremented by the step value on
 160 every iteration, until the index variable fails the test with the upper
 161 bound value.
 162
 163    As an example, the following C code could be translated into the SUIF
 164 code shown in a simplified form below:
 165
 166      for (i = 100; i >= 0; i--) {
 167          A[i] = i;
 168      }
 169
 170      FOR (Index=i Test=SGTE Cont=L:__L1 Brk=L:__L2)
 171      FOR LB
 172          ldc 100
 173      FOR UB
 174          ldc 0
 175      FOR STEP
 176          ldc -1
 177      FOR LANDING PAD
 178      FOR BODY
 179          str e1 = i
 180            e1: array e2+0, size(32), index(i), bound(e3)
 181              e2: ldc <A,0>
 182                e3: ldc 101
 183      FOR END
 184
 185    ---------- Footnotes ----------
 186
 187    (1) The code produced directly by the C front-end assumes that the
 188 upper bound and step operands are reevaluated on every iteration.  The
 189 expander's cleanup pass dismantles any `tree_for' nodes for which it
 190 cannot guarantee that these semantics are equivalent.
 191
 192 \x1f
 193 File: suif1.info,  Node: Block Nodes,  Next: Procedure Nodes,  Prev: For Nodes,  Up: Tree Nodes
 194
 195 Block Nodes
 196 -----------
 197
 198    A `tree_block' node introduces a new scope.  Nested scopes are
 199 useful for retaining scope information from source programs and for
 200 debugging purposes.  They are also very useful for implementing code
 201 transformations.  For example, name conflicts are easily avoided by
 202 introducing new scopes.
 203
 204    A `tree_block' node contains a `block_symtab' symbol table and a
 205 tree node list for the body.  The symbol table is accessed with the
 206 `symtab' and `set_symtab' methods.  The `body' and `set_body' methods
 207 are used to get and set the list of tree nodes inside the block.
 208
 209    There is a one-to-one correspondence between a `tree_block' and its
 210 symbol table, and the hierarchy of symbol tables within a procedure must
 211 parallel the hierarchy of `tree_block' nodes.  When inserting a
 212 `tree_block' into an AST, the new block's symbol table must be inserted
 213 in the appropriate place in the symbol table hierarchy.  When a
 214 `tree_block' is destroyed, the associated symbol table is detached from
 215 its parent table and destroyed.  However, the converse is not
 216 true--when a `block_symtab' is deleted, the corresponding `tree_block'
 217 node is not deleted.
 218
 219    The entries in a block's symbol table may not be referenced from
 220 outside the block.  There are no other restrictions on `tree_block'
 221 nodes.  The bodies may be empty or contain any kinds of tree nodes.
 222 Blocks are usually entered from the beginning but that is not required;
 223 unlike other AST nodes, branches into the middle of a block are allowed.
 224
 225    As an example, the following code creates a new `tree_block' and its
 226 associated symbol table.  It then adds the new block to the body of an
 227 existing block node:
 228
 229      block_symtab *new_symtab = new block_symtab("my_symtab");
 230      cur_block->symtab()->add_child(new_symtab);
 231      tree_node_list *tl = new tree_node_list;
 232      tree_block *new_block = new tree_block(tl, new_symtab);
 233      cur_block->body()->append(new_block);
 234
 235 \x1f
 236 File: suif1.info,  Node: Procedure Nodes,  Prev: Block Nodes,  Up: Tree Nodes
 237
 238 Procedure Nodes
 239 ---------------
 240
 241    A special kind of block node is used as the root of each AST.  These
 242 `tree_proc' nodes are derived from the `tree_block' class.  They are
 243 essentially the same as other block nodes, but they include a few extra
 244 methods and have a slightly different kind of symbol table.  SUIF does
 245 not support nested procedures, so `tree_proc' nodes may only be used at
 246 the roots of ASTs.
 247
 248    A procedure node's symbol table must be a `proc_symtab' instead of
 249 just an ordinary `block_symtab'.  The `proc_syms' method is provided to
 250 cast the symbol table pointer to be a `proc_symtab'.  A `tree_proc'
 251 node also has a pointer to the symbol of the corresponding procedure.
 252 This pointer is set automatically when a `tree_proc' is attached to the
 253 procedure symbol.  For any tree node in an AST, the `proc' method
 254 retrieves this procedure symbol from the `tree_proc' at the root of the
 255 tree.
 256
 257    Because `tree_proc' nodes are always at the roots of the ASTs, their
 258 `parent' list pointers (*note Other Node Features::.) are always `NULL'.
 259
 260 \x1f
 261 File: suif1.info,  Node: Tree Node Lists,  Next: Mapping Subtrees,  Prev: Tree Nodes,  Up: Trees
 262
 263 Tree Node Lists
 264 ===============
 265
 266    A `tree_node_list' is a doubly-linked list of tree nodes, derived
 267 from the `dlist' class.  *Note Doubly-Linked Lists::.  Most nodes in an
 268 AST have their children recorded in tree node lists.  For example, the
 269 body of a block node is a tree node list.  Besides the standard list
 270 features, each tree node list includes a back-pointer to its parent tree
 271 node.  These pointers are maintained automatically by the SUIF library
 272 and can be accessed using the `parent' method.  In addition, each node
 273 on a tree node list contains pointers back to the list and the list
 274 element.  *Note Other Node Features::.  These pointers are set by the
 275 `set_elem' method (*note Generic Lists::.) which is automatically
 276 called every time an element is added to the list.
 277
 278    In a `tree_for' node, the lower bound, upper bound, and step value
 279 are usually treated as operands; however, the operands are actually
 280 stored as tree node lists.  *Note For Nodes::.  Each of these lists is
 281 required to contain a single expression with a dummy copy instruction at
 282 the root.  The destination of the dummy copy must be a null operand.
 283 The `tree_node_list' class includes several methods to handle these
 284 special lists.  The `is_op' method checks if the list consists of a
 285 single copy instruction with a null destination.  The `op' method
 286 retrieves the operand stored in the list.  If after converting the list
 287 to expression tree form, the `is_op' method returns `TRUE', the `op'
 288 method returns the source operand of the dummy copy instruction;
 289 otherwise, an error occurs.  The `set_op' method is used to set the
 290 contents of the tree node list to a dummy copy instruction with the
 291 specified operand as its source.  The list must either be empty or
 292 already contain the dummy copy.  Finally, the `op_is_intconst' method
 293 checks if the operand in the tree node list is a constant integer, and
 294 if so, returns the value.
 295
 296    Tree node lists can easily be printed out as text.  The `print'
 297 method simply iterates through the list and calls the `print' method
 298 for each tree node.  The optional `depth' parameter specifies the
 299 indentation level to be used.  A separate method is used to print tree
 300 node lists that hold `tree_for' operands.  If the `is_op' method for
 301 one of these lists returns `TRUE', the `print_expr' method prints the
 302 operand directly.  Otherwise, it prints the list as usual.
 303
 304 \x1f
 305 File: suif1.info,  Node: Mapping Subtrees,  Prev: Tree Node Lists,  Up: Trees
 306
 307 Mapping Functions Over Subtrees
 308 ===============================
 309
 310    Many SUIF programs contain code that visits all of the nodes in an
 311 AST to perform various operations.  Rather than having every programmer
 312 duplicate the code for this traversal, the SUIF library includes methods
 313 to map functions over ASTs.  The `tree_node' and `tree_node_list'
 314 classes both provide these `map' methods.
 315
 316    The function to be mapped must be of type `tree_map_f'.  For a tree
 317 node, the `map' method applies the function to the tree node and all of
 318 its children.  The `preorder' parameter is used to select either
 319 preorder or postorder traversals; the default is preorder.  For
 320 preorder traversals, the function is applied to the tree node before it
 321 is applied to the children.  For postorder, the function is first
 322 applied to the tree_node's children and descendants and then last
 323 applied to the tree node itself.
 324
 325    The `map' method for a tree node list calls `map' for each node in
 326 the list.  The nodes are visited in order from first to last regardless
 327 of the `preorder' parameter.  The `tree_node_list' version of the `map'
 328 method has an extra optional parameter that allows you to only map the
 329 function over the entries in the list without recursively visiting
 330 their children.
 331
 332    The `tree_map_f' functions have two parameters: a pointer to the
 333 tree node and a `void*' value passed on from the `map' method.
 334 Additional parameters can be passed by making the `void*' value a
 335 pointer to a structure containing the parameters.  For example, the
 336 following code counts the number of `tree_for' and `tree_loop' nodes in
 337 a procedure:
 338
 339      struct count_result {
 340          int num_fors;
 341          int num_loops;
 342      };
 343
 344      void count_loops (tree_node *t, void *x)
 345      {
 346          count_result *results = (count_result *)x;
 347
 348          if (t->kind() == TREE_FOR) results->num_fors++;
 349          else if (t->kind() == TREE_LOOP) results->num_loops++;
 350      }
 351
 352      /*
 353       *  Return the number of tree_fors and tree_loops in the procedure.
 354       */
 355
 356      void counter (tree_proc *p, int *for_count, int *loop_count)
 357      {
 358          count_result results;
 359          results.num_fors = 0;
 360          results.num_loops = 0;
 361
 362          p->map(count_loops, (void *)&results);
 363
 364          *for_count =  results.num_fors;
 365          *loop_count = results.num_loops;
 366      }
 367
 368 \x1f
 369 File: suif1.info,  Node: Instructions,  Next: Symbols,  Prev: Trees,  Up: Top
 370
 371 Instructions and Expression Trees
 372 *********************************
 373
 374    At the leaves of the abstract syntax trees, the `tree_instr' nodes
 375 (*note Instruction Nodes::.) hold SUIF instructions.  Each instruction
 376 performs a single operation, specified by its opcode.  With just a few
 377 exceptions, the opcodes are simple and straightforward, similar to the
 378 instruction set of a typical RISC processor.  The instructions include
 379 arithmetic, data transfer, and control flow operations.
 380
 381    SUIF instructions may be used in two ways.  Most high-level passes
 382 use expression trees, where the instructions are grouped together in
 383 trees that correspond to the expressions in the source programs.
 384 Back-end optimization passes have traditionally worked with flat lists
 385 of instructions, and SUIF supports that representation as well.
 386
 387    The files `instruction.h' and `instruction.cc' contain the code for
 388 the SUIF instruction classes.
 389
 390 * Menu:
 391
 392 * Basic Features::              Features shared by all kinds of instructions.
 393 * Operands::                    Instruction operands.
 394 * Expression Trees::            How to use expression trees.
 395 * Three Operand Instructions::  The `in_rrr' class (most instructions).
 396 * Branch and Jump Instructions::  The `in_bj' class.
 397 * Load Constant Instructions::  The `in_ldc' class.
 398 * Call Instructions::           The `in_cal' class.
 399 * Array Instructions::          The `in_array' class.
 400 * Multi-way Branch Instructions::  The `in_mbr' class.
 401 * Label Instructions::          The `in_lab' class.
 402 * Generic Instructions::        The `in_gen' class (non-standard).
 403
 404 \x1f
 405 File: suif1.info,  Node: Basic Features,  Next: Operands,  Up: Instructions
 406
 407 Basic Instruction Features
 408 ==========================
 409
 410    All instructions share some basic features.  This section describes
 411 the fields in the base `instruction' class.  Other classes are derived
 412 from the `instruction' class to include fields that are specific to the
 413 different instruction formats.
 414
 415    Besides the features below, the instructions have ID numbers that
 416 can be used to identify them across passes of the compiler.  This is
 417 discussed in detail elsewhere.  *Note ID Numbers::.
 418
 419    The SUIF library also declares an `instruction_list' class for
 420 working with lists of pointers to instructions.  This class is not used
 421 much within the library itself because SUIF does not store instructions
 422 directly on lists.  Even with the flat list form, the instructions are
 423 contained in `tree_instr' nodes, which in turn are stored in lists.
 424 However, it may sometimes be convenient to work with lists of
 425 instruction pointers even if those instructions are actually stored in
 426 `tree_instr' nodes.
 427
 428 * Menu:
 429
 430 * Opcodes and Formats::         Description of opcode and format enumerations.
 431 * Destination Operands::        Conventions for the destination operands.
 432 * Result Types::                Specifying the types of the result values.
 433 * Parent Tree Nodes::           Back-pointers to the `tree_instr' nodes.
 434 * Source Operands::             Standard methods for accessing sources.
 435 * Printing Methods::            Printing instructions and operands as text.
 436
 437 \x1f
 438 File: suif1.info,  Node: Opcodes and Formats,  Next: Destination Operands,  Up: Basic Features
 439
 440 Opcodes and Formats
 441 -------------------
 442
 443    The `opcode' and `set_opcode' methods may be used to access the
 444 opcode field in an instruction.  These opcodes are members of the
 445 `if_ops' enumerated type.  The internal names for the opcodes all begin
 446 with `io_', but that prefix is omitted in this documentation.  The
 447 `if_ops_name' function takes an opcode and returns a character string
 448 holding the opcode name (without the `io_' prefix).  This is used by
 449 the library when it prints out instructions, but you may also call it
 450 directly.
 451
 452    Each opcode is associated with a particular instruction format.  The
 453 formats are specified by members of the `inst_format' enumeration.  The
 454 `format' method may be used to retrieve the format for a particular
 455 instruction.  Alternatively, the `which_format' function takes an
 456 opcode and returns its format without reference to any particular
 457 instruction.  Each format corresponds to a particular derived
 458 instruction class:
 459
 460 `inf_rrr'
 461      Three operand instructions use the `in_rrr' class.
 462
 463 `inf_bj'
 464      Branch and jump instructions use the `in_bj' class.
 465
 466 `inf_ldc'
 467      The load constant instruction uses the `in_ldc' class.
 468
 469 `inf_cal'
 470      The call instruction uses the `in_cal' class.
 471
 472 `inf_array'
 473      The array instruction uses the `in_array' class.
 474
 475 `inf_mbr'
 476      The multi-way branch instruction uses the `in_mbr' class.
 477
 478 `inf_lab'
 479      The label instruction uses the `in_lab' class.
 480
 481 `inf_gen'
 482      Generic instructions use the `in_gen' class.
 483
 484    The `format' method is often used in `switch' statements for dealing
 485 with different kinds of instructions.  For example, the following code
 486 computes the number of source operands in an instruction (instead of
 487 using the `num_srcs' method):
 488
 489      instruction *i;
 490      switch (i->format()) {
 491          case inf_rrr: {
 492              n += 2;
 493              break;
 494          }
 495          case inf_bj: {
 496              n += 1;
 497              break;
 498          }
 499          case inf_cal: {
 500              in_cal *cali = (in_cal *)i;
 501              n += 1 + cali->num_args();
 502              break;
 503          }
 504          case inf_array: {
 505              in_array *arri = (in_array *)i;
 506              n += 2 + 2 * arri->dims();
 507              break;
 508          }
 509          case inf_mbr: {
 510              n += 1;
 511              break;
 512          }
 513          case inf_gen: {
 514              in_gen *geni = (in_gen *)i;
 515              n += geni->num_srcs();
 516              break;
 517          }
 518          default: {
 519              /* other formats (inf_ldc and inf_lab) have no source operands */
 520              break;
 521          }
 522      }
 523
 524    The opcodes and instruction formats are defined in the `opcodes.h'
 525 and `opcodes.cc' files.
 526
 527 \x1f
 528 File: suif1.info,  Node: Destination Operands,  Next: Result Types,  Prev: Opcodes and Formats,  Up: Basic Features
 529
 530 Destination Operands
 531 --------------------
 532
 533    The base `instruction' class includes a field for the destination
 534 operand, even though not all instructions use it.  The `dst_op' method
 535 returns the value of this operand.  If an instruction does not produce
 536 a result, the destination operand should always be a null operand.
 537 Moreover, the destination may be null even if an instruction does
 538 produce a result.  In that case, the result value is computed and then
 539 discarded.
 540
 541    If the destination operand is a variable, its type must be compatible
 542 with the result type of the instruction.
 543
 544    If the result of an instruction is a temporary value, the destination
 545 operand is a pointer to the instruction where the value is used.  Note
 546 that this is different than an instruction pointer in a source operand.
 547 The source operands of an instruction point to the children in an
 548 expression tree, while the destination operand points to the parent
 549 instruction.
 550
 551    The `set_dst' method may be used to set the destination operand for
 552 an instruction.  However, there are some important restrictions on using
 553 this method:
 554
 555    * For derived classes with instructions that never produce result
 556      values, it is illegal to call `set_dst'.  This allows the library
 557      to ensure that the destination will always be a null operand (1).
 558      Trying to use `set_dst' for one of these instructions will cause an
 559      error.
 560
 561    * You may not set the destination operand to point to another
 562      instruction; trying to do so will cause an error.  This is to
 563      prevent inconsistent pointers.  The destination should only point
 564      to another instruction if that other instruction has a source
 565      operand that points back to it.  Thus, the library automatically
 566      sets the destination operand when an instruction is used as a
 567      source operand.
 568
 569    * If the destination operand already contains a pointer to another
 570      instruction, you cannot use `set_dst' to overwrite it.  This also
 571      causes a run-time error.  As in the previous case, this is to
 572      prevent inconsistent pointers.  Instead, you must first call the
 573      instruction's `remove' method to clear the destination operand.
 574      *Note Source Operands::.
 575
 576    ---------- Footnotes ----------
 577
 578    (1) Note that for other classes it is your responsibility to make
 579 sure that the destination is null if the instruction does not produce a
 580 result.
 581
 582 \x1f
 583 File: suif1.info,  Node: Result Types,  Next: Parent Tree Nodes,  Prev: Destination Operands,  Up: Basic Features
 584
 585 Result Types
 586 ------------
 587
 588    Each instruction also has a field to specify the result type.  The
 589 `result_type' and `set_result_type' methods may be used to access this
 590 field.  If an instruction produces a result value (regardless of
 591 whether the result is actually used), the type of the result must be
 592 indicated.  Otherwise, the result type should be the SUIF `void' type.
 593
 594    If the instruction produces a value, the result type must be an
 595 object type (i.e. not a function type) with non-zero size and that size
 596 must be known at compile time.  This restriction is to exclude arrays
 597 with unknown or symbolic bounds, the only types without known sizes.
 598
 599 \x1f
 600 File: suif1.info,  Node: Parent Tree Nodes,  Next: Source Operands,  Prev: Result Types,  Up: Basic Features
 601
 602 Parent Tree Nodes
 603 -----------------
 604
 605    Each instruction automatically records the `tree_instr' node to
 606 which it is attached.  The `parent' method retrieves this pointer.  In
 607 an expression tree, all of the instructions share the same parent
 608 `tree_instr' node, so an instruction may not be directly connected to
 609 its parent node.
 610
 611    The `owner' method is identical to the `parent' method except for
 612 instructions that are used as operands of `tree_for' nodes.  Those
 613 operands are actually attached to dummy copy instructions on lists
 614 attached to the `tree_for' nodes (*note For Nodes::.), but many users
 615 will want to treat them as if they are directly attached to the
 616 `tree_for' nodes.  Thus, for an instruction in a `tree_for' operand,
 617 the `owner' method will return a pointer to the `tree_for' node instead
 618 of returning the actual parent `tree_instr'.  This only works if the
 619 `tree_for' operand list contains a single expression (not a flat list);
 620 otherwise, the actual `tree_instr' parent is returned.
 621
 622 \x1f
 623 File: suif1.info,  Node: Source Operands,  Next: Printing Methods,  Prev: Parent Tree Nodes,  Up: Basic Features
 624
 625 Source Operands
 626 ---------------
 627
 628    Even though the base `instruction' class does not contain any source
 629 operand fields, it does provide several methods to access the source
 630 operands in derived classes.  First, the `num_srcs' method returns the
 631 number of source operands in a particular instruction.  Once you know
 632 the number of sources, you can access them by number.  This is
 633 frequently useful when you need to visit all of the source operands
 634 without concern for how they are used in the instruction.  The `src_op'
 635 and `set_src_op' methods provide this access.  The operands are
 636 numbered beginning from zero (like arrays in C).  Specifying an operand
 637 number that does not exist will cause an error.  Do not use these
 638 methods to access a particular operand field in an instruction; the
 639 operand numbering used here is implementation-defined and subject to
 640 change.
 641
 642    The `src_map' method provides an alternate way to visit all of the
 643 source operands.  It applies a function that you specify to all of the
 644 source operands.  This can be used to implement recursive descents of
 645 expression trees by making the mapped function call `src_map' if the
 646 operand is an instruction.  This is similar to what the `instr_map'
 647 method in the `tree_instr' class does.  *Note Instruction Nodes::.  The
 648 mapped function must match the `src_map_f' type which has three
 649 parameters: a pointer to the instruction, a pointer to a copy of the
 650 operand, and a `void*' value used to pass any other information needed
 651 by the function.  If the function returns `TRUE', the copy of the
 652 operand will be assigned to the actual operand field within the
 653 instruction.  If it returns `FALSE', the source operand field is not
 654 modified.
 655
 656    The SUIF library requires that instruction pointers in operands be
 657 consistent.  That is, a source operand may only point to an instruction
 658 if the destination of that instruction points back to where the result
 659 value is used.  Because of this, you cannot simply overwrite a source
 660 operand that contains an instruction pointer.  Doing so would leave the
 661 other instruction with an inconsistent destination operand.  Instead,
 662 you must first use the source operand's `remove' method.  This clears
 663 the instruction pointers in both the source and destination operands.
 664 If the source instruction was part of an expression tree (i.e. it
 665 didn't have its own parent `tree_instr' node), `remove' also clears its
 666 `parent' pointer by calling the parent's `remove_instr' method (*note
 667 Instruction Nodes::.).  If the source operand is not an instruction,
 668 the `remove' method does nothing so it is always safe to use it.  The
 669 `instruction' class also has a `remove' method which does the same
 670 thing.
 671
 672    As an example, the following code clears all of the source operands
 673 in an instruction:
 674
 675      for (int n = 0; n < i->num_srcs(); n++) {
 676          i->src_op(n).remove();
 677          i->set_src_op(n, operand());
 678      }
 679
 680 \x1f
 681 File: suif1.info,  Node: Printing Methods,  Prev: Source Operands,  Up: Basic Features
 682
 683 Printing Methods
 684 ----------------
 685
 686    Instructions can be printed out as text using the `print' method.
 687 The optional `depth' parameter specifies the indentation level for the
 688 output.  The `elab' and `en' parameters are also optional and are only
 689 used when printing sub-expressions.  Most users should never need to
 690 worry about these extra parameters.  The `elab' parameter is the number
 691 for the label attached to the instruction, and the `en' parameter is
 692 the next number to be used for labeling subexpressions.
 693
 694 \x1f
 695 File: suif1.info,  Node: Operands,  Next: Expression Trees,  Prev: Basic Features,  Up: Instructions
 696
 697 Operands
 698 ========
 699
 700    Most SUIF instructions have operands that are represented by objects
 701 of the `operand' class, which is implemented in the files `operand.h'
 702 and `operand.cc'.  An operand may have three different kinds of values:
 703
 704    * It may be a null operand.  The `is_null' method tests for this
 705      case.  Null operands occur when an operand field in an instruction
 706      is unused.
 707
 708    * An operand may refer directly to the symbol for a variable.  The
 709      `is_symbol' method tests for this case.  The `symbol' and
 710      `set_symbol' methods access the symbol field.  A variable in a
 711      source operand indicates that the contents of the variable are
 712      used.  In a destination operand, a variable is assigned the result
 713      of the instruction.
 714
 715    * An operand may also point to another instruction.  The `is_instr'
 716      method tests for this, and the `instr' and `set_instr' methods
 717      access the pointer field.  This is used to implement expression
 718      trees and, in low-SUIF, to refer to the results of other
 719      instructions in a flat list.  An instruction pointer in a source
 720      operand means that the result of that instruction is used as the
 721      operand.  Conversely, an instruction pointer in a destination
 722      operand indicates that the result value is used by that
 723      instruction.
 724
 725    The `is_expr' method is very similar to `is_instr'.  Besides
 726 checking if the operand holds an instruction pointer, it also tests if
 727 that instruction is a subexpression that is not contained in a separate
 728 `tree_instr' node (i.e. it's not in a flat list).  This method should
 729 not be used for destination operands.
 730
 731    Before changing a source operand from an instruction pointer to some
 732 other value, the instruction must be removed.  The `remove' method
 733 checks if the operand is an instruction, and if so, calls the
 734 instruction's `remove' method.  *Note Source Operands::.
 735
 736    The operand class includes several methods to simplify common
 737 operations.  One frequently needs to know the type of a particular
 738 operand.  This can be determined from the variable's type or the
 739 instruction's result type, but checking for these different kinds of
 740 operands and then extracting the type is cumbersome.  Instead, you can
 741 use the `type' method which performs this operation for you.  It
 742 returns the SUIF `void' type for null operands.  Another common
 743 operation is testing an operand to see if it contains an integer
 744 constant from a load constant instruction.  The `is_const_int' method
 745 checks if this is the case, and if so, it also returns the value of the
 746 constant.
 747
 748    Operands have two different methods for printing themselves out to a
 749 file as text.  The `print' method is used by the library when printing
 750 instructions.  Because source and destination operands are handled
 751 differently, `print' requires that you specify the instruction that
 752 contains the operand so that it can determine how the operand is being
 753 used.  However, when debugging a program, you may not know which
 754 instruction contains an operand.  If you do know that it is used as a
 755 source operand, you can use the `print_source' method to print it
 756 without specifying the instruction.
 757
 758 \x1f
 759 File: suif1.info,  Node: Expression Trees,  Next: Three Operand Instructions,  Prev: Operands,  Up: Instructions
 760
 761 Expression Trees
 762 ================
 763
 764    SUIF supports both expression trees and flat lists of instructions.
 765 Procedures are always written out as flat lists.  After a procedure body
 766 has been read from an input file, it can be converted to expression tree
 767 form.  The `read_proc' method for the procedure symbol has a flag that
 768 allows you to specify this; by default it builds the expression trees.
 769 You may also do the conversion manually as described below.  When the
 770 procedure body is written out, it is automatically converted back to
 771 the flat list form.
 772
 773    In the flat list form, each SUIF instruction has its own
 774 `tree_instr' node.  With expression trees, the only change is that all
 775 of the instructions within an expression are grouped under the same
 776 `tree_instr' node.  The instruction pointers in the operand fields do
 777 not change.
 778
 779    Up to now we have implied that high-SUIF uses expression trees and
 780 that low-SUIF uses the flat list representation, but the decision to use
 781 expression trees is actually independent from the structure of the ASTs.
 782 SUIF allows expressions trees to be used even if all of the high-level
 783 control structures have been removed, and conversely the flat list
 784 representation works with all kinds of AST nodes (although accessing the
 785 `tree_for' operands is a little awkward).  In fact, the two
 786 representations can even be mixed in the same procedure!
 787
 788    Some caution is needed here.  These two representations are not
 789 equivalent.  An expression tree only specifies a partial order for the
 790 evaluation of the instructions in the tree, whereas a flat list is a
 791 total ordering.  Similarly, while flat lists allow you to intermix the
 792 instructions from different expressions, with expression trees each
 793 expression must be completely evaluated before proceeding to execute the
 794 instructions for the next expression.  Thus, flat lists give you precise
 795 control of the execution order for the instructions.  Because of this,
 796 if you reorder the instructions in a flat list and then try to convert
 797 to expression trees, the library may complain and promote temporaries to
 798 variables in order to maintain the equivalent semantics.
 799
 800    Methods are provided in the `tree_node_list' class to convert back
 801 and forth between expression trees and flat lists.  The `flatten'
 802 method converts all of the expression trees in the list and its child
 803 lists to flat lists of instructions.  This is simply a matter of
 804 creating new `tree_instr' nodes to hold the subexpressions and
 805 inserting them into the lists.
 806
 807    The `cvt_to_trees' method converts a tree node list to expression
 808 tree form.  This does not actually change the instructions, but rather
 809 collects the instructions within an expression under a single
 810 `tree_instr' node.  Only contiguous instructions can be converted into
 811 an expression tree.  If other instructions from outside the expression
 812 are intermixed, building the expression tree may change the behavior of
 813 the code.  Thus, if this occurs, the library will print a warning
 814 message and split up the expression tree so that the original semantics
 815 are preserved.
 816
 817 \x1f
 818 File: suif1.info,  Node: Three Operand Instructions,  Next: Branch and Jump Instructions,  Prev: Expression Trees,  Up: Instructions
 819
 820 Three Operand Instructions
 821 ==========================
 822
 823    The vast majority of SUIF instructions are represented by the
 824 `in_rrr' class which includes fields for two source operands.  Not all
 825 of the operands have to be used.  For example, `nop' instructions don't
 826 use any operands at all.
 827
 828    The `src1_op' and `src2_op' methods retrieve the source operands,
 829 and the `set_src1' and `set_src2' methods assign new operands.  Besides
 830 the methods to directly access the two source operands, other methods
 831 are provided to refer to these sources according to their uses with
 832 specific opcodes:
 833
 834    * Shift and rotate instructions (`asr', `lsl', `lsr', and `rot') may
 835      use the `shift_cnt_op' and `set_shift_cnt_op' methods to access
 836      the operand that specifies the numbers of bits to shift or rotate.
 837      The other source operand may be accessed with the `src_op' and
 838      `set_src' methods.
 839
 840    * Store (`str') and memory copy (`memcpy') instructions may use the
 841      `dst_addr_op' and `set_dst_addr_op' methods to access the source
 842      operand that holds the destination address for the memory
 843      reference.
 844
 845    * Load (`lod') and memory copy (`memcpy') instructions may use the
 846      `src_addr_op' and `set_src_addr_op' methods to access the operand
 847      that specifies the source address for the memory reference.  Note
 848      that the actual operand field that is used is different for these
 849      two opcodes.
 850
 851    * Instructions that only use one of the source operands may use the
 852      `src_op' and `set_src' methods without specifying whether the
 853      first or second operand field is used.  In addition, a store
 854      (`str') instruction may use these methods to refer to the operand
 855      holding the value to be stored.  They are also used by shift and
 856      rotate instructions as described above.
 857
 858 Using these opcode-specific methods with the wrong opcodes will cause
 859 errors.
 860
 861    For your convenience, the `in_rrr' class provides a method,
 862 `is_unary', to determine if an instruction produces a result value
 863 using only one of the source operands.  It does not consider return
 864 (`ret') instructions to be unary because they do not produce result
 865 values.  Although load (`lod') instructions fit the pattern, they are a
 866 special case and are not considered to be unary.
 867
 868    The `is_commutative' method checks the opcode of an instruction to
 869 determine if it is a commutative operation.  If so, the two source
 870 operands should be interchangeable.
 871
 872    Some of the arithmetic instructions may generate run-time exceptions
 873 if the appropriate class of exceptions is enabled (*note Miscellaneous
 874 Annotes::.).  If the exceptions are not enabled, the rules for ANSI C
 875 are used to determine the result.
 876
 877    The following table lists all of the three operand SUIF instructions.
 878 Each entry describes the operands used with that opcode and includes any
 879 restrictions on the operand and result types.
 880
 881 `nop'
 882      Do nothing at all.  All of the operands for these instructions
 883      should be null, and the result type should be the SUIF `void' type.
 884
 885 `lod'
 886      Load the value at the address contained in the `src1' operand and
 887      put it in the `dst' operand.  The result type indicates the type
 888      of the value being loaded and may be any type, subject to the usual
 889      restrictions on a result type (*note Result Types::.).  The type
 890      of the expression in `src1' must be a pointer to the result type.
 891      The `src2' operand is not used.
 892
 893 `str'
 894      Store the value in the `src2' operand at the address contained in
 895      the `src1' operand.  Both operands must be specified.  There is no
 896      special restriction on the type of the `src2' operand, though the
 897      restrictions on instruction result types (*note Result Types::.)
 898      and variables (*note Variable Symbols::.) guarantee it will have a
 899      known, non-zero size.  The `src1' operand should contain an
 900      expression that is a pointer to the type of the operand being
 901      stored.  The `dst' operand is not used.
 902
 903 `memcpy'
 904      Memory to memory copy.  Load the value from the address in the
 905      `src2' operand and store it at the address in the `src1' operand.
 906      The type of the object to be copied is subject to the same
 907      conditions as the result type of an instruction (*note Result
 908      Types::.), so it must have known, non-zero size.  Both of the
 909      source operands must be pointers to this object type.  The `dst'
 910      operand is not used.
 911
 912 `cpy'
 913      Copy the `src1' operand to the `dst' operand.  The `src2' operand
 914      is not used.  The result type must be the same as the type of the
 915      source operand.  The restrictions on instruction result types
 916      (*note Result Types::.) guarantee that the object being copied has
 917      known, non-zero size.
 918
 919 `cvt'
 920      Convert the `src1' operand to the result type and put it in the
 921      `dst' operand.  The `src2' operand is not used.  Nothing can can
 922      be converted to or from a `struct', `union', `array', or `void'
 923      type.  Pointer types can only be converted to and from integer
 924      types and other pointer types.
 925
 926 `neg'
 927      Negation.  Change the sign of the value in the `src1' operand and
 928      put the result in the `dst' operand.  The `src2' operand is
 929      unused.  The result type and the type of the operand must be the
 930      same signed integer or floating-point type.
 931
 932 `add'
 933      Add the values in the `src1' and `src2' operands and put the
 934      result in the `dst' operand.  Except for pointer additions, the
 935      result type and the types of the operands must be the same integer
 936      or floating-point types.  Pointer addition is a special case.  One
 937      of the source operands may have a pointer type, as long as the
 938      other source operand has signed or unsigned integer type of any
 939      size; the result type must also be a pointer type, but need not be
 940      the same as the source pointer type.
 941
 942 `sub'
 943      Subtract the value in the `src2' operand from the value in the
 944      `src1' operand and put the result in the `dst' operand.  Except
 945      for pointer subtractions, the result type and the types of the
 946      operands must be the same integer or floating-point types.  There
 947      are two special cases for pointer subtractions.  In either case,
 948      the `src1' operand must have a pointer type.  First, the `src2'
 949      operand may have any integer type, in which case the result type
 950      may be any pointer type, not necessarily the same as the source
 951      pointer type.  Second, the `src2' operand's type may be another
 952      pointer, in which case the result type must be type_ptr_diff.
 953
 954 `mul'
 955 `div'
 956      Multiply or divide the value in the `src1' operand by the value in
 957      the `src2' operand and put the result in the `dst' operand.  The
 958      result type and the types of the operands must be the same integer
 959      or floating-point type.  Integer multiplication and division are
 960      defined according to the rules for ANSI C.
 961
 962 `rem'
 963 `mod'
 964      Remainder and modulus.  These two instructions are very similar.
 965      Both divide the value in the `src1' operand by the value in the
 966      `src2' operand to find the remainder or modulus.  The `rem'
 967      instruction is identical to the modulus operator in ANSI C.  That
 968      is, if either source operand is negative, the sign of the result is
 969      undefined and depends on the semantics of integer division.  The
 970      `mod' instruction is the same except that its result is always
 971      guaranteed to be positive.  The result type and the types of the
 972      destination and source operands must be the same integer type.
 973
 974 `not'
 975      Bit-wise inversion.  Compute the one's complement negation of the
 976      value in the `src1' operand and put the result in the `dst'
 977      operand.  The `src2' operand is not used.  The result type and the
 978      types of the operand must be the same unsigned integer type.
 979
 980 `and'
 981 `ior'
 982 `xor'
 983      Compute the bit-wise AND, inclusive OR, or exclusive OR of the
 984      values in the `src1' and `src2' operands and put the result in the
 985      `dst' operand.  The result type and the type of the operands must
 986      be the same unsigned integer type.
 987
 988 `asr'
 989 `lsr'
 990 `lsl'
 991      Shift the value in the `src1' operand right or left by the amount
 992      specified in the `src2' operand.  The variable in the `src2'
 993      operand must always have an unsigned integer type.  The `asr'
 994      instruction performs sign extension and requires that the result
 995      type and the type `src1' operand be the same signed integer type.
 996      The `lsr' instructions does not perform sign extension and requires
 997      that the result type and type of the `src1' operand be the same
 998      unsigned integer type.  Sign extension is not an issue for left
 999      shifts, so the `lsl' instruction only requires that the result
1000      type and the type of the `src1' operand be the same integer type.
1001
1002 `divfloor'
1003 `divceil'
1004      Division combined with floor and ceiling operations.  The
1005      `divfloor' opcode means take the rational quotient of the `src1'
1006      operand by the `src2' operand and apply the floor operation (i.e.
1007      round to the nearest integer less than or equal to the quotient).
1008      The `divceil' opcode means take the rational quotient of the
1009      `src1' operand by the `src2' operand and apply the ceiling
1010      operation (i.e. round to the nearest integer greater than or equal
1011      to the quotient).  The result type and the operand types must be
1012      the same integer type.
1013
1014 `min'
1015 `max'
1016      Minimum and maximum.  The result value is the minimum or maximum,
1017      respectively, of the two source operands.  The result type and the
1018      operand types must be the same integer or floating-point types.
1019
1020 `abs'
1021      Absolute value.  Compute the absolute value of the `src1' operand.
1022      The `src2' operand is unused.  The result type and the source
1023      operand type must be the same integer or floating-point type.
1024
1025 `rot'
1026      Rotate the value in the `src1' operand left or right by the amount
1027      specified in the `src2' operand.  The variable in the `src2'
1028      operand must always have a signed integer type.  If the shift
1029      amount is positive, the value is rotated to the left (toward the
1030      most-significant bit); if it is negative, the value is rotated to
1031      the right.  The result type and the type of the `src1' operand must
1032      be the same integer type.
1033
1034 `seq'
1035 `sne'
1036 `sl'
1037 `sle'
1038      Comparison instructions.  If the `src1' operand is equal, not
1039      equal, less than, or less than or equal, respectively, to the
1040      `src2' operand, assign the integer value one to the `dst' operand.
1041      Otherwise, set the `dst' operand to zero.  The result type must
1042      always be a signed integer type.  The source operands must have
1043      integer, enumerated, floating point, or pointer type.
1044
1045 `ret'
1046      Return from a procedure.  Only the `src1' operand is used and it
1047      is optional.  If specified, it is the return value and may contain
1048      an operand of any type except array or function types.  If the
1049      procedure's function type has void return type, the operand must be
1050      null; otherwise the operand must not be null and must have the same
1051      type as the return type of the procedure.
1052
1053 `mrk'
1054      This pseudo-instruction only marks a position in the program and
1055      is used to hold miscellaneous annotations such as line numbers.
1056      It is functionally equivalent to a `nop' instruction.  All of the
1057      operands for these instructions should be null, and the result type
1058      should be the SUIF `void' type.
1059
1060 \x1f
1061 File: suif1.info,  Node: Branch and Jump Instructions,  Next: Load Constant Instructions,  Prev: Three Operand Instructions,  Up: Instructions
1062
1063 Branch and Jump Instructions
1064 ============================
1065
1066    The `in_bj' class is used to hold branch and jump instructions.
1067 This class includes a field to specify the label symbol at the branch
1068 target and an optional source operand that is used for conditional
1069 branches.  The `target' and `set_target' methods are used to access the
1070 label symbol for the branch target.  This label must be defined in the
1071 scope where the branch occurs, and it must be within the same
1072 procedure.  The `src_op' and `set_src_op' methods access the optional
1073 source operand field.  The `dst' operand is unused for all branch and
1074 jump instructions and the result type should always be the SUIF `void'
1075 type.  The branch and jump opcodes are described below:
1076
1077 `jmp'
1078      Unconditional jump.  The source operand is unused.  The flow of
1079      control is unconditionally transferred to the code at the target
1080      label.
1081
1082 `btrue'
1083      Branch if true.  If the source operand, which must have an signed
1084      integer type, contains a true (non-zero) value, the flow of
1085      control is transferred to the code at the target label.
1086      Otherwise, it continues with the next instruction in sequential
1087      order.
1088
1089 `bfalse'
1090      Branch if false.  If the source operand contains a false (zero)
1091      value, the flow of control is transferred to the code at the
1092      target label.  Otherwise, it continues with the next instruction
1093      in sequential order.  The source operand must have an signed
1094      integer type.
1095