docs/BugpointRedesign.md

   1 # Bugpoint Redesign
   2 Author: Diego Treviño (diegotf@google.com)
   3
   4 Date: 2019-06-05
   5
   6 Status: Draft
   7
   8
   9 ## Introduction
  10 As use of bugpoint has grown several areas of improvement have been identified
  11 through years of use: confusing to use, slow, it doesn’t always produce high
  12 quality test cases, etc. This document proposes a new approach with a narrower
  13 focus: minimization of IR test cases.
  14
  15
  16 ## Proposed New Design
  17
  18
  19 ### Narrow focus: test-case reduction
  20 The main focus will be a code reduction strategy to obtain much smaller test
  21 cases that still have the same property as the original one. This will be done
  22 via classic delta debugging and by adding some IR-specific reductions (e.g.
  23 replacing globals, removing unused instructions, etc), similar to what
  24 already exists, but with more in-depth minimization.
  25
  26
  27 Granted, if the community differs on this proposal, the legacy code could still
  28 be present in the tool, but with the caveat of still being documented and
  29 designed towards delta reduction.
  30
  31
  32 ### Command-Line Options
  33 We are proposing to reduce the plethora of bugpoint’s options to just two: an
  34 interesting-ness test and the arguments for said test, similar to other delta
  35 reduction tools such as CReduce, Delta, and Lithium; the tool should feel less
  36  cluttered, and there should also be no uncertainty about how to operate it.
  37
  38
  39 The interesting-ness test that’s going to be run to reduce the code is given
  40 by name:
  41         `--test=<test_name>`
  42 If a `--test`  option is not given, the program exits; this option is similar
  43 to bugpoint’s current `-compile-custom` option, which lets the user run a
  44 custom script.
  45
  46
  47 The interesting-ness test would be defined as a script that returns 0 when the
  48 IR achieves a user-defined behaviour (e.g. failure to compile on clang) and a
  49 nonzero value when otherwise. Leaving the user the freedom to determine what is
  50 and isn’t interesting to the tool, and thus, streamlining the process of
  51 reducing a test-case.
  52
  53
  54 If the test accepts any arguments (excluding the input ll/bc file), they are
  55 given via the following flag:
  56         `--test_args=<test_arguments>`
  57 If unspecified, the test is run as given. It’s worth noting that the input file
  58 would be passed as a parameter to the test, similar how `-compile-custom`
  59 currently operates.
  60
  61
  62 ### Implementation
  63 The tool would behave similar to CReduce’s functionality in that it would have a
  64 list of passes that try to minimize the given test-case. We should be able to
  65 modularize the tool’s behavior, as well as making it easier to maintain and
  66 expand.
  67
  68
  69 The first version of this redesign would try to:
  70
  71
  72 * Discard functions, instructions and metadata that don’t influence the
  73   interesting-ness test
  74 * Remove unused parameters from functions
  75 * Eliminate unvisited conditional paths
  76 * Rename variables to more regular ones (such as “a”, “b”, “c”, etc.)
  77
  78
  79 Once these passes are implemented, more meaningful reductions (such as type
  80 reduction) would be added to the tool, to even further reduce IR.
  81
  82
  83 ## Background on historical bugpoint issues
  84
  85
  86 ### Root Cause Analysis
  87 Presently, bugpoint takes a long time to find the source problem in a given IR
  88 file, mainly due to the fact that it tries to debug the input by running
  89 various strategies to classify the bug, which in turn run multiple optimizer
  90 and compilation passes over the input, taking up a lot of time. Furthermore,
  91 when the IR crashes, it tries to reduce it by performing some sub-optimal
  92 passes (e.g. a lot of unreachable blocks), and sometimes even fails to minimize
  93 at all.
  94
  95
  96 ### "Quirky" Interface
  97 Bugpoint’s current interface overwhelms and confuses the user, the help screen
  98 alone ends up confusing rather providing guidance. And, not only are there
  99 numerous features and options, but some of them also work in unexpected ways
 100 and most of the time the user ends up using a custom script. Pruning and
 101 simplifying the interface will be worth considering in order to make the tool
 102 more useful in the general case and easier to maintain.