cabal-testsuite/README.md

   1 cabal-testsuite is a suite of integration tests for Cabal-based
   2 frameworks.
   3
   4 How to run
   5 ----------
   6
   7 1. Build `cabal-tests` (`cabal new-build cabal-tests`)
   8 2. Run the `cabal-tests` executable. It will scan for all tests
   9    in your current directory and subdirectories and run them.
  10    To run a specific set of tests, use `cabal-tests PATH ...`.  You can
  11    control parallelism using the `-j` flag.
  12
  13 There are a few useful flags:
  14
  15 * `--with-cabal PATH` can be used to specify the path of a
  16   `cabal-install` executable.  IF YOU DO NOT SPECIFY THIS FLAG,
  17   CABAL INSTALL TESTS WILL NOT RUN.
  18
  19 * `--with-ghc PATH` can be used to specify an alternate version of
  20   GHC to ask the tests to compile with.
  21
  22 * `--builddir DIR` can be used to manually specify the dist directory
  23   that was used to build `cabal-tests`; this can be used if
  24   the autodetection doesn't work correctly (which may be the
  25   case for old versions of GHC.)
  26
  27 How to write
  28 ------------
  29
  30 If you learn better by example, just look at the tests that live
  31 in `cabal-testsuite/PackageTests`; if you `git log -p`, you can
  32 see the full contents of various commits which added a test for
  33 various functionality.  See if you can find an existing test that
  34 is similar to what you want to test.
  35
  36 Otherwise, here is a walkthrough:
  37
  38 1. Create the package(s) that you need for your test in a
  39    new directory.  (Currently, tests are stored in `PackageTests`
  40    and `tests`; we might reorganize this soon.)
  41
  42 2. Create one or more `.test.hs` scripts in your directory, using
  43    the template:
  44    ```
  45    import Test.Cabal.Prelude
  46    main = setupAndCabalTest $ do
  47        -- your test code here
  48    ```
  49
  50    `setupAndCabal` test indicates that invocations of `setup`
  51    should work both for a raw `Setup` script, as well as
  52    `cabal-install` (if your test works only for one or the
  53    other, use `setupTest` or `cabalTest`).
  54
  55    Code runs in the `TestM` monad, which manages some administrative
  56    environment (e.g., the test that is running, etc.)
  57    `Test.Cabal.Prelude` contains a number of useful functions
  58    for testing implemented in this monad, including the functions `cabal`
  59    and `setup` which let you invoke those respective programs.  You should
  60    read through that file to get a sense for what capabilities
  61    are possible (grep for use-sites of functions to see how they
  62    are used).  If you don't see something anywhere, that's probably
  63    because it isn't implemented. Implement it!
  64
  65 3. Run your tests using `cabal-tests` (no need to rebuild when
  66    you add or modify a test; it is automatically picked up.)
  67    The first time you run a test, assuming everything else is
  68    in order, it will complain that the actual output doesn't match
  69    the expected output.  Use the `--accept` flag to accept the
  70    output if it makes sense!
  71
  72 We also support a `.multitest.hs` prefix; eventually this will
  73 allow multiple tests to be defined in one file but run in parallel;
  74 at the moment, these just indicate long running tests that should
  75 be run early (to avoid straggling.)
  76
  77 Frequently asked questions
  78 --------------------------
  79
  80 For all of these answers, to see examples of the functions in
  81 question, grep the test suite.
  82
  83 **Why isn't some output I added to Cabal showing up in the recorded
  84 test output?** Only "marked" output is picked up by Cabal; currently,
  85 only `notice`, `warn` and `die` produce marked output.  Use those
  86 combinators for your output.
  87
  88 **How do I safely let my test modify version-controlled source files?**
  89 Use `withSourceCopy`.  Note that you MUST `git add`
  90 all files which are relevant to the test; otherwise they will not be
  91 available when running the test.
  92
  93 **How can I add a dependency on a package from Hackage in a test?**
  94 By default, the test suite is completely independent of the contents
  95 of Hackage, to ensure that it keeps working across all GHC versions.
  96 If possible, define the package locally.  If the package needs
  97 to be from Hackage (e.g., you are testing the global store code
  98 in new-build), use `withRepo "repo"` to initialize a "fake" Hackage with
  99 the packages placed in the `repo` directory.
 100
 101 **How do I run an executable that my test built?** The specific
 102 function you should use depends on how you built the executable:
 103
 104 * If you built it using `Setup build`, use `runExe`
 105 * If you installed it using `Setup install` or `cabal install`, use
 106   `runInstalledExe`.
 107 * If you built it with `cabal new-build`, use `runPlanExe`; note
 108   that you will need to run this inside of a `withPlan` that is
 109   placed *after* you have invoked `new-build`.  (Grep
 110   for an example!)
 111
 112 **How do I turn of accept tests? My test output wobbles to much.**
 113 Use `recordMode DoNotRecord`.  This should be a last resort; consider
 114 modifying Cabal so that the output is stable.  If you must do this, make
 115 sure you add extra, manual tests to ensure the output looks like what
 116 you expect.
 117
 118 **How can I manually test for a string in output?**  Use the hyphenated
 119 variants of a command (e.g., `cabal'` rather than `cabal`) and use
 120 `assertOutputContains`.  Note that this will search over BOTH stdout
 121 and stderr.
 122
 123 **How do I skip running a test in some environments?**  Use the
 124 `skipIf` and `skipUnless` combinators.  Useful parameters to test
 125 these with include `hasSharedLibraries`, `hasProfiledLibraries`,
 126 `hasCabalShared`, `ghcVersionIs`, `isWindows`, `isLinux`, `isOSX`
 127 and `hasCabalForGhc`.
 128
 129 **I programatically modified a file in my test suite, but Cabal/GHC
 130 doesn't seem to be picking it up.**  You need to sleep sufficiently
 131 long before editing a file, in order for file system timestamp
 132 resolution to pick it up.  Use `withDelay` and `delay` prior to
 133 making a modification.
 134
 135 **How do I mark a test as broken?**  Use `expectBroken`, which takes
 136 the ticket number as its first argument.  Note that this does NOT
 137 handle accept-test brokenness, so you will have to add a manual
 138 string output test, if that is how your test is "failing."
 139
 140 Hermetic tests
 141 --------------
 142
 143 By default, we run tests directly on the source code that is checked into the
 144 source code repository.  However, some tests require programatically
 145 modifying source files, or interact with Cabal commands which are
 146 not hermetic (e.g., `cabal freeze`).  In this case, cabal-testsuite
 147 supports opting into a hermetic test, where we first make copy of all
 148 the relevant source code before starting the test.  You can opt into
 149 this mode using the 'withSourceCopy' combinator (search for examples!)
 150 This mode is subject to the following limitations:
 151
 152 * You must be running the test inside a valid Git checkout of the test
 153   suite (`withSourceCopy` uses Git to determine which files should be copied.)
 154
 155 * You must `git add` all files which are relevant to the test, otherwise
 156   they will not be copied.
 157
 158 * The source copy is still made at a well-known location, so running
 159   a test is still not reentrant. (See also Known Limitations.)
 160
 161 Design notes
 162 ------------
 163
 164 This is the second rewrite of the integration testing framework.  The
 165 primary goal was to use Haskell as the test language (letting us take
 166 advantage of a real programming language, and use utilities provided to
 167 us by the Cabal library itself), while at the same time compensating for
 168 two perceived problems of pure-Haskell test suites:
 169
 170 * Haskell test suites are generally compiled before they run
 171   (for example, this is the modus operandi of `cabal test`).
 172   In practice, this results in a long edit-recompile cycle
 173   when working on tests. This hurts a lot when you would
 174   like to experimentally edit a test when debugging an issue.
 175
 176 * Haskell's metaprogramming facilities (e.g., Template Haskell)
 177   can't handle dynamically loading modules from the file system;
 178   thus, there ends up being a considerable amount of boilerplate
 179   needed to "wire" up test cases to the central test runner.
 180
 181 Our approach to address these issues is to maintain Haskell test scripts
 182 as self-contained programs which are run by the GHCi interpreter.
 183 This is not altogether trivial, and so there are a few important
 184 technical innovations to make this work:
 185
 186 * Unlike a traditional test program which can be built by the Cabal
 187   build system, these test scripts must be interpretable at
 188   runtime (outside of the build system.)  Our approach to handle
 189   this is to link against the same version of Cabal that was
 190   used to build the top-level test program (by way of a Custom
 191   setup linked against the Cabal library under test) and then
 192   use this library to compute the necessary GHC flags to pass
 193   to these scripts.
 194
 195 * The startup latency of `runghc` can be quite high, which adds up
 196   when you have many tests.  To solve this, in `Test.Cabal.Server`
 197   we have an implementation an GHCi server, for which we can reuse
 198   a GHCi instance as we are running test scripts.  It took some
 199   technical ingenuity to implement this, but the result is that
 200   running scripts is essentially free.
 201
 202 Here is the general outline of how the `cabal-tests` program operates:
 203
 204 1. It first loads the cached `LocalBuildInfo` associated with the
 205    host build system (which was responsible for building `cabal-tests`
 206    in the first place.)  This information lets us compute the
 207    flags that we will use to subsequently invoke GHC.
 208
 209 2. We then recursively scan the current working directory, looking
 210    for files suffixed `.test.hs`; these are the test scripts we
 211    will run.
 212
 213 3. For every thread specified via the `-j`, we spawn a GHCi
 214    server, and then use these to run the test scripts until all
 215    test scripts have been run.
 216
 217 The new `cabal-tests` runner doesn't use Tasty because I couldn't
 218 figure out how to get out the threading setting, and then spawn
 219 that many GHCi servers to service the running threads.  Improvements
 220 welcome.
 221
 222 Expect tests
 223 ------------
 224
 225 An expect test is a test where we read out the output of the test
 226 and compare it directly against a saved copy of the test output.
 227 When test output changes, you can ask the test suite to "accept"
 228 the new output, which automatically overwrites the old expected
 229 test output with the new.
 230
 231 Supporting expect tests with Cabal is challenging, because Cabal
 232 interacts with multiple versions of external components (most
 233 prominently GHC) with different variants of their output, and no
 234 one wants to rerun a test on four different versions of GHC to make
 235 sure we've picked up the correct output in all cases.
 236
 237 Still, we'd like to take advantage of expect tests for Cabal's error
 238 reporting.  So here's our strategy:
 239
 240 1. We have a new verbosity flag +markoutput which lets you toggle the emission
 241    of '-----BEGIN CABAL OUTPUT-----' and  '-----END CABAL OUTPUT-----'
 242    stanzas.
 243
 244 2. When someone requests an expect test, we ONLY consider output between
 245    these flags.
 246
 247 The expectation is that Cabal will only enclose output it controls
 248 between these stanzas.  In practice, this just means we wrap die,
 249 warn and notice with these markers.
 250
 251 An added benefit of this strategy is that we can continue operating
 252 at high verbosity by default (which is very helpful for having useful
 253 diagnostic information immediately, e.g. in CI.)
 254
 255 We also need to deal with nondeterminism in test output in some
 256 situations.  Here are the most common ones:
 257
 258 * Dependency solving output on failure is still non-deterministic, due to
 259   its dependence on the global package database.  We're tracking this
 260   in https://github.com/haskell/cabal/issues/4332 but for now, we're
 261   not running expect tests on this output.
 262
 263 * Tests against Custom setup will build against the Cabal that shipped with
 264   GHC, so you need to be careful NOT to record this output (since we
 265   don't control that output.)
 266
 267 * We have some munging on the output, to remove common sources of
 268   non-determinism: paths, GHC versions, boot package versions, etc.
 269   Check normalizeOutput to see what we do.  Note that we save
 270   *normalized* output, so if you modify the normalizer you will
 271   need to rerun the test suite accepting everything.
 272
 273 * The Setup interface gets a `--enable-deterministic` flag which we
 274   pass by default.  The intent is to make Cabal more deterministic;
 275   for example, with this flag we no longer compute a hash when
 276   computing IPIDs, but just use the tag `-inplace`.  You can manually
 277   disable this using `--disable-deterministic` (as is the case with
 278   `UniqueIPID`.)
 279
 280 Some other notes:
 281
 282 * It's good style to put default-language in all your stanzas, so
 283   Cabal doesn't complain about it (that warning is marked!)  Ditto
 284   with cabal-version at the top of your Cabal file.
 285
 286 * If you can't get the output of a test to be deterministic, no
 287   problem: just exclude it from recording and do a manual test
 288   on the output for the string you're looking for.  Try to be
 289   deterministic, but sometimes it's not (easily) possible.
 290
 291 Non-goals
 292 ---------
 293
 294 Here are some things we do not currently plan on supporting:
 295
 296 * A file format for specifying multiple packages and source files.
 297   While in principle there is nothing wrong with making it easier
 298   to write tests, tests stored in this manner are more difficult
 299   to debug with, as they must first be "decompressed" into a full
 300   folder hierarchy before they can be interacted with.  (But some
 301   of our tests need substantial setup; for example, tests that
 302   have to setup a package repository.  In this case, because there
 303   already is a setup necessary, we might consider making things easier here.)
 304
 305 Known limitations
 306 -----------------
 307
 308 * Tests are NOT reentrant: test build products are always built into
 309   the same location, and if you run the same test at the same time,
 310   you will clobber each other.  This is convenient for debugging and
 311   doesn't seem to be a problem in practice.