cabal-testsuite/README.md

   1 cabal-testsuite is a suite of integration tests for Cabal-based
   2 frameworks.
   3
   4 How to run
   5 ----------
   6
   7 1. Build `cabal-testsuite` (`cabal build cabal-testsuite:cabal-tests`)
   8 2. Run the `cabal-tests` executable. It will scan for all tests
   9    in your current directory and subdirectories and run them.
  10    To run a specific set of tests, use `cabal-tests --with-cabal=CABALBIN PATH ...`.
  11    (e.g. `cabal run cabal-testsuite:cabal-tests -- --with-cabal=cabal cabal-testsuite/PackageTests/TestOptions/setup.test.hs`)
  12    You can control parallelism using the `-j` flag.
  13
  14 There are a few useful flags:
  15
  16 * `--with-cabal PATH` can be used to specify the path of a
  17   `cabal-install` executable.  IF YOU DO NOT SPECIFY THIS FLAG,
  18   CABAL INSTALL TESTS WILL NOT RUN.
  19
  20 * `--with-ghc PATH` can be used to specify an alternate version of
  21   GHC to ask the tests to compile with.
  22
  23 * `--builddir DIR` can be used to manually specify the dist directory
  24   that was used to build `cabal-tests`; this can be used if
  25   the autodetection doesn't work correctly (which may be the
  26   case for old versions of GHC.)
  27
  28 ### How to run the doctests
  29
  30 You need to install the `doctest` tool. Make sure it's compiled with your current
  31 GHC, and don't forget to reinstall it every time you switch GHC version:
  32
  33 ``` shellsession
  34 cabal install doctest --overwrite-policy=always --ignore-project
  35 ```
  36
  37 After that you can run doctests for a component of your choice via the following command:
  38
  39 ``` shellsession
  40 cabal repl --with-ghc=doctest --build-depends=QuickCheck --build-depends=template-haskell --repl-options="-w" --project-file="cabal.project.validate" Cabal-syntax
  41 ```
  42
  43 In this example we have run doctests in `Cabal-syntax`. Notice, that some
  44 components have broken doctests
  45 ([#8734](https://github.com/haskell/cabal/issues/8734));
  46 our CI currently checks that `Cabal-syntax` and `Cabal` doctests pass via
  47 `make doctest-install && make doctest` (you can use this `make`-based workflow too).
  48
  49 How to write
  50 ------------
  51
  52 If you learn better by example, just look at the tests that live
  53 in `cabal-testsuite/PackageTests`; if you `git log -p`, you can
  54 see the full contents of various commits which added a test for
  55 various functionality.  See if you can find an existing test that
  56 is similar to what you want to test.
  57
  58 Otherwise, here is a walkthrough:
  59
  60 1. Create the package(s) that you need for your test in a
  61    new directory.
  62    (Currently (2021-10-06), tests are stored in `PackageTests`,
  63    with the exception of one test stored in `tests`.)
  64
  65 2. Create one or more `.test.hs` scripts in your directory, using
  66    the template:
  67    ```haskell
  68    import Test.Cabal.Prelude
  69    main = setupAndCabalTest $ do
  70        -- your test code here
  71    ```
  72
  73    `setupAndCabal` test indicates that invocations of `setup`
  74    should work both for a raw `Setup` script, as well as
  75    `cabal-install` (if your test works only for one or the
  76    other, use `setupTest` or `cabalTest`).
  77
  78    Code runs in the `TestM` monad, which manages some administrative
  79    environment (e.g., the test that is running, etc.).
  80    `Test.Cabal.Prelude` contains a number of useful functions
  81    for testing implemented in this monad, including the functions `cabal`
  82    and `setup` which let you invoke those respective programs.  You should
  83    read through that file to get a sense for what capabilities
  84    are possible (grep for use-sites of functions to see how they
  85    are used).  If you don't see something anywhere, that's probably
  86    because it isn't implemented. Implement it!
  87
  88    To include parts that are supposed to fail (in the sense that a
  89    non-zero exit code is returned), there is the `fails` combinator,
  90    e.g.:
  91    ```haskell
  92    main = cabalTest $ do
  93      fails $ cabal "bad-command" [ "bad", "args" ]
  94      cabal "good-command" [ "good", "args" ]
  95      fails $ cabal "another-bad-one" [ ... ]
  96      ...
  97    ```
  98
  99    The dependencies which your test is allowed to use are listed in the
 100    cabal file under the `test-runtime-deps` executable. At compile-time there is
 101    a custom Setup.hs script which inspects this list and records the versions of
 102    each package in a generated file. These are then used when `cabal-tests` runs
 103    when it invokes `runghc` to run each test.
 104    We ensure they are built and available by listing `test-runtime-deps` in the
 105    build-tool-depends section of the cabal-tests executable.
 106
 107
 108 3. Run your tests using `cabal-tests` (no need to rebuild when
 109    you add or modify a test; it is automatically picked up).
 110    The first time you run a test, assuming everything else is
 111    in order, it will complain that the actual output doesn't match
 112    the expected output.  Use the `--accept` flag to accept the
 113    output if it makes sense!
 114
 115 We also support a `.multitest.hs` prefix; eventually this will
 116 allow multiple tests to be defined in one file but run in parallel;
 117 at the moment, these just indicate long running tests that should
 118 be run early (to avoid straggling).
 119
 120 Frequently asked questions
 121 --------------------------
 122
 123 For all of these answers, to see examples of the functions in
 124 question, grep the test suite.
 125
 126 **Why isn't some output I added to Cabal showing up in the recorded
 127 test output?** Only "marked" output is picked up by Cabal; currently,
 128 only `notice`, `warn` and `die` produce marked output.  Use those
 129 combinators for your output.
 130
 131 **How do I safely let my test modify version-controlled source files?**
 132 Use `withSourceCopy`.  Note that you MUST `git add`
 133 all files which are relevant to the test; otherwise they will not be
 134 available when running the test.
 135
 136 **How can I add a dependency on a package from Hackage in a test?**
 137 By default, the test suite is completely independent of the contents
 138 of Hackage, to ensure that it keeps working across all GHC versions.
 139 If possible, define the package locally.  If the package needs
 140 to be from Hackage (e.g., you are testing the global store code
 141 in new-build), use `withRepo "repo"` to initialize a "fake" Hackage with
 142 the packages placed in the `repo` directory.
 143
 144 **How do I run an executable that my test built?** The specific
 145 function you should use depends on how you built the executable:
 146
 147 * If you built it using `Setup build`, use `runExe`
 148 * If you installed it using `Setup install` or `cabal install`, use
 149   `runInstalledExe`.
 150 * If you built it with `cabal build`, use `runPlanExe`; note
 151   that you will need to run this inside of a `withPlan` that is
 152   placed *after* you have invoked `build`. (Grep for an example!)
 153
 154 **How do I turn off accept tests? My test output wobbles too much.**
 155 Use `recordMode DoNotRecord`.  This should be a last resort; consider
 156 modifying Cabal so that the output is stable.  If you must do this, make
 157 sure you add extra, manual tests to ensure the output looks like what
 158 you expect.
 159
 160 **How can I manually test for a string in output?**  Use the primed
 161 variants of a command (e.g., `cabal'` rather than `cabal`) and use
 162 `assertOutputContains`.  Note that this will search over BOTH stdout
 163 and stderr.
 164
 165 **How do I skip running a test in some environments?**  Use the
 166 `skipIf` and `skipUnless` combinators.  Useful parameters to test
 167 these with include `hasSharedLibraries`, `hasProfiledLibraries`,
 168 `hasCabalShared`, `isGhcVersion`, `isWindows`, `isLinux`, `isOSX`
 169 and `hasCabalForGhc`.
 170
 171 **I programmatically modified a file in my test suite, but Cabal/GHC
 172 doesn't seem to be picking it up.**  You need to sleep sufficiently
 173 long before editing a file, in order for file system timestamp
 174 resolution to pick it up.  Use `withDelay` and `delay` prior to
 175 making a modification.
 176
 177 **How do I mark a test as broken?**  Use `expectBroken`, which takes
 178 the ticket number as its first argument.  Note that this does NOT
 179 handle accept-test brokenness, so you will have to add a manual
 180 string output test, if that is how your test is "failing."
 181
 182 Hermetic tests
 183 --------------
 184
 185 By default, we run tests directly on the source code that is checked into the
 186 source code repository.  However, some tests require programmatically
 187 modifying source files, or interact with Cabal commands which are
 188 not hermetic (e.g., `cabal freeze`).  In this case, cabal-testsuite
 189 supports opting into a hermetic test, where we first make copy of all
 190 the relevant source code before starting the test.  You can opt into
 191 this mode using the `withSourceCopy` combinator (search for examples!)
 192 This mode is subject to the following limitations:
 193
 194 * You must be running the test inside a valid Git checkout of the test
 195   suite (`withSourceCopy` uses Git to determine which files should be copied.)
 196
 197 * You must `git add` all files which are relevant to the test, otherwise
 198   they will not be copied.
 199
 200 * The source copy is still made at a well-known location, so running
 201   a test is still not reentrant. (See also Known Limitations.)
 202
 203 Design notes
 204 ------------
 205
 206 This is the second rewrite of the integration testing framework.  The
 207 primary goal was to use Haskell as the test language (letting us take
 208 advantage of a real programming language, and use utilities provided to
 209 us by the Cabal library itself), while at the same time compensating for
 210 two perceived problems of pure-Haskell test suites:
 211
 212 * Haskell test suites are generally compiled before they run
 213   (for example, this is the modus operandi of `cabal test`).
 214   In practice, this results in a long edit-recompile cycle
 215   when working on tests. This hurts a lot when you would
 216   like to experimentally edit a test when debugging an issue.
 217
 218 * Haskell's metaprogramming facilities (e.g., Template Haskell)
 219   can't handle dynamically loading modules from the file system;
 220   thus, there ends up being a considerable amount of boilerplate
 221   needed to "wire" up test cases to the central test runner.
 222
 223 Our approach to address these issues is to maintain Haskell test scripts
 224 as self-contained programs which are run by the GHCi interpreter.
 225 This is not altogether trivial, and so there are a few important
 226 technical innovations to make this work:
 227
 228 * Unlike a traditional test program which can be built by the Cabal
 229   build system, these test scripts must be interpretable at
 230   runtime (outside of the build system.)  Our approach to handle
 231   this is to link against the same version of Cabal that was
 232   used to build the top-level test program (by way of a Custom
 233   setup linked against the Cabal library under test) and then
 234   use this library to compute the necessary GHC flags to pass
 235   to these scripts.
 236
 237 * The startup latency of `runghc` can be quite high, which adds up
 238   when you have many tests.  To solve this, our `Test.Cabal.Server`
 239   GHCi server implementation can reuse
 240   a GHCi instance as we are running test scripts.  It took some
 241   technical ingenuity to implement this, but the result is that
 242   running scripts is essentially free.
 243
 244 Here is the general outline of how the `cabal-tests` program operates:
 245
 246 1. It first loads the cached `LocalBuildInfo` associated with the
 247    host build system (which was responsible for building `cabal-tests`
 248    in the first place.)  This information lets us compute the
 249    flags that we will use to subsequently invoke GHC.
 250
 251 2. We then recursively scan the current working directory, looking
 252    for files suffixed `.test.hs`; these are the test scripts we
 253    will run.
 254
 255 3. For every thread specified via the `-j`, we spawn a GHCi
 256    server, and then use these to run the test scripts until all
 257    test scripts have been run.
 258
 259 The new `cabal-tests` runner doesn't use Tasty because I couldn't
 260 figure out how to get out the threading setting, and then spawn
 261 that many GHCi servers to service the running threads.  Improvements
 262 welcome.
 263
 264 Expect tests
 265 ------------
 266
 267 An expect test (aka _golden test_)
 268 is a test where we read out the output of the test
 269 and compare it directly against a saved copy of the test output.
 270 When test output changes, you can ask the test suite to "accept"
 271 the new output, which automatically overwrites the old expected
 272 test output with the new.
 273
 274 Supporting expect tests with Cabal is challenging, because Cabal
 275 interacts with multiple versions of external components (most
 276 prominently GHC) with different variants of their output, and no
 277 one wants to rerun a test on four different versions of GHC to make
 278 sure we've picked up the correct output in all cases.
 279
 280 Still, we'd like to take advantage of expect tests for Cabal's error
 281 reporting.  So here's our strategy:
 282
 283 1. We have a new verbosity flag `+markoutput` which lets you toggle the emission
 284    of `-----BEGIN CABAL OUTPUT-----` and  `-----END CABAL OUTPUT-----`
 285    stanzas.
 286
 287 2. When someone requests an expect test, we ONLY consider output between
 288    these flags.
 289
 290 The expectation is that Cabal will only enclose output it controls
 291 between these stanzas.  In practice, this just means we wrap `die`,
 292 `warn` and `notice` with these markers.
 293
 294 An added benefit of this strategy is that we can continue operating
 295 at high verbosity by default (which is very helpful for having useful
 296 diagnostic information immediately, e.g. in CI.)
 297
 298 We also need to deal with nondeterminism in test output in some
 299 situations.  Here are the most common ones:
 300
 301 * Dependency solving output on failure is still non-deterministic, due to
 302   its dependence on the global package database.  We're tracking this
 303   in https://github.com/haskell/cabal/issues/4332 but for now, we're
 304   not running expect tests on this output.
 305
 306 * Tests against Custom setup will build against the Cabal that shipped with
 307   GHC, so you need to be careful NOT to record this output (since we
 308   don't control that output.)
 309
 310 * We have some munging on the output, to remove common sources of
 311   non-determinism: paths, GHC versions, boot package versions, etc.
 312   Check `normalizeOutput` to see what we do.  Note that we save
 313   *normalized* output, so if you modify the normalizer you will
 314   need to rerun the test suite accepting everything.
 315
 316 * The Setup interface gets a `--enable-deterministic` flag which we
 317   pass by default.  The intent is to make Cabal more deterministic;
 318   for example, with this flag we no longer compute a hash when
 319   computing IPIDs, but just use the tag `-inplace`.  You can manually
 320   disable this using `--disable-deterministic` (as is the case with
 321   `UniqueIPID`.)
 322
 323 Some other notes:
 324
 325 * It's good style to put `default-language` in all your stanzas, so
 326   Cabal doesn't complain about it (that warning is marked!).  Ditto
 327   with `cabal-version` at the top of your Cabal file.
 328
 329 * If you can't get the output of a test to be deterministic, no
 330   problem: just exclude it from recording and do a manual test
 331   on the output for the string you're looking for.  Try to be
 332   deterministic, but sometimes it's not (easily) possible.
 333
 334 Non-goals
 335 ---------
 336
 337 Here are some things we do not currently plan on supporting:
 338
 339 * A file format for specifying multiple packages and source files.
 340   While in principle there is nothing wrong with making it easier
 341   to write tests, tests stored in this manner are more difficult
 342   to debug with, as they must first be "decompressed" into a full
 343   folder hierarchy before they can be interacted with.  (But some
 344   of our tests need substantial setup; for example, tests that
 345   have to setup a package repository.  In this case, because there
 346   already is a setup necessary, we might consider making things easier here.)
 347
 348 Known limitations
 349 -----------------
 350
 351 * Tests are NOT reentrant: test build products are always built into
 352   the same location, and if you run the same test at the same time,
 353   you will clobber each other.  This is convenient for debugging and
 354   doesn't seem to be a problem in practice.