cabal-testsuite/README.md

   1 cabal-testsuite is a suite of integration tests for Cabal-based
   2 frameworks.
   3
   4 How to run
   5 ----------
   6
   7 1. Build `cabal-testsuite` (`cabal build cabal-testsuite:cabal-tests`)
   8 2. Run the `cabal-tests` executable. It will scan for all tests
   9    in your current directory and subdirectories and run them.
  10    To run a specific set of tests, use `cabal-tests --with-cabal=CABALBIN PATH ...`.
  11    (e.g. `cabal run cabal-testsuite:cabal-tests -- --with-cabal=cabal cabal-testsuite/PackageTests/TestOptions/setup.test.hs`)
  12    You can control parallelism using the `-j` flag.
  13
  14 There are a few useful flags:
  15
  16 * `--with-cabal PATH` can be used to specify the path of a
  17   `cabal-install` executable.  IF YOU DO NOT SPECIFY THIS FLAG,
  18   CABAL INSTALL TESTS WILL NOT RUN.
  19
  20 * `--with-ghc PATH` can be used to specify an alternate version of
  21   GHC to ask the tests to compile with.
  22
  23 * `--builddir DIR` can be used to manually specify the dist directory
  24   that was used to build `cabal-tests`; this can be used if
  25   the autodetection doesn't work correctly (which may be the
  26   case for old versions of GHC.)
  27
  28 * `--keep-tmp-files` can be used to keep the temporary directories that tests
  29   are run in.
  30
  31 ### How to run the doctests
  32
  33 You need to install the `doctest` tool. Make sure it's compiled with your current
  34 GHC, and don't forget to reinstall it every time you switch GHC version:
  35
  36 ``` shellsession
  37 cabal install doctest --overwrite-policy=always --ignore-project
  38 ```
  39
  40 After that you can run doctests for a component of your choice via the following command:
  41
  42 ``` shellsession
  43 cabal repl --with-ghc=doctest --build-depends=QuickCheck --build-depends=template-haskell --repl-options="-w" --project-file="cabal.project.validate" Cabal-syntax
  44 ```
  45
  46 In this example we have run doctests in `Cabal-syntax`. Notice, that some
  47 components have broken doctests
  48 ([#8734](https://github.com/haskell/cabal/issues/8734));
  49 our CI currently checks that `Cabal-syntax` and `Cabal` doctests pass via
  50 `make doctest-install && make doctest` (you can use this `make`-based workflow too).
  51
  52 How to write
  53 ------------
  54
  55 If you learn better by example, just look at the tests that live
  56 in `cabal-testsuite/PackageTests`; if you `git log -p`, you can
  57 see the full contents of various commits which added a test for
  58 various functionality.  See if you can find an existing test that
  59 is similar to what you want to test.
  60
  61 Tests are all run in temporary system directories. At the start of a test
  62 all the files which are in the same folder as the test script are copied into
  63 a system temporary directory and then the rest of the script operates in this
  64 directory.
  65
  66 **NOTE:** only files which are known to git are copied, so you have to
  67 `git add` any files which are part of a test before running the test.
  68 You can use the `--keep-tmp-files` flag to keep the temporary directories in
  69 order to inspect the result of running a test.
  70
  71 Otherwise, here is a walkthrough:
  72
  73 1. Create the package(s) that you need for your test in a
  74    new directory.
  75    (Currently (2021-10-06), tests are stored in `PackageTests`,
  76    with the exception of one test stored in `tests`.)
  77
  78 2. Create one or more `.test.hs` scripts in your directory, using
  79    the template:
  80    ```haskell
  81    import Test.Cabal.Prelude
  82    main = setupAndCabalTest $ do
  83        -- your test code here
  84    ```
  85
  86    `setupAndCabal` test indicates that invocations of `setup`
  87    should work both for a raw `Setup` script, as well as
  88    `cabal-install` (if your test works only for one or the
  89    other, use `setupTest` or `cabalTest`).
  90
  91    Code runs in the `TestM` monad, which manages some administrative
  92    environment (e.g., the test that is running, etc.).
  93    `Test.Cabal.Prelude` contains a number of useful functions
  94    for testing implemented in this monad, including the functions `cabal`
  95    and `setup` which let you invoke those respective programs.  You should
  96    read through that file to get a sense for what capabilities
  97    are possible (grep for use-sites of functions to see how they
  98    are used).  If you don't see something anywhere, that's probably
  99    because it isn't implemented. Implement it!
 100
 101    To include parts that are supposed to fail (in the sense that a
 102    non-zero exit code is returned), there is the `fails` combinator,
 103    e.g.:
 104    ```haskell
 105    main = cabalTest $ do
 106      fails $ cabal "bad-command" [ "bad", "args" ]
 107      cabal "good-command" [ "good", "args" ]
 108      fails $ cabal "another-bad-one" [ ... ]
 109      ...
 110    ```
 111
 112    The dependencies which your test is allowed to use are listed in the
 113    cabal file under the `test-runtime-deps` executable. At compile-time there is
 114    a custom Setup.hs script which inspects this list and records the versions of
 115    each package in a generated file. These are then used when `cabal-tests` runs
 116    when it invokes `runghc` to run each test.
 117    We ensure they are built and available by listing `test-runtime-deps` in the
 118    build-tool-depends section of the cabal-tests executable.
 119
 120
 121 3. Run your tests using `cabal-tests` (no need to rebuild when
 122    you add or modify a test; it is automatically picked up).
 123    The first time you run a test, assuming everything else is
 124    in order, it will complain that the actual output doesn't match
 125    the expected output.  Use the `--accept` flag to accept the
 126    output if it makes sense!
 127
 128 We also support a `.multitest.hs` prefix; eventually this will
 129 allow multiple tests to be defined in one file but run in parallel;
 130 at the moment, these just indicate long running tests that should
 131 be run early (to avoid straggling).
 132
 133 Frequently asked questions
 134 --------------------------
 135
 136 For all of these answers, to see examples of the functions in
 137 question, grep the test suite.
 138
 139 **Why isn't some output I added to Cabal showing up in the recorded
 140 test output?** Only "marked" output is picked up by Cabal; currently,
 141 only `notice`, `warn` and `die` produce marked output.  Use those
 142 combinators for your output.
 143
 144 **How can I add a dependency on a package from Hackage in a test?**
 145 By default, the test suite is completely independent of the contents
 146 of Hackage, to ensure that it keeps working across all GHC versions.
 147 If possible, define the package locally.  If the package needs
 148 to be from Hackage (e.g., you are testing the global store code
 149 in new-build), use `withRepo "repo"` to initialize a "fake" Hackage with
 150 the packages placed in the `repo` directory.
 151
 152 **How do I run an executable that my test built?** The specific
 153 function you should use depends on how you built the executable:
 154
 155 * If you built it using `Setup build`, use `runExe`
 156 * If you installed it using `Setup install` or `cabal install`, use
 157   `runInstalledExe`.
 158 * If you built it with `cabal build`, use `runPlanExe`; note
 159   that you will need to run this inside of a `withPlan` that is
 160   placed *after* you have invoked `build`. (Grep for an example!)
 161
 162 **How do I turn off accept tests? My test output wobbles too much.**
 163 Use `recordMode DoNotRecord`.  This should be a last resort; consider
 164 modifying Cabal so that the output is stable.  If you must do this, make
 165 sure you add extra, manual tests to ensure the output looks like what
 166 you expect.
 167
 168 **How can I manually test for a string in output?**  Use the primed
 169 variants of a command (e.g., `cabal'` rather than `cabal`) and use
 170 `assertOutputContains`.  Note that this will search over BOTH stdout
 171 and stderr.
 172
 173 **How do I skip running a test in some environments?**  Use the
 174 `skipIf` and `skipUnless` combinators.  Useful parameters to test
 175 these with include `hasSharedLibraries`, `hasProfiledLibraries`,
 176 `hasCabalShared`, `isGhcVersion`, `isWindows`, `isLinux`, `isOSX`
 177 and `hasCabalForGhc`.
 178
 179 **I programmatically modified a file in my test suite, but Cabal/GHC
 180 doesn't seem to be picking it up.**  You need to sleep sufficiently
 181 long before editing a file, in order for file system timestamp
 182 resolution to pick it up.  Use `withDelay` and `delay` prior to
 183 making a modification.
 184
 185 **How do I mark a test as broken?**  Use `expectBroken`, which takes
 186 the ticket number as its first argument.  Note that this does NOT
 187 handle accept-test brokenness, so you will have to add a manual
 188 string output test, if that is how your test is "failing."
 189
 190 Hermetic tests
 191 --------------
 192
 193 Tests are run in a fresh temporary system directory. This attempts to isolate the
 194 tests from anything specific to do with your directory structure. In particular
 195
 196 * You must be running the test inside a valid Git checkout of the test
 197   suite (`withSourceCopy` uses Git to determine which files should be copied.)
 198
 199 * You must `git add` all files which are relevant to the test, otherwise
 200   they will not be copied.
 201
 202 Design notes
 203 ------------
 204
 205 This is the second rewrite of the integration testing framework.  The
 206 primary goal was to use Haskell as the test language (letting us take
 207 advantage of a real programming language, and use utilities provided to
 208 us by the Cabal library itself), while at the same time compensating for
 209 two perceived problems of pure-Haskell test suites:
 210
 211 * Haskell test suites are generally compiled before they run
 212   (for example, this is the modus operandi of `cabal test`).
 213   In practice, this results in a long edit-recompile cycle
 214   when working on tests. This hurts a lot when you would
 215   like to experimentally edit a test when debugging an issue.
 216
 217 * Haskell's metaprogramming facilities (e.g., Template Haskell)
 218   can't handle dynamically loading modules from the file system;
 219   thus, there ends up being a considerable amount of boilerplate
 220   needed to "wire" up test cases to the central test runner.
 221
 222 Our approach to address these issues is to maintain Haskell test scripts
 223 as self-contained programs which are run by the GHCi interpreter.
 224 This is not altogether trivial, and so there are a few important
 225 technical innovations to make this work:
 226
 227 * Unlike a traditional test program which can be built by the Cabal
 228   build system, these test scripts must be interpretable at
 229   runtime (outside of the build system.)  Our approach to handle
 230   this is to link against the same version of Cabal that was
 231   used to build the top-level test program (by way of a Custom
 232   setup linked against the Cabal library under test) and then
 233   use this library to compute the necessary GHC flags to pass
 234   to these scripts.
 235
 236 * The startup latency of `runghc` can be quite high, which adds up
 237   when you have many tests.  To solve this, our `Test.Cabal.Server`
 238   GHCi server implementation can reuse
 239   a GHCi instance as we are running test scripts.  It took some
 240   technical ingenuity to implement this, but the result is that
 241   running scripts is essentially free.
 242
 243 Here is the general outline of how the `cabal-tests` program operates:
 244
 245 1. It first loads the cached `LocalBuildInfo` associated with the
 246    host build system (which was responsible for building `cabal-tests`
 247    in the first place.)  This information lets us compute the
 248    flags that we will use to subsequently invoke GHC.
 249
 250 2. We then recursively scan the current working directory, looking
 251    for files suffixed `.test.hs`; these are the test scripts we
 252    will run.
 253
 254 3. For every thread specified via the `-j`, we spawn a GHCi
 255    server, and then use these to run the test scripts until all
 256    test scripts have been run.
 257
 258 The new `cabal-tests` runner doesn't use Tasty because I couldn't
 259 figure out how to get out the threading setting, and then spawn
 260 that many GHCi servers to service the running threads.  Improvements
 261 welcome.
 262
 263 Expect tests
 264 ------------
 265
 266 An expect test (aka _golden test_)
 267 is a test where we read out the output of the test
 268 and compare it directly against a saved copy of the test output.
 269 When test output changes, you can ask the test suite to "accept"
 270 the new output, which automatically overwrites the old expected
 271 test output with the new.
 272
 273 Supporting expect tests with Cabal is challenging, because Cabal
 274 interacts with multiple versions of external components (most
 275 prominently GHC) with different variants of their output, and no
 276 one wants to rerun a test on four different versions of GHC to make
 277 sure we've picked up the correct output in all cases.
 278
 279 Still, we'd like to take advantage of expect tests for Cabal's error
 280 reporting.  So here's our strategy:
 281
 282 1. We have a new verbosity flag `+markoutput` which lets you toggle the emission
 283    of `-----BEGIN CABAL OUTPUT-----` and  `-----END CABAL OUTPUT-----`
 284    stanzas.
 285
 286 2. When someone requests an expect test, we ONLY consider output between
 287    these flags.
 288
 289 The expectation is that Cabal will only enclose output it controls
 290 between these stanzas.  In practice, this just means we wrap `die`,
 291 `warn` and `notice` with these markers.
 292
 293 An added benefit of this strategy is that we can continue operating
 294 at high verbosity by default (which is very helpful for having useful
 295 diagnostic information immediately, e.g. in CI.)
 296
 297 We also need to deal with nondeterminism in test output in some
 298 situations.  Here are the most common ones:
 299
 300 * Dependency solving output on failure is still non-deterministic, due to
 301   its dependence on the global package database.  We're tracking this
 302   in https://github.com/haskell/cabal/issues/4332 but for now, we're
 303   not running expect tests on this output.
 304
 305 * Tests against Custom setup will build against the Cabal that shipped with
 306   GHC, so you need to be careful NOT to record this output (since we
 307   don't control that output.)
 308
 309 * We have some munging on the output, to remove common sources of
 310   non-determinism: paths, GHC versions, boot package versions, etc.
 311   Check `normalizeOutput` to see what we do.  Note that we save
 312   *normalized* output, so if you modify the normalizer you will
 313   need to rerun the test suite accepting everything.
 314
 315 * The Setup interface gets a `--enable-deterministic` flag which we
 316   pass by default.  The intent is to make Cabal more deterministic;
 317   for example, with this flag we no longer compute a hash when
 318   computing IPIDs, but just use the tag `-inplace`.  You can manually
 319   disable this using `--disable-deterministic` (as is the case with
 320   `UniqueIPID`.)
 321
 322 Some other notes:
 323
 324 * It's good style to put `default-language` in all your stanzas, so
 325   Cabal doesn't complain about it (that warning is marked!).  Ditto
 326   with `cabal-version` at the top of your Cabal file.
 327
 328 * If you can't get the output of a test to be deterministic, no
 329   problem: just exclude it from recording and do a manual test
 330   on the output for the string you're looking for.  Try to be
 331   deterministic, but sometimes it's not (easily) possible.
 332
 333 Non-goals
 334 ---------
 335
 336 Here are some things we do not currently plan on supporting:
 337
 338 * A file format for specifying multiple packages and source files.
 339   While in principle there is nothing wrong with making it easier
 340   to write tests, tests stored in this manner are more difficult
 341   to debug with, as they must first be "decompressed" into a full
 342   folder hierarchy before they can be interacted with.  (But some
 343   of our tests need substantial setup; for example, tests that
 344   have to setup a package repository.  In this case, because there
 345   already is a setup necessary, we might consider making things easier here.)