cabal-testsuite/README.md

   1 cabal-testsuite is a suite of integration tests for Cabal-based
   2 frameworks.
   3
   4 # How to run
   5
   6 1. Build `cabal-testsuite` (`cabal build cabal-testsuite:cabal-tests`)
   7 2. Run the `cabal-tests` executable. It will scan for all tests
   8    in your current directory and subdirectories and run them.
   9
  10 There are a few useful flags:
  11
  12 * To run a specific set of tests, pass the path to a `.test.hs` file to run or
  13   use the `-p`/`--pattern` flag to filter tests.
  14
  15   See the ["Selecting tests"](#selecting-tests) section below for more details.
  16
  17 * `-j INT` controls the number of threads used for running tests.
  18
  19 * `--with-cabal PATH` can be used to specify the path of a
  20   `cabal-install` executable.  IF YOU DO NOT SPECIFY THIS FLAG,
  21   CABAL INSTALL TESTS WILL NOT RUN.
  22
  23 * `--with-ghc PATH` can be used to specify an alternate version of
  24   GHC to ask the tests to compile with.
  25
  26 * `--builddir DIR` can be used to manually specify the dist directory
  27   that was used to build `cabal-tests`; this can be used if
  28   the autodetection doesn't work correctly (which may be the
  29   case for old versions of GHC.)
  30
  31 * `--keep-tmp-files` can be used to keep the temporary directories that tests
  32   are run in.
  33
  34 ## Selecting tests
  35
  36 To run a specific set of tests, use `cabal-tests --with-cabal=CABALBIN PATH ...`, e.g.:
  37
  38 ```
  39 cabal run cabal-testsuite:cabal-tests -- \
  40    --with-cabal=cabal \
  41    cabal-testsuite/PackageTests/TestOptions/setup.test.hs
  42 ```
  43
  44 Alternatively, use `-p`/`--pattern` to select tests dynamically:
  45
  46 ```
  47 cabal run cabal-testsuite:cabal-tests -- \
  48    --with-cabal=cabal \
  49    --pattern "/TestOptions/"
  50 ```
  51
  52 See [the documentation for Tasty pattern
  53 syntax](https://hackage.haskell.org/package/tasty#patterns) for more
  54 information.
  55
  56 ## Which Cabal library version do cabal-install tests use?
  57
  58 By default the `cabal-install` tests will use the `Cabal` library which comes with
  59 the boot compiler when it needs to build a custom `Setup.hs`.
  60
  61 This can be very confusing if you are modifying the Cabal library, writing a test
  62 which relies on a custom setup script and you are wondering why the test is not
  63 responding at all to your changes.
  64
  65 There are some flags which allow you to instruct `cabal-install` to use a different
  66 `Cabal` library version.
  67
  68 1. `--boot-cabal-lib` specifies to use the Cabal library bundled with the
  69    test compiler, this is the default.
  70 2. `--intree-cabal-lib=<root_dir>` specifies to use Cabal and Cabal-syntax
  71    from a specific directory, and `--test-tmp` indicates where to put
  72    the package database they are built into.
  73 3. `--specific-cabal-lib=<VERSION>` specifies to use a specific Cabal
  74    version from hackage (ie 3.10.2.0) and installs the package database
  75    into `--test-tmp=<DIR>`
  76
  77 The CI scripts use the `--intree-cabal-lib` option for the most part but in
  78 the future there should be a variety of jobs which test `cabal-install` built
  79 against newer `Cabal` versions but forced to interact with older `Cabal` versions.
  80
  81 ### How to run the doctests
  82
  83 You need to install the `doctest` tool. Make sure it's compiled with your current
  84 GHC, and don't forget to reinstall it every time you switch GHC version:
  85
  86 ``` shellsession
  87 cabal install doctest --overwrite-policy=always --ignore-project
  88 ```
  89
  90 After that you can run doctests for a component of your choice via the following command:
  91
  92 ``` shellsession
  93 cabal repl --with-ghc=doctest --build-depends=QuickCheck --build-depends=template-haskell --repl-options="-w" --project-file="cabal.validate.project" Cabal-syntax
  94 ```
  95
  96 In this example we have run doctests in `Cabal-syntax`. Notice, that some
  97 components have broken doctests
  98 ([#8734](https://github.com/haskell/cabal/issues/8734));
  99 our CI currently checks that `Cabal-syntax` and `Cabal` doctests pass via
 100 `make doctest-install && make doctest` (you can use this `make`-based workflow too).
 101
 102 # How to write
 103
 104 If you learn better by example, just look at the tests that live
 105 in `cabal-testsuite/PackageTests`; if you `git log -p`, you can
 106 see the full contents of various commits which added a test for
 107 various functionality.  See if you can find an existing test that
 108 is similar to what you want to test.
 109
 110 Tests are all run in temporary system directories. At the start of a test
 111 all the files which are in the same folder as the test script are copied into
 112 a system temporary directory and then the rest of the script operates in this
 113 directory.
 114
 115 **NOTE:** only files which are known to git are copied, so you have to
 116 `git add` any files which are part of a test before running the test.
 117 You can use the `--keep-tmp-files` flag to keep the temporary directories in
 118 order to inspect the result of running a test.
 119
 120 Otherwise, here is a walkthrough:
 121
 122 1. Create the package(s) that you need for your test in a
 123    new directory.
 124    (Currently (2021-10-06), tests are stored in `PackageTests`,
 125    with the exception of one test stored in `tests`.)
 126
 127 2. Create one or more `.test.hs` scripts in your directory, using
 128    the template:
 129    ```haskell
 130    import Test.Cabal.Prelude
 131    main = setupAndCabalTest $ do
 132        -- your test code here
 133    ```
 134
 135    `setupAndCabal` test indicates that invocations of `setup`
 136    should work both for a raw `Setup` script, as well as
 137    `cabal-install` (if your test works only for one or the
 138    other, use `setupTest` or `cabalTest`).
 139
 140    Code runs in the `TestM` monad, which manages some administrative
 141    environment (e.g., the test that is running, etc.).
 142    `Test.Cabal.Prelude` contains a number of useful functions
 143    for testing implemented in this monad, including the functions `cabal`
 144    and `setup` which let you invoke those respective programs.  You should
 145    read through that file to get a sense for what capabilities
 146    are possible (grep for use-sites of functions to see how they
 147    are used).  If you don't see something anywhere, that's probably
 148    because it isn't implemented. Implement it!
 149
 150    To include parts that are supposed to fail (in the sense that a
 151    non-zero exit code is returned), there is the `fails` combinator,
 152    e.g.:
 153    ```haskell
 154    main = cabalTest $ do
 155      fails $ cabal "bad-command" [ "bad", "args" ]
 156      cabal "good-command" [ "good", "args" ]
 157      fails $ cabal "another-bad-one" [ ... ]
 158      ...
 159    ```
 160
 161    The dependencies which your test is allowed to use are listed in the
 162    cabal file under the `test-runtime-deps` executable. At compile-time there is
 163    a custom Setup.hs script which inspects this list and records the versions of
 164    each package in a generated file. These are then used when `cabal-tests` runs
 165    when it invokes `runghc` to run each test.
 166    We ensure they are built and available by listing `test-runtime-deps` in the
 167    build-tool-depends section of the cabal-tests executable.
 168
 169
 170 3. Run your tests using `cabal-tests` (no need to rebuild when
 171    you add or modify a test; it is automatically picked up).
 172    The first time you run a test, assuming everything else is
 173    in order, it will complain that the actual output doesn't match
 174    the expected output.  Use the `--accept` flag to accept the
 175    output if it makes sense!
 176
 177 We also support a `.multitest.hs` prefix; eventually this will
 178 allow multiple tests to be defined in one file but run in parallel;
 179 at the moment, these just indicate long running tests that should
 180 be run early (to avoid straggling).
 181
 182 # Frequently asked questions
 183
 184 For all of these answers, to see examples of the functions in
 185 question, grep the test suite.
 186
 187 **Why isn't some output I added to Cabal showing up in the recorded
 188 test output?** Only "marked" output is picked up by Cabal; currently,
 189 only `notice`, `warn` and `die` produce marked output.  Use those
 190 combinators for your output.
 191
 192 **How can I add a dependency on a package from Hackage in a test?**
 193 By default, the test suite is completely independent of the contents
 194 of Hackage, to ensure that it keeps working across all GHC versions.
 195 If possible, define the package locally.  If the package needs
 196 to be from Hackage (e.g., you are testing the global store code
 197 in new-build), use `withRepo "repo"` to initialize a "fake" Hackage with
 198 the packages placed in the `repo` directory.
 199
 200 **How do I run an executable that my test built?** The specific
 201 function you should use depends on how you built the executable:
 202
 203 * If you built it using `Setup build`, use `runExe`
 204 * If you installed it using `Setup install` or `cabal install`, use
 205   `runInstalledExe`.
 206 * If you built it with `cabal build`, use `runPlanExe`; note
 207   that you will need to run this inside of a `withPlan` that is
 208   placed *after* you have invoked `build`. (Grep for an example!)
 209
 210 **How do I turn off accept tests? My test output wobbles too much.**
 211 Use `recordMode DoNotRecord`.  This should be a last resort; consider
 212 modifying Cabal so that the output is stable.  If you must do this, make
 213 sure you add extra, manual tests to ensure the output looks like what
 214 you expect.
 215
 216 **How can I manually test for a string in output?**  Use the primed
 217 variants of a command (e.g., `cabal'` rather than `cabal`) and use
 218 `assertOutputContains`.  Note that this will search over BOTH stdout
 219 and stderr.
 220
 221 **How do I skip running a test in some environments?**  Use the
 222 `skipIf` and `skipUnless` combinators.  Useful parameters to test
 223 these with include `hasSharedLibraries`, `hasProfiledLibraries`,
 224 `hasCabalShared`, `isGhcVersion`, `isWindows`, `isLinux`, `isOSX`.
 225
 226 There are some pre-defined versions of those combinators like `skipIfWindows`
 227 or `skipIfCI`. If possible try to use those as the error message will be uniform
 228 with other tests, allowing for `grep`ing the output more easily.
 229
 230 Make sure that you only skip tests which cannot be run by fundamental reasons,
 231 like the OS or the capabilities of the GHC version. If a test is failing do not
 232 skip it, mark it as broken instead (see next question).
 233
 234 **How do I mark a test as broken?**  Use `expectBroken`, which takes
 235 the ticket number as its first argument.
 236
 237 **How do I mark a flaky test?** If a test passes only sometimes for unknown
 238 reasons, it is better to mark it as flaky with the `flaky` and `flakyIf`
 239 combinators. They both take a ticket number so the flaky tests has to be tracked
 240 in an issue. Flaky tests are executed, and the outcome is reported by the
 241 test-suite but even if they fail they won't make the test-suite fail.
 242
 243 **I programmatically modified a file in my test suite, but Cabal/GHC
 244 doesn't seem to be picking it up.**  You need to sleep sufficiently
 245 long before editing a file, in order for file system timestamp
 246 resolution to pick it up.  Use `withDelay` and `delay` prior to
 247 making a modification.
 248
 249 # Hermetic tests
 250
 251 Tests are run in a fresh temporary system directory. This attempts to isolate the
 252 tests from anything specific to do with your directory structure. In particular
 253
 254 * You must be running the test inside a valid Git checkout of the test
 255   suite (`withSourceCopy` uses Git to determine which files should be copied.)
 256
 257 * You must `git add` all files which are relevant to the test, otherwise
 258   they will not be copied.
 259
 260 # Design notes
 261
 262 This is the second rewrite of the integration testing framework.  The
 263 primary goal was to use Haskell as the test language (letting us take
 264 advantage of a real programming language, and use utilities provided to
 265 us by the Cabal library itself), while at the same time compensating for
 266 two perceived problems of pure-Haskell test suites:
 267
 268 * Haskell test suites are generally compiled before they run
 269   (for example, this is the modus operandi of `cabal test`).
 270   In practice, this results in a long edit-recompile cycle
 271   when working on tests. This hurts a lot when you would
 272   like to experimentally edit a test when debugging an issue.
 273
 274 * Haskell's metaprogramming facilities (e.g., Template Haskell)
 275   can't handle dynamically loading modules from the file system;
 276   thus, there ends up being a considerable amount of boilerplate
 277   needed to "wire" up test cases to the central test runner.
 278
 279 Our approach to address these issues is to maintain Haskell test scripts
 280 as self-contained programs which are run by the GHCi interpreter.
 281 This is not altogether trivial, and so there are a few important
 282 technical innovations to make this work:
 283
 284 * Unlike a traditional test program which can be built by the Cabal
 285   build system, these test scripts must be interpretable at
 286   runtime (outside of the build system.)  Our approach to handle
 287   this is to link against the same version of Cabal that was
 288   used to build the top-level test program (by way of a Custom
 289   setup linked against the Cabal library under test) and then
 290   use this library to compute the necessary GHC flags to pass
 291   to these scripts.
 292
 293 * The startup latency of `runghc` can be quite high, which adds up
 294   when you have many tests.  To solve this, our `Test.Cabal.Server`
 295   GHCi server implementation can reuse
 296   a GHCi instance as we are running test scripts.  It took some
 297   technical ingenuity to implement this, but the result is that
 298   running scripts is essentially free.
 299
 300 Here is the general outline of how the `cabal-tests` program operates:
 301
 302 1. It first loads the cached `LocalBuildInfo` associated with the
 303    host build system (which was responsible for building `cabal-tests`
 304    in the first place.)  This information lets us compute the
 305    flags that we will use to subsequently invoke GHC.
 306
 307 2. We then recursively scan the current working directory, looking
 308    for files suffixed `.test.hs`; these are the test scripts we
 309    will run.
 310
 311 3. For every thread specified via the `-j`, we spawn a GHCi
 312    server, and then use these to run the test scripts until all
 313    test scripts have been run.
 314
 315 The new `cabal-tests` runner doesn't use Tasty because I couldn't
 316 figure out how to get out the threading setting, and then spawn
 317 that many GHCi servers to service the running threads.  Improvements
 318 welcome.
 319
 320 # Expect tests
 321
 322 An expect test (aka _golden test_)
 323 is a test where we read out the output of the test
 324 and compare it directly against a saved copy of the test output.
 325 When test output changes, you can ask the test suite to "accept"
 326 the new output, which automatically overwrites the old expected
 327 test output with the new.
 328
 329 Supporting expect tests with Cabal is challenging, because Cabal
 330 interacts with multiple versions of external components (most
 331 prominently GHC) with different variants of their output, and no
 332 one wants to rerun a test on four different versions of GHC to make
 333 sure we've picked up the correct output in all cases.
 334
 335 Still, we'd like to take advantage of expect tests for Cabal's error
 336 reporting.  So here's our strategy:
 337
 338 1. We have a new verbosity flag `+markoutput` which lets you toggle the emission
 339    of `-----BEGIN CABAL OUTPUT-----` and  `-----END CABAL OUTPUT-----`
 340    stanzas.
 341
 342 2. When someone requests an expect test, we ONLY consider output between
 343    these flags.
 344
 345 The expectation is that Cabal will only enclose output it controls
 346 between these stanzas.  In practice, this just means we wrap `die`,
 347 `warn` and `notice` with these markers.
 348
 349 An added benefit of this strategy is that we can continue operating
 350 at high verbosity by default (which is very helpful for having useful
 351 diagnostic information immediately, e.g. in CI.)
 352
 353 We also need to deal with nondeterminism in test output in some
 354 situations.  Here are the most common ones:
 355
 356 * Dependency solving output on failure is still non-deterministic, due to
 357   its dependence on the global package database.  We're tracking this
 358   in https://github.com/haskell/cabal/issues/4332 but for now, we're
 359   not running expect tests on this output.
 360
 361 * Tests against Custom setup will build against the Cabal that shipped with
 362   GHC, so you need to be careful NOT to record this output (since we
 363   don't control that output.)
 364
 365 * We have some munging on the output, to remove common sources of
 366   non-determinism: paths, GHC versions, boot package versions, etc.
 367   Check `normalizeOutput` to see what we do.  Note that we save
 368   *normalized* output, so if you modify the normalizer you will
 369   need to rerun the test suite accepting everything.
 370
 371 * The Setup interface gets a `--enable-deterministic` flag which we
 372   pass by default.  The intent is to make Cabal more deterministic;
 373   for example, with this flag we no longer compute a hash when
 374   computing IPIDs, but just use the tag `-inplace`.  You can manually
 375   disable this using `--disable-deterministic` (as is the case with
 376   `UniqueIPID`.)
 377
 378 Some other notes:
 379
 380 * It's good style to put `default-language` in all your stanzas, so
 381   Cabal doesn't complain about it (that warning is marked!).  Ditto
 382   with `cabal-version` at the top of your Cabal file.
 383
 384 * If you can't get the output of a test to be deterministic, no
 385   problem: just exclude it from recording and do a manual test
 386   on the output for the string you're looking for.  Try to be
 387   deterministic, but sometimes it's not (easily) possible.
 388
 389 # Non-goals
 390
 391 Here are some things we do not currently plan on supporting:
 392
 393 * A file format for specifying multiple packages and source files.
 394   While in principle there is nothing wrong with making it easier
 395   to write tests, tests stored in this manner are more difficult
 396   to debug with, as they must first be "decompressed" into a full
 397   folder hierarchy before they can be interacted with.  (But some
 398   of our tests need substantial setup; for example, tests that
 399   have to setup a package repository.  In this case, because there
 400   already is a setup necessary, we might consider making things easier here.)