llvm/docs/TestSuiteGuide.md

   1 test-suite Guide
   2 ================
   3
   4 Quickstart
   5 ----------
   6
   7 1. The lit test runner is required to run the tests. You can either use one
   8    from an LLVM build:
   9
  10    ```bash
  11    % <path to llvm build>/bin/llvm-lit --version
  12    lit 20.0.0dev
  13    ```
  14
  15    An alternative is installing it as a Python package in a Python virtual
  16    environment:
  17
  18    ```bash
  19    % python3 -m venv .venv
  20    % . .venv/bin/activate
  21    % pip install git+https://github.com/llvm/llvm-project.git#subdirectory=llvm/utils/lit
  22    % lit --version
  23    lit 20.0.0dev
  24    ```
  25
  26    Installing the official Python release of lit in a Python virtual
  27    environment could also work. This will install the most recent
  28    release of lit:
  29
  30    ```bash
  31    % python3 -m venv .venv
  32    % . .venv/bin/activate
  33    % pip install lit
  34    % lit --version
  35    lit 18.1.8
  36    ```
  37
  38    Please note that recent tests may rely on features not in the latest released lit.
  39    If in doubt, try one of the previous methods.
  40
  41 2. Check out the `test-suite` module with:
  42
  43    ```bash
  44    % git clone https://github.com/llvm/llvm-test-suite.git test-suite
  45    ```
  46
  47 3. Create a build directory and use CMake to configure the suite. Use the
  48    `CMAKE_C_COMPILER` option to specify the compiler to test. Use a cache file
  49    to choose a typical build configuration:
  50
  51    ```bash
  52    % mkdir test-suite-build
  53    % cd test-suite-build
  54    % cmake -DCMAKE_C_COMPILER=<path to llvm build>/bin/clang \
  55            -C../test-suite/cmake/caches/O3.cmake \
  56            ../test-suite
  57    ```
  58
  59 **NOTE!** if you are using your built clang, and you want to build and run the
  60 MicroBenchmarks/XRay microbenchmarks, you need to add `compiler-rt` to your
  61 `LLVM_ENABLE_RUNTIMES` cmake flag.
  62
  63 4. Build the benchmarks:
  64
  65    ```text
  66    % make
  67    Scanning dependencies of target timeit-target
  68    [  0%] Building C object tools/CMakeFiles/timeit-target.dir/timeit.c.o
  69    [  0%] Linking C executable timeit-target
  70    ...
  71    ```
  72
  73 5. Run the tests with lit:
  74
  75    ```text
  76    % llvm-lit -v -j 1 -o results.json .
  77    -- Testing: 474 tests, 1 threads --
  78    PASS: test-suite :: MultiSource/Applications/ALAC/decode/alacconvert-decode.test (1 of 474)
  79    ********** TEST 'test-suite :: MultiSource/Applications/ALAC/decode/alacconvert-decode.test' RESULTS **********
  80    compile_time: 0.2192
  81    exec_time: 0.0462
  82    hash: "59620e187c6ac38b36382685ccd2b63b"
  83    size: 83348
  84    **********
  85    PASS: test-suite :: MultiSource/Applications/ALAC/encode/alacconvert-encode.test (2 of 474)
  86    ...
  87    ```
  88 **NOTE!** even in the case you only want to get the compile-time results(code size, llvm stats etc),
  89 you need to run the test with the above `llvm-lit` command. In that case, the *results.json* file will
  90 contain compile-time metrics.
  91
  92 6. Show and compare result files (optional):
  93
  94    ```bash
  95    # Make sure pandas and scipy are installed. Prepend `sudo` if necessary.
  96    % pip install pandas scipy
  97    # Show a single result file:
  98    % test-suite/utils/compare.py results.json
  99    # Compare two result files:
 100    % test-suite/utils/compare.py results_a.json results_b.json
 101    ```
 102
 103
 104 Structure
 105 ---------
 106
 107 The test-suite contains benchmark and test programs.  The programs come with
 108 reference outputs so that their correctness can be checked.  The suite comes
 109 with tools to collect metrics such as benchmark runtime, compilation time and
 110 code size.
 111
 112 The test-suite is divided into several directories:
 113
 114 -  `SingleSource/`
 115
 116    Contains test programs that are only a single source file in size.  A
 117    subdirectory may contain several programs.
 118
 119 -  `MultiSource/`
 120
 121    Contains subdirectories which entire programs with multiple source files.
 122    Large benchmarks and whole applications go here.
 123
 124 -  `MicroBenchmarks/`
 125
 126    Programs using the [google-benchmark](https://github.com/google/benchmark)
 127    library. The programs define functions that are run multiple times until the
 128    measurement results are statistically significant.
 129
 130 -  `External/`
 131
 132    Contains descriptions and test data for code that cannot be directly
 133    distributed with the test-suite. The most prominent members of this
 134    directory are the SPEC CPU benchmark suites.
 135    See [External Suites](#external-suites).
 136
 137 -  `Bitcode/`
 138
 139    These tests are mostly written in LLVM bitcode.
 140
 141 -  `CTMark/`
 142
 143    Contains symbolic links to other benchmarks forming a representative sample
 144    for compilation performance measurements.
 145
 146 ### Benchmarks
 147
 148 Every program can work as a correctness test. Some programs are unsuitable for
 149 performance measurements. Setting the `TEST_SUITE_BENCHMARKING_ONLY` CMake
 150 option to `ON` will disable them.
 151
 152 The MultiSource benchmarks consist of the following apps and benchmarks:
 153
 154 | MultiSource          | Language  | Application Area              | Remark               |
 155 |----------------------|-----------|-------------------------------|----------------------|
 156 | 7zip                 |  C/C++    | Compression/Decompression     |                      |
 157 | ASCI_Purple          |  C        | SMG2000 benchmark and solver  | Memory intensive app |
 158 | ASC_Sequoia          |  C        | Simulation and solver         |                      |
 159 | BitBench             |  C        | uudecode/uuencode utility     | Bit Stream benchmark for functional compilers |
 160 | Bullet               |  C++      | Bullet 2.75 physics engine    |                      |
 161 | DOE-ProxyApps-C++    |  C++      | HPC/scientific apps           | Small applications, representative of our larger DOE workloads |
 162 | DOE-ProxyApps-C      |  C        | HPC/scientific apps           | "                    |
 163 | Fhourstones          |  C        | Game/solver                   | Integer benchmark that efficiently solves positions in the game of Connect-4 |
 164 | Fhourstones-3.1      |  C        | Game/solver                   | "                    |
 165 | FreeBench            |  C        | Benchmark suite               | Raytracer, four in a row, neural network, file compressor, Fast Fourier/Cosine/Sine Transform |
 166 | llubenchmark         |  C        | Linked-list micro-benchmark   |                      |
 167 | mafft                |  C        | Bioinformatics                | A multiple sequence alignment program |
 168 | MallocBench          |  C        | Benchmark suite               | cfrac, espresso, gawk, gs, make, p2c, perl |
 169 | McCat                |  C        | Benchmark suite               | Quicksort, bubblesort, eigenvalues |
 170 | mediabench           |  C        | Benchmark suite               | adpcm, g721, gsm, jpeg, mpeg2 |
 171 | MiBench              |  C        | Embedded benchmark suite      | Automotive, consumer, office, security, telecom apps  |
 172 | nbench               |  C        |                               | BYTE Magazine's BYTEmark benchmark program |
 173 | NPB-serial           |  C        | Parallel computing            | Serial version of the NPB IS code |
 174 | Olden                |  C        | Data Structures               | SGI version of the Olden benchmark |
 175 | OptimizerEval        |  C        | Solver                        | Preston Brigg's optimizer evaluation framework |
 176 | PAQ8p                |  C++      | Data compression              |                      |
 177 | Prolangs-C++         |  C++      | Benchmark suite               | city, employ, life, NP, ocean, primes, simul, vcirc |
 178 | Prolangs-C           |  C        | Benchmark suite               | agrep, archie-client, bison, gnugo, unix-smail |
 179 | Ptrdist              |  C        | Pointer-Intensive Benchmark Suite |                  |
 180 | Rodinia              |  C        | Scientific apps              | backprop, pathfinder, srad |
 181 | SciMark2-C           |  C        | Scientific apps              | FFT, LU, Montecarlo, sparse matmul |
 182 | sim                  |  C        | Dynamic programming          | A Time-Efficient, Linear-Space Local Similarity Algorithm |
 183 | tramp3d-v4           |  C++      | Numerical analysis           | Template-intensive numerical program based on FreePOOMA |
 184 | Trimaran             |  C        | Encryption                   | 3des, md5, crc |
 185 | TSVC                 |  C        | Vectorization benchmark      | Test Suite for Vectorizing Compilers (TSVC) |
 186 | VersaBench           |  C        | Benchmark suite              | 8b10b, beamformer, bmm, dbms, ecbdes |
 187
 188 All MultiSource applications are suitable for performance measurements
 189 and will run when CMake option `TEST_SUITE_BENCHMARKING_ONLY` is set.
 190
 191 Configuration
 192 -------------
 193
 194 The test-suite has configuration options to customize building and running the
 195 benchmarks. CMake can print a list of them:
 196
 197 ```bash
 198 % cd test-suite-build
 199 # Print basic options:
 200 % cmake -LH
 201 # Print all options:
 202 % cmake -LAH
 203 ```
 204
 205 ### Common Configuration Options
 206
 207 - `CMAKE_C_FLAGS`
 208
 209   Specify extra flags to be passed to C compiler invocations.  The flags are
 210   also passed to the C++ compiler and linker invocations.  See
 211   [https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_FLAGS.html](https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_FLAGS.html)
 212
 213 - `CMAKE_C_COMPILER`
 214
 215   Select the C compiler executable to be used. Note that the C++ compiler is
 216   inferred automatically i.e. when specifying `path/to/clang` CMake will
 217   automatically use `path/to/clang++` as the C++ compiler.  See
 218   [https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_COMPILER.html](https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_COMPILER.html)
 219
 220 - `CMAKE_Fortran_COMPILER`
 221
 222   Select the Fortran compiler executable to be used. Not set by default and not
 223   required unless running the Fortran Test Suite.
 224
 225 - `CMAKE_BUILD_TYPE`
 226
 227   Select a build type like `OPTIMIZE` or `DEBUG` selecting a set of predefined
 228   compiler flags. These flags are applied regardless of the `CMAKE_C_FLAGS`
 229   option and may be changed by modifying `CMAKE_C_FLAGS_OPTIMIZE` etc.  See
 230   [https://cmake.org/cmake/help/latest/variable/CMAKE_BUILD_TYPE.html](https://cmake.org/cmake/help/latest/variable/CMAKE_BUILD_TYPE.html)
 231
 232 - `TEST_SUITE_FORTRAN`
 233
 234   Activate that Fortran tests. This is a work in progress. More information can be
 235   found in the [Flang documentation](https://flang.llvm.org/docs/FortranLLVMTestSuite.html)
 236
 237 - `TEST_SUITE_RUN_UNDER`
 238
 239   Prefix test invocations with the given tool. This is typically used to run
 240   cross-compiled tests within a simulator tool.
 241
 242 - `TEST_SUITE_BENCHMARKING_ONLY`
 243
 244   Disable tests that are unsuitable for performance measurements. The disabled
 245   tests either run for a very short time or are dominated by I/O performance
 246   making them unsuitable as compiler performance tests.
 247
 248 - `TEST_SUITE_SUBDIRS`
 249
 250   Semicolon-separated list of directories to include. This can be used to only
 251   build parts of the test-suite or to include external suites.  This option
 252   does not work reliably with deeper subdirectories as it skips intermediate
 253   `CMakeLists.txt` files which may be required.
 254
 255 - `TEST_SUITE_COLLECT_STATS`
 256
 257   Collect internal LLVM statistics. Appends `-save-stats=obj` when invoking the
 258   compiler and makes the lit runner collect and merge the statistic files.
 259
 260 - `TEST_SUITE_RUN_BENCHMARKS`
 261
 262   If this is set to `OFF` then lit will not actually run the tests but just
 263   collect build statistics like compile time and code size.
 264
 265 - `TEST_SUITE_USE_PERF`
 266
 267   Use the `perf` tool for time measurement instead of the `timeit` tool that
 268   comes with the test-suite.  The `perf` is usually available on linux systems.
 269
 270 - `TEST_SUITE_SPEC2000_ROOT`, `TEST_SUITE_SPEC2006_ROOT`, `TEST_SUITE_SPEC2017_ROOT`, ...
 271
 272   Specify installation directories of external benchmark suites. You can find
 273   more information about expected versions or usage in the README files in the
 274   `External` directory (such as `External/SPEC/README`)
 275
 276 ### Common CMake Flags
 277
 278 - `-GNinja`
 279
 280   Generate build files for the ninja build tool.
 281
 282 - `-Ctest-suite/cmake/caches/<cachefile.cmake>`
 283
 284   Use a CMake cache.  The test-suite comes with several CMake caches which
 285   predefine common or tricky build configurations.
 286
 287
 288 Displaying and Analyzing Results
 289 --------------------------------
 290
 291 The `compare.py` script displays and compares result files.  A result file is
 292 produced when invoking lit with the `-o filename.json` flag.
 293
 294 Example usage:
 295
 296 - Basic Usage:
 297
 298   ```text
 299   % test-suite/utils/compare.py baseline.json
 300   Warning: 'test-suite :: External/SPEC/CINT2006/403.gcc/403.gcc.test' has No metrics!
 301   Tests: 508
 302   Metric: exec_time
 303
 304   Program                                         baseline
 305
 306   INT2006/456.hmmer/456.hmmer                   1222.90
 307   INT2006/464.h264ref/464.h264ref               928.70
 308   ...
 309                baseline
 310   count  506.000000
 311   mean   20.563098
 312   std    111.423325
 313   min    0.003400
 314   25%    0.011200
 315   50%    0.339450
 316   75%    4.067200
 317   max    1222.896800
 318   ```
 319
 320 - Show compile_time or text segment size metrics:
 321
 322   ```bash
 323   % test-suite/utils/compare.py -m compile_time baseline.json
 324   % test-suite/utils/compare.py -m size.__text baseline.json
 325   ```
 326
 327 - Compare two result files and filter short running tests:
 328
 329   ```bash
 330   % test-suite/utils/compare.py --filter-short baseline.json experiment.json
 331   ...
 332   Program                                         baseline  experiment  diff
 333
 334   SingleSour.../Benchmarks/Linpack/linpack-pc     5.16      4.30        -16.5%
 335   MultiSourc...erolling-dbl/LoopRerolling-dbl     7.01      7.86         12.2%
 336   SingleSour...UnitTests/Vectorizer/gcc-loops     3.89      3.54        -9.0%
 337   ...
 338   ```
 339
 340 - Merge multiple baseline and experiment result files by taking the minimum
 341   runtime each:
 342
 343   ```bash
 344   % test-suite/utils/compare.py base0.json base1.json base2.json vs exp0.json exp1.json exp2.json
 345   ```
 346
 347 ### Continuous Tracking with LNT
 348
 349 LNT is a set of client and server tools for continuously monitoring
 350 performance. You can find more information at
 351 [https://llvm.org/docs/lnt](https://llvm.org/docs/lnt). The official LNT instance
 352 of the LLVM project is hosted at [http://lnt.llvm.org](http://lnt.llvm.org).
 353
 354
 355 External Suites
 356 ---------------
 357
 358 External suites such as SPEC can be enabled by either
 359
 360 - placing (or linking) them into the `test-suite/test-suite-externals/xxx` directory (example: `test-suite/test-suite-externals/speccpu2000`)
 361 - using a configuration option such as `-D TEST_SUITE_SPEC2000_ROOT=path/to/speccpu2000`
 362
 363 You can find further information in the respective README files such as
 364 `test-suite/External/SPEC/README`.
 365
 366 For the SPEC benchmarks you can switch between the `test`, `train` and
 367 `ref` input datasets via the `TEST_SUITE_RUN_TYPE` configuration option.
 368 The `train` dataset is used by default.
 369
 370
 371 Custom Suites
 372 -------------
 373
 374 You can build custom suites using the test-suite infrastructure. A custom suite
 375 has a `CMakeLists.txt` file at the top directory. The `CMakeLists.txt` will be
 376 picked up automatically if placed into a subdirectory of the test-suite or when
 377 setting the `TEST_SUITE_SUBDIRS` variable:
 378
 379 ```bash
 380 % cmake -DTEST_SUITE_SUBDIRS=path/to/my/benchmark-suite ../test-suite
 381 ```
 382
 383
 384 Profile Guided Optimization
 385 ---------------------------
 386
 387 Profile guided optimization requires to compile and run twice. First the
 388 benchmark should be compiled with profile generation instrumentation enabled
 389 and setup for training data. The lit runner will merge the profile files
 390 using `llvm-profdata` so they can be used by the second compilation run.
 391
 392 Example:
 393 ```bash
 394 # Profile generation run using LLVM IR PGO:
 395 % cmake -DTEST_SUITE_PROFILE_GENERATE=ON \
 396         -DTEST_SUITE_USE_IR_PGO=ON \
 397         -DTEST_SUITE_RUN_TYPE=train \
 398         ../test-suite
 399 % make
 400 % llvm-lit .
 401 # Use the profile data for compilation and actual benchmark run:
 402 % cmake -DTEST_SUITE_PROFILE_GENERATE=OFF \
 403         -DTEST_SUITE_PROFILE_USE=ON \
 404         -DTEST_SUITE_RUN_TYPE=ref \
 405         .
 406 % make
 407 % llvm-lit -o result.json .
 408 ```
 409
 410 To use Clang frontend's PGO instead of LLVM IR PGO, set `-DTEST_SUITE_USE_IR_PGO=OFF`.
 411
 412 The `TEST_SUITE_RUN_TYPE` setting only affects the SPEC benchmark suites.
 413
 414
 415 Cross Compilation and External Devices
 416 --------------------------------------
 417
 418 ### Compilation
 419
 420 CMake allows to cross compile to a different target via toolchain files. More
 421 information can be found here:
 422
 423 - [https://llvm.org/docs/lnt/tests.html#cross-compiling](https://llvm.org/docs/lnt/tests.html#cross-compiling)
 424
 425 - [https://cmake.org/cmake/help/latest/manual/cmake-toolchains.7.html](https://cmake.org/cmake/help/latest/manual/cmake-toolchains.7.html)
 426
 427 Cross compilation from macOS to iOS is possible with the
 428 `test-suite/cmake/caches/target-target-*-iphoneos-internal.cmake` CMake cache
 429 files; this requires an internal iOS SDK.
 430
 431 ### Running
 432
 433 There are two ways to run the tests in a cross compilation setting:
 434
 435 - Via SSH connection to an external device: The `TEST_SUITE_REMOTE_HOST` option
 436   should be set to the SSH hostname.  The executables and data files need to be
 437   transferred to the device after compilation.  This is typically done via the
 438   `rsync` make target.  After this, the lit runner can be used on the host
 439   machine. It will prefix the benchmark and verification command lines with an
 440   `ssh` command.
 441
 442   Example:
 443
 444   ```bash
 445   % cmake -G Ninja -D CMAKE_C_COMPILER=path/to/clang \
 446           -C ../test-suite/cmake/caches/target-arm64-iphoneos-internal.cmake \
 447           -D CMAKE_BUILD_TYPE=Release \
 448           -D TEST_SUITE_REMOTE_HOST=mydevice \
 449           ../test-suite
 450   % ninja
 451   % ninja rsync
 452   % llvm-lit -j1 -o result.json .
 453   ```
 454
 455 - You can specify a simulator for the target machine with the
 456   `TEST_SUITE_RUN_UNDER` setting. The lit runner will prefix all benchmark
 457   invocations with it.
 458
 459
 460 Running the test-suite via LNT
 461 ------------------------------
 462
 463 The LNT tool can run the test-suite. Use this when submitting test results to
 464 an LNT instance.  See
 465 [https://llvm.org/docs/lnt/tests.html#llvm-cmake-test-suite](https://llvm.org/docs/lnt/tests.html#llvm-cmake-test-suite)
 466 for details.
 467
 468 Running the test-suite via Makefiles (deprecated)
 469 -------------------------------------------------
 470
 471 **Note**: The test-suite comes with a set of Makefiles that are considered
 472 deprecated.  They do not support newer testing modes like `Bitcode` or
 473 `Microbenchmarks` and are harder to use.
 474
 475 Old documentation is available in the
 476 [test-suite Makefile Guide](TestSuiteMakefileGuide).