4 Enabled compiling CUDA device code with clang
5 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
6 clang can be used as a device compiler by setting GMX_CLANG_CUDA=ON. A
7 CUDA toolkit (>=7.0) is also needed. Note that the resulting runtime
8 performance is usually worse than that of binaries compiled by the
9 official NVIDIA CUDA compiler (nvcc).
11 Increased the oldest cmake, compiler and CUDA versions required
12 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
13 We now require gcc-4.8.1, clang-3.3 and icc-17.0.1, so we can rely on full
14 C++11 support. We now also require CUDA-6.5 and CMake-3.4.3.
16 Added check that CUDA available hardware and compiled code are compatible
17 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
18 Added an early check to detect when the :ref:`gmx mdrun` binary does
19 not embed code compatible with the GPU device it tries to use nor does
20 it have PTX that could have been just-in-time compiled.
22 Additionally, if the user manually sets GMX_CUDA_TARGET_COMPUTE=20 and
23 no later SM or COMPUTE but runs on >2.0 hardware, we'd be executing
24 just-in-time-compiled Fermi kernels with incorrect host-side code
25 assumptions (e.g amount of shared memory allocated or texture type).
26 This change also prevents such cases.
30 Disabled ARM Neon native rsqrt iteration used in short-ranged interactions
31 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
34 Avoided FTZ triggering simd test failures
35 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
36 For very small arguments on platforms without FMA support, the Intel
37 compiler's default usage of flush-to-zero for denormal values can lead
38 to slight deviations. Since this is a range we really don't care
39 about, and non-FMA platforms are anyway a thing of the past, just
40 avoid testing a very small range around that threshold for non-FMA
45 Fixed OpenCL compiles on Mac OS
46 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
47 Confirmed to work on Mac OS 10.13.2 running on a Macbook Pro with
52 Tested that nvcc/host compiler combination works
53 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
54 We now compile a trivial CUDA program during a run of CMake to catch
55 both unsupported nvcc/host compiler version combinations and other
60 Added AVX_512 and KNC symbols to FFTW SIMD test
61 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
62 Otherwise the CMake code might complain loudly about FFTW not being
63 accelerated on KNC or KNL hosts.
65 Implemented changes for CMake policy 0068
66 """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
67 CMake-3.9 introduced a changed behavior for RPATH vs. install_name
68 options on OS X. This avoids relying on functionality that will be
69 removed in future CMake versions.