Next-generation SIMD module, reference implementation
commit0b87ff462e54091898737039cb7890666b82cc1b
authorErik Lindahl <erik@kth.se>
Sun, 1 Nov 2015 21:01:15 +0000 (1 22:01 +0100)
committerErik Lindahl <erik@kth.se>
Wed, 16 Dec 2015 09:05:31 +0000 (16 10:05 +0100)
treee2cf532cfcd6dc40659b593c3ed23b77ba16533b
parent690bfd41c88acf5d47e172948d5d0f0b69f1ec59
Next-generation SIMD module, reference implementation

This moves the SIMD implementation to C++, and extends it
to put all architecture-specific code in the SIMD module
and make the nonbonded kernels fully generic. A number
of other features has also been added, with the hope that
we should be able to largely freeze the features after
this expansion.
- SIMD variables are now always unique types, and all
  names have been changed to C++ standard in the gmx
  namespace.
- All SIMD functions are now functions and not defines.
- Function names have been modified to correspond as much
  as possible to the C++ standard library for normal
  floating-point types.
- The alignment routines have been removed and replaced
  with the AlignedAllocator and GMX_ALIGNED() attribute
  for stack variables. The latter is now defined on all archs,
  either to MSVC- or GNU-specific versions or the C++11
  standard alignas/alignof, with a new basedefinition test
  that will catch at least some bad alignment routines.
- About 10 higher-level routines have been added to
  perform the operations necessary for full-SIMD-width
  nonbonded kernels.
- There are new defines to indicate that wide SIMD
  implementations support half-SIMD-width operations, and
  a handful of utility functions that are required to
  use this in the nonbonded kernels.
- Masked operations for multiply, reciprocals and
  inverse square roots have been added. In the future
  this will make it possible to improve the kernel
  efficiency, in particular on platforms with native
  support for masked operations.
- We no longer use defines to check availability
  of the basic integer SIMD type, but implementations
  that do not support this at all should implement it
  through the floating-point types, i.e. it will always
  be present from the user perspective.
- A transpose operation has been added for SIMD4.
- Gather/scatter operations have been added for
  triplets. This will make it easier to use SIMD
  even for routines with non-SIMD-friendly data layout.
- Memory alignment asserts are used when compiling with
  asserts enabled.
- The nbnxn kernels no longer use any architecture-
  specific files, but rely entirely on the simd module.

This also fixes a float-to-double conversion bug
for Xeon Phi that was detected with a new unit test
for the conversions. Those routines have never been
used in GROMACS, so it is harmless.

Change-Id: Ic882df80b21e8a70a9585c2dc4dd1e87fae1a9d8
118 files changed:
docs/doxygen/lib/simd.md
docs/doxygen/suppressions.txt
src/gromacs/ewald/pme-gather.cpp
src/gromacs/ewald/pme-internal.h
src/gromacs/ewald/pme-simd4.h
src/gromacs/ewald/pme-solve.cpp
src/gromacs/ewald/pme-spline-work.cpp
src/gromacs/ewald/pme-spline-work.h
src/gromacs/ewald/pme-spread.cpp
src/gromacs/gmxlib/nonbonded/nonbonded.cpp
src/gromacs/listed-forces/bonded.cpp
src/gromacs/listed-forces/listed-forces.cpp
src/gromacs/mdlib/calc_verletbuf.cpp
src/gromacs/mdlib/clincs.cpp
src/gromacs/mdlib/forcerec.cpp
src/gromacs/mdlib/nbnxn_atomdata.cpp
src/gromacs/mdlib/nbnxn_grid.cpp
src/gromacs/mdlib/nbnxn_internal.h
src/gromacs/mdlib/nbnxn_kernels/nbnxn_kernel_simd_utils.h [deleted file]
src/gromacs/mdlib/nbnxn_kernels/nbnxn_kernel_simd_utils_ibm_qpx.h [deleted file]
src/gromacs/mdlib/nbnxn_kernels/nbnxn_kernel_simd_utils_ref.h [deleted file]
src/gromacs/mdlib/nbnxn_kernels/nbnxn_kernel_simd_utils_x86_128d.h [deleted file]
src/gromacs/mdlib/nbnxn_kernels/nbnxn_kernel_simd_utils_x86_128s.h [deleted file]
src/gromacs/mdlib/nbnxn_kernels/nbnxn_kernel_simd_utils_x86_256d.h [deleted file]
src/gromacs/mdlib/nbnxn_kernels/nbnxn_kernel_simd_utils_x86_256s.h [deleted file]
src/gromacs/mdlib/nbnxn_kernels/nbnxn_kernel_simd_utils_x86_mic.h [deleted file]
src/gromacs/mdlib/nbnxn_kernels/nbnxn_kernel_x86_simd128.h [deleted file]
src/gromacs/mdlib/nbnxn_kernels/simd_2xnn/nbnxn_kernel_simd_2xnn_common.h
src/gromacs/mdlib/nbnxn_kernels/simd_2xnn/nbnxn_kernel_simd_2xnn_inner.h
src/gromacs/mdlib/nbnxn_kernels/simd_2xnn/nbnxn_kernel_simd_2xnn_outer.h
src/gromacs/mdlib/nbnxn_kernels/simd_4xn/nbnxn_kernel_simd_4xn_common.h
src/gromacs/mdlib/nbnxn_kernels/simd_4xn/nbnxn_kernel_simd_4xn_inner.h
src/gromacs/mdlib/nbnxn_kernels/simd_4xn/nbnxn_kernel_simd_4xn_outer.h
src/gromacs/mdlib/nbnxn_pairlist.h
src/gromacs/mdlib/nbnxn_search.cpp
src/gromacs/mdlib/nbnxn_search_simd_2xnn.h
src/gromacs/mdlib/nbnxn_search_simd_4xn.h
src/gromacs/mdlib/nbnxn_simd.h
src/gromacs/mdlib/nbnxn_util.h
src/gromacs/pbcutil/pbc-simd.cpp
src/gromacs/pbcutil/pbc-simd.h
src/gromacs/simd/impl_arm_neon/impl_arm_neon_simd4_float.h
src/gromacs/simd/impl_arm_neon/impl_arm_neon_simd_float.h
src/gromacs/simd/impl_arm_neon_asimd/impl_arm_neon_asimd_simd_double.h
src/gromacs/simd/impl_arm_neon_asimd/impl_arm_neon_asimd_simd_float.h
src/gromacs/simd/impl_ibm_qpx/impl_ibm_qpx_simd4_double.h
src/gromacs/simd/impl_ibm_qpx/impl_ibm_qpx_simd4_float.h
src/gromacs/simd/impl_ibm_qpx/impl_ibm_qpx_simd_double.h
src/gromacs/simd/impl_ibm_qpx/impl_ibm_qpx_simd_float.h
src/gromacs/simd/impl_ibm_vmx/impl_ibm_vmx_simd4_float.h
src/gromacs/simd/impl_ibm_vmx/impl_ibm_vmx_simd_float.h
src/gromacs/simd/impl_ibm_vsx/impl_ibm_vsx_simd4_float.h
src/gromacs/simd/impl_ibm_vsx/impl_ibm_vsx_simd_double.h
src/gromacs/simd/impl_ibm_vsx/impl_ibm_vsx_simd_float.h
src/gromacs/simd/impl_intel_mic/impl_intel_mic_simd4_double.h
src/gromacs/simd/impl_intel_mic/impl_intel_mic_simd4_float.h
src/gromacs/simd/impl_intel_mic/impl_intel_mic_simd_double.h
src/gromacs/simd/impl_intel_mic/impl_intel_mic_simd_float.h
src/gromacs/simd/impl_reference/impl_reference.h
src/gromacs/simd/impl_reference/impl_reference_common.h [deleted file]
src/gromacs/simd/impl_reference/impl_reference_definitions.h [new file with mode: 0644]
src/gromacs/simd/impl_reference/impl_reference_general.h [copied from src/gromacs/simd/impl_reference/impl_reference.h with 58% similarity]
src/gromacs/simd/impl_reference/impl_reference_simd4_double.h
src/gromacs/simd/impl_reference/impl_reference_simd4_float.h
src/gromacs/simd/impl_reference/impl_reference_simd_double.h
src/gromacs/simd/impl_reference/impl_reference_simd_float.h
src/gromacs/simd/impl_reference/impl_reference_util_double.h [new file with mode: 0644]
src/gromacs/simd/impl_reference/impl_reference_util_float.h [new file with mode: 0644]
src/gromacs/simd/impl_sparc64_hpc_ace/impl_sparc64_hpc_ace_simd_double.h
src/gromacs/simd/impl_sparc64_hpc_ace/impl_sparc64_hpc_ace_simd_float.h
src/gromacs/simd/impl_x86_avx2_256/impl_x86_avx2_256_simd4_double.h
src/gromacs/simd/impl_x86_avx2_256/impl_x86_avx2_256_simd4_float.h
src/gromacs/simd/impl_x86_avx2_256/impl_x86_avx2_256_simd_double.h
src/gromacs/simd/impl_x86_avx2_256/impl_x86_avx2_256_simd_float.h
src/gromacs/simd/impl_x86_avx_128_fma/impl_x86_avx_128_fma_simd4_double.h
src/gromacs/simd/impl_x86_avx_128_fma/impl_x86_avx_128_fma_simd_double.h
src/gromacs/simd/impl_x86_avx_128_fma/impl_x86_avx_128_fma_simd_float.h
src/gromacs/simd/impl_x86_avx_256/impl_x86_avx_256_simd4_double.h
src/gromacs/simd/impl_x86_avx_256/impl_x86_avx_256_simd4_float.h
src/gromacs/simd/impl_x86_avx_256/impl_x86_avx_256_simd_double.h
src/gromacs/simd/impl_x86_avx_256/impl_x86_avx_256_simd_float.h
src/gromacs/simd/impl_x86_avx_512er/impl_x86_avx_512er_simd4_double.h
src/gromacs/simd/impl_x86_avx_512er/impl_x86_avx_512er_simd4_float.h
src/gromacs/simd/impl_x86_avx_512er/impl_x86_avx_512er_simd_double.h
src/gromacs/simd/impl_x86_avx_512er/impl_x86_avx_512er_simd_float.h
src/gromacs/simd/impl_x86_avx_512f/impl_x86_avx_512f_simd4_double.h
src/gromacs/simd/impl_x86_avx_512f/impl_x86_avx_512f_simd4_float.h
src/gromacs/simd/impl_x86_avx_512f/impl_x86_avx_512f_simd_double.h
src/gromacs/simd/impl_x86_avx_512f/impl_x86_avx_512f_simd_float.h
src/gromacs/simd/impl_x86_sse2/impl_x86_sse2_simd4_float.h
src/gromacs/simd/impl_x86_sse2/impl_x86_sse2_simd_double.h
src/gromacs/simd/impl_x86_sse2/impl_x86_sse2_simd_float.h
src/gromacs/simd/impl_x86_sse4_1/impl_x86_sse4_1_simd4_float.h
src/gromacs/simd/impl_x86_sse4_1/impl_x86_sse4_1_simd_double.h
src/gromacs/simd/impl_x86_sse4_1/impl_x86_sse4_1_simd_float.h
src/gromacs/simd/simd.h
src/gromacs/simd/simd_math.h
src/gromacs/simd/tests/CMakeLists.txt
src/gromacs/simd/tests/base.cpp
src/gromacs/simd/tests/base.h
src/gromacs/simd/tests/bootstrap_loadstore.cpp
src/gromacs/simd/tests/simd.cpp
src/gromacs/simd/tests/simd.h
src/gromacs/simd/tests/simd4.cpp
src/gromacs/simd/tests/simd4.h
src/gromacs/simd/tests/simd4_floatingpoint.cpp
src/gromacs/simd/tests/simd4_math.cpp
src/gromacs/simd/tests/simd4_vector_operations.cpp
src/gromacs/simd/tests/simd_floatingpoint.cpp
src/gromacs/simd/tests/simd_floatingpoint_util.cpp [new file with mode: 0644]
src/gromacs/simd/tests/simd_integer.cpp
src/gromacs/simd/tests/simd_math.cpp
src/gromacs/simd/tests/simd_vector_operations.cpp
src/gromacs/simd/vector_operations.h
src/gromacs/utility/basedefinitions.h
src/gromacs/utility/tests/CMakeLists.txt
src/gromacs/utility/tests/basedefinitions.cpp [moved from src/gromacs/mdlib/nbnxn_kernels/nbnxn_kernel_x86_simd256.h with 59% similarity]
tests/CppCheck.cmake