Next-generation SIMD, for SSE2, SSE4.1 and 128-bit AVX
commit74157ac75ab156c46a8a0bde6e3bff5f6235ec5e
authorErik Lindahl <erik@kth.se>
Thu, 23 Apr 2015 10:07:55 +0000 (23 12:07 +0200)
committerGerrit Code Review <gerrit@gerrit.gromacs.org>
Sun, 20 Dec 2015 20:19:01 +0000 (20 21:19 +0100)
treeb3655d67a224c384a497e82599dcfa2bcb693188
parent42dc4398d39871313107a8d369a548c4a4edbeef
Next-generation SIMD, for SSE2, SSE4.1 and 128-bit AVX

This adds the same functionality that was previously done for the
reference SIMD implementation This includes all the 128-bit x86
flavors, since SSE4.1 and AVX-128 only overrides a few SSE2
instructions/functions. Performance appears to be identical to the
state before the new SIMD code on x86 when using SSE2. For the most
performance-sensitive functions I expect we will later test a few
different alternative implementations once we can benchmark the
routines inside actual kernels using them.

Change-Id: I59d5741df345b38745f9a6d1ea3a4d27b0a66034
35 files changed:
docs/doxygen/Doxyfile-common.cmakein
docs/doxygen/suppressions.txt
src/gromacs/simd/impl_x86_avx_128_fma/impl_x86_avx_128_fma.h
src/gromacs/simd/impl_x86_avx_128_fma/impl_x86_avx_128_fma_common.h [deleted file]
src/gromacs/simd/impl_x86_avx_128_fma/impl_x86_avx_128_fma_definitions.h [new file with mode: 0644]
src/gromacs/simd/impl_x86_avx_128_fma/impl_x86_avx_128_fma_general.h [copied from src/gromacs/simd/impl_x86_sse2/impl_x86_sse2.h with 86% similarity]
src/gromacs/simd/impl_x86_avx_128_fma/impl_x86_avx_128_fma_simd4_double.h
src/gromacs/simd/impl_x86_avx_128_fma/impl_x86_avx_128_fma_simd4_float.h [copied from src/gromacs/simd/impl_x86_sse4_1/impl_x86_sse4_1_simd_float.h with 57% similarity]
src/gromacs/simd/impl_x86_avx_128_fma/impl_x86_avx_128_fma_simd_double.h
src/gromacs/simd/impl_x86_avx_128_fma/impl_x86_avx_128_fma_simd_float.h
src/gromacs/simd/impl_x86_avx_128_fma/impl_x86_avx_128_fma_util_double.h [copied from src/gromacs/simd/impl_x86_avx_128_fma/impl_x86_avx_128_fma_simd_double.h with 50% similarity]
src/gromacs/simd/impl_x86_avx_128_fma/impl_x86_avx_128_fma_util_float.h [new file with mode: 0644]
src/gromacs/simd/impl_x86_sse2/impl_x86_sse2.h
src/gromacs/simd/impl_x86_sse2/impl_x86_sse2_common.h [deleted file]
src/gromacs/simd/impl_x86_sse2/impl_x86_sse2_definitions.h [new file with mode: 0644]
src/gromacs/simd/impl_x86_sse2/impl_x86_sse2_general.h [copied from src/gromacs/simd/impl_x86_sse4_1/impl_x86_sse4_1.h with 84% similarity]
src/gromacs/simd/impl_x86_sse2/impl_x86_sse2_simd4_float.h
src/gromacs/simd/impl_x86_sse2/impl_x86_sse2_simd_double.h
src/gromacs/simd/impl_x86_sse2/impl_x86_sse2_simd_float.h
src/gromacs/simd/impl_x86_sse2/impl_x86_sse2_util_double.h [new file with mode: 0644]
src/gromacs/simd/impl_x86_sse2/impl_x86_sse2_util_float.h [new file with mode: 0644]
src/gromacs/simd/impl_x86_sse4_1/impl_x86_sse4_1.h
src/gromacs/simd/impl_x86_sse4_1/impl_x86_sse4_1_definitions.h [new file with mode: 0644]
src/gromacs/simd/impl_x86_sse4_1/impl_x86_sse4_1_general.h [copied from src/gromacs/simd/impl_x86_sse2/impl_x86_sse2.h with 86% similarity]
src/gromacs/simd/impl_x86_sse4_1/impl_x86_sse4_1_simd4_float.h
src/gromacs/simd/impl_x86_sse4_1/impl_x86_sse4_1_simd_double.h
src/gromacs/simd/impl_x86_sse4_1/impl_x86_sse4_1_simd_float.h
src/gromacs/simd/impl_x86_sse4_1/impl_x86_sse4_1_util_double.h [copied from src/gromacs/simd/impl_x86_sse2/impl_x86_sse2.h with 86% similarity]
src/gromacs/simd/impl_x86_sse4_1/impl_x86_sse4_1_util_float.h [moved from src/gromacs/simd/impl_x86_sse4_1/impl_x86_sse4_1_common.h with 83% similarity]
src/gromacs/simd/simd.h
src/gromacs/simd/tests/bootstrap_loadstore.cpp
src/gromacs/simd/tests/simd4_floatingpoint.cpp
src/gromacs/simd/tests/simd_floatingpoint.cpp
src/gromacs/simd/tests/simd_floatingpoint_util.cpp
src/gromacs/simd/tests/simd_integer.cpp