Fix compilation issues with ARM SIMD
ARM_NEON has never supported double precision SIMD, so disabled it
with GROMACS double-precision build.
The maskzR* functions used the wrong argument order in the debug-mode
pre-masking (and sometimes in a typo-ed syntax).
In the shift operators, the clang-based compilers (including the
armclang v6 compiler series) seem to check that the required immediate
integer argument is given before inlining the call to the operator
function. The inlining seems to permit gcc to recognize that the
callers always use an immediate. In theory, the new code might
generate code that runs a trifle slower, but we don't use it at the
moment and the cost might be negligible if other effects dominate
performance.
Change-Id: I61dd4d906f7d5b77bc4e851cfaaaff059e5a67fe