Fix ThreadMPI GPU assumptions
The OpenCL implementation introduces the constraint of one GPU per
node, but thread-MPI still assumed any compatible GPU was available
for use and thus should have a rank.
Consolidated the configure-time constants behind some API functions so
that we can use the same behaviour in the various setup code.
Added a warning message that the OpenCL implementation has to waste a
GPU, stopped showing another warning message related to wasting
GPUs when the OpenCL implemenation forces this, and improved
another message to clarify why gmx mdrun -ntmpi 2 won't work
with OpenCL.
Also fixed a few references to thread-MPI threads that are better
called thread-MPI ranks.
Change-Id: I4664c49786ebd26a53cbf5e1c26df79649ba4f5f