mpp/mpp.F90

   1 !***********************************************************************
   2 !*                   GNU Lesser General Public License
   3 !*
   4 !* This file is part of the GFDL Flexible Modeling System (FMS).
   5 !*
   6 !* FMS is free software: you can redistribute it and/or modify it under
   7 !* the terms of the GNU Lesser General Public License as published by
   8 !* the Free Software Foundation, either version 3 of the License, or (at
   9 !* your option) any later version.
  10 !*
  11 !* FMS is distributed in the hope that it will be useful, but WITHOUT
  12 !* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
  13 !* FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
  14 !* for more details.
  15 !*
  16 !* You should have received a copy of the GNU Lesser General Public
  17 !* License along with FMS.  If not, see <http://www.gnu.org/licenses/>.
  18 !***********************************************************************
  19 !-----------------------------------------------------------------------
  20 !                 Communication for message-passing codes
  21 !
  22 ! AUTHOR: V. Balaji (V.Balaji@noaa.gov)
  23 !         SGI/GFDL Princeton University
  24 !
  25 !-----------------------------------------------------------------------
  26
  27 !> @defgroup mpp_mod mpp_mod
  28 !> @ingroup mpp
  29 !> @brief This module defines interfaces for common operations using message-passing libraries.
  30 !! Any type-less arguments in the documentation are MPP_TYPE_ which is defined by the pre-processor
  31 !! to create multiple subroutines out of one implementation for use in an interface. See the note
  32 !! below for more information
  33 !!
  34 !> @author V. Balaji <"V.Balaji@noaa.gov">
  35 !!
  36 !!   A set of simple calls to provide a uniform interface
  37 !!   to different message-passing libraries. It currently can be
  38 !!   implemented either in the SGI/Cray native SHMEM library or in the MPI
  39 !!   standard. Other libraries (e.g MPI-2, Co-Array Fortran) can be
  40 !!   incorporated as the need arises.
  41 !!
  42 !!   The data transfer between a processor and its own memory is based
  43 !!   on <TT>load</TT> and <TT>store</TT> operations upon
  44 !!   memory. Shared-memory systems (including distributed shared memory
  45 !!   systems) have a single address space and any processor can acquire any
  46 !!   data within the memory by <TT>load</TT> and
  47 !!   <TT>store</TT>. The situation is different for distributed
  48 !!   parallel systems. Specialized MPP systems such as the T3E can simulate
  49 !!   shared-memory by direct data acquisition from remote memory. But if
  50 !!   the parallel code is distributed across a cluster, or across the Net,
  51 !!   messages must be sent and received using the protocols for
  52 !!   long-distance communication, such as TCP/IP. This requires a
  53 !!   ``handshaking'' between nodes of the distributed system. One can think
  54 !!   of the two different methods as involving <TT>put</TT>s or
  55 !!   <TT>get</TT>s (e.g the SHMEM library), or in the case of
  56 !!   negotiated communication (e.g MPI), <TT>send</TT>s and
  57 !!   <TT>recv</TT>s.
  58 !!
  59 !!   The difference between SHMEM and MPI is that SHMEM uses one-sided
  60 !!   communication, which can have very low-latency high-bandwidth
  61 !!   implementations on tightly coupled systems. MPI is a standard
  62 !!   developed for distributed computing across loosely-coupled systems,
  63 !!   and therefore incurs a software penalty for negotiating the
  64 !!   communication. It is however an open industry standard whereas SHMEM
  65 !!   is a proprietary interface. Besides, the <TT>put</TT>s or
  66 !!   <TT>get</TT>s on which it is based cannot currently be implemented in
  67 !!   a cluster environment (there are recent announcements from Compaq that
  68 !!   occasion hope).
  69 !!
  70 !!   The message-passing requirements of climate and weather codes can be
  71 !!   reduced to a fairly simple minimal set, which is easily implemented in
  72 !!   any message-passing API. <TT>mpp_mod</TT> provides this API.
  73 !!
  74 !!    Features of <TT>mpp_mod</TT> include:
  75 !!    <ol>
  76 !!     <li> Simple, minimal API, with free access to underlying API for </li>
  77 !!       more complicated stuff.<BR/>
  78 !!     <li> Design toward typical use in climate/weather CFD codes. </li>
  79 !!     <li> Performance to be not significantly lower than any native API. </li>
  80 !!    </ol>
  81 !!
  82 !!   This module is used to develop higher-level calls for
  83 !!   domain decomposition (@ref mpp_domains) and parallel I/O (@ref fms2_io)
  84 !! <br/>
  85 !!   Parallel computing is initially daunting, but it soon becomes
  86 !!   second nature, much the way many of us can now write vector code
  87 !!   without much effort. The key insight required while reading and
  88 !!   writing parallel code is in arriving at a mental grasp of several
  89 !!   independent parallel execution streams through the same code (the SPMD
  90 !!   model). Each variable you examine may have different values for each
  91 !!   stream, the processor ID being an obvious example. Subroutines and
  92 !!   function calls are particularly subtle, since it is not always obvious
  93 !!   from looking at a call what synchronization between execution streams
  94 !!   it implies. An example of erroneous code would be a global barrier
  95 !!   call (see @ref mpp_sync below) placed
  96 !!   within a code block that not all PEs will execute, e.g:
  97 !!
  98 !!   <PRE>
  99 !!   if( pe.EQ.0 )call mpp_sync()
 100 !!   </PRE>
 101 !!
 102 !!   Here only PE 0 reaches the barrier, where it will wait
 103 !!   indefinitely. While this is a particularly egregious example to
 104 !!   illustrate the coding flaw, more subtle versions of the same are
 105 !!   among the most common errors in parallel code.
 106 !!  <br/>
 107 !!   It is therefore important to be conscious of the context of a
 108 !!   subroutine or function call, and the implied synchronization. There
 109 !!   are certain calls here (e.g <TT>mpp_declare_pelist, mpp_init,
 110 !!   mpp_set_stack_size</TT>) which must be called by all
 111 !!   PEs. There are others which must be called by a subset of PEs (here
 112 !!   called a <TT>pelist</TT>) which must be called by all the PEs in the
 113 !!   <TT>pelist</TT> (e.g <TT>mpp_max, mpp_sum, mpp_sync</TT>). Still
 114 !!   others imply no synchronization at all. I will make every effort to
 115 !!   highlight the context of each call in the MPP modules, so that the
 116 !!   implicit synchronization is spelt out.
 117 !! <br/>
 118 !!   For performance it is necessary to keep synchronization as limited
 119 !!   as the algorithm being implemented will allow. For instance, a single
 120 !!   message between two PEs should only imply synchronization across the
 121 !!   PEs in question. A <I>global</I> synchronization (or <I>barrier</I>)
 122 !!   is likely to be slow, and is best avoided. But codes first
 123 !!   parallelized on a Cray T3E tend to have many global syncs, as very
 124 !!   fast barriers were implemented there in hardware.
 125 !! <br/>
 126 !!   Another reason to use pelists is to run a single program in MPMD
 127 !!   mode, where different PE subsets work on different portions of the
 128 !!   code. A typical example is to assign an ocean model and atmosphere
 129 !!   model to different PE subsets, and couple them concurrently instead of
 130 !!   running them serially. The MPP module provides the notion of a
 131 !!   <I>current pelist</I>, which is set when a group of PEs branch off
 132 !!   into a subset. Subsequent calls that omit the <TT>pelist</TT> optional
 133 !!   argument (seen below in many of the individual calls) assume that the
 134 !!   implied synchronization is across the current pelist. The calls
 135 !!   <TT>mpp_root_pe</TT> and <TT>mpp_npes</TT> also return the values
 136 !!   appropriate to the current pelist. The <TT>mpp_set_current_pelist</TT>
 137 !!   call is provided to set the current pelist.
 138 !! </DESCRIPTION>
 139 !! <br/>
 140 !!
 141 !!  @note F90 is a strictly-typed language, and the syntax pass of the
 142 !!  compiler requires matching of type, kind and rank (TKR). Most calls
 143 !!  listed here use a generic type, shown here as <TT>MPP_TYPE_</TT>. This
 144 !!  is resolved in the pre-processor stage to any of a variety of
 145 !!  types. In general the MPP operations work on 4-byte and 8-byte
 146 !!  variants of <TT>integer, real, complex, logical</TT> variables, of
 147 !!  rank 0 to 5, leading to 48 specific module procedures under the same
 148 !!  generic interface. Any of the variables below shown as
 149 !!  <TT>MPP_TYPE_</TT> is treated in this way.
 150
 151 module mpp_mod
 152
 153 ! Define rank(X) for PGI compiler
 154 #if defined( __PGI) || defined (__FLANG)
 155 #define rank(X) size(shape(X))
 156 #endif
 157
 158
 159 #if defined(use_libMPI)
 160   use mpi
 161 #endif
 162
 163   use iso_fortran_env,   only : INPUT_UNIT, OUTPUT_UNIT, ERROR_UNIT
 164   use mpp_parameter_mod, only : MPP_VERBOSE, MPP_DEBUG, ALL_PES, ANY_PE, NULL_PE
 165   use mpp_parameter_mod, only : NOTE, WARNING, FATAL, MPP_CLOCK_DETAILED,MPP_CLOCK_SYNC
 166   use mpp_parameter_mod, only : CLOCK_COMPONENT, CLOCK_SUBCOMPONENT, CLOCK_MODULE_DRIVER
 167   use mpp_parameter_mod, only : CLOCK_MODULE, CLOCK_ROUTINE, CLOCK_LOOP, CLOCK_INFRA
 168   use mpp_parameter_mod, only : MAX_EVENTS, MAX_BINS, MAX_EVENT_TYPES, MAX_CLOCKS
 169   use mpp_parameter_mod, only : MAXPES, EVENT_WAIT, EVENT_ALLREDUCE, EVENT_BROADCAST
 170   use mpp_parameter_mod, only : EVENT_ALLTOALL
 171   use mpp_parameter_mod, only : EVENT_TYPE_CREATE, EVENT_TYPE_FREE
 172   use mpp_parameter_mod, only : EVENT_RECV, EVENT_SEND, MPP_READY, MPP_WAIT
 173   use mpp_parameter_mod, only : mpp_parameter_version=>version
 174   use mpp_parameter_mod, only : DEFAULT_TAG
 175   use mpp_parameter_mod, only : COMM_TAG_1,  COMM_TAG_2,  COMM_TAG_3,  COMM_TAG_4
 176   use mpp_parameter_mod, only : COMM_TAG_5,  COMM_TAG_6,  COMM_TAG_7,  COMM_TAG_8
 177   use mpp_parameter_mod, only : COMM_TAG_9,  COMM_TAG_10, COMM_TAG_11, COMM_TAG_12
 178   use mpp_parameter_mod, only : COMM_TAG_13, COMM_TAG_14, COMM_TAG_15, COMM_TAG_16
 179   use mpp_parameter_mod, only : COMM_TAG_17, COMM_TAG_18, COMM_TAG_19, COMM_TAG_20
 180   use mpp_parameter_mod, only : MPP_FILL_INT,MPP_FILL_DOUBLE
 181   use mpp_data_mod,      only : stat, mpp_stack, ptr_stack, status, ptr_status, sync, ptr_sync
 182   use mpp_data_mod,      only : mpp_from_pe, ptr_from, remote_data_loc, ptr_remote
 183   use mpp_data_mod,      only : mpp_data_version=>version
 184   use platform_mod
 185
 186 implicit none
 187 private
 188
 189   !--- public parameters  -----------------------------------------------
 190   public :: MPP_VERBOSE, MPP_DEBUG, ALL_PES, ANY_PE, NULL_PE, NOTE, WARNING, FATAL
 191   public :: MPP_CLOCK_SYNC, MPP_CLOCK_DETAILED, CLOCK_COMPONENT, CLOCK_SUBCOMPONENT
 192   public :: CLOCK_MODULE_DRIVER, CLOCK_MODULE, CLOCK_ROUTINE, CLOCK_LOOP, CLOCK_INFRA
 193   public :: MAXPES, EVENT_RECV, EVENT_SEND
 194   public :: COMM_TAG_1,  COMM_TAG_2,  COMM_TAG_3,  COMM_TAG_4
 195   public :: COMM_TAG_5,  COMM_TAG_6,  COMM_TAG_7,  COMM_TAG_8
 196   public :: COMM_TAG_9,  COMM_TAG_10, COMM_TAG_11, COMM_TAG_12
 197   public :: COMM_TAG_13, COMM_TAG_14, COMM_TAG_15, COMM_TAG_16
 198   public :: COMM_TAG_17, COMM_TAG_18, COMM_TAG_19, COMM_TAG_20
 199   public :: MPP_FILL_INT,MPP_FILL_DOUBLE,MPP_INFO_NULL,MPP_COMM_NULL
 200   public :: mpp_init_test_full_init, mpp_init_test_init_true_only, mpp_init_test_peset_allocated
 201   public :: mpp_init_test_clocks_init, mpp_init_test_datatype_list_init, mpp_init_test_logfile_init
 202   public :: mpp_init_test_read_namelist, mpp_init_test_etc_unit, mpp_init_test_requests_allocated
 203
 204   !--- public interface from mpp_util.h ------------------------------
 205   public :: stdin, stdout, stderr, stdlog, warnlog, lowercase, uppercase, mpp_error, mpp_error_state
 206   public :: mpp_set_warn_level, mpp_sync, mpp_sync_self, mpp_pe
 207   public :: mpp_npes, mpp_root_pe, mpp_set_root_pe, mpp_declare_pelist
 208   public :: mpp_get_current_pelist, mpp_set_current_pelist, mpp_get_current_pelist_name
 209   public :: mpp_clock_id, mpp_clock_set_grain, mpp_record_timing_data, get_unit
 210   public :: read_ascii_file, read_input_nml, mpp_clock_begin, mpp_clock_end
 211   public :: get_ascii_file_num_lines, get_ascii_file_num_lines_and_length
 212   public :: mpp_record_time_start, mpp_record_time_end
 213
 214   !--- public interface from mpp_comm.h ------------------------------
 215   public :: mpp_chksum, mpp_max, mpp_min, mpp_sum, mpp_transmit, mpp_send, mpp_recv
 216   public :: mpp_sum_ad
 217   public :: mpp_broadcast, mpp_init, mpp_exit
 218   public :: mpp_gather, mpp_scatter, mpp_alltoall
 219   public :: mpp_type, mpp_byte, mpp_type_create, mpp_type_free
 220
 221   !*********************************************************************
 222   !
 223   !    public data type
 224   !
 225   !*********************************************************************
 226   !> Communication information for message passing libraries
 227   !!
 228   !> peset hold communicators as SHMEM-compatible triads (start, log2(stride), num)
 229   !> @ingroup mpp_mod
 230   type :: communicator
 231      private
 232      character(len=32) :: name
 233      integer, pointer  :: list(:) =>NULL()
 234      integer           :: count
 235      integer           :: start, log2stride !< dummy variables when libMPI is defined.
 236      integer           :: id, group         !< MPI communicator and group id for this PE set.
 237   end type communicator
 238
 239   !> Communication event profile
 240   !> @ingroup mpp_mod
 241   type :: event
 242      private
 243      character(len=16)                         :: name
 244      integer(i8_kind), dimension(MAX_EVENTS)   :: ticks, bytes
 245      integer                                   :: calls
 246   end type event
 247
 248   !> a clock contains an array of event profiles for a region
 249   !> @ingroup mpp_mod
 250   type :: clock
 251      private
 252      character(len=32)    :: name
 253      integer(i8_kind)     :: hits
 254      integer(i8_kind)     :: tick
 255      integer(i8_kind)     :: total_ticks
 256      integer              :: peset_num
 257      logical              :: sync_on_begin, detailed
 258      integer              :: grain
 259      type(event), pointer :: events(:) =>NULL() !> if needed, allocate to MAX_EVENT_TYPES
 260      logical              :: is_on              !> initialize to false. set true when calling mpp_clock_begin
 261                                                 !! set false when calling mpp_clock_end
 262   end type clock
 263
 264   !> Summary of information from a clock run
 265   !> @ingroup mpp_mod
 266   type :: Clock_Data_Summary
 267      private
 268      character(len=16)  :: name
 269      real(r8_kind)      :: msg_size_sums(MAX_BINS)
 270      real(r8_kind)      :: msg_time_sums(MAX_BINS)
 271      real(r8_kind)      :: total_data
 272      real(r8_kind)      :: total_time
 273      integer(i8_kind)   :: msg_size_cnts(MAX_BINS)
 274      integer(i8_kind)   :: total_cnts
 275   end type Clock_Data_Summary
 276
 277   !> holds name and clock data for use in @ref mpp_util.h
 278   !> @ingroup mpp_mod
 279   type :: Summary_Struct
 280      private
 281      character(len=16)         :: name
 282      type (Clock_Data_Summary) :: event(MAX_EVENT_TYPES)
 283   end type Summary_Struct
 284
 285   !> Data types for generalized data transfer (e.g. MPI_Type)
 286   !> @ingroup mpp_mod
 287   type :: mpp_type
 288      private
 289      integer :: counter !> Number of instances of this type
 290      integer :: ndims
 291      integer, allocatable :: sizes(:)
 292      integer, allocatable :: subsizes(:)
 293      integer, allocatable :: starts(:)
 294      integer :: etype   !> Elementary data type (e.g. MPI_BYTE)
 295      integer :: id      !> Identifier within message passing library (e.g. MPI)
 296
 297      type(mpp_type), pointer :: prev => null()
 298      type(mpp_type), pointer :: next => null()
 299   end type mpp_type
 300
 301   !> Persisent elements for linked list interaction
 302   !> @ingroup mpp_mod
 303   type :: mpp_type_list
 304       private
 305       type(mpp_type), pointer :: head => null()
 306       type(mpp_type), pointer :: tail => null()
 307       integer :: length
 308   end type mpp_type_list
 309
 310 !***********************************************************************
 311 !
 312 !     public interface from mpp_util.h
 313 !
 314 !***********************************************************************
 315   !> @brief Error handler.
 316   !!
 317   !>    It is strongly recommended that all error exits pass through
 318   !!    <TT>mpp_error</TT> to assure the program fails cleanly. An individual
 319   !!    PE encountering a <TT>STOP</TT> statement, for instance, can cause the
 320   !!    program to hang. The use of the <TT>STOP</TT> statement is strongly
 321   !!    discouraged.
 322   !!
 323   !!    Calling mpp_error with no arguments produces an immediate error
 324   !!    exit, i.e:
 325   !!    <PRE>
 326   !!                    call mpp_error
 327   !!                    call mpp_error()
 328   !!    </PRE>
 329   !!    are equivalent.
 330   !!
 331   !!    The argument order
 332   !!    <PRE>
 333   !!                    call mpp_error( routine, errormsg, errortype )
 334   !!    </PRE>
 335   !!    is also provided to support legacy code. In this version of the
 336   !!    call, none of the arguments may be omitted.
 337   !!
 338   !!    The behaviour of <TT>mpp_error</TT> for a <TT>WARNING</TT> can be
 339   !!    controlled with an additional call <TT>mpp_set_warn_level</TT>.
 340   !!    <PRE>
 341   !!                    call mpp_set_warn_level(ERROR)
 342   !!    </PRE>
 343   !!    causes <TT>mpp_error</TT> to treat <TT>WARNING</TT>
 344   !!    exactly like <TT>FATAL</TT>.
 345   !!    <PRE>
 346   !!                    call mpp_set_warn_level(WARNING)
 347   !!    </PRE>
 348   !!    resets to the default behaviour described above.
 349   !!
 350   !!    <TT>mpp_error</TT> also has an internal error state which
 351   !!    maintains knowledge of whether a warning has been issued. This can be
 352   !!    used at startup in a subroutine that checks if the model has been
 353   !!    properly configured. You can generate a series of warnings using
 354   !!    <TT>mpp_error</TT>, and then check at the end if any warnings has been
 355   !!    issued using the function <TT>mpp_error_state()</TT>. If the value of
 356   !!    this is <TT>WARNING</TT>, at least one warning has been issued, and
 357   !!    the user can take appropriate action:
 358   !!
 359   !!    <PRE>
 360   !!                    if( ... )call mpp_error( WARNING, '...' )
 361   !!                    if( ... )call mpp_error( WARNING, '...' )
 362   !!                    if( ... )call mpp_error( WARNING, '...' )
 363   !!                    ...
 364   !!                    if( mpp_error_state().EQ.WARNING )call mpp_error( FATAL, '...' )
 365   !!    </PRE>
 366   !!  </DESCRIPTION>
 367   !! <br> Example usage:
 368   !! @code{.F90}
 369   !! call mpp_error( errortype, routine, errormsg )
 370   !! @endcode
 371   !! @param errortype
 372   !!    One of <TT>NOTE</TT>, <TT>WARNING</TT> or <TT>FATAL</TT>
 373   !!    (these definitions are acquired by use association).
 374   !!    <TT>NOTE</TT> writes <TT>errormsg</TT> to <TT>STDOUT</TT>.
 375   !!    <TT>WARNING</TT> writes <TT>errormsg</TT> to <TT>STDERR</TT>.
 376   !!    <TT>FATAL</TT> writes <TT>errormsg</TT> to <TT>STDERR</TT>,
 377   !!    and induces a clean error exit with a call stack traceback.
 378   !! @param routine Calling routine name
 379   !! @param errmsg Message to output
 380   !!  </IN>
 381   !> @ingroup mpp_mod
 382   interface mpp_error
 383      module procedure mpp_error_basic
 384      module procedure mpp_error_mesg
 385      module procedure mpp_error_noargs
 386      module procedure mpp_error_is
 387      module procedure mpp_error_rs
 388      module procedure mpp_error_ia
 389      module procedure mpp_error_ra
 390      module procedure mpp_error_ia_ia
 391      module procedure mpp_error_ia_ra
 392      module procedure mpp_error_ra_ia
 393      module procedure mpp_error_ra_ra
 394      module procedure mpp_error_ia_is
 395      module procedure mpp_error_ia_rs
 396      module procedure mpp_error_ra_is
 397      module procedure mpp_error_ra_rs
 398      module procedure mpp_error_is_ia
 399      module procedure mpp_error_is_ra
 400      module procedure mpp_error_rs_ia
 401      module procedure mpp_error_rs_ra
 402      module procedure mpp_error_is_is
 403      module procedure mpp_error_is_rs
 404      module procedure mpp_error_rs_is
 405      module procedure mpp_error_rs_rs
 406   end interface
 407   !> Takes a given integer or real array and returns it as a string
 408   !> @param[in] array An array of integers or reals
 409   !> @returns string equivalent of given array
 410   !> @ingroup mpp_mod
 411   interface array_to_char
 412      module procedure iarray_to_char
 413      module procedure rarray_to_char
 414   end interface
 415
 416 !***********************************************************************
 417 !
 418 !    public interface from mpp_comm.h
 419 !
 420 !***********************************************************************
 421
 422 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 423   !                                                                             !
 424   !       ROUTINES TO INITIALIZE/FINALIZE MPP MODULE: mpp_init, mpp_exit        !
 425   !                                                                             !
 426 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 427
 428 !> @fn mpp_mod::mpp_init::mpp_init( flags, localcomm, test_level)
 429 !> @ingroup mpp_mod
 430 !> @brief Initialize @ref mpp_mod
 431 !!
 432 !> Called to initialize the <TT>mpp_mod</TT> package. It is recommended
 433 !! that this call be the first executed line in your program. It sets the
 434 !! number of PEs assigned to this run (acquired from the command line, or
 435 !! through the environment variable <TT>NPES</TT>), and associates an ID
 436 !! number to each PE. These can be accessed by calling @ref mpp_npes and
 437 !! @ref mpp_pe.
 438 !! <br> Example usage:
 439 !!
 440 !!            call mpp_init( flags )
 441 !!
 442 !! @param flags
 443 !!   <TT>flags</TT> can be set to <TT>MPP_VERBOSE</TT> to
 444 !!   have <TT>mpp_mod</TT> keep you informed of what it's up to.
 445 !! @param test_level
 446 !!   Debugging flag to set amount of initialization tasks performed
 447
 448 !> @fn mpp_mod::mpp_exit()
 449 !> @brief Exit <TT>@ref mpp_mod</TT>.
 450 !!
 451 !> Called at the end of the run, or to re-initialize <TT>mpp_mod</TT>,
 452 !! should you require that for some odd reason.
 453 !!
 454 !! This call implies synchronization across all PEs.
 455 !!
 456 !! <br>Example usage:
 457 !!
 458 !!            call mpp_exit()
 459 !> @ingroup mpp_mod
 460
 461   !#####################################################################
 462
 463   !> @fn subroutine mpp_set_stack_size(n)
 464   !> @brief Allocate module internal workspace.
 465   !> @param Integer to set stack size to(in words)
 466   !> <TT>mpp_mod</TT> maintains a private internal array called
 467   !! <TT>mpp_stack</TT> for private workspace. This call sets the length,
 468   !! in words, of this array.
 469   !!
 470   !! The <TT>mpp_init</TT> call sets this
 471   !! workspace length to a default of 32768, and this call may be used if a
 472   !! longer workspace is needed.
 473   !!
 474   !! This call implies synchronization across all PEs.
 475   !!
 476   !! This workspace is symmetrically allocated, as required for
 477   !! efficient communication on SGI and Cray MPP systems. Since symmetric
 478   !! allocation must be performed by <I>all</I> PEs in a job, this call
 479   !! must also be called by all PEs, using the same value of
 480   !! <TT>n</TT>. Calling <TT>mpp_set_stack_size</TT> from a subset of PEs,
 481   !! or with unequal argument <TT>n</TT>, may cause the program to hang.
 482   !!
 483   !! If any MPP call using <TT>mpp_stack</TT> overflows the declared
 484   !! stack array, the program will abort with a message specifying the
 485   !! stack length that is required. Many users wonder why, if the required
 486   !! stack length can be computed, it cannot also be specified at that
 487   !! point. This cannot be automated because there is no way for the
 488   !! program to know if all PEs are present at that call, and with equal
 489   !! values of <TT>n</TT>. The program must be rerun by the user with the
 490   !! correct argument to <TT>mpp_set_stack_size</TT>, called at an
 491   !! appropriate point in the code where all PEs are known to be present.
 492   !!        @verbose call mpp_set_stack_size(n)
 493   !!
 494   !> @ingroup mpp_mod
 495   public :: mpp_set_stack_size
 496   ! from mpp_util.h
 497
 498 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 499 !                                                                             !
 500 !              DATA TRANSFER TYPES: mpp_type_create                           !
 501 !                                                                             !
 502 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 503
 504   !> @brief Create a mpp_type variable
 505   !> @param[in] field A field of any numerical or logical type
 506   !> @param[in] array_of_subsizes Integer array of subsizes
 507   !> @param[in] array_of_starts Integer array of starts
 508   !> @param[out] dtype_out Output variable for created @ref mpp_type
 509   !> @ingroup mpp_mod
 510   interface mpp_type_create
 511       module procedure mpp_type_create_int4
 512       module procedure mpp_type_create_int8
 513       module procedure mpp_type_create_real4
 514       module procedure mpp_type_create_real8
 515       module procedure mpp_type_create_cmplx4
 516       module procedure mpp_type_create_cmplx8
 517       module procedure mpp_type_create_logical4
 518       module procedure mpp_type_create_logical8
 519   end interface mpp_type_create
 520
 521 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 522   !                                                                             !
 523   !            GLOBAL REDUCTION ROUTINES: mpp_max, mpp_sum, mpp_min             !
 524   !                                                                             !
 525 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 526
 527   !> @brief Reduction operations.
 528   !>    Find the max of scalar a from the PEs in pelist
 529   !!    result is also automatically broadcast to all PEs
 530   !!    @code{.F90}
 531   !!            call  mpp_max( a, pelist )
 532   !!    @endcode
 533   !> @param a <TT>real</TT> or <TT>integer</TT>, of 4-byte of 8-byte kind.
 534   !> @param pelist If <TT>pelist</TT> is omitted, the context is assumed to be the
 535   !!    current pelist. This call implies synchronization across the PEs in
 536   !!    <TT>pelist</TT>, or the current pelist if <TT>pelist</TT> is absent.
 537   !> @ingroup mpp_mod
 538   interface mpp_max
 539      module procedure mpp_max_real8_0d
 540      module procedure mpp_max_real8_1d
 541      module procedure mpp_max_int8_0d
 542      module procedure mpp_max_int8_1d
 543      module procedure mpp_max_real4_0d
 544      module procedure mpp_max_real4_1d
 545      module procedure mpp_max_int4_0d
 546      module procedure mpp_max_int4_1d
 547   end interface
 548
 549   !> @brief Reduction operations.
 550   !>    Find the min of scalar a from the PEs in pelist
 551   !!    result is also automatically broadcast to all PEs
 552   !!    @code{.F90}
 553   !!            call  mpp_min( a, pelist )
 554   !!    @endcode
 555   !> @param a <TT>real</TT> or <TT>integer</TT>, of 4-byte of 8-byte kind.
 556   !> @param pelist If <TT>pelist</TT> is omitted, the context is assumed to be the
 557   !!    current pelist. This call implies synchronization across the PEs in
 558   !!    <TT>pelist</TT>, or the current pelist if <TT>pelist</TT> is absent.
 559   !> @ingroup mpp_mod
 560   interface mpp_min
 561      module procedure mpp_min_real8_0d
 562      module procedure mpp_min_real8_1d
 563      module procedure mpp_min_int8_0d
 564      module procedure mpp_min_int8_1d
 565      module procedure mpp_min_real4_0d
 566      module procedure mpp_min_real4_1d
 567      module procedure mpp_min_int4_0d
 568      module procedure mpp_min_int4_1d
 569   end interface
 570
 571
 572   !> @brief Reduction operation.
 573   !!
 574   !> <TT>MPP_TYPE_</TT> corresponds to any 4-byte and 8-byte variant of
 575   !! <TT>integer, real, complex</TT> variables, of rank 0 or 1. A
 576   !! contiguous block from a multi-dimensional array may be passed by its
 577   !! starting address and its length, as in <TT>f77</TT>.
 578   !!
 579   !! Library reduction operators are not required or guaranteed to be
 580   !! bit-reproducible. In any case, changing the processor count changes
 581   !! the data layout, and thus very likely the order of operations. For
 582   !! bit-reproducible sums of distributed arrays, consider using the
 583   !! <TT>mpp_global_sum</TT> routine provided by the
 584   !! @ref mpp_domains module.
 585   !!
 586   !! The <TT>bit_reproducible</TT> flag provided in earlier versions of
 587   !! this routine has been removed.
 588   !!
 589   !!
 590   !! If <TT>pelist</TT> is omitted, the context is assumed to be the
 591   !! current pelist. This call implies synchronization across the PEs in
 592   !! <TT>pelist</TT>, or the current pelist if <TT>pelist</TT> is absent.
 593   !! Example usage:
 594   !!            call mpp_sum( a, length, pelist )
 595   !!
 596   !> @ingroup mpp_mod
 597   interface mpp_sum
 598      module procedure mpp_sum_int8
 599      module procedure mpp_sum_int8_scalar
 600      module procedure mpp_sum_int8_2d
 601      module procedure mpp_sum_int8_3d
 602      module procedure mpp_sum_int8_4d
 603      module procedure mpp_sum_int8_5d
 604      module procedure mpp_sum_real8
 605      module procedure mpp_sum_real8_scalar
 606      module procedure mpp_sum_real8_2d
 607      module procedure mpp_sum_real8_3d
 608      module procedure mpp_sum_real8_4d
 609      module procedure mpp_sum_real8_5d
 610 #ifdef OVERLOAD_C8
 611      module procedure mpp_sum_cmplx8
 612      module procedure mpp_sum_cmplx8_scalar
 613      module procedure mpp_sum_cmplx8_2d
 614      module procedure mpp_sum_cmplx8_3d
 615      module procedure mpp_sum_cmplx8_4d
 616      module procedure mpp_sum_cmplx8_5d
 617 #endif
 618      module procedure mpp_sum_int4
 619      module procedure mpp_sum_int4_scalar
 620      module procedure mpp_sum_int4_2d
 621      module procedure mpp_sum_int4_3d
 622      module procedure mpp_sum_int4_4d
 623      module procedure mpp_sum_int4_5d
 624      module procedure mpp_sum_real4
 625      module procedure mpp_sum_real4_scalar
 626      module procedure mpp_sum_real4_2d
 627      module procedure mpp_sum_real4_3d
 628      module procedure mpp_sum_real4_4d
 629      module procedure mpp_sum_real4_5d
 630 #ifdef OVERLOAD_C4
 631      module procedure mpp_sum_cmplx4
 632      module procedure mpp_sum_cmplx4_scalar
 633      module procedure mpp_sum_cmplx4_2d
 634      module procedure mpp_sum_cmplx4_3d
 635      module procedure mpp_sum_cmplx4_4d
 636      module procedure mpp_sum_cmplx4_5d
 637 #endif
 638   end interface
 639
 640   !> Calculates sum of a given numerical array across pe's for adjoint domains
 641   !> @ingroup mpp_mod
 642   interface mpp_sum_ad
 643      module procedure mpp_sum_int8_ad
 644      module procedure mpp_sum_int8_scalar_ad
 645      module procedure mpp_sum_int8_2d_ad
 646      module procedure mpp_sum_int8_3d_ad
 647      module procedure mpp_sum_int8_4d_ad
 648      module procedure mpp_sum_int8_5d_ad
 649      module procedure mpp_sum_real8_ad
 650      module procedure mpp_sum_real8_scalar_ad
 651      module procedure mpp_sum_real8_2d_ad
 652      module procedure mpp_sum_real8_3d_ad
 653      module procedure mpp_sum_real8_4d_ad
 654      module procedure mpp_sum_real8_5d_ad
 655 #ifdef OVERLOAD_C8
 656      module procedure mpp_sum_cmplx8_ad
 657      module procedure mpp_sum_cmplx8_scalar_ad
 658      module procedure mpp_sum_cmplx8_2d_ad
 659      module procedure mpp_sum_cmplx8_3d_ad
 660      module procedure mpp_sum_cmplx8_4d_ad
 661      module procedure mpp_sum_cmplx8_5d_ad
 662 #endif
 663      module procedure mpp_sum_int4_ad
 664      module procedure mpp_sum_int4_scalar_ad
 665      module procedure mpp_sum_int4_2d_ad
 666      module procedure mpp_sum_int4_3d_ad
 667      module procedure mpp_sum_int4_4d_ad
 668      module procedure mpp_sum_int4_5d_ad
 669      module procedure mpp_sum_real4_ad
 670      module procedure mpp_sum_real4_scalar_ad
 671      module procedure mpp_sum_real4_2d_ad
 672      module procedure mpp_sum_real4_3d_ad
 673      module procedure mpp_sum_real4_4d_ad
 674      module procedure mpp_sum_real4_5d_ad
 675 #ifdef OVERLOAD_C4
 676      module procedure mpp_sum_cmplx4_ad
 677      module procedure mpp_sum_cmplx4_scalar_ad
 678      module procedure mpp_sum_cmplx4_2d_ad
 679      module procedure mpp_sum_cmplx4_3d_ad
 680      module procedure mpp_sum_cmplx4_4d_ad
 681      module procedure mpp_sum_cmplx4_5d_ad
 682 #endif
 683   end interface
 684
 685   !> @brief Gather data sent from pelist onto the root pe
 686   !! Wrapper for MPI_gather, can be used with and without indices
 687   !> @ingroup mpp_mod
 688   !!
 689   !> @param sbuf MPP_TYPE_ data buffer to send
 690   !> @param rbuf MPP_TYPE_ data buffer to receive
 691   !> @param pelist integer(:) optional pelist to gather from, defaults to current
 692   !>
 693   !> <BR> Example usage:
 694   !!
 695   !!                    call mpp_gather(send_buffer,recv_buffer, pelist)
 696   !!                    call mpp_gather(is, ie, js, je, pelist, array_seg, data, is_root_pe)
 697   !!
 698   interface mpp_gather
 699      module procedure mpp_gather_logical_1d
 700      module procedure mpp_gather_int4_1d
 701      module procedure mpp_gather_int8_1d
 702      module procedure mpp_gather_real4_1d
 703      module procedure mpp_gather_real8_1d
 704      module procedure mpp_gather_logical_1dv
 705      module procedure mpp_gather_int4_1dv
 706      module procedure mpp_gather_int8_1dv
 707      module procedure mpp_gather_real4_1dv
 708      module procedure mpp_gather_real8_1dv
 709      module procedure mpp_gather_pelist_logical_2d
 710      module procedure mpp_gather_pelist_logical_3d
 711      module procedure mpp_gather_pelist_int4_2d
 712      module procedure mpp_gather_pelist_int4_3d
 713      module procedure mpp_gather_pelist_int8_2d
 714      module procedure mpp_gather_pelist_int8_3d
 715      module procedure mpp_gather_pelist_real4_2d
 716      module procedure mpp_gather_pelist_real4_3d
 717      module procedure mpp_gather_pelist_real8_2d
 718      module procedure mpp_gather_pelist_real8_3d
 719   end interface
 720
 721   !> @brief Scatter (ie - is) * (je - js) contiguous elements of array data from the designated root pe
 722   !! into contigous members of array segment in each pe that is included in the pelist argument.
 723   !> @ingroup mpp_mod
 724   !!
 725   !> @param is, ie integer start and end index of the first dimension of the segment array
 726   !> @param je, js integer start and end index of the second dimension of the segment array
 727   !> @param pelist integer(:) the PE list of target pes, needs to be monotonically increasing
 728   !> @param array_seg MPP_TYPE_ 2D array that the data is to be copied into
 729   !> @param data MPP_TYPE_ the source array
 730   !> @param is_root_pe logical true if calling from root pe
 731   !> @param ishift integer offsets specifying the first elelement in the data array
 732   !> @param nk integer size of third dimension for 3D calls
 733   !!
 734   !> <BR> Example usage:
 735   !!
 736   !!                    call mpp_scatter(is, ie, js, je, pelist, segment, data, .true.)
 737   !!
 738   interface mpp_scatter
 739      module procedure mpp_scatter_pelist_int4_2d
 740      module procedure mpp_scatter_pelist_int4_3d
 741      module procedure mpp_scatter_pelist_int8_2d
 742      module procedure mpp_scatter_pelist_int8_3d
 743      module procedure mpp_scatter_pelist_real4_2d
 744      module procedure mpp_scatter_pelist_real4_3d
 745      module procedure mpp_scatter_pelist_real8_2d
 746      module procedure mpp_scatter_pelist_real8_3d
 747   end interface
 748
 749   !#####################################################################
 750   !> @brief Scatter a vector across all PEs
 751   !!
 752   !> Transpose the vector and PE index
 753   !! Wrapper for the MPI_alltoall function, includes more generic _V and _W
 754   !! versions if given displacements/data types
 755   !!
 756   !! Generic MPP_TYPE_ implentations:
 757   !! <li> @ref mpp_alltoall_ </li>
 758   !! <li> @ref mpp_alltoallv_ </li>
 759   !! <li> @ref mpp_alltoallw_ </li>
 760   !!
 761   !> @ingroup mpp_mod
 762   interface mpp_alltoall
 763      module procedure mpp_alltoall_int4
 764      module procedure mpp_alltoall_int8
 765      module procedure mpp_alltoall_real4
 766      module procedure mpp_alltoall_real8
 767 #ifdef OVERLOAD_C4
 768      module procedure mpp_alltoall_cmplx4
 769 #endif
 770 #ifdef OVERLOAD_C8
 771      module procedure mpp_alltoall_cmplx8
 772 #endif
 773      module procedure mpp_alltoall_logical4
 774      module procedure mpp_alltoall_logical8
 775      module procedure mpp_alltoall_int4_v
 776      module procedure mpp_alltoall_int8_v
 777      module procedure mpp_alltoall_real4_v
 778      module procedure mpp_alltoall_real8_v
 779 #ifdef OVERLOAD_C4
 780      module procedure mpp_alltoall_cmplx4_v
 781 #endif
 782 #ifdef OVERLOAD_C8
 783      module procedure mpp_alltoall_cmplx8_v
 784 #endif
 785      module procedure mpp_alltoall_logical4_v
 786      module procedure mpp_alltoall_logical8_v
 787      module procedure mpp_alltoall_int4_w
 788      module procedure mpp_alltoall_int8_w
 789      module procedure mpp_alltoall_real4_w
 790      module procedure mpp_alltoall_real8_w
 791 #ifdef OVERLOAD_C4
 792      module procedure mpp_alltoall_cmplx4_w
 793 #endif
 794 #ifdef OVERLOAD_C8
 795      module procedure mpp_alltoall_cmplx8_w
 796 #endif
 797      module procedure mpp_alltoall_logical4_w
 798      module procedure mpp_alltoall_logical8_w
 799   end interface
 800
 801
 802   !#####################################################################
 803   !> @brief Basic message-passing call.
 804   !!
 805   !>    <TT>MPP_TYPE_</TT> corresponds to any 4-byte and 8-byte variant of
 806   !!    <TT>integer, real, complex, logical</TT> variables, of rank 0 or 1. A
 807   !!    contiguous block from a multi-dimensional array may be passed by its
 808   !!    starting address and its length, as in <TT>f77</TT>.
 809   !!
 810   !!    <TT>mpp_transmit</TT> is currently implemented as asynchronous
 811   !!    outward transmission and synchronous inward transmission. This follows
 812   !!    the behaviour of <TT>shmem_put</TT> and <TT>shmem_get</TT>. In MPI, it
 813   !!    is implemented as <TT>mpi_isend</TT> and <TT>mpi_recv</TT>. For most
 814   !!    applications, transmissions occur in pairs, and are here accomplished
 815   !!    in a single call.
 816   !!
 817   !!    The special PE designations <TT>NULL_PE</TT>,
 818   !!    <TT>ANY_PE</TT> and <TT>ALL_PES</TT> are provided by use
 819   !!    association.
 820   !!
 821   !!    <TT>NULL_PE</TT>: is used to disable one of the pair of
 822   !!    transmissions.<BR/>
 823   !!    <TT>ANY_PE</TT>: is used for unspecific remote
 824   !!    destination. (Please note that <TT>put_pe=ANY_PE</TT> has no meaning
 825   !!    in the MPI context, though it is available in the SHMEM invocation. If
 826   !!    portability is a concern, it is best avoided).<BR/>
 827   !!    <TT>ALL_PES</TT>: is used for broadcast operations.
 828   !!
 829   !!    It is recommended that
 830   !!    @ref mpp_broadcast be used for
 831   !!    broadcasts.
 832   !!
 833   !!    The following example illustrates the use of
 834   !!    <TT>NULL_PE</TT> and <TT>ALL_PES</TT>:
 835   !!
 836   !!    <PRE>
 837   !!    real, dimension(n) :: a
 838   !!    if( pe.EQ.0 )then
 839   !!        do p = 1,npes-1
 840   !!           call mpp_transmit( a, n, p, a, n, NULL_PE )
 841   !!        end do
 842   !!    else
 843   !!        call mpp_transmit( a, n, NULL_PE, a, n, 0 )
 844   !!    end if
 845   !!
 846   !!    call mpp_transmit( a, n, ALL_PES, a, n, 0 )
 847   !!    </PRE>
 848   !!
 849   !!    The do loop and the broadcast operation above are equivalent.
 850   !!
 851   !!    Two overloaded calls <TT>mpp_send</TT> and
 852   !!     <TT>mpp_recv</TT> have also been
 853   !!    provided. <TT>mpp_send</TT> calls <TT>mpp_transmit</TT>
 854   !!    with <TT>get_pe=NULL_PE</TT>. <TT>mpp_recv</TT> calls
 855   !!    <TT>mpp_transmit</TT> with <TT>put_pe=NULL_PE</TT>. Thus
 856   !!    the do loop above could be written more succinctly:
 857   !!
 858   !!    <PRE>
 859   !!    if( pe.EQ.0 )then
 860   !!        do p = 1,npes-1
 861   !!           call mpp_send( a, n, p )
 862   !!        end do
 863   !!    else
 864   !!        call mpp_recv( a, n, 0 )
 865   !!    end if
 866   !!    </PRE>
 867   !! <br>Example call:
 868   !! @code{.F90}
 869   !!    call mpp_transmit( put_data, put_len, put_pe, get_data, get_len, get_pe )
 870   !! @endcode
 871   !> @ingroup mpp_mod
 872   interface mpp_transmit
 873      module procedure mpp_transmit_real8
 874      module procedure mpp_transmit_real8_scalar
 875      module procedure mpp_transmit_real8_2d
 876      module procedure mpp_transmit_real8_3d
 877      module procedure mpp_transmit_real8_4d
 878      module procedure mpp_transmit_real8_5d
 879 #ifdef OVERLOAD_C8
 880      module procedure mpp_transmit_cmplx8
 881      module procedure mpp_transmit_cmplx8_scalar
 882      module procedure mpp_transmit_cmplx8_2d
 883      module procedure mpp_transmit_cmplx8_3d
 884      module procedure mpp_transmit_cmplx8_4d
 885      module procedure mpp_transmit_cmplx8_5d
 886 #endif
 887      module procedure mpp_transmit_int8
 888      module procedure mpp_transmit_int8_scalar
 889      module procedure mpp_transmit_int8_2d
 890      module procedure mpp_transmit_int8_3d
 891      module procedure mpp_transmit_int8_4d
 892      module procedure mpp_transmit_int8_5d
 893      module procedure mpp_transmit_logical8
 894      module procedure mpp_transmit_logical8_scalar
 895      module procedure mpp_transmit_logical8_2d
 896      module procedure mpp_transmit_logical8_3d
 897      module procedure mpp_transmit_logical8_4d
 898      module procedure mpp_transmit_logical8_5d
 899
 900      module procedure mpp_transmit_real4
 901      module procedure mpp_transmit_real4_scalar
 902      module procedure mpp_transmit_real4_2d
 903      module procedure mpp_transmit_real4_3d
 904      module procedure mpp_transmit_real4_4d
 905      module procedure mpp_transmit_real4_5d
 906
 907 #ifdef OVERLOAD_C4
 908      module procedure mpp_transmit_cmplx4
 909      module procedure mpp_transmit_cmplx4_scalar
 910      module procedure mpp_transmit_cmplx4_2d
 911      module procedure mpp_transmit_cmplx4_3d
 912      module procedure mpp_transmit_cmplx4_4d
 913      module procedure mpp_transmit_cmplx4_5d
 914 #endif
 915      module procedure mpp_transmit_int4
 916      module procedure mpp_transmit_int4_scalar
 917      module procedure mpp_transmit_int4_2d
 918      module procedure mpp_transmit_int4_3d
 919      module procedure mpp_transmit_int4_4d
 920      module procedure mpp_transmit_int4_5d
 921      module procedure mpp_transmit_logical4
 922      module procedure mpp_transmit_logical4_scalar
 923      module procedure mpp_transmit_logical4_2d
 924      module procedure mpp_transmit_logical4_3d
 925      module procedure mpp_transmit_logical4_4d
 926      module procedure mpp_transmit_logical4_5d
 927   end interface
 928   !> @brief Recieve data from another PE
 929   !!
 930   !> @param[out] get_data scalar or array to get written with received data
 931   !> @param get_len size of array to recv from get_data
 932   !> @param from_pe PE number to receive from
 933   !> @param block true for blocking, false for non-blocking. Defaults to true
 934   !> @param tag communication tag
 935   !> @param[out] request MPI request handle
 936   !> @ingroup mpp_mod
 937   interface mpp_recv
 938      module procedure mpp_recv_real8
 939      module procedure mpp_recv_real8_scalar
 940      module procedure mpp_recv_real8_2d
 941      module procedure mpp_recv_real8_3d
 942      module procedure mpp_recv_real8_4d
 943      module procedure mpp_recv_real8_5d
 944 #ifdef OVERLOAD_C8
 945      module procedure mpp_recv_cmplx8
 946      module procedure mpp_recv_cmplx8_scalar
 947      module procedure mpp_recv_cmplx8_2d
 948      module procedure mpp_recv_cmplx8_3d
 949      module procedure mpp_recv_cmplx8_4d
 950      module procedure mpp_recv_cmplx8_5d
 951 #endif
 952      module procedure mpp_recv_int8
 953      module procedure mpp_recv_int8_scalar
 954      module procedure mpp_recv_int8_2d
 955      module procedure mpp_recv_int8_3d
 956      module procedure mpp_recv_int8_4d
 957      module procedure mpp_recv_int8_5d
 958      module procedure mpp_recv_logical8
 959      module procedure mpp_recv_logical8_scalar
 960      module procedure mpp_recv_logical8_2d
 961      module procedure mpp_recv_logical8_3d
 962      module procedure mpp_recv_logical8_4d
 963      module procedure mpp_recv_logical8_5d
 964
 965      module procedure mpp_recv_real4
 966      module procedure mpp_recv_real4_scalar
 967      module procedure mpp_recv_real4_2d
 968      module procedure mpp_recv_real4_3d
 969      module procedure mpp_recv_real4_4d
 970      module procedure mpp_recv_real4_5d
 971
 972 #ifdef OVERLOAD_C4
 973      module procedure mpp_recv_cmplx4
 974      module procedure mpp_recv_cmplx4_scalar
 975      module procedure mpp_recv_cmplx4_2d
 976      module procedure mpp_recv_cmplx4_3d
 977      module procedure mpp_recv_cmplx4_4d
 978      module procedure mpp_recv_cmplx4_5d
 979 #endif
 980      module procedure mpp_recv_int4
 981      module procedure mpp_recv_int4_scalar
 982      module procedure mpp_recv_int4_2d
 983      module procedure mpp_recv_int4_3d
 984      module procedure mpp_recv_int4_4d
 985      module procedure mpp_recv_int4_5d
 986      module procedure mpp_recv_logical4
 987      module procedure mpp_recv_logical4_scalar
 988      module procedure mpp_recv_logical4_2d
 989      module procedure mpp_recv_logical4_3d
 990      module procedure mpp_recv_logical4_4d
 991      module procedure mpp_recv_logical4_5d
 992   end interface
 993   !> Send data to a receiving PE.
 994   !!
 995   !> @param put_data scalar or array to get sent to a receiving PE
 996   !> @param put_len size of data to send from put_data
 997   !> @param to_pe PE number to send to
 998   !> @param block true for blocking, false for non-blocking. Defaults to true
 999   !> @param tag communication tag
1000   !> @param[out] request MPI request handle
1001   !! <br> Example usage:
1002   !! @code{.F90} call mpp_send(data, ie, pe) @endcode
1003   !> @ingroup mpp_mod
1004   interface mpp_send
1005      module procedure mpp_send_real8
1006      module procedure mpp_send_real8_scalar
1007      module procedure mpp_send_real8_2d
1008      module procedure mpp_send_real8_3d
1009      module procedure mpp_send_real8_4d
1010      module procedure mpp_send_real8_5d
1011 #ifdef OVERLOAD_C8
1012      module procedure mpp_send_cmplx8
1013      module procedure mpp_send_cmplx8_scalar
1014      module procedure mpp_send_cmplx8_2d
1015      module procedure mpp_send_cmplx8_3d
1016      module procedure mpp_send_cmplx8_4d
1017      module procedure mpp_send_cmplx8_5d
1018 #endif
1019      module procedure mpp_send_int8
1020      module procedure mpp_send_int8_scalar
1021      module procedure mpp_send_int8_2d
1022      module procedure mpp_send_int8_3d
1023      module procedure mpp_send_int8_4d
1024      module procedure mpp_send_int8_5d
1025      module procedure mpp_send_logical8
1026      module procedure mpp_send_logical8_scalar
1027      module procedure mpp_send_logical8_2d
1028      module procedure mpp_send_logical8_3d
1029      module procedure mpp_send_logical8_4d
1030      module procedure mpp_send_logical8_5d
1031
1032      module procedure mpp_send_real4
1033      module procedure mpp_send_real4_scalar
1034      module procedure mpp_send_real4_2d
1035      module procedure mpp_send_real4_3d
1036      module procedure mpp_send_real4_4d
1037      module procedure mpp_send_real4_5d
1038
1039 #ifdef OVERLOAD_C4
1040      module procedure mpp_send_cmplx4
1041      module procedure mpp_send_cmplx4_scalar
1042      module procedure mpp_send_cmplx4_2d
1043      module procedure mpp_send_cmplx4_3d
1044      module procedure mpp_send_cmplx4_4d
1045      module procedure mpp_send_cmplx4_5d
1046 #endif
1047      module procedure mpp_send_int4
1048      module procedure mpp_send_int4_scalar
1049      module procedure mpp_send_int4_2d
1050      module procedure mpp_send_int4_3d
1051      module procedure mpp_send_int4_4d
1052      module procedure mpp_send_int4_5d
1053      module procedure mpp_send_logical4
1054      module procedure mpp_send_logical4_scalar
1055      module procedure mpp_send_logical4_2d
1056      module procedure mpp_send_logical4_3d
1057      module procedure mpp_send_logical4_4d
1058      module procedure mpp_send_logical4_5d
1059   end interface
1060
1061
1062   !> @brief Perform parallel broadcasts
1063   !!
1064   !> The <TT>mpp_broadcast</TT> call has been added because the original
1065   !! syntax (using <TT>ALL_PES</TT> in <TT>mpp_transmit</TT>) did not
1066   !! support a broadcast across a pelist.
1067   !!
1068   !! <TT>MPP_TYPE_</TT> corresponds to any 4-byte and 8-byte variant of
1069   !! <TT>integer, real, complex, logical</TT> variables, of rank 0 or 1. A
1070   !! contiguous block from a multi-dimensional array may be passed by its
1071   !! starting address and its length, as in <TT>f77</TT>.
1072   !!
1073   !! Global broadcasts through the <TT>ALL_PES</TT> argument to
1074   !! @ref mpp_transmit are still provided for
1075   !! backward-compatibility.
1076   !!
1077   !! If <TT>pelist</TT> is omitted, the context is assumed to be the
1078   !! current pelist. <TT>from_pe</TT> must belong to the current
1079   !! pelist. This call implies synchronization across the PEs in
1080   !! <TT>pelist</TT>, or the current pelist if <TT>pelist</TT> is absent.
1081   !!
1082   !! <br>Example usage:
1083   !!
1084   !!            call mpp_broadcast( data, length, from_pe, pelist )
1085   !!
1086   !> @param[inout] data Data to broadcast
1087   !> @param length Length of data to broadcast
1088   !> @param from_pe PE to send the data from
1089   !> @param pelist List of PE's to broadcast across, if not provided uses current list
1090   !> @ingroup mpp_mod
1091   interface mpp_broadcast
1092      module procedure mpp_broadcast_char
1093      module procedure mpp_broadcast_real8
1094      module procedure mpp_broadcast_real8_scalar
1095      module procedure mpp_broadcast_real8_2d
1096      module procedure mpp_broadcast_real8_3d
1097      module procedure mpp_broadcast_real8_4d
1098      module procedure mpp_broadcast_real8_5d
1099 #ifdef OVERLOAD_C8
1100      module procedure mpp_broadcast_cmplx8
1101      module procedure mpp_broadcast_cmplx8_scalar
1102      module procedure mpp_broadcast_cmplx8_2d
1103      module procedure mpp_broadcast_cmplx8_3d
1104      module procedure mpp_broadcast_cmplx8_4d
1105      module procedure mpp_broadcast_cmplx8_5d
1106 #endif
1107      module procedure mpp_broadcast_int8
1108      module procedure mpp_broadcast_int8_scalar
1109      module procedure mpp_broadcast_int8_2d
1110      module procedure mpp_broadcast_int8_3d
1111      module procedure mpp_broadcast_int8_4d
1112      module procedure mpp_broadcast_int8_5d
1113      module procedure mpp_broadcast_logical8
1114      module procedure mpp_broadcast_logical8_scalar
1115      module procedure mpp_broadcast_logical8_2d
1116      module procedure mpp_broadcast_logical8_3d
1117      module procedure mpp_broadcast_logical8_4d
1118      module procedure mpp_broadcast_logical8_5d
1119
1120      module procedure mpp_broadcast_real4
1121      module procedure mpp_broadcast_real4_scalar
1122      module procedure mpp_broadcast_real4_2d
1123      module procedure mpp_broadcast_real4_3d
1124      module procedure mpp_broadcast_real4_4d
1125      module procedure mpp_broadcast_real4_5d
1126
1127 #ifdef OVERLOAD_C4
1128      module procedure mpp_broadcast_cmplx4
1129      module procedure mpp_broadcast_cmplx4_scalar
1130      module procedure mpp_broadcast_cmplx4_2d
1131      module procedure mpp_broadcast_cmplx4_3d
1132      module procedure mpp_broadcast_cmplx4_4d
1133      module procedure mpp_broadcast_cmplx4_5d
1134 #endif
1135      module procedure mpp_broadcast_int4
1136      module procedure mpp_broadcast_int4_scalar
1137      module procedure mpp_broadcast_int4_2d
1138      module procedure mpp_broadcast_int4_3d
1139      module procedure mpp_broadcast_int4_4d
1140      module procedure mpp_broadcast_int4_5d
1141      module procedure mpp_broadcast_logical4
1142      module procedure mpp_broadcast_logical4_scalar
1143      module procedure mpp_broadcast_logical4_2d
1144      module procedure mpp_broadcast_logical4_3d
1145      module procedure mpp_broadcast_logical4_4d
1146      module procedure mpp_broadcast_logical4_5d
1147   end interface
1148
1149   !#####################################################################
1150
1151   !> @brief Calculate parallel checksums
1152   !!
1153   !> \e mpp_chksum is a parallel checksum routine that returns an
1154   !! identical answer for the same array irrespective of how it has been
1155   !! partitioned across processors. \e int_kind is the KIND
1156   !! parameter corresponding to long integers (see discussion on
1157   !! OS-dependent preprocessor directives) defined in
1158   !! the file platform.F90. \e MPP_TYPE_ corresponds to any
1159   !! 4-byte and 8-byte variant of \e integer, \e real, \e complex, \e logical
1160   !! variables, of rank 0 to 5.
1161   !!
1162   !! Integer checksums on FP data use the F90 <TT>TRANSFER()</TT>
1163   !! intrinsic.
1164   !!
1165   !! This provides identical results on a single-processor job, and to perform
1166   !! serial checksums on a single processor of a parallel job, you only
1167   !! need to use the optional <TT>pelist</TT> argument.
1168   !! <PRE>
1169   !! use mpp_mod
1170   !! integer :: pe, chksum
1171   !! real :: a(:)
1172   !! pe = mpp_pe()
1173   !! chksum = mpp_chksum( a, (/pe/) )
1174   !! </PRE>
1175   !!
1176   !! The additional functionality of <TT>mpp_chksum</TT> over
1177   !! serial checksums is to compute the checksum across the PEs in
1178   !! <TT>pelist</TT>. The answer is guaranteed to be the same for
1179   !! the same distributed array irrespective of how it has been
1180   !! partitioned.
1181   !!
1182   !! If <TT>pelist</TT> is omitted, the context is assumed to be the
1183   !! current pelist. This call implies synchronization across the PEs in
1184   !! <TT>pelist</TT>, or the current pelist if <TT>pelist</TT> is absent.
1185   !! <br> Example usage:
1186   !!
1187   !!            mpp_chksum( var, pelist )
1188   !!
1189   !! @param var Data to calculate checksum of
1190   !! @param pelist Optional list of PE's to include in checksum calculation if not using
1191   !! current pelist
1192   !! @return Parallel checksum of var across given or implicit pelist
1193   !!
1194   !! Generic MPP_TYPE_ implentations:
1195   !! <li> @ref mpp_chksum_</li>
1196   !! <li> @ref mpp_chksum_int_</li>
1197   !! <li> @ref mpp_chksum_int_rmask_</li>
1198   !!
1199   !> @ingroup mpp_mod
1200   interface mpp_chksum
1201      module procedure mpp_chksum_i8_1d
1202      module procedure mpp_chksum_i8_2d
1203      module procedure mpp_chksum_i8_3d
1204      module procedure mpp_chksum_i8_4d
1205      module procedure mpp_chksum_i8_5d
1206      module procedure mpp_chksum_i8_1d_rmask
1207      module procedure mpp_chksum_i8_2d_rmask
1208      module procedure mpp_chksum_i8_3d_rmask
1209      module procedure mpp_chksum_i8_4d_rmask
1210      module procedure mpp_chksum_i8_5d_rmask
1211
1212      module procedure mpp_chksum_i4_1d
1213      module procedure mpp_chksum_i4_2d
1214      module procedure mpp_chksum_i4_3d
1215      module procedure mpp_chksum_i4_4d
1216      module procedure mpp_chksum_i4_5d
1217      module procedure mpp_chksum_i4_1d_rmask
1218      module procedure mpp_chksum_i4_2d_rmask
1219      module procedure mpp_chksum_i4_3d_rmask
1220      module procedure mpp_chksum_i4_4d_rmask
1221      module procedure mpp_chksum_i4_5d_rmask
1222
1223      module procedure mpp_chksum_r8_0d
1224      module procedure mpp_chksum_r8_1d
1225      module procedure mpp_chksum_r8_2d
1226      module procedure mpp_chksum_r8_3d
1227      module procedure mpp_chksum_r8_4d
1228      module procedure mpp_chksum_r8_5d
1229
1230      module procedure mpp_chksum_r4_0d
1231      module procedure mpp_chksum_r4_1d
1232      module procedure mpp_chksum_r4_2d
1233      module procedure mpp_chksum_r4_3d
1234      module procedure mpp_chksum_r4_4d
1235      module procedure mpp_chksum_r4_5d
1236 #ifdef OVERLOAD_C8
1237      module procedure mpp_chksum_c8_0d
1238      module procedure mpp_chksum_c8_1d
1239      module procedure mpp_chksum_c8_2d
1240      module procedure mpp_chksum_c8_3d
1241      module procedure mpp_chksum_c8_4d
1242      module procedure mpp_chksum_c8_5d
1243 #endif
1244 #ifdef OVERLOAD_C4
1245      module procedure mpp_chksum_c4_0d
1246      module procedure mpp_chksum_c4_1d
1247      module procedure mpp_chksum_c4_2d
1248      module procedure mpp_chksum_c4_3d
1249      module procedure mpp_chksum_c4_4d
1250      module procedure mpp_chksum_c4_5d
1251 #endif
1252   end interface
1253
1254 !> @addtogroup mpp_mod
1255 !> @{
1256 !***********************************************************************
1257 !
1258 !            module variables
1259 !
1260 !***********************************************************************
1261   integer, parameter   :: PESET_MAX = 10000
1262   integer              :: current_peset_max = 32
1263   type(communicator), allocatable :: peset(:) !< Will be allocated starting from 0, 0 is a dummy used
1264                                               !! to hold single-PE "self" communicator
1265   logical              :: module_is_initialized = .false.
1266   logical              :: debug = .false.
1267   integer              :: npes=1, root_pe=0, pe=0
1268   integer(i8_kind)     :: tick, ticks_per_sec, max_ticks, start_tick, end_tick, tick0=0
1269   integer              :: mpp_comm_private
1270   logical              :: first_call_system_clock_mpi=.TRUE.
1271   real(r8_kind)        :: mpi_count0=0  !< use to prevent integer overflow
1272   real(r8_kind)        :: mpi_tick_rate=0.d0  !< clock rate for mpi_wtick()
1273   logical              :: mpp_record_timing_data=.TRUE.
1274   type(clock),save     :: clocks(MAX_CLOCKS)
1275   integer              :: log_unit, etc_unit
1276   integer              :: warn_unit !< unit number of the warning log
1277   character(len=32), parameter    :: configfile='logfile'
1278   character(len=32), parameter    :: warnfile='warnfile' !< base name for warninglog (appends ".<PE>.out")
1279   integer              :: peset_num=0, current_peset_num=0
1280   integer              :: world_peset_num                  !<the world communicator
1281   integer              :: error
1282   integer              :: clock_num=0, num_clock_ids=0,current_clock=0, previous_clock(MAX_CLOCKS)=0
1283   real                 :: tick_rate
1284
1285   type(mpp_type_list)    :: datatypes
1286   type(mpp_type), target :: mpp_byte
1287
1288   integer              :: cur_send_request = 0
1289   integer              :: cur_recv_request = 0
1290   integer, allocatable :: request_send(:)
1291   integer, allocatable :: request_recv(:)
1292   integer, allocatable :: size_recv(:)
1293   integer, allocatable :: type_recv(:)
1294 ! if you want to save the non-root PE information uncomment out the following line
1295 ! and comment out the assigment of etcfile to '/dev/null'
1296 #ifdef NO_DEV_NULL
1297   character(len=32)    :: etcfile='._mpp.nonrootpe.msgs'
1298 #else
1299   character(len=32)    :: etcfile='/dev/null'
1300 #endif
1301
1302 !> Use the intrinsics in iso_fortran_env
1303   integer :: in_unit=INPUT_UNIT, out_unit=OUTPUT_UNIT, err_unit=ERROR_UNIT
1304   integer :: stdout_unit
1305
1306   !--- variables used in mpp_util.h
1307   type(Summary_Struct) :: clock_summary(MAX_CLOCKS)
1308   logical              :: warnings_are_fatal = .FALSE.
1309   integer              :: error_state=0
1310   integer              :: clock_grain=CLOCK_LOOP-1
1311
1312   !--- variables used in mpp_comm.h
1313   integer            :: clock0    !<measures total runtime from mpp_init to mpp_exit
1314   integer            :: mpp_stack_size=0, mpp_stack_hwm=0
1315   logical            :: verbose=.FALSE.
1316
1317   integer :: get_len_nocomm = 0 !< needed for mpp_transmit_nocomm.h
1318
1319   !--- variables used in mpp_comm_mpi.inc
1320   integer, parameter :: mpp_init_test_full_init = -1
1321   integer, parameter :: mpp_init_test_init_true_only = 0
1322   integer, parameter :: mpp_init_test_peset_allocated = 1
1323   integer, parameter :: mpp_init_test_clocks_init = 2
1324   integer, parameter :: mpp_init_test_datatype_list_init = 3
1325   integer, parameter :: mpp_init_test_logfile_init = 4
1326   integer, parameter :: mpp_init_test_read_namelist = 5
1327   integer, parameter :: mpp_init_test_etc_unit = 6
1328   integer, parameter :: mpp_init_test_requests_allocated = 7
1329
1330 !> MPP_INFO_NULL acts as an analagous mpp-macro for MPI_INFO_NULL to share with fms2_io NetCDF4
1331 !! mpi-io.  The default value for the no-mpi case comes from Intel MPI and MPICH.  OpenMPI sets
1332 !! a default value of '0'
1333 #if defined(use_libMPI)
1334   integer, parameter ::  MPP_INFO_NULL = MPI_INFO_NULL
1335 #else
1336   integer, parameter ::  MPP_INFO_NULL = 469762048
1337 #endif
1338
1339 !> MPP_COMM_NULL acts as an analagous mpp-macro for MPI_COMM_NULL to share with fms2_io NetCDF4
1340 !! mpi-io.  The default value for the no-mpi case comes from Intel MPI and MPICH.  OpenMPI sets
1341 !! a default value of '2'
1342 #if defined(use_libMPI)
1343   integer, parameter ::  MPP_COMM_NULL = MPI_COMM_NULL
1344 #else
1345   integer, parameter ::  MPP_COMM_NULL = 67108864
1346 #endif
1347
1348 !***********************************************************************
1349 !  variables needed for subroutine read_input_nml (include/mpp_util.inc)
1350 !
1351 ! public variable needed for reading input nml file from an internal file
1352   character(len=:), dimension(:), allocatable, target, public :: input_nml_file
1353   logical :: read_ascii_file_on = .FALSE.
1354 !***********************************************************************
1355
1356 ! Include variable "version" to be written to log file.
1357 #include<file_version.h>
1358   public version
1359
1360   integer, parameter :: MAX_REQUEST_MIN  = 10000
1361   integer            :: request_multiply = 20
1362
1363   logical :: etc_unit_is_stderr = .false.
1364   integer :: max_request = 0
1365   logical :: sync_all_clocks = .false.
1366   namelist /mpp_nml/ etc_unit_is_stderr, request_multiply, mpp_record_timing_data, sync_all_clocks
1367
1368   contains
1369 #include <system_clock.fh>
1370 #include <mpp_util.inc>
1371 #include <mpp_comm.inc>
1372
1373   end module mpp_mod
1374 !> @}
1375 ! close documentation grouping