1 Using parallel IO through the netCDF-4 interface (io_form = 13)
3 If you don't need variable-level compression, stop and go use pnetcdf
4 (parallel-netcdf-1.9.0), which will have better IO performance. (Also
5 should be using a parallel file system to gain benefits.)
7 To use parallel netcdf-4, set the environment variable NETCDFPAR to the
8 directory containingthe lib and include directories, e.g.,
10 setenv NETCDFPAR /usr/local/netcdf474par
12 (This will also cause configure to set NETCDF = NETCDFPAR to prevent
13 conflicting libraries, and also will force NETCDF4=1 and USENETCDFPAR=1)
15 The code assumes you want compression turned on, so netcdf-c version 4.7.4
16 or later is required. (Because otherwise just use pnetcdf since it is
17 faster.) This in turn requires HDF5 1.10.3 or later. Netcdf-c can be build
18 with or without pnetcdf enabled, but it is not used here through the netcdf-4
19 interface. (There is a separate IO option for PNETCDF that can be used).
21 Usage: io_form is 13, and must turn off colons from the filename (as for pnetcdf)
29 Tests for development used the following:
31 parallel-netcdf 1.9.0 (--enable-relax-coord-bound --disable-cxx)
32 Hdf5 version 1.10.7 (--enable-fortran --enable-parallel)
33 netcdf-c 4.7.4 (--enable-netcdf-4 --enable-pnetcdf --disable-dap)
34 netcdf-fortran 4.5.3 (--enable-parallel-tests)
36 Other options as needed: FC=mpif90 F90=mpif90 CC=mpicc F77=mpif90
38 IO output form for parallel netcdf-4 is 13 (io_netcdfpar=13 in Registry).
40 Performance seems to vary with how 'regular' the domain decomposition is
41 (i.e., patch size). Some experimentation with manually setting the decomposition
42 may be needed for optimal writing times. Also pay attention to file system
43 striping (Lustre), where setting the number stripes should not exceed the
44 number of nodes used by the job.