Code installation
EditNew Pagesuo-yang edited this page on Apr 29, 2019 · 10 revisions
Before installing NGA, ensure that all libraries are properly installed. Remember to:
If using Open MPI, then
source NGA_modules
If using Intel MPI, then
source NGA_modules_impi
Downloading NGA
To download a clear version of NGA from the server, execute the command
git clone https://github.umn.edu/CRFEL/NGA.git
To download a clear version of NGA from the server into a directory other than “NGA”, execute the command
git clone https://github.umn.edu/CRFEL/NGA.git my_NGA
If you already have NGA and simply need to update your version with the most recent changes from the server, execute the command
git pull
Compiling NGA
Update Makefile.in
In the NGA src directory, open the file Makefile.in You will need to change are the paths to the libraries. Change the four entries BLAS_DIR
, LAPACK_DIR
, HYPRE_DIR
, FFTW_DIR
, and SUNDIALS_DIR
to the directories in which you installed these libraries. Write out the full path to the directories, e.g. if on MSI and use Open MPI, /home/suo-yang/username/opt/fftw
and not ~/opt/fftw
, otherwise the compiler may fail.
Important noice: please write out the full path to fftw-openmp
and hypre-openmp
.
Additional steps may be necessary depending on where you are installing NGA and what compiler version you are using:
MSI (mesabi)
Remark: Using mpif90
or mpifort
depends on which one is available, but they probably perform similarly when both are available.
For Intel MPI, you need to make the following changes:
CC = mpiicc #mpicc
CXX = mpiicpc #mpicxx
F90 = mpiifort #mpif90
LD = mpiifort #mpif90
The Skylake processors on the new mesabi addition (mangi
) support 512-bit vector extensions, which could make NGA run faster. To compile using these extensions, set the following two lines in Makefile.in
as:
LDFLAGS = -qopenmp -xCORE-AVX512
OPTFLAGS = -qopenmp -O3 -xCORE-AVX512 -ip -qoverride-limits
However, the current mesabi does not support 512-bit vector extensions. So please remove -xCORE-AVX512
from both LDFLAGS
and OPTFLAGS
, and use -xhost
instead:
LDFLAGS = -qopenmp -xhost
OPTFLAGS = -qopenmp -O3 -xhost -ip -qoverride-limits
Also make change these fields:
AR = xiar rcv
DBGFLAGS= -qopenmp -O0 -g -CA -CB -CS -traceback -debug all -ftrapuv -check noarg_temp_created -WB -warn none
Remark: (1) -CB
checks for out-of-bounds array. (2) -ftrapuv
stops the code if see division by zero, which is very useful.
You also must comment out use mpi
in library/parallel.f90 and uncomment include 'mpif.h'
.
New Tiger (tigercpu)
The changes in this section have already been applied to Makefile.in in the latest version of NGA, but are left for posterity until after fully transitioning to tigercpu.
Remark: Using mpif90
or mpifort
depends on which one is available, but they probably perform similarly when both are available.
The Skylake processors on the new Tiger support 512-bit vector extensions, which could make NGA run faster. To compile using these extensions, set the following two lines in Makefile.in
as:
LDFLAGS = -qopenmp -xCORE-AVX512
OPTFLAGS = -qopenmp -O3 -xCORE-AVX512
Remarks: (1) When use Intel-16, -O3
and -qoverride-limits
could result in segmentation fault for some cases (e.g., DNS Box), but -O0
to -O2
still works. (2) -ipo
cannot find its required files in New Tiger (tigercpu) and could result in warnings, but -ip
may work.
Also make change these fields:
AR = xiar rcv
DBGFLAGS= -qopenmp -O0 -g -CA -CB -CS -traceback -debug all -ftrapuv -check noarg_temp_created -WB -warn none
Remark: (1) -CB
checks for out-of-bounds array. (2) -ftrapuv
stops the code if see division by zero, which is very useful. However, for DNS Box case, -ftrapuv
indicates such a bug when calling HYPRE_StructGMRESSetup
and stops the code. Since this is an internal subroutine of the hypre
library, we can only see the input, and thus there is no easy way to debug here and figure out why. This bug is still there waiting for future solution.
You also must comment out use mpi
in library/parallel.f90 and uncomment include 'mpif.h'
.
Old Tiger (tiger)
AS OF 08/2016: If you have installed the above libraries using the latest Intel compilers (intel/16.0/64/16.0.2.181 on Tiger as of 08/16), you will need to modify the following fields in Makefile.in
:
F90 = mpifort
LD = mpifort
AR = xiar rcv
LDFLAGS = -qopenmp
DBGFLAGS = -qopenmp -O0 -g -CA -CB -CS -CV -traceback -debug all -ftrapuv -check noarg_temp_created -WB -warn none
OPTFLAGS = -qopenmp -O3 -xhost -ipo -qoverride-limits -nowarn
If you are using the openmpi/intel-16.0 module on Tiger you must comment out use mpi
in library/parallel.f90 and uncomment include 'mpif.h'
. Otherwise you will receive error messages such as:
parallel.f90(116): error #6285: There is no matching specific subroutine for this generic subroutine call. [MPI_ALLREDUCE]
You will also need to add the lines module load intel/16.0
and module load openmpi/intel-16.0
to your slurm script.
NERSC Clusters
If you are installing on Cori or Edison @ NERSC you should switch on the branch ‘cori’ which already has the setting for these machines; in addition, load the module for fftw, otherwise the compiler will fail.
Compiling
If on a cluster, use the module list
command to show what modules you have currently loaded. If these are different than the modules you loaded when installing the libraries for NGA, use the module purge
command and then module load
the correct modules, as described in the page on [libraries](library installation).
Compilation of NGA then occurs in one single step. To compile the code for production runs (read: faster code, longer compilation), execute the command
make opt
Alternatively, to compile the code for debugging (read: slower code, shorter compilation), for which additional diagnostics will be printed to the screen for debugging purposes, execute the command
make debug
Unless you have a specific need, the code should always be compile in opt
mode.
If compilation of the code was successful, the NGA bin directory should contain several executables, notably, arts and init_flow.
Remark: init_flow and arts must be generated by the same compiler flags, otherwise the reading of initial flow by arts will be misaligned and incorrect (e.g., all zeros). May need to make distclean
and recompile.
Check the pages on Code execution and Running on Tiger to see what to do next.