Code installation

Code installation

EditNew Pagesuo-yang edited this page on Apr 29, 2019 · 10 revisions

Before installing NGA, ensure that all libraries are properly installed. Remember to:

If using Open MPI, then

  source NGA_modules

If using Intel MPI, then

  source NGA_modules_impi

Downloading NGA

To download a clear version of NGA from the server, execute the command

git clone https://github.umn.edu/CRFEL/NGA.git

To download a clear version of NGA from the server into a directory other than “NGA”, execute the command

git clone https://github.umn.edu/CRFEL/NGA.git my_NGA

If you already have NGA and simply need to update your version with the most recent changes from the server, execute the command

git pull

Compiling NGA

Update Makefile.in

In the NGA src directory, open the file Makefile.in You will need to change are the paths to the libraries. Change the four entries BLAS_DIRLAPACK_DIRHYPRE_DIRFFTW_DIR, and SUNDIALS_DIR to the directories in which you installed these libraries. Write out the full path to the directories, e.g. if on MSI and use Open MPI, /home/suo-yang/username/opt/fftw and not ~/opt/fftw, otherwise the compiler may fail.

Important noice: please write out the full path to fftw-openmp and hypre-openmp.

Additional steps may be necessary depending on where you are installing NGA and what compiler version you are using:

MSI (mesabi)

Remark: Using mpif90 or mpifort depends on which one is available, but they probably perform similarly when both are available.

For Intel MPI, you need to make the following changes:

    CC  = mpiicc #mpicc
    CXX = mpiicpc #mpicxx
    F90 = mpiifort #mpif90
    LD  = mpiifort #mpif90

The Skylake processors on the new mesabi addition (mangi) support 512-bit vector extensions, which could make NGA run faster. To compile using these extensions, set the following two lines in Makefile.in as:

    LDFLAGS  = -qopenmp -xCORE-AVX512
    OPTFLAGS = -qopenmp -O3 -xCORE-AVX512 -ip -qoverride-limits

However, the current mesabi does not support 512-bit vector extensions. So please remove -xCORE-AVX512 from both LDFLAGS and OPTFLAGS, and use -xhost instead:

    LDFLAGS  = -qopenmp -xhost
    OPTFLAGS = -qopenmp -O3 -xhost -ip -qoverride-limits

Also make change these fields:

    AR  = xiar rcv
    DBGFLAGS= -qopenmp -O0 -g -CA -CB -CS -traceback -debug all -ftrapuv -check noarg_temp_created -WB -warn none

Remark: (1) -CB checks for out-of-bounds array. (2) -ftrapuv stops the code if see division by zero, which is very useful.

You also must comment out use mpi in library/parallel.f90 and uncomment include 'mpif.h'.

New Tiger (tigercpu)

The changes in this section have already been applied to Makefile.in in the latest version of NGA, but are left for posterity until after fully transitioning to tigercpu.

Remark: Using mpif90 or mpifort depends on which one is available, but they probably perform similarly when both are available.

The Skylake processors on the new Tiger support 512-bit vector extensions, which could make NGA run faster. To compile using these extensions, set the following two lines in Makefile.in as:

    LDFLAGS  = -qopenmp -xCORE-AVX512
    OPTFLAGS = -qopenmp -O3 -xCORE-AVX512

Remarks: (1) When use Intel-16, -O3 and -qoverride-limits could result in segmentation fault for some cases (e.g., DNS Box), but -O0 to -O2 still works. (2) -ipo cannot find its required files in New Tiger (tigercpu) and could result in warnings, but -ip may work.

Also make change these fields:

    AR  = xiar rcv
    DBGFLAGS= -qopenmp -O0 -g -CA -CB -CS -traceback -debug all -ftrapuv -check noarg_temp_created -WB -warn none

Remark: (1) -CB checks for out-of-bounds array. (2) -ftrapuv stops the code if see division by zero, which is very useful. However, for DNS Box case, -ftrapuv indicates such a bug when calling HYPRE_StructGMRESSetup and stops the code. Since this is an internal subroutine of the hypre library, we can only see the input, and thus there is no easy way to debug here and figure out why. This bug is still there waiting for future solution.

You also must comment out use mpi in library/parallel.f90 and uncomment include 'mpif.h'.

Old Tiger (tiger)

AS OF 08/2016: If you have installed the above libraries using the latest Intel compilers (intel/16.0/64/16.0.2.181 on Tiger as of 08/16), you will need to modify the following fields in Makefile.in:

    F90 = mpifort
    LD  = mpifort
    AR  = xiar rcv
    LDFLAGS  = -qopenmp
    DBGFLAGS = -qopenmp -O0 -g -CA -CB -CS -CV -traceback -debug all -ftrapuv -check noarg_temp_created -WB -warn none
    OPTFLAGS = -qopenmp -O3 -xhost -ipo -qoverride-limits -nowarn

If you are using the openmpi/intel-16.0 module on Tiger you must comment out use mpi in library/parallel.f90 and uncomment include 'mpif.h'. Otherwise you will receive error messages such as:

parallel.f90(116): error #6285: There is no matching specific subroutine for this generic subroutine call.   [MPI_ALLREDUCE]

You will also need to add the lines module load intel/16.0 and module load openmpi/intel-16.0 to your slurm script.

NERSC Clusters

If you are installing on Cori or Edison @ NERSC you should switch on the branch ‘cori’ which already has the setting for these machines; in addition, load the module for fftw, otherwise the compiler will fail.

Compiling

If on a cluster, use the module list command to show what modules you have currently loaded. If these are different than the modules you loaded when installing the libraries for NGA, use the module purge command and then module load the correct modules, as described in the page on [libraries](library installation).

Compilation of NGA then occurs in one single step. To compile the code for production runs (read: faster code, longer compilation), execute the command

make opt

Alternatively, to compile the code for debugging (read: slower code, shorter compilation), for which additional diagnostics will be printed to the screen for debugging purposes, execute the command

make debug

Unless you have a specific need, the code should always be compile in opt mode.

If compilation of the code was successful, the NGA bin directory should contain several executables, notably, arts and init_flow.

Remark: init_flow and arts must be generated by the same compiler flags, otherwise the reading of initial flow by arts will be misaligned and incorrect (e.g., all zeros). May need to make distclean and recompile.

Check the pages on Code execution and Running on Tiger to see what to do next.