Summary
This guide is being actively updated as we prepare to publicly release WRF-GC. Stay tuned for updates.
This is a tutorial on how to set-up the WRF-GC model on the Amazon Web Services cloud.
There have already been successful efforts and guides on setting up WRF and GEOS-Chem on the AWS cloud and GEOS-Chem input data is already readily available on AWS S3. Thus it is very easy to set-up the WRF-GC coupled model to run ultra-high resolution simulations on the cloud.
In this tutorial we will learn how to use AWS ParallelCluster to create your own cluster on the AWS cloud and run WRF-GC on it. WRF-GC supports MPI-based parallelization so it can take advantage of multiple compute nodes.
Steps
In this guide I will document steps for:
- Setting up AWS ParallelCluster to create your own HPC cluster
- Configuring the software environment for building WRF-GC
- Running a test WRF-GC simulation across nodes
I will not go into much detail on setting up AWS infrastructure. I’d like to point you to an excellent tutorial written by Jiawei Zhuang.
Setting up AWS ParallelCluster
Work in progress - refer to Jiawei’s AWS HPC Guide for setting up.
Creating the software environment for WRF-GC: Compilers
First, get the spack
package manager for HPC:
cd /shared # install to shared disk
git clone https://github.com/spack/spack.git
echo 'export PATH=/shared/spack/bin:$PATH' >> ~/.bashrc # to discover spack executable
source ~/.bashrc
Load intelmpi
in your modules:
module load intelmpi
source /opt/intel/compilers_and_libraries/linux/mpi/intel64/bin/mpivars.sh -ofi_internal=0
You might want to put this in your ~/.bashrc
.
You have two paths to choose now for your compiler choice:
- Use the Intel C/Fortran compilers, if you have a license. They generally afford better performance. If you choose to go this route, skip to the Intel section.
- Use the free and open-source GNU C/Fortran compilers. In this case skip to the GNU section.
Intel
You will need a valid Intel compiler license. (You may be eligible as a student).
Setting up Intel compilers with spack requires some additional configuration. You can follow Spack’s official guide or the quick and dirty version below:
- Edit the compiler specification file using
spack config --scope=user/linux edit compilers
. Your file should look like this: the stub paths will be filled later.
compilers:
- compiler:
target: x86_64
operating_system: centos7
modules: []
spec: intel@19.0.4
paths:
cc: stub
cxx: stub
f77: stub
fc: stub
- Now install the compilers using
spack install intel@19.0.4 %intel@19.0.4
. - Find the actual paths of the compiler executables using
find $(spack location -i intel) -name icc -type f -ls
. - Put these paths in the
spack config --scope=user/linux edit compilers
compiler specification file. It should look something like this, but do not copy the paths below:
compilers:
- compiler:
target: x86_64
operating_system: centos7
modules: []
spec: intel@19.0.4
paths:
cc: /shared/spack/opt/spack/.../linux/bin/intel64/icc
cxx: /shared/spack/opt/spack/.../linux/bin/intel64/icpc
f77: /shared/spack/opt/spack/.../linux/bin/intel64/ifort
fc: /shared/spack/opt/spack/.../linux/bin/intel64/ifort
Noting that the compilers for cc, cxx, f77, fc are icc
, icpc
, ifort
and ifort
respectively.
- Load the compilers using
source $(spack location -i intel)/bin/compilervars.sh -arch intel64
. - Tell
intelmpi
and everyone else to use the Intel compilers. Add this to your~/.bashrc
:
source $(spack location -i intel)/bin/compilervars.sh -arch intel64
export I_MPI_CC=icc
export I_MPI_CXX=icpc
export I_MPI_FC=ifort
export I_MPI_F77=ifort
export I_MPI_F90=ifort
export CC=icc
export FC=ifort
export CXX=icpc
GNU Fortran
You should be ready to go although with an older compiler. Add this to your ~/.bashrc
:
export I_MPI_CC=gcc
export I_MPI_CXX=g++
export I_MPI_FC=gfortran
export I_MPI_F77=gfortran
export I_MPI_F90=gfortran
export CC=gcc
export FC=gfortran
export CXX=g++
Software: Required Libraries
The list of dependencies for WRF and WRF-GC are as follows:
- MPI (we are using
intelmpi
built-in with AWS here) - hdf5
- netCDF-C, netCDF-Fortran
- JasPer JPEG library 1.900.1
Tell spack
that intelmpi
is already available, by creating the file ~/.spack/packages.yaml
:
packages:
intel-mpi:
paths:
intel-mpi@2019.4.243: /opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/
buildable: False
(You may want to check if that is indeed the path for intel-mpi
on your system. Usually, which mpirun
will tell you the rough path.)
Install these dependencies using spack
- no need to compile your own, so easy!
spack -v install netcdf-fortran %intel ^hdf5+fortran+hl ^intel-mpi
spack -v install jasper@1.900.1 %intel
Tell WRF how to locate these dependencies. I recommend adding these to your ~/.bashrc
:
export PATH=$(spack location -i netcdf-c)/bin:$PATH
export PATH=$(spack location -i netcdf-fortran)/bin:$PATH
export HDF5=$(spack location -i hdf5)
export NETCDF=$(spack location -i netcdf-fortran)
export JASPERLIB=$(spack location -i jasper@1.900.1)/lib
export JASPERINC=$(spack location -i jasper@1.900.1)/include
export LD_LIBRARY_PATH=$HDF5/lib:$NETCDF/lib:$LD_LIBRARY_PATH
WRF expects netcdf-c
to be installed in the same place as netcdf-fortran
, so you need to do some moving around. Just link netcdf-C to the netCDF-Fortran folder (ugly ugly…)
# Only need to run this once
NETCDF_C=$(spack location -i netcdf-c)
ln -sf $NETCDF_C/include/* $NETCDF/include/
ln -sf $NETCDF_C/lib/* $NETCDF/lib/
ln -sf $NETCDF_C/bin/* $NETCDF/bin/
Some extra ~/.bashrc
entries, courtesy of Jiawei and some I added for convenience when you run the model:
# this prevents segmentation fault when running the model
ulimit -s unlimited
# WRF-specific settings
export WRF_EM_CORE=1
export WRFIO_NCD_NO_LARGE_FILE_SUPPORT=0
export WRF_CHEM=1 # compile WRF-GC
# Some quick aliases to work with
alias vn="vi namelist.input"
alias vrc="vi ~/.bashrc"
alias tt="tail -f rsl.out.0000"
alias te="tail -n 50 rsl.* | less"
alias mco="rm rsl.*; rm wrfout_*"
Downloading WRF, GEOS-Chem and WRF-GC
A quick recap of the WRF-GC directory hierarchy. WRF-GC is driven by the WRF model exactly like how WRF-Chem is driven by WRF.
The WRF model rests at the top-most directory, usually named WRFV3:
[centos@ip-172-31-93-89 WRFV3]$ ls
arch configure dyn_exp hydro phys README.hydro README.SSIB run var
chem configure.wrf dyn_nmm inc README README.io_config README_test_cases share
clean configure.wrf.backup external main README.DA README.NMM README.windturbine test
compile dyn_em frame Makefile README.hybrid_vert_coord README.rsl_output Registry tools
A chem
directory contains all code pertaining to chemistry. Inside chem
, you can find the WRF-GC files and a copy of GEOS-Chem in the chem/gc
subdirectory, usually like:
chem/
gc/
config/
chem_driver.F
wrfgc_convert_state_mod.F
chemics_init.F
...
Let’s get started.
- Downloading WRF: Obtain from the WRF GitHub repository a copy of the compatible WRF model version, currently
3.9.1.1
for WRF-GC 1.0, and extract it:
mkdir WRFV3
wget https://github.com/wrf-model/WRF/archive/V3.9.1.1.tar.gz
tar -xvzf V3.9.1.1.tar.gz --directory WRFV3
You may want to rename your WRF folder to
WRFV3
or something easy to type.
- Removing existing WRF-Chem code. WRF has recently begun shipping WRF-Chem code alongside its main source. We do not need that code for WRF-GC operation and you should remove it. Go inside the WRF directory and remove the
chem
subdirectory:
cd WRFV3
rm -f chem
- Downloading WRF-GC: Obtain from the WRF-GC Release GitHub repository and obtain a copy of WRF-GC, cloning the git repository to a
chem
folder (that you’ve just deleted) in WRF’s folder:
git clone https://github.com/jimmielin/wrf-gc-release.git chem
Note: WRF-GC has not been publicly released yet. During the private beta period please refer to the WRF-GC website to contact Prof. Tzung-May Fu for obtaining a copy.
WRF-GC is now publicly available on GitHub: jimmielin/wrf-gc-release. Please also visit the WRF-GC website for the latest updates.
The latest version of WRF-GC, v2.0.1, includes GEOS-Chem 12.8.3. Refer to the latest documentation PDF and the WRF-GC 2.0 paper by Feng et al., 2021 for more information!
Only if using older versions of WRF-GC:
Downloading GEOS-Chem: Obtain from the GEOS-Chem GitHub repository a copy of GEOS-Chem. WRF-GC 1.0 currently only supports GEOS-Chem versions 12.2.1 and you will need to clone that specific version.
cd chem
wget https://github.com/geoschem/geos-chem/archive/12.2.1.tar.gz
tar -xvzf 12.2.1.tar.gz --directory gc
You should already have a chem/gc
folder before cloning the GEOS-Chem repository. That is fine. There is code to interact with the GEOS-Chem modules in the chem/gc/GCHP
folder pre-existing as part of WRF-GC. You will need to keep those.
Downloading HEMCO emissions and GEOS-Chem input data
You will need to download HEMCO emissions and a set of basic GEOS-Chem input data for use with the WRF-GC model. In your /shared
directory, create a ExtData
folder to store all GEOS-Chem inputs, according to GEOS-Chem conventions.
cd /shared
mkdir ExtData
mkdir ExtData/HEMCO
This section is under construction - I am coordinating with the GEOS-Chem developers to work on a download script from S3. Stay tuned!
I would like to refer you to the excellent GEOS-Chem on the Cloud guide if you have any questions.
Downloading WRF input data
Lots of different meteorological boundary and initial conditions can be used to drive the WRF model. Refer to this list of free data sets for driving WRF.
You can find more information on input data from the official WRF-GC guide.
Building WRF(-GC)
Configuring
Building WRF-GC is just like WRF-Chem. If you have used the above ~/.bashrc
you will be ready to go - cd
to your WRFV3
directory and ./configure
WRF:
./configure -hyb
Noting that the -hyb
option must be enabled for WRF 3.9.1.1 to include the sigma-eta hybrid grid, required by GEOS-Chem.
If you chose to use the Intel compiler set, choose icc/ifort
with (dmpar)
option; if you chose GNU, then choose gcc/gfortran
with (dmpar)
option.
Once configured successfully, proceed to install the WRF-GC registry file:
cd chem
make install_registry
cd ..
This step is mandatory - you may otherwise get Registry:
errors during WRF-GC compile complaining about species not found.
Building WRF
Issue the compile command:
./compile em_real
You may want to run this in a screen
, because the compile process takes quite long. Once finished WRF should tell you “Executables successfully built”. If you find errors you may want to look at my other “common compile problems” blog post 🙂
Building WPS
WRF requires meteorological fields (boundary and initial conditions) and configuration of the simulation domain before proceeding. You will also need to compile WPS (WRF PreProcessing System). The procedure is exactly the same as on a regular Linux cluster, noting a few important compile issues.
Download WPS and extract it in the same level as your WRFV3
folder:
wget https://github.com/wrf-model/WPS/archive/v3.9.1.tar.gz
tar -xvzf v3.9.1.tar.gz --directory WPS
rm v3.9.1.tar.gz
Configure WPS using the ./configure
command, choosing (serial)
and the appropriate compiler option. Compile using ./compile
.
If you are experiencing issues compiling WPS and cannot find ungrib.exe
: Edit WPS/configure.wps
and look for COMPRESSION_LIBS
, COMPRESSION_INCLUDE
. Change these to the output of:
spack location -i jasper@1.900.1
By appending lib/
and include/
to the path. The JasPer library is required for compiling ungrib.exe
, so if you cannot find -ljasper
then this is the issue.
Preparing WRF input data and configuring WRF-GC (GEOS-Chem)
To prepare input data and more information on WPS, please refer to the WRF-GC user’s guide.
Up to now we have all worked on the login node. This is not good practice if you have a shared cluster or need lots of resources. When running WPS tasks for generating the geographical grid, ungribbing the met fields and gridding them to the simulation domain, we need to run them on a compute node. So how do we run things on the compute nodes?
We will instruct compute nodes to run tasks using the srun
command for SLURM. SLURM is a scheduler, which in short manages the resources on your cluster. It is very well integrated with AWS ParallelCluster and will automatically add compute nodes when necessary, so essentially you have a pay-per-use “private supercomputing cluster”. Neat!
You will likely only need one node to run the geogrid.exe
, ungrib.exe
and metgrid.exe
scripts. You can do this in two ways:
- Creating a one-node interactive shell session. If you’ve worked in a supercomputer cluster before, you might have heard of a “interactive” session. This launches a shell on a compute node and you can do compute intensive tasks on it. To do this you can use:
srun -N 1 --pty /bin/bash
Then you can run WPS tasks simply using ./geogrid.exe
, ./ungrib.exe
and ./metgrid.exe
. Remember to exit
from the shell after you are finished, so your compute node can shut down.
- Running the task using
srun
. The following command will run the task on a designated number (-N 1
,-N 2
, etc.) of computational nodes:
srun -N 1 --ntasks-per-node 1 ./geogrid.exe
Note that since we compiled WPS as (serial)
we are forcing one core here - more cores will not make WPS faster. If you are running a huge domain you may want to compile with (dmpar)
for MPI-parallel of WPS, then you can use multiple nodes and cores.
Running WRF-GC
To run WRF-GC, you will likely need to run the real.exe
pre-processor in WRFV3/run
first to generate the WRF input files from the met fields generated in WPS. This is the first task you likely need to run in multiple nodes. If I am using c5n.18xlarge
nodes with 36 physical cores and 72 “virtual” hyper-threading cores, and want to use, say, 12 nodes, then I can run real.exe
like so:
srun -N 12 --ntasks-per-node 36 ./real.exe
Since WRF is a resource-intensive task, you should always use the number of physical cores. Virtual cores will not do you good in this case. Here is some further information from AWS or on the GEOS-Chem wiki about scalability in “hyperthreading” cores.
To run WRF:
srun -N 12 --ntasks-per-node 36 ./wrf.exe
You can track your model progress by looking at the output from each core. Usually the master node has more information, so you can tail -f rsl.out.0000
to look at information.
What if I have errors?
Please stay tuned as this guide is being updated and a separate guide for “compile woes with WRF(-GC), GCHP” follows.
Happy modeling!
Appendix: My ~/.bashrc
# User specific aliases and functions
export PATH=/shared/spack/bin:$PATH
module load intelmpi
export I_MPI_CC=icc
export I_MPI_CXX=icpc
export I_MPI_FC=ifort
export I_MPI_F77=ifort
export I_MPI_F90=ifort
export CC=icc
export FC=ifort
export CXX=icpc
source $(spack location -i intel)/bin/compilervars.sh -arch intel64
export PATH=$(spack location -i netcdf-c)/bin:$PATH
export PATH=$(spack location -i netcdf-fortran)/bin:$PATH
# Environment variables required by WRF
export HDF5=$(spack location -i hdf5)
export NETCDF=$(spack location -i netcdf-fortran)
# run-time linking
export LD_LIBRARY_PATH=$HDF5/lib:$NETCDF/lib:$LD_LIBRARY_PATH
# this prevents segmentation fault when running the model
ulimit -s unlimited
# WRF-specific settings
export WRF_EM_CORE=1
export WRFIO_NCD_NO_LARGE_FILE_SUPPORT=0
export WRF_CHEM=1
export ESMF_COMM=intelmpi
export ESMF_COMPILER=intel
export I_MPI_PMI_LIBRARY=/opt/slurm/lib/libpmi.so # enable slurm
export I_MPI_FABRICS=shm:ofi # use libfabric (default)
export FI_PROVIDER=efa # enable EFA (default)
source /opt/intel/compilers_and_libraries/linux/mpi/intel64/bin/mpivars.sh -ofi_internal=0 # do not use intel-provided libfabr$
# Some quick aliases to work with
alias vn="vi namelist.input"
alias vrc="vi ~/.bashrc"
alias tt="tail -f rsl.out.0000"
alias te="tail -n 50 rsl.* | less"
alias mco="rm rsl.*; rm wrfout_*"
alias itr="srun -N 1 --ntasks-per-node 36 --pty /bin/bash"
# NCL
export NCARG_ROOT="/shared/ncl"
export PATH="$NCARG_ROOT/bin:$PATH"