This guide is being actively updated as we prepare to publicly release WRF-GC. Stay tuned for updates.
There have already been successful efforts and guides on setting up WRF and GEOS-Chem on the AWS cloud and GEOS-Chem input data is already readily available on AWS S3. Thus it is very easy to set-up the WRF-GC coupled model to run ultra-high resolution simulations on the cloud.
In this tutorial we will learn how to use AWS ParallelCluster to create your own cluster on the AWS cloud and run WRF-GC on it. WRF-GC supports MPI-based parallelization so it can take advantage of multiple compute nodes.
In this guide I will document steps for:
- Setting up AWS ParallelCluster to create your own HPC cluster
- Configuring the software environment for building WRF-GC
- Running a test WRF-GC simulation across nodes
I will not go into much detail on setting up AWS infrastructure. I’d like to point you to an excellent tutorial written by Jiawei Zhuang.
Setting up AWS ParallelCluster
Work in progress - refer to Jiawei’s AWS HPC Guide for setting up.
Creating the software environment for WRF-GC: Compilers
First, get the
spack package manager for HPC:
cd /shared # install to shared disk git clone https://github.com/spack/spack.git echo 'export PATH=/shared/spack/bin:$PATH' >> ~/.bashrc # to discover spack executable source ~/.bashrc
intelmpi in your modules:
module load intelmpi source /opt/intel/compilers_and_libraries/linux/mpi/intel64/bin/mpivars.sh -ofi_internal=0
You might want to put this in your
You have two paths to choose now for your compiler choice:
- Use the Intel C/Fortran compilers, if you have a license. They generally afford better performance. If you choose to go this route, skip to the Intel section.
- Use the free and open-source GNU C/Fortran compilers. In this case skip to the GNU section.
You will need a valid Intel compiler license. (You may be eligible as a student).
Setting up Intel compilers with spack requires some additional configuration. You can follow Spack’s official guide or the quick and dirty version below:
- Edit the compiler specification file using
spack config --scope=user/linux edit compilers. Your file should look like this: the stub paths will be filled later.
compilers: - compiler: target: x86_64 operating_system: centos7 modules:  spec: firstname.lastname@example.org paths: cc: stub cxx: stub f77: stub fc: stub
- Now install the compilers using
spack install email@example.com %firstname.lastname@example.org.
- Find the actual paths of the compiler executables using
find $(spack location -i intel) -name icc -type f -ls.
- Put these paths in the
spack config --scope=user/linux edit compilerscompiler specification file. It should look something like this, but do not copy the paths below:
compilers: - compiler: target: x86_64 operating_system: centos7 modules:  spec: email@example.com paths: cc: /shared/spack/opt/spack/.../linux/bin/intel64/icc cxx: /shared/spack/opt/spack/.../linux/bin/intel64/icpc f77: /shared/spack/opt/spack/.../linux/bin/intel64/ifort fc: /shared/spack/opt/spack/.../linux/bin/intel64/ifort
Noting that the compilers for cc, cxx, f77, fc are
- Load the compilers using
source $(spack location -i intel)/bin/compilervars.sh -arch intel64.
intelmpiand everyone else to use the Intel compilers. Add this to your
source $(spack location -i intel)/bin/compilervars.sh -arch intel64 export I_MPI_CC=icc export I_MPI_CXX=icpc export I_MPI_FC=ifort export I_MPI_F77=ifort export I_MPI_F90=ifort export CC=icc export FC=ifort export CXX=icpc
You should be ready to go although with an older compiler. Add this to your
export I_MPI_CC=gcc export I_MPI_CXX=g++ export I_MPI_FC=gfortran export I_MPI_F77=gfortran export I_MPI_F90=gfortran export CC=gcc export FC=gfortran export CXX=g++
Software: Required Libraries
The list of dependencies for WRF and WRF-GC are as follows:
- MPI (we are using
intelmpibuilt-in with AWS here)
- netCDF-C, netCDF-Fortran
- JasPer JPEG library 1.900.1
intelmpi is already available, by creating the file
packages: intel-mpi: paths: firstname.lastname@example.org: /opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/ buildable: False
(You may want to check if that is indeed the path for
intel-mpi on your system. Usually,
which mpirun will tell you the rough path.)
Install these dependencies using
spack - no need to compile your own, so easy!
spack -v install netcdf-fortran %intel ^hdf5+fortran+hl ^intel-mpi spack -v install email@example.com %intel
Tell WRF how to locate these dependencies. I recommend adding these to your
export PATH=$(spack location -i netcdf-c)/bin:$PATH export PATH=$(spack location -i netcdf-fortran)/bin:$PATH export HDF5=$(spack location -i hdf5) export NETCDF=$(spack location -i netcdf-fortran) export JASPERLIB=$(spack location -i firstname.lastname@example.org)/lib export JASPERINC=$(spack location -i email@example.com)/include export LD_LIBRARY_PATH=$HDF5/lib:$NETCDF/lib:$LD_LIBRARY_PATH
netcdf-c to be installed in the same place as
netcdf-fortran, so you need to do some moving around. Just link netcdf-C to the netCDF-Fortran folder (ugly ugly…)
# Only need to run this once NETCDF_C=$(spack location -i netcdf-c) ln -sf $NETCDF_C/include/* $NETCDF/include/ ln -sf $NETCDF_C/lib/* $NETCDF/lib/ ln -sf $NETCDF_C/bin/* $NETCDF/bin/
~/.bashrc entries, courtesy of Jiawei and some I added for convenience when you run the model:
# this prevents segmentation fault when running the model ulimit -s unlimited # WRF-specific settings export WRF_EM_CORE=1 export WRFIO_NCD_NO_LARGE_FILE_SUPPORT=0 export WRF_CHEM=1 # compile WRF-GC # Some quick aliases to work with alias vn="vi namelist.input" alias vrc="vi ~/.bashrc" alias tt="tail -f rsl.out.0000" alias te="tail -n 50 rsl.* | less" alias mco="rm rsl.*; rm wrfout_*"
Downloading WRF, GEOS-Chem and WRF-GC
A quick recap of the WRF-GC directory hierarchy. WRF-GC is driven by the WRF model exactly like how WRF-Chem is driven by WRF.
The WRF model rests at the top-most directory, usually named WRFV3:
[centos@ip-172-31-93-89 WRFV3]$ ls arch configure dyn_exp hydro phys README.hydro README.SSIB run var chem configure.wrf dyn_nmm inc README README.io_config README_test_cases share clean configure.wrf.backup external main README.DA README.NMM README.windturbine test compile dyn_em frame Makefile README.hybrid_vert_coord README.rsl_output Registry tools
chem directory contains all code pertaining to chemistry. Inside
chem, you can find the WRF-GC files and a copy of GEOS-Chem in the
chem/gc subdirectory, usually like:
chem/ gc/ config/ chem_driver.F wrfgc_convert_state_mod.F chemics_init.F ...
Let’s get started.
- Downloading WRF: Obtain from the WRF GitHub repository a copy of the compatible WRF model version, currently
188.8.131.52for WRF-GC 1.0, and extract it:
mkdir WRFV3 wget https://github.com/wrf-model/WRF/archive/V184.108.40.206.tar.gz tar -xvzf V220.127.116.11.tar.gz --directory WRFV3
You may want to rename your WRF folder to `WRFV3` or something easy to type.
- Removing existing WRF-Chem code. WRF has recently begun shipping WRF-Chem code alongside its main source. We do not need that code for WRF-GC operation and you should remove it. Go inside the WRF directory and remove the
cd WRFV3 rm -f chem
- Downloading WRF-GC: Obtain from the WRF-GC Release GitHub repository and obtain a copy of WRF-GC, cloning the git repository to a
chemfolder (that you’ve just deleted) in WRF’s folder:
git clone https://github.com/jimmielin/wrf-gc-release.git chem
Note: WRF-GC has not been publicly released yet. During the private beta period please refer to the WRF-GC website to contact Prof. Tzung-May Fu for obtaining a copy.
- Downloading GEOS-Chem: Obtain from the GEOS-Chem GitHub repository a copy of GEOS-Chem. WRF-GC 1.0 currently only supports GEOS-Chem versions 12.2.1 and you will need to clone that specific version.
cd chem wget https://github.com/geoschem/geos-chem/archive/12.2.1.tar.gz tar -xvzf 12.2.1.tar.gz --directory gc
You should already have a
chem/gc folder before cloning the GEOS-Chem repository. That is fine. There is code to interact with the GEOS-Chem modules in the
chem/gc/GCHP folder pre-existing as part of WRF-GC. You will need to keep those.
Downloading HEMCO emissions and GEOS-Chem input data
You will need to download HEMCO emissions and a set of basic GEOS-Chem input data for use with the WRF-GC model. In your
/shared directory, create a
ExtData folder to store all GEOS-Chem inputs, according to GEOS-Chem conventions.
cd /shared mkdir ExtData mkdir ExtData/HEMCO
This section is under construction - I am coordinating with the GEOS-Chem developers to work on a download script from S3. Stay tuned!
I would like to refer you to the excellent GEOS-Chem on the Cloud guide if you have any questions.
Downloading WRF input data
Lots of different meteorological boundary and initial conditions can be used to drive the WRF model. Refer to this list of free data sets for driving WRF.
You can find more information on input data from the official WRF-GC guide.
Building WRF-GC is just like WRF-Chem. If you have used the above
~/.bashrc you will be ready to go -
cd to your
WRFV3 directory and
Noting that the
-hyb option must be enabled for WRF 18.104.22.168 to include the sigma-eta hybrid grid, required by GEOS-Chem.
If you chose to use the Intel compiler set, choose
(dmpar) option; if you chose GNU, then choose
Once configured successfully, proceed to install the WRF-GC registry file:
cd chem make install_registry cd ..
This step is mandatory - you may otherwise get
Registry: errors during WRF-GC compile complaining about species not found.
Issue the compile command:
You may want to run this in a
screen, because the compile process takes quite long. Once finished WRF should tell you “Executables successfully built”. If you find errors you may want to look at my other “common compile problems” blog post 🙂
WRF requires meteorological fields (boundary and initial conditions) and configuration of the simulation domain before proceeding. You will also need to compile WPS (WRF PreProcessing System). The procedure is exactly the same as on a regular Linux cluster, noting a few important compile issues.
Download WPS and extract it in the same level as your
wget https://github.com/wrf-model/WPS/archive/v3.9.1.tar.gz tar -xvzf v3.9.1.tar.gz --directory WPS rm v3.9.1.tar.gz
Configure WPS using the
./configure command, choosing
(serial) and the appropriate compiler option. Compile using
If you are experiencing issues compiling WPS and cannot find
WPS/configure.wps and look for
COMPRESSION_INCLUDE. Change these to the output of:
spack location -i firstname.lastname@example.org
include/ to the path. The JasPer library is required for compiling
ungrib.exe, so if you cannot find
-ljasper then this is the issue.
Preparing WRF input data and configuring WRF-GC (GEOS-Chem)
To prepare input data and more information on WPS, please refer to the WRF-GC user’s guide.
Up to now we have all worked on the login node. This is not good practice if you have a shared cluster or need lots of resources. When running WPS tasks for generating the geographical grid, ungribbing the met fields and gridding them to the simulation domain, we need to run them on a compute node. So how do we run things on the compute nodes?
We will instruct compute nodes to run tasks using the
srun command for SLURM. SLURM is a scheduler, which in short manages the resources on your cluster. It is very well integrated with AWS ParallelCluster and will automatically add compute nodes when necessary, so essentially you have a pay-per-use “private supercomputing cluster”. Neat!
You will likely only need one node to run the
metgrid.exe scripts. You can do this in two ways:
- Creating a one-node interactive shell session. If you’ve worked in a supercomputer cluster before, you might have heard of a “interactive” session. This launches a shell on a compute node and you can do compute intensive tasks on it. To do this you can use:
srun -N 1 --pty /bin/bash
Then you can run WPS tasks simply using
./metgrid.exe. Remember to
exit from the shell after you are finished, so your compute node can shut down.
- Running the task using
srun. The following command will run the task on a designated number (
-N 2, etc.) of computational nodes:
srun -N 1 --ntasks-per-node 1 ./geogrid.exe
Note that since we compiled WPS as
(serial) we are forcing one core here - more cores will not make WPS faster. If you are running a huge domain you may want to compile with
(dmpar) for MPI-parallel of WPS, then you can use multiple nodes and cores.
To run WRF-GC, you will likely need to run the
real.exe pre-processor in
WRFV3/run first to generate the WRF input files from the met fields generated in WPS. This is the first task you likely need to run in multiple nodes. If I am using
c5n.18xlarge nodes with 36 physical cores and 72 “virtual” hyper-threading cores, and want to use, say, 12 nodes, then I can run
real.exe like so:
srun -N 12 --ntasks-per-node 36 ./real.exe
Since WRF is a resource-intensive task, you should always use the number of physical cores. Virtual cores will not do you good in this case. Here is some further information from AWS or on the GEOS-Chem wiki about scalability in “hyperthreading” cores.
To run WRF:
srun -N 12 --ntasks-per-node 36 ./wrf.exe
You can track your model progress by looking at the output from each core. Usually the master node has more information, so you can
tail -f rsl.out.0000 to look at information.
What if I have errors?
Please stay tuned as this guide is being updated and a separate guide for “compile woes with WRF(-GC), GCHP” follows.
# User specific aliases and functions export PATH=/shared/spack/bin:$PATH module load intelmpi export I_MPI_CC=icc export I_MPI_CXX=icpc export I_MPI_FC=ifort export I_MPI_F77=ifort export I_MPI_F90=ifort export CC=icc export FC=ifort export CXX=icpc source $(spack location -i intel)/bin/compilervars.sh -arch intel64 export PATH=$(spack location -i netcdf-c)/bin:$PATH export PATH=$(spack location -i netcdf-fortran)/bin:$PATH # Environment variables required by WRF export HDF5=$(spack location -i hdf5) export NETCDF=$(spack location -i netcdf-fortran) # run-time linking export LD_LIBRARY_PATH=$HDF5/lib:$NETCDF/lib:$LD_LIBRARY_PATH # this prevents segmentation fault when running the model ulimit -s unlimited # WRF-specific settings export WRF_EM_CORE=1 export WRFIO_NCD_NO_LARGE_FILE_SUPPORT=0 export WRF_CHEM=1 export ESMF_COMM=intelmpi export ESMF_COMPILER=intel export I_MPI_PMI_LIBRARY=/opt/slurm/lib/libpmi.so # enable slurm export I_MPI_FABRICS=shm:ofi # use libfabric (default) export FI_PROVIDER=efa # enable EFA (default) source /opt/intel/compilers_and_libraries/linux/mpi/intel64/bin/mpivars.sh -ofi_internal=0 # do not use intel-provided libfabr$ # Some quick aliases to work with alias vn="vi namelist.input" alias vrc="vi ~/.bashrc" alias tt="tail -f rsl.out.0000" alias te="tail -n 50 rsl.* | less" alias mco="rm rsl.*; rm wrfout_*" alias itr="srun -N 1 --ntasks-per-node 36 --pty /bin/bash" # NCL export NCARG_ROOT="/shared/ncl" export PATH="$NCARG_ROOT/bin:$PATH"