Tools#
Wellies defines a Tool as anything that needs to be made available on the
execution environment of one or more of your workflow's tasks. They can be
dependent on each other and contain up to three script snippets that will be
used on different contexts:
setup: Defines how a tool can be installed on the suite's workspace.load: For any occasion when preparation action needs to be done to make a tool discoverable for use.unload: The opposite operation of load. Makes the tool not available anymore.
Tool types#
- Environment variable: general environment variables with
nameandvaluethat can be set at configuration level. - Module: They are just loaded/unloaded from the system, i.e., does not have
a setup. At configuration level the options are
nameandversion. Can be a general Module or Private which requires an extra optionmodulefiles. - Package: They are installables that can be retrieved from local or remote
locations. The options are similar to the ones for the different
static data types and the
post_scriptoption can be used to reference to custom installation script snippets or commands. They don't addloadorunloadscripts, so usually they are associated to an environment where they can be made discoverable. A custombuild_dircan be provided and that will point the retrieval part of the setup script to that location. - Python virtual environment: This is a shortcut to define python virtual
environments that will be built using
venv. Wellies supports two typestype: system_venv: Creates an environment that is based on the system-wide python installation, meaning the creation option will contain the--system-site-packagesargument. Extra external packages can be installed locally to the environment using theextra_packagesoption.type: venv: Creates a file-based environment without the--system-site-packagesargument.
- Conda environment: Wellies supports three types of conda environments using
different building strategies: from a specification file when
env_fileis present, with a list of packages provided whenextra_packagesis present, or no build at all with a reference to an existing environment whenenvironmentis present. Extra options for the conda commands can be set using theconda_cmdoption. Theenv_filebehaves just as any other data object, so any of the static data types can be use to define how the specification file needs to be retrieved. - Folder environment: This is namespace environment type where packages can be
installed with appropriate dependencies. The created namespace will be added to
the system
PATHfor discoverability. - Custom environment: Custom environment with custom
load,unloadandsetupcommands or scripts.
All Tool types also accept a depends option where a list of
dependencies can be defined by each tool name. Environment types accept a
packages option where previously defined package Tools can be installed on that
same environment
Warning
Please be aware that configuration files are parsed sequentially, so the dependency tree must be defined accordingly.
Examples#
Environment variables#
tools:
env_variables:
PYTHONPATH:
value: "$LIB_DIR/python:$LIB_DIR/bin"
HDF5:
variable: HDF5_USE_FILE_LOCKING
value: "FALSE"
The wellies' generated snippets environment variables will be
--------setup script-----------
--------load script-----------
export PYTHONPATH=$LIB_DIR/python:$LIB_DIR/bin:${PYTHONPATH:-}
--------unload script-----------
echo 'removing $LIB_DIR/python:$LIB_DIR/bin from $PYTHONPATH'
export PYTHONPATH=${PYTHONPATH/$LIB_DIR/python:$LIB_DIR/bin:/}
echo '$PATH after removing' && echo $PYTHONPATH
If variable is provided, the high-level key will be an alias that you can refer
to in the python suite generator code of your suite that will point to the
actual variable name on the system. The snippets will reflect that.
--------setup script-----------
--------load script-----------
export HDF5_USE_FILE_LOCKING=FALSE
--------unload script-----------
unset $HDF5_USE_FILE_LOCKING
Note
any environment variable ending with "PATH" will have a special treatment and
write the value in append mode.
Modules#
tools:
modules:
ecmwf-toolbox:
version: "2023.10.1.0"
mymodule:
modulefiles: /path/to/dev/module
The wellies' generated snippets for the system module will be
--------setup script-----------
--------load script-----------
set +ux
module unload ecmwf-toolbox || true
module load ecmwf-toolbox/2023.10.1.0
set -ux
--------unload script-----------
module unload ecmwf-toolbox
And for private modules
--------setup script-----------
--------load script-----------
module use /path/to/dev/module
set +ux
module unload mymodule || true
module load mymodule/default
set -ux
--------unload script-----------
module unload mymodule
You can also differentiate between the configuration key and the actual module
name by providing a value for name. This makes it easier for cross-referencing
on dependencies trees while using various versions.
tools:
modules:
python:
name: python3
version: "3.10.10-01"
python_old:
name: python3
version: "old"
pyflow:
version: "3.2.0"
depends: ["python"]
pcraster:
version: "4.3.0-01"
depends: ["python_old"]
After combined into a [wellies.ToolStore][] object, the dependencies can be
resolved accordingly. Using the configuration above, the pyflow tool will
contain the following snippets:
--------setup script-----------
--------load script-----------
set +ux
module unload python3 || true
module load python3/3.10.10-01
set -ux
set +ux
module unload pyflow || true
module load pyflow/3.2.0
set -ux
--------unload script-----------
module unload python3
module unload pyflow
Packages#
tools:
packages:
earthkit:
type: "git"
source: "git@github.com:ecmwf/earthkit-data.git"
branch: "develop"
build_dir: "/tmp/git/files"
post_script: [
"pip uninstall earthkit",
"pip install .",
"pip show src | grep Version > version.txt",
]
local_files:
type: "rsync"
source: "hpc-login:/path/to/pkg/src"
post_script: "/path/to/installer.sh"
Considering there is a LIB_DIR environment variable pointing to where packages should be installed, the wellies' generated snippets for earthkit will be
--------setup script-----------
# Main script for retrieving data
mkdir -p /tmp/git/files
dest_dir=/tmp/git/files/earthkit
rm -rf $dest_dir
giturl=git@github.com:ecmwf/earthkit-data.git
gitbranch=develop
git clone $giturl --branch $gitbranch --single-branch --depth 1 $dest_dir
cd $dest_dir
# Post-script
pip uninstall earthkit
pip install .
pip show src | grep Version > version.txt
ecflow_client --label=version $(if [[ -f version.txt ]]; then cat version.txt; else echo NA; fi)
--------load script-----------
--------unload script-----------
The package signature is based on the StaticData
object. For supported types for different retrieval strategies, please check
this page. Further customization in the setup process is
provided by the post_script option that can accept either a literal script or
a reference to an existing file.
So, using the example above once again, local_files will have the following
generated snippets:
--------setup script-----------
# Main script for retrieving data
mkdir -p $LIB_DIR/build/${ENV_NAME:-}
dest_dir=$LIB_DIR/build/${ENV_NAME:-}/local_files
rsync -avzpL hpc-login:/path/to/pkg/src $dest_dir/
cd $dest_dir
# Post-script
echo "Hello from install file"
cd $LIB_DIR/local_files && make install
ecflow_client --label=version $(if [[ -f version.txt ]]; then cat version.txt; else echo NA; fi)
--------load script-----------
--------unload script-----------
Folder environment#
Folder environments are like a namespace to aggregate different tools together. Firstly, checking the environment itself:
Considering there is LIB_DIR environment variable pointing to the root
directory, the wellies' generated snippets will be
--------setup script-----------
rm -rf $LIB_DIR/bin
mkdir -p $LIB_DIR/bin
--------load script-----------
export PATH=$LIB_DIR/bin:${PATH:-}
--------unload script-----------
echo 'removing $LIB_DIR/bin from $PATH'
export PATH=${PATH/$LIB_DIR/bin:/}
echo '$PATH after removing' && echo $PATH
Python virtual environment#
System environment#
The following config can be used to define a local python virtual environment that extends a system wide installation:
tools:
environments:
myvenv:
type: system_venv
extra_packages: "pcraster>=3.4"
venv_options: "--upgrade"
Considering there is LIB_DIR environment variable pointing to the
installation root directory, the wellies' generated snippets will be
--------setup script-----------
rm -rf $LIB_DIR/myvenv
python3 -m venv $LIB_DIR/myvenv --system-site-packages --upgrade
source $LIB_DIR/myvenv/bin/activate
export LD_LIBRARY_PATH=$LIB_DIR/myvenv/lib:${LD_LIBRARY_PATH:=}
pip install 'pcraster>=3.4'
--------load script-----------
source $LIB_DIR/myvenv/bin/activate
export LD_LIBRARY_PATH=$LIB_DIR/myvenv/lib:${LD_LIBRARY_PATH:=}
--------unload script-----------
deactivate
Note
the reserved name packages always refers to other tools of this type specified
within your configuration. For external packages use extra_packages
Warning
If provided, extra_packages must always be a list, even of one element
Build with custom packages#
The following config can be used to define a local python virtual environment that does not use the system-wide site-packages.
tools:
environments:
datasets_env:
type: venv
packages: [anemoi_datasets]
depends: [python]
Considering there is LIB_DIR environment variable pointing to the
installation root directory, the wellies' generated snippets will be
--------setup script-----------
rm -rf $LIB_DIR/datasets_env
python3 -m venv $LIB_DIR/datasets_env
--------load script-----------
source $LIB_DIR/datasets_env/bin/activate
export LD_LIBRARY_PATH=$LIB_DIR/datasets_env/lib:${LD_LIBRARY_PATH:=}
--------unload script-----------
deactivate
Note
the reserved name packages always refers to other tools of this type specified
within your configuration. For external packages use extra_packages.
Conda environments#
Conda environments can be defined in three ways: - Existing system environment - Built from a specification list - Built from a specification file
The loading and unloading script snippets for all of them will be same. They differ only on the way they are set up and this will be the focus here.
System environment#
A system conda environment can be configured as
Considering there is LIB_DIR environment variable pointing to the
installation root directory, the wellies' generated snippets will be
--------setup script-----------
--------load script-----------
set +ux
conda activate base
set -ux
--------unload script-----------
conda deactivate
Build with custom packages#
To specify a conda environment that needs to be built within your workflow from a list of packages specifications, a configuration file can look like:
tools:
environments:
myconda:
type: conda
packages: ["earthkit"]
extra_packages: ["python==3.10", "pcraster>=4.3.0", "gdal"]
conda_cmd: "mamba -c conda-forge"
Here we also changed the conda base command to use on setup to use the mamba environment solver and to give priority to the conda-forge channel. This allows you to use any valid extra option for your conda commands.
Note
the reserved name packages always refers to other tools of this type specified
within your configuration. For external packages use extra_packages
Warning
If provided, extra_packages must always be a list, even of one element
Considering there is LIB_DIR environment variable pointing to the
installation root directory, the wellies' generated snippets will be
--------setup script-----------
rm -rf $LIB_DIR/myconda
mamba -c conda-forge create -p $LIB_DIR/myconda python==3.10 'pcraster>=4.3.0' gdal
--------load script-----------
set +ux
conda activate $LIB_DIR/myconda
set -ux
--------unload script-----------
conda deactivate
From file#
Another common way to specify conda environment is through yml files. A valid configuration to obtain such a file from the local filesystem is
tools:
environments:
myconda:
type: conda
env_file:
type: copy
source: /path/to/project
files: env.yml
The env_file specification can be any valid wellies.StaticData entry. For
more details on the options available, please check the
data types page.
Considering there is LIB_DIR environment variable pointing to the
installation root directory, the wellies' generated snippets will be
--------setup script-----------
# Main script for retrieving data
mkdir -p $LIB_DIR/build
dest_dir=$LIB_DIR/build/myconda
rm -rf $dest_dir
mkdir -p $dest_dir
scp /path/to/project/env.yml $dest_dir/
cd $dest_dir
rm -rf $LIB_DIR/myconda
conda env create --file $LIB_DIR/build/myconda/env.yml -p $LIB_DIR/myconda
--------load script-----------
set +ux
conda activate $LIB_DIR/myconda
set -ux
--------unload script-----------
conda deactivate
Custom Environments#
Custom environment is a flexible alternative to generate any other type of environment. Supports
definition of load, unload and setup scripts or commands from the configuration files.
tools:
environments:
myenv:
type: custom
load: "/path/to/env/load.sh"
unload: "unload_myenv"
The wellies' generated snippets for the custom environment will be