UCSF Wynton HPC Cluster

Examples not working?

Currently, the conda-stage tool has only been tested with the Bash shell, and it is unlikely it will work with other shells. Most users on Wynton use Bash, but a few have explicitly asked to use another. Type echo $SHELL if you’re not sure what shell you use.

Stage Conda environment on local disk (highly recommended) #

Please, stage your Conda environment to local disk! Your software and job scripts will run much faster, and it will significantly decrease the load on our global filesystem (BeeGFS). It is a win-win for everyone!

Working with a Conda environment that lives on local disk greatly improves the performance. This is because the local disk (/scratch) on the current machine is much faster than any network-based file system, including BeeGFS (/wynton) used on Wynton. This is particularly beneficial when running many instances of a software tool, e.g. in job scripts.

Staging a Conda environment to local disk is straightforward using the conda-stage tool. All we have to do is configure the environment once, and from then on we can work with conda activate ... and conda deactivate as normal.

Below is a walk-through that illustrates the process. It assumes we have already create a Conda environment named myjupyter with some software installed.

Configure Conda environment for automatic staging (once) #

To configure Conda environment myjupyter for automatic staging, call conda stage --auto-stage=enable myjupyter as in:

[alice@dev2 ~]$ module load CBI miniforge3  ## or your own conda installation
[alice@dev2 ~]$ module load CBI conda-stage
[alice@dev2 ~]$ conda stage --auto-stage=enable myjupyter
INFO: Configuring automatic staging and unstaging of original Conda environment  ...
INFO: [ONCE] Packaging Conda environment, because it hasn't been done before ...
Collecting packages...
Packing environment at '/wynton/home/boblab/alice/.conda/envs/myjupyter' to
'/wynton/home/boblab/alice/.conda/envs/.tmp.myjupyter.tar.gz'
[########################################] | 100% Completed |  4min  5.6s
INFO: Total 'conda-pack' time: 274 seconds
INFO: Created conda-pack tarball: /wynton/home/boblab/alice/.conda/envs/myjupyter.tar.gz
      (140099022 bytes; 2025-03-02 19:33:08.108806755 -0800)
INFO: Enabled auto-staging
INFO: Enabled auto-unstaging
[alice@dev2 ~]$ 

This configuration step is quick and needs to be done only once per environment.

That’s basically it! From now on, you can do what you have always done with Conda environments, as illustrated next.

Activating and deactivating Conda environment (as usual) #

Each time you activate the environment, it is automatically staged to local disk;

[alice@dev2 ~]$ conda activate myjupyter
INFO: Staging current Conda environment (/wynton/home/boblab/alice/.conda/envs/myjupyter) to local disk ...
INFO: Extracting /wynton/home/boblab/alice/.conda/envs/myjupyter.tar.gz 
      (86965746 bytes; 2022-04-15 16:53:50.000000000 -0700) to /scratch/alice/conda-stage-grWA/myjupyter
INFO: Total extract time: 4 seconds
INFO: Disable any /scratch/alice/conda-stage-grWA/myjupyter/etc/conda/activate.d/*.conda-stage-auto.sh scripts
INFO: Activating staged environment
INFO: Unpacking (relocating)
INFO: Total 'conda-unpack' time: 0 seconds
INFO: Making staged environment read-only (use --writable to disable)
INFO: Activating staged Conda environment: /scratch/alice/conda-stage-grWA/myjupyter
(/scratch/alice/conda-stage-grWA/myjupyter) [alice@dev2 ~]$ 

To convince ourselves that, at this point, everything runs off the local disk, try this:

(/scratch/alice/conda-stage-grWA/myjupyter) [alice@dev2 ~]$ command -v python
/scratch/alice/conda-stage-grWA/myjupyter/bin/python
(/scratch/alice/conda-stage-grWA/myjupyter) [alice@dev2 ~]$ command -v jupyter
/scratch/alice/conda-stage-grWA/myjupyter/bin/jupyter

Success! This means that these software tools run much faster, because they no longer rely on the much slower BeeGFS filesystem. Another advantage is that your Conda software stack adds much less load to BeeGFS, which otherwise can be quite significant when using Conda. This is a win-win for everyone. See ‘Benchmark staged Conda environment’ below for some benchmark results.

When deactivated, the staged environment is automatically unstaged and all of the temporary, staged files are automatically removed. No surprises here either;

(/scratch/alice/conda-stage-grWA/myjupyter) [alice@dev2 ~]$ conda deactivate
INFO: Unstaging and reverting to original Conda environment  ...
INFO: Preparing removal of staged files: /scratch/alice/conda-stage-grWA/myjupyter
INFO: Deactivating and removing staged Conda environment: /scratch/alice/conda-stage-grWA/myjupyter
INFO: Total unstage time: 0 seconds
[alice@dev2 ~]$ command -v jupyter
[alice@dev2 ~]$ command -v python
/usr/bin/python

Using Conda staging in job scripts #

To work with staged conda environments in your job scripts, make sure to first configure it to do automatic staging interactively from a development node as above. Then activate the environment as usual, e.g.

#! /usr/bin/env bash
#$ -S /bin/bash   # Run in bash
#$ -cwd           # Current working directory
#$ -j y           # Join STDERR and STDOUT
#$ -R yes         # SGE host reservation, highly recommended

module load CBI miniforge3

conda activate myenv
trap 'conda deactivate' EXIT

…

In this example, we have also added a shell “trap” that deactivates the environment when the script exits. This makes sure the staged environment is unstaged, including all of its temporary files are removed.

If you get an error on /usr/share/lmod/lmod/init/sh: line 14: 'conda-stage': not a valid identifier, make sure to declare the shell (#$ -S /bin/bash) to use in your job script.

Update an automatically-staged Conda environment #

If we would update or install new Conda packages to a staged environment, they will all be lost when unstaged. Because of this staged environments are by default read-only (conda-stage option --writable overrides this). Instead, for installation to be persistent, we need to install to the original Conda environment before it is staged. The easiest approach is to first disable auto-staging;

[alice@dev2 ~]$ module load CBI conda-stage
[alice@dev2 ~]$ conda stage --auto-stage=disable myjupyter
INFO: Configuring automatic staging and unstaging of original Conda environment  ...
INFO: Removed 'conda-pack' tarball /home/alice/.conda/envs/myenv.tar.gz
      (140098670 bytes; 2025-03-02 18:09:55.975267674 -0800)
INFO: Disabled auto-staging
INFO: Disabled auto-unstaging
[alice@dev2 ~]$ 

Then update it as usual as in:

[alice@dev2 ~]$ conda enable myjupyter
(myjupyter) [alice@dev2 ~]$ conda update --all
…
(myjupyter) [alice@dev2 ~]$ conda deactivate

Finally, re-enable auto-staging as above, i.e.

[alice@dev2 ~]$ conda stage --auto-stage=enable myjupyter
...

Appendix #

Benchmark staged Conda environment #

To illustrate the benefit of staging a Conda environment to local disk, we will benchmark how long it takes for jupyter --version to complete without staging and with staging.

Without staging to local disk, the call takes a whopping 32 seconds to return:

[alice@dev2 ~]$ CONDA_STAGE=false conda activate myjupyter
(myjupyter) [alice@dev2 ~]$ command -v jupyter
/wynton/home/boblab/alice/.conda/envs/myjupyter/bin/jupyter
(myjupyter) [alice@dev2 ~]$ command time --portability jupyter --version > /dev/null
real 32.06
user 1.42
sys 0.76

This test was conducted during a time when the cluster did indeed experience heavy load on the BeeGFS file system at the time. The fact that real is much greater than user + sys suggests our process spends a lot of time just waiting. When staging to local disk, we can avoid being affected by this load. When running from the local disk, the same call takes less than a second;

[alice@dev2 ~]$ conda activate myjupyter
(/scratch/alice/conda-stage_wFWY/myjupyter) [alice@dev2 ~]$ command -v jupyter
/scratch/alice/conda-stage_wFWY/myjupyter/bin/jupyter
(/scratch/alice/conda-stage_wFWY/myjupyter) [alice@dev2 ~]$ command time --portability jupyter --version > /dev/null
real 0.75
user 0.67
sys 0.07

Proof that a staged Conda environment lives on local disk #

If we run jupyter --version through strace to log all files accessed,

[alice@dev2 ~]$ conda activate myjupyter
(/scratch/alice/conda-stage_wFWY/myjupyter) [alice@dev2 ~]$ strace -e trace=stat -o jupyter.strace jupyter --version

Selected Jupyter core packages...
IPython          : 8.2.0
ipykernel        : 6.9.1
ipywidgets       : not installed
jupyter_client   : 7.1.2
jupyter_core     : 4.9.2
jupyter_server   : not installed
jupyterlab       : not installed
nbclient         : 0.5.11
nbconvert        : 6.4.4
nbformat         : 5.1.3
notebook         : 6.4.10
qtconsole        : not installed
traitlets        : 5.1.1

and inspect the jupyter.strace log file, we find that most file-access calls go to the local disk:

$ head -6 jupyter.strace 
stat("/scratch/alice/conda-stage_wFWY/myjupyter/bin/../lib/tls/x86_64", 0x7ffc9a9ea980) = -1 ENOENT (No such file or directory)
stat("/scratch/alice/conda-stage_wFWY/myjupyter/bin/../lib/tls", 0x7ffc9a9ea980) = -1 ENOENT (No such file or directory)
stat("/scratch/alice/conda-stage_wFWY/myjupyter/bin/../lib/x86_64", 0x7ffc9a9ea980) = -1 ENOENT (No such file or directory)
stat("/scratch/alice/conda-stage_wFWY/myjupyter/bin/../lib", {st_mode=S_IFDIR|0755, st_size=8192, ...}) = 0
stat("/etc/sysconfig/64bit_strstr_via_64bit_strstr_sse2_unaligned", 0x7ffc9a9eaf10) = -1 ENOENT (No such file or directory)
stat("/scratch/alice/conda-stage_wFWY/myjupyter/bin/python", {st_mode=S_IFREG|0755, st_size=15880080, ...}) = 0

Exactly, how many of them? In this simple example where we only query the version of Jupyter Notebook and its dependencies, there are 4,027 queries to the file system;

$ grep -c stat jupyter.strace 
4027

Out of these, 4,021 are done toward the local disk (/scratch);

$ grep -c 'stat("/scratch' jupyter.strace 
4021

and only one toward the BeeGFS file system (/wynton):

$ grep -v 'stat("/wynton' jupyter.strace 
stat("/wynton/home/boblab/alice/.local/lib/python3.9/site-packages", 0x7ffc9a9ea820) = -1 ENOENT (No such file or directory)

In other words, by staging the Conda environment to local disk, we saved ourselves, and the system, 4,021 queries to the BeeGFS file system. And, this only for the very simple jupyter --version call.