Hosting Jupyter Notebooks on Slurm
Published:
Slurm is a popular choice for job scheduling and management on GPU clusters with multiple compute nodes that are ubiquitous for training large deep learning models.
A Slurm GPU cluster consists of one or more head (or login) nodes and multiple compute nodes. Generally, the head nodes are lighter on compute resources (eg. no GPUs, low RAM, etc). It is a common practice by system admins to prevent users from logging into compute nodes directly from the head node bypassing Slurm resource allocation (see PAM).
In this post, we’ll set up a Jupyter Notebook server on an allocated compute node without having to directly SSH to it from the head node. To access the server locally, we’ll setup port-forwarding from the compute node to the local machine via an SSH tunnel through the head node.
Log into Slurm Head Node
First, make sure that you can log into the Slurm head node <slurm_head_node>
. Check by running SSH with the following command in the Terminal on your local machine:
ssh -p <ssh_port> <slurm_username>@<slurm_head_node> -i <path_to_the_rsa_private_key_file>
Request Slurm Compute
Once logged into the head node, compute resources can be requested using the Slurm command srun
, which spins up an interactive shell on a compute node with the resources requested:
srun --partition=<slurm_partition> --gres=gpu:1 --mem=50G --pty bash -l
Here, we requested a Slurm compute node with at least one GPU and 50 GB of memory (RAM). Once resources are allocated, the shell in the Terminal session logs into the compute node <slurm_compute_node>
.
Run Jupyter Server
To run a Jupyter Notebook server in the background that does not terminate with the Terminal shell session, use Tmux or GNU Screen. Here we use tmux
by creating a tab named jupyter_tmux
on the allocated compute node:
tmux new -s "jupyter_tmux"
Now, launch a Jupyter Notebook server within the tmux
session by running:
jupyter notebook --no-browser --port <jupyter_port>
Detach from the tmux
session by pressing ctrl
and b
keys at the same time, followed by d
.
Forward Ports By SSH Tunneling
With the Jupyter server running on the Slurm compute node, the next step is to forward the port <jupyter_port>
from the compute node to the local machine via an SSH tunnel through the head node.
Open a new Terminal session on the local computer and run:
# on local Terminal
ssh -A -N -f -o "ProxyCommand ssh -W %h:%p -p <ssh_port> <slurm_username>@<slurm_head_node> -i <path_to_the_rsa_private_key_file>" -L localhost:<jupyter_port>:localhost:<jupyter_port> <slurm_username>@<slurm_compute_node> -i <path_to_the_rsa_private_key_file>
Voila! The Jupyter Notebook server should be accessible from the local machine at http://localhost:<jupyter_port>/
!
References
[1] Slurm, https://slurm.schedmd.com/.
[2] Jupyter, https://jupyter.org/.
[3] Tmux, https://github.com/tmux/tmux/wiki.
[4] GNU Screen, https://www.gnu.org/software/screen/.
Citation
If you liked this article, consider subscribing to the blog’s mailing list here: Subscribe
To cite this work, please use:
Bhaskara, Vin (Nov 2023). Hosting Jupyter Notebooks on Slurm. https://vinbhaskara.github.io/posts/2023/11/slurm-jupyter/
or,
@article{bhaskara2023slurm, title = "Hosting Jupyter Notebooks on Slurm", author = "Bhaskara, Vin", journal = "vinbhaskara.github.io", year = "2023", month = "Nov", url = "https://vinbhaskara.github.io/posts/2023/11/slurm-jupyter/" }
Leave a Comment