The SLURM documentation provides you with the basic information that you can use Docker withing SLURM – as long as you use rootless Docker. However some crucial pieces are missing.
The issue that you will immediately run into is that the SLURM resource allocation is not propagated to docker at all. E.g. if you start your job with
srun --gpus 1 docker ... all GPUs will be available to docker nevertheless.
The issue here is that Docker uses a manager daemon that the
docker CLI communicates with. And that daemon does not know anything about SLURM or any resources it allocated for the job.
The solution is to start a daemon per job (instead of per user) as one user might want to run different jobs with different allocations on the same machine. The docker documentation gives you an idea on how to do that.
You will need to set at least the following parameters to make the daemon fully job-specific
# dockerd-rootless.sh requires XDG_RUNTIME_DIR XDG_RUNTIME_DIR=/somewhere/including/$SLURM_JOB_ID # export, so docker client sees it later on export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/docker.sock dockerd-rootless.sh --host=$DOCKER_HOST --data-root=... --exec-root=...
DOCKER_HOST makes the
docker CLI use the correct daemon.
The drawback of this method is that each job needs to pull the container again due to the separate data-root paths. Switching to podman might solve that.