Docker containers don’t see your system’s GPU automatically. This causes reduced performance in GPU-dependent workloads such as machine learning frameworks. Here’s how to expose your host’s NVIDIA GPU to your containers.
Making GPUs Work In Docker
Docker containers share your host’s kernel but bring along their own operating system and software packages. This means they lack the NVIDIA drivers used to interface with your GPU. Docker doesn’t even add GPUs to containers by default so a plain
docker run won’t see your hardware at all.
At a high level, getting your GPU to work is a two-step procedure: install the drivers within your image, then instruct Docker to add GPU devices to your containers at runtime.
This guide focuses on modern versions of CUDA and Docker. The latest release of NVIDIA Container Toolkit is designed for combinations of CUDA 10 and Docker Engine 19.03 and later. Older builds of CUDA, Docker, and the NVIDIA drivers may require additional steps.
Adding the NVIDIA Drivers
Make sure you’ve got the NVIDIA drivers working properly on your host before you continue with your Docker configuration. You should be able to successfully run
nvidia-smi and see your GPU’s name, driver version, and CUDA version.
To use your GPU with Docker, begin by adding the NVIDIA Container Toolkit to your host. This integrates into Docker Engine to automatically configure your containers for GPU support.
Add the toolkit’s package repository to your system using the example command:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \ && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
Next install the
nvidia-docker2 package on your host:
apt-get update apt-get install -y nvidia-docker2
Restart the Docker daemon to complete the installation:
sudo systemctl restart docker
The Container Toolkit should now be operational. You’re ready to start a test container.
Starting a Container With GPU Access
As Docker doesn’t provide your system’s GPUs by default, you need to create containers with the
--gpus flag for your hardware to show up. You can either specify specific devices to enable or use the
nvidia/cuda images are preconfigured with the CUDA binaries and GPU tools. Start a container and run the
nvidia-smi command to check your GPU’s accessible. The output should match what you saw when using
nvidia-smi on your host. The CUDA version could be different depending on the toolkit versions on your host and in your selected container image.
docker run -it --gpus all nvidia/cuda:11.4.0-base-ubuntu20.04 nvidia-smi
Selecting a Base Image
Using one of the
nvidia/cuda tags is the quickest and easiest way to get your GPU workload running in Docker. Many different variants are available; they provide a matrix of operating system, CUDA version, and NVIDIA software options. The images are built for multiple architectures.
Each tag has this format:
11.4.0– CUDA version.
base– Image flavor.
ubuntu20.04– Operating system version.
Three different image flavors are available. The
base image is a minimal option with the essential CUDA runtime binaries.
runtime is a more fully-featured option that includes the CUDA math libraries and NCCL for cross-GPU communication. The third variant is
devel which gives you everything from
runtime as well as headers and development tools for creating custom CUDA images.
If one of the images will work for you, aim to use it as your base in your
Dockerfile. You can then use regular Dockerfile instructions to install your programming languages, copy in your source code, and configure your application. It removes the complexity of manual GPU set up steps.
FROM nvidia/cuda:11.4.0-base-ubuntu20.04 RUN apt update RUN apt-get install -y python3 python3-pip RUN pip install tensorflow-gpu COPY tensor-code.py . ENTRYPONT ["python3", "tensor-code.py"]
Building and running this image with the
--gpus flag would start your Tensor workload with GPU acceleration.
Manually Configuring an Image
You can manually add CUDA support to your image if you need to choose a different base. The best way to achieve this is to reference the official NVIDIA Dockerfiles.
Copy the instructions used to add the CUDA package repository, install the library, and link it into your path. We’re not reproducing all the steps in this guide as they vary by CUDA version and operating system.
Pay attention to the environment variables at the end of the Dockerfile – these define how containers using your image integrate with the NVIDIA Container Runtime:
ENV NVIDIA_VISIBLE_DEVICES all ENV NVIDIA_DRIVER_CAPABILITIES compute,utility
Your image should detect your GPU once CUDA’s installed and the environment variables have been set. This gives you more control over the contents of your image but leaves you liable to adjust the instructions as new CUDA versions release.
How Does It Work?
The NVIDIA Container Toolkit is a collection of packages which wrap container runtimes like Docker with an interface to the NVIDIA driver on the host. The
libnvidia-container library is responsible for providing an API and CLI that automatically provides your system’s GPUs to containers via the runtime wrapper.
nvidia-container-toolkit component implements a container runtime
prestart hook. This means it’s notified when a new container is about to start. It looks at the GPUs you want to attach and invokes
libnvidia-container to handle container creation.
The hook is enabled by
nvidia-container-runtime. This wraps your “real” container runtime such as containerd or runc to ensure the NVIDIA
prestart hook is run. Your existing runtime continues the container start process after the hook has executed. When the container toolkit is installed, you’ll see the NVIDIA runtime selected in your Docker daemon config file.
Using an NVIDIA GPU inside a Docker container requires you to add the NVIDIA Container Toolkit to the host. This integrates the NVIDIA drivers with your container runtime.
docker run with the
--gpu flag makes your hardware visible to the container. This must be set on each container you launch, after the Container Toolkit has been installed.
NVIDIA provides preconfigured CUDA Docker images that you can use as a quick starter for your application. If you need something more specific, refer to the official Dockerfiles to assemble your own that’s still compatible with the Container Toolkit.