398 字
2 分钟
给Ubuntu配置运行CUDA的Docker

建议使用Ubuntu 20.04以上的版本。

安装Cuda Toolkit#

Ubuntu 20.04+#

注意,随Cuda的升级,下面的命令可能会发生改变,具体命令查看此链接

Terminal window
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-5

Ubuntu 18.04#

Terminal window
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda

重启#

安装CUda Toolkit后要重启系统sudo reboot

安装Docker#

配置Docker的repo#

Terminal window
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

安装Docker#

Terminal window
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo usermod $USER -aG docker # 添加本用户到docker的组中
sudo service docker start

要确认Docker成功安装,输入命令docker --version查看输出,还可以通过尝试运行docker run hello-world运行简单的内置Docker映像,测试安装是否正常工作。

给Docker配置CUDA#

安装NVIDIA Container Toolkit#

首先配置repo:

Terminal window
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

然后更新reposudo apt-get update,再运行sudo apt-get install -y nvidia-container-toolkit来安装NVIDIA Container Toolkit。

配置Docker#

使用nvidia-ctk命令配置容器运行时:

Terminal window
sudo nvidia-ctk runtime configure --runtime=docker

重新启动Docker进程:

Terminal window
sudo systemctl restart docker

测试#

Terminal window
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

若测试失败,则sudo vim /etc/nvidia-container-runtime/config.toml,然后修改no-cgroups = false,再重启Docker。

Terminal window
sudo systemctl restart docker
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

给Docker配置proxy#

参考链接

  1. 给daemon配置proxy
  2. 给client配置proxy

Docker build使用本机proxy#

Terminal window
docker build --network=host --build-arg http_proxy=$http_proxy --build-arg https_proxy=$https_proxy ...
给Ubuntu配置运行CUDA的Docker
https://blog.xiaobaizhang.top/posts/ubuntu-docker-cuda/
作者
张小白
发布于
2024-06-04
许可协议
CC BY-NC-SA 4.0