398 字
2 分钟
给Ubuntu配置运行CUDA的Docker
建议使用Ubuntu 20.04
以上的版本。
安装Cuda Toolkit
Ubuntu 20.04+
注意,随Cuda的升级,下面的命令可能会发生改变,具体命令查看此链接。
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.debsudo dpkg -i cuda-keyring_1.1-1_all.debsudo apt-get updatesudo apt-get -y install cuda-toolkit-12-5
Ubuntu 18.04
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-keyring_1.0-1_all.debsudo dpkg -i cuda-keyring_1.0-1_all.debsudo apt-get updatesudo apt-get -y install cuda
重启
安装CUda Toolkit后要重启系统sudo reboot
。
安装Docker
配置Docker的repo
sudo apt-get updatesudo apt-get install ca-certificates curlsudo install -m 0755 -d /etc/apt/keyringssudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.ascsudo chmod a+r /etc/apt/keyrings/docker.asc
echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \ $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/nullsudo apt-get update
安装Docker
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-pluginsudo usermod $USER -aG docker # 添加本用户到docker的组中sudo service docker start
要确认Docker成功安装,输入命令docker --version
查看输出,还可以通过尝试运行docker run hello-world
运行简单的内置Docker映像,测试安装是否正常工作。
给Docker配置CUDA
安装NVIDIA Container Toolkit
首先配置repo:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
然后更新reposudo apt-get update
,再运行sudo apt-get install -y nvidia-container-toolkit
来安装NVIDIA Container Toolkit。
配置Docker
使用nvidia-ctk
命令配置容器运行时:
sudo nvidia-ctk runtime configure --runtime=docker
重新启动Docker进程:
sudo systemctl restart docker
测试
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smidocker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
若测试失败,则sudo vim /etc/nvidia-container-runtime/config.toml
,然后修改no-cgroups = false
,再重启Docker。
sudo systemctl restart dockerdocker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
给Docker配置proxy
让Docker build
使用本机proxy
docker build --network=host --build-arg http_proxy=$http_proxy --build-arg https_proxy=$https_proxy ...
给Ubuntu配置运行CUDA的Docker
https://blog.xiaobaizhang.top/posts/ubuntu-docker-cuda/