WSL中ML的GPU加速入门

安装CUDA
安装NCCL
tf2验证
pytorch2验证

本文主要记录下在wsl中使用GPU加速机器学习的过程，参照微软的官方文档进行的。

官方文档的发布日期：2023/03/21

在Windows下安装NVIDIA GPU的最新驱动这里就不做介绍，根据提示进行安装即可，我这里主要记录在wsl中安装NVIDIA工具包。

安装CUDA

根据https://developer.nvidia.com/cuda-downloads提示，选择适合选项。我这里以OracleLinux_9_1为例：

$ dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
$ dnf clean all
$ dnf -y module install nvidia-driver:latest-dkms --skip-broken
$ dnf -y install cuda --skip-broken

安装NCCL

操作如下：

$ dnf config-manager --add-repo http://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
$ yum install -y libnccl libnccl-devel libnccl-static

tf2验证

这里使用python虚拟环境：

$ mkvirtualenv directml
(directml) $ pip install tensorflow-cpu tensorflow-directml-plugin ipython
(directml) $ ipython
Python 3.9.16 (main, May 29 2023, 00:00:00)
Type 'copyright', 'credits' or 'license' for more information
IPython 8.14.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import tensorflow as tf
2023-06-27 19:57:28.264049: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-06-27 19:57:30.061421: I tensorflow/c/logging.cc:34] Successfully opened dynamic library libdirectml.d6f03b303ac3c4f2eeb8ca631688c9757b361310.so
2023-06-27 19:57:30.061555: I tensorflow/c/logging.cc:34] Successfully opened dynamic library libdxcore.so
2023-06-27 19:57:30.069764: I tensorflow/c/logging.cc:34] Successfully opened dynamic library libd3d12.so
2023-06-27 19:57:31.271695: I tensorflow/c/logging.cc:34] DirectML device enumeration: found 1 compatible adapters.

In [2]: tf.test.is_gpu_available()
WARNING:tensorflow:From <ipython-input-2-17bb7203622b>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2023-06-27 19:57:48.462729: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-06-27 19:57:48.463267: I tensorflow/c/logging.cc:34] DirectML: creating device on adapter 0 (NVIDIA GeForce GTX 960M)
2023-06-27 19:57:49.474531: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-06-27 19:57:49.474639: W tensorflow/core/common_runtime/pluggable_device/pluggable_device_bfc_allocator.cc:28] Overriding allow_growth setting because force_memory_growth was requested by the device.
2023-06-27 19:57:49.474714: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/device:GPU:0 with 10275 MB memory) -> physical PluggableDevice (device: 0, name: DML, pci bus id: <undefined>)
Out[2]: True

pytorch2验证

$ pip install opencv-python wget torchvision torch-directml
$ ipython  
Python 3.9.16 (main, May 29 2023, 00:00:00)
Type 'copyright', 'credits' or 'license' for more information
IPython 8.14.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import torch

In [2]: torch.__version__
Out[2]: '2.0.1+cu117'

In [3]: torch.cuda.is_available()
Out[3]: True

In [4]: torch.cuda.device_count()
Out[4]: 1

In [5]: torch.cuda.device(0)
Out[5]: <torch.cuda.device at 0x7f4fd74a3490>

In [6]: torch.cuda.get_device_name()
Out[6]: 'NVIDIA GeForce GTX 960M'

声明：本作品采用署名-非商业性使用-相同方式共享 4.0 国际 (CC BY-NC-SA 4.0)进行许可，使用时请注明出处。
Author: mengbin
blog: mengbin
Github: mengbin92
cnblogs: 恋水无意