经验首页 前端设计 程序设计 Java相关 移动开发 数据库/运维 软件/图像 大数据/云计算 其他经验
当前位置:技术经验 » 程序设计 » Python » 查看文章
用GPU来运行Python代码
来源:cnblogs  作者:南瓜慢说  时间:2023/2/6 9:02:34  对本文有异议

简介

前几天捣鼓了一下Ubuntu,正是想用一下我旧电脑上的N卡,可以用GPU来跑代码,体验一下多核的快乐。

还好我这破电脑也是支持Cuda的:

  1. $ sudo lshw -C display
  2. *-display
  3. description: 3D controller
  4. product: GK208M [GeForce GT 740M]
  5. vendor: NVIDIA Corporation
  6. physical id: 0
  7. bus info: pci@0000:01:00.0
  8. version: a1
  9. width: 64 bits
  10. clock: 33MHz
  11. capabilities: pm msi pciexpress bus_master cap_list rom
  12. configuration: driver=nouveau latency=0
  13. resources: irq:35 memory:f0000000-f0ffffff memory:c0000000-cfffffff memory:d0000000-d1ffffff ioport:6000(size=128)

安装相关工具

首先安装一下Cuda的开发工具,命令如下:

  1. $ sudo apt install nvidia-cuda-toolkit

查看一下相关信息:

  1. $ nvcc --version
  2. nvcc: NVIDIA (R) Cuda compiler driver
  3. Copyright (c) 2005-2021 NVIDIA Corporation
  4. Built on Thu_Nov_18_09:45:30_PST_2021
  5. Cuda compilation tools, release 11.5, V11.5.119
  6. Build cuda_11.5.r11.5/compiler.30672275_0

通过Conda安装相关的依赖包:

  1. conda install numba & conda install cudatoolkit

通过pip安装也可以,一样的。

测试与驱动安装

简单测试了一下,发觉报错了:

  1. $ /home/larry/anaconda3/bin/python /home/larry/code/pkslow-samples/python/src/main/python/cuda/test1.py
  2. Traceback (most recent call last):
  3. File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/driver.py", line 246, in ensure_initialized
  4. self.cuInit(0)
  5. File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/driver.py", line 319, in safe_cuda_api_call
  6. self._check_ctypes_error(fname, retcode)
  7. File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/driver.py", line 387, in _check_ctypes_error
  8. raise CudaAPIError(retcode, msg)
  9. numba.cuda.cudadrv.driver.CudaAPIError: [100] Call to cuInit results in CUDA_ERROR_NO_DEVICE
  10. During handling of the above exception, another exception occurred:
  11. Traceback (most recent call last):
  12. File "/home/larry/code/pkslow-samples/python/src/main/python/cuda/test1.py", line 15, in <module>
  13. gpu_print[1, 2]()
  14. File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/compiler.py", line 862, in __getitem__
  15. return self.configure(*args)
  16. File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/compiler.py", line 857, in configure
  17. return _KernelConfiguration(self, griddim, blockdim, stream, sharedmem)
  18. File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/compiler.py", line 718, in __init__
  19. ctx = get_context()
  20. File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/devices.py", line 220, in get_context
  21. return _runtime.get_or_create_context(devnum)
  22. File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/devices.py", line 138, in get_or_create_context
  23. return self._get_or_create_context_uncached(devnum)
  24. File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/devices.py", line 153, in _get_or_create_context_uncached
  25. with driver.get_active_context() as ac:
  26. File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/driver.py", line 487, in __enter__
  27. driver.cuCtxGetCurrent(byref(hctx))
  28. File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/driver.py", line 284, in __getattr__
  29. self.ensure_initialized()
  30. File "/home/larry/anaconda3/lib/python3.9/site-packages/numba/cuda/cudadrv/driver.py", line 250, in ensure_initialized
  31. raise CudaSupportError(f"Error at driver init: {description}")
  32. numba.cuda.cudadrv.error.CudaSupportError: Error at driver init: Call to cuInit results in CUDA_ERROR_NO_DEVICE (100)

网上搜了一下,发现是驱动问题。通过Ubuntu自带的工具安装显卡驱动:

还是失败:

  1. $ nvidia-smi
  2. NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

最后,通过命令行安装驱动,成功解决这个问题:

  1. $ sudo apt install nvidia-driver-470

检查后发现正常了:

  1. $ nvidia-smi
  2. Wed Dec 7 22:13:49 2022
  3. +-----------------------------------------------------------------------------+
  4. | NVIDIA-SMI 470.161.03 Driver Version: 470.161.03 CUDA Version: 11.4 |
  5. |-------------------------------+----------------------+----------------------+
  6. | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
  7. | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
  8. | | | MIG M. |
  9. |===============================+======================+======================|
  10. | 0 NVIDIA GeForce ... Off | 00000000:01:00.0 N/A | N/A |
  11. | N/A 51C P8 N/A / N/A | 4MiB / 2004MiB | N/A Default |
  12. | | | N/A |
  13. +-------------------------------+----------------------+----------------------+
  14. +-----------------------------------------------------------------------------+
  15. | Processes: |
  16. | GPU GI CI PID Type Process name GPU Memory |
  17. | ID ID Usage |
  18. |=============================================================================|
  19. | No running processes found |
  20. +-----------------------------------------------------------------------------+

测试代码也可以跑了。

测试Python代码

打印ID

准备以下代码:

  1. from numba import cuda
  2. import os
  3. def cpu_print():
  4. print('cpu print')
  5. @cuda.jit
  6. def gpu_print():
  7. dataIndex = cuda.threadIdx.x + cuda.blockIdx.x * cuda.blockDim.x
  8. print('gpu print ', cuda.threadIdx.x, cuda.blockIdx.x, cuda.blockDim.x, dataIndex)
  9. if __name__ == '__main__':
  10. gpu_print[4, 4]()
  11. cuda.synchronize()
  12. cpu_print()

这个代码主要有两个函数,一个是用CPU执行,一个是用GPU执行,执行打印操作。关键在于@cuda.jit这个注解,让代码在GPU上执行。运行结果如下:

  1. $ /home/larry/anaconda3/bin/python /home/larry/code/pkslow-samples/python/src/main/python/cuda/print_test.py
  2. gpu print 0 3 4 12
  3. gpu print 1 3 4 13
  4. gpu print 2 3 4 14
  5. gpu print 3 3 4 15
  6. gpu print 0 2 4 8
  7. gpu print 1 2 4 9
  8. gpu print 2 2 4 10
  9. gpu print 3 2 4 11
  10. gpu print 0 1 4 4
  11. gpu print 1 1 4 5
  12. gpu print 2 1 4 6
  13. gpu print 3 1 4 7
  14. gpu print 0 0 4 0
  15. gpu print 1 0 4 1
  16. gpu print 2 0 4 2
  17. gpu print 3 0 4 3
  18. cpu print

可以看到GPU总共打印了16次,使用了不同的Thread来执行。这次每次打印的结果都可能不同,因为提交GPU是异步执行的,无法确保哪个单元先执行。同时也需要调用同步函数cuda.synchronize(),确保GPU执行完再继续往下跑。

查看时间

我们通过这个函数来看GPU并行的力量:

  1. from numba import jit, cuda
  2. import numpy as np
  3. # to measure exec time
  4. from timeit import default_timer as timer
  5. # normal function to run on cpu
  6. def func(a):
  7. for i in range(10000000):
  8. a[i] += 1
  9. # function optimized to run on gpu
  10. @jit(target_backend='cuda')
  11. def func2(a):
  12. for i in range(10000000):
  13. a[i] += 1
  14. if __name__ == "__main__":
  15. n = 10000000
  16. a = np.ones(n, dtype=np.float64)
  17. start = timer()
  18. func(a)
  19. print("without GPU:", timer() - start)
  20. start = timer()
  21. func2(a)
  22. print("with GPU:", timer() - start)

结果如下:

  1. $ /home/larry/anaconda3/bin/python /home/larry/code/pkslow-samples/python/src/main/python/cuda/time_test.py
  2. without GPU: 3.7136273959999926
  3. with GPU: 0.4040513340000871

可以看到使用CPU需要3.7秒,而GPU则只要0.4秒,还是能快不少的。当然这里不是说GPU一定比CPU快,具体要看任务的类型。

代码

代码请看GitHub: https://github.com/LarryDpk/pkslow-samples

原文链接:https://www.cnblogs.com/larrydpk/p/17093627.html

 友情链接:直通硅谷  点职佳  北美留学生论坛

本站QQ群:前端 618073944 | Java 606181507 | Python 626812652 | C/C++ 612253063 | 微信 634508462 | 苹果 692586424 | C#/.net 182808419 | PHP 305140648 | 运维 608723728

W3xue 的所有内容仅供测试,对任何法律问题及风险不承担任何责任。通过使用本站内容随之而来的风险与本站无关。
关于我们  |  意见建议  |  捐助我们  |  报错有奖  |  广告合作、友情链接(目前9元/月)请联系QQ:27243702 沸活量
皖ICP备17017327号-2 皖公网安备34020702000426号