Click here to download the source code for this version
In this tutorial, you will learn how to use the OpenCV Deep Neural Network (DNN) module with NVIDIA GPUs, CUDA and cuDNN211-1549%faster completion.
In August 2017 I publishedmy first tutorial on how to use OpenCV's "Deep Neural Network" (DNN) module for image classification.
PyImageSearch readers loved the convenience and ease of use of OpenCVdnn
module so much so that later I posted more tutorials about itdnn
Module including:
- Object Recognition with Deep Learning and OpenCV
- Real-time object detection with Deep Learning and OpenCV
- Detection of YOLO objects with OpenCV
- R-CNN mask with OpenCV
Each of these guides used OpenCVdnn
Module to (1) load a pre-trained network from disk, (2) make predictions on an input image, and then (3) display the results, allowing you to build your own custom machine vision/learning pipeline deep for your specific project.
However, themain problemwhat is OpenCVdnn
the module was alack of NVIDIA GPU/CUDA support— with these templates youI could notEasily use a GPU to improve the frames per second (FPS) processing speed of your pipeline.
This wasn't a big issue for the Single Shot Detector (SSD) tutorials, which can easily run at 25-30+ FPS on a CPU, but it wasgiganticProblem for YOLO and Mask R-CNN struggling to get more than 1-3 FPS on a CPU.
That all changed at Google Summer of Code (GSoC) 2019.
Directed by dlibKing Davis, and implemented bylong live the sama,OpenCV 4.2 now supports NVIDIA GPUs for inferencing with OpenCVdnn
module, improving inference speedup to 1549%!
In today's tutorial, I'll show you how to compile and install OpenCV to use your NVIDIA GPU for deep neural network inference.
So in next week's tutorial I'll provide code for Single Shot Detector, YOLO and Mask R-CNN to use your GPU with OpenCV. We then evaluate the results and compare them against CPU-only inference so you know which models benefit most from using a GPU.
For information on compiling and installing the OpenCV "dnn" module with support for NVIDIA GPUs, CUDA, and cuDNN, seeJust keep reading!
Looking at the source code for this release?
Go straight to the download areaHow to use the OpenCV 'dnn' module with NVIDIA GPUs, CUDA and cuDNN
In the remainder of this tutorial, I'll show you how to build OpenCV from source code so that you can leverage NVIDIA's GPU-accelerated inference for pre-trained deep neural networks.
Assumptions when building OpenCV for NVIDIA GPU support
To compile and install the OpenCV "Deep Neural Network" module with NVIDIA GPU support, I will make the following assumptions:
- You have an NVIDIA GPU.This must be an obvious assumption. If you don't have NVIDIA GPU, you cannot compile OpenCV module "dnn" with NVIDIA GPU support.
- You are using Ubuntu 18.04 (or another Debian-based distribution).When it comes to deep learning, Ihighly recommendedUnix-based machines running on Windows systems (actuallyI am not compatible with windows on the PyImageSearch blog).If you intend to use a GPU for deep learning, choose Ubuntu over macOS or Windows- arevery easyto set up.
- You know how to use a command line.In this tutorial, we will use the command line. If you are not familiar with the command line, I recommendRead this introduction to the command linefirst and then spend a few hours (or even days) practicing. Again, this tutorial isnofor those who are new to the command line.
- It is capable of reading terminal output and diagnosing problems.Compiling OpenCV from source can be challenging if you've never done it before - there are a number of things that can cause problems, including missing packages, incorrect library paths, etc. Even with my detailed guides, you're likely to make a mistake. by the way.Don't be discouraged!Take the time to understand the commands you run, what they do, and most importantly,Read the output of the commands!Don't copy and paste blindly; You will only find errors.
With all that said, let's start configuring the OpenCV "dnn" module for NVIDIA GPU inference.
Step 1: Install the NVIDIA CUDA, CUDA Toolkit, and cuDNN Drivers
This tutorial assumes that youvonTer:
- An NVIDIA GPU
- CUDA drivers for this specific GPU are installed
- CUDA Toolkit and cuDNN configured and installed
If you have an NVIDIA GPU in your system, but have not yet installed the CUDA drivers, CUDA toolkit and cuDNN, first configure your computer:I will not cover CUDA setup and installation in this guide.
To know how to install NVIDIA CUDA driver, CUDA Toolkit and cuDNN, I recommend you to read myUbuntu 18.04 GPU and TensorFlow/Keras Installation Guide— After installing the appropriate NVIDIA drivers and toolkits, you can return to this tutorial.
Step 2: Install OpenCV and "dnn" GPU dependencies
The first step in configuring the OpenCV "dnn" module for NVIDIA GPU inference is to install the appropriate dependencies:
$ sudo apt-get update$ sudo apt-get upgrade$ sudo apt-get install build-essential cmake unzip pkg-config$ sudo apt-get install libjpeg-dev libpng-dev libtiff-dev$ sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev$ sudo apt-get install libv4l-dev libxvidcore-dev libx264-dev$ sudo apt-get install libgtk-3-dev$ sudo apt-get install libatlas-base-dev gfortran$ sudo apt-get install python3-dev
Most of these packages should have been installed if you followed myUbuntu 18.04 Deep Learning Setup Manual, but I would recommend running the above command just to be sure.
Step #3: Download the OpenCV source code
There is no "pip-installable" version of OpenCV that comes with NVIDIA GPU support; Instead we need to compile OpenCVfrom the startwith the correct NVIDIA GPU settings.
The first step to do this is to download the OpenCV v4.2 source code:
$ cd ~$ wget -O opencv.zip https://github.com/opencv/opencv/archive/4.2.0.zip$ wget -O opencv_contrib.zip https://github.com/opencv/opencv_contrib/archive/ 4.2.0.zip$ descomprimir opencv.zip$ descomprimir opencv_contrib.zip$ mv opencv-4.2.0 opencv$ mv opencv_contrib-4.2.0 opencv_contrib
Now we can proceed to configure our build.
Step 4: Setting up the Python Virtual Environment
if you followed meUbuntu 18.04, TensorFlow and Keras Deep Learning Setup Guide, then you must already have itvirtual environmentjconcha virtualFurnished:
- If your machine isalready configured, heck
mkvirtualenv
commands in this section. - If not, follow each of these steps to configure your device.
Python virtual environments are a good practice when it comes to Python development. They allow you to test different versions of Python libraries in separate and isolated development and production environments. Python virtual environments are considered abest practicesin the python world: I use it daily and so should you.
If you haven't installed it yetlump
, the Python package manager, you can do this with the following command:
$ wget https://bootstrap.pypa.io/get-pip.py$ sudo python3 get-pip.py
One timelump
installed, you can install bothvirtual environment
jconcha virtual
:
$ sudo pip install virtualenv virtualenvwrapper$ sudo rm -rf ~/get-pip.py ~/.cache/pip
You must then open your~/.bashrc
file and update it toautomaticallyLoad virtualenv/virtualenvwrapper every time you open a terminal.
I prefer to use thesenano
Text editor, but you can use whichever editor you're most comfortable with:
$ nano ~/.bashrc
after having them~/.bashrc
Open the file, scroll to the end of the file and paste the following:
# virtualenv e virtualenvwrapperexport WORKON_HOME=$HOME/.virtualenvsexport VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3source /usr/local/bin/virtualenvwrapper.sh
From there, save and exit your terminal (Strength + X
,j
,log in
).
You can then reload your~/.bashrc
File in your terminal session:
$fonte ~/.bashrc
You just need to run the above commandone time- since you updated your~/.bashrc
File is virtualenv/virtualenvwrapper environment variablesautomatically setevery time you open a new terminal window.
The final step is to create your Python virtual environment:
$ mkvirtualenv opencv_cuda -p python3
Losmkvirtualenv
The command creates a new Python virtual environment calledopencv_cuda
com Python3.
Next, you need to install NumPy onopencv_cuda
Proximity:
$ pip instala numpy
If you close your terminal or shut down your Python virtual environment, you can access it again via thein ... jobs
Domain:
$ work on opencv_cuda
If you are new to Python virtual environments,I suggest you take a second and read how they work.- You are abest practicesin the world of Python.
If you choose not to use them, that's fine, but remember that your choice doesn't exempt you from learning Python best practices. Take advantage now to invest in your knowledge.
Step 5: Determine your CUDA architecture version
When compiling the OpenCV module "dnn" with NVIDIA GPU support, we need to determine ourNVIDIA GPU architecture version:
- This version number is aapplicationif we stop this
CUDA_ARCH_BIN
variable in ouragain
command in the next section. - NVIDIA GPU architecture version dependsWhat GPU are you usingSo make sure you know your GPU modelearly
- If you don't configure yours correctly
CUDA_ARCH_BIN
The variable may cause OpenCV to continue compiling but not be able to use your GPU for inference(making it troublesome to diagnose and debug).
One of the easiest ways to determine the architecture version of your NVIDIA GPU is to simply look at thenvidia-smi
Domain:
$nvidia-smiMon Jan 27 14:11:32 2020+------------------------------------ - ------------------------------------------------- ---- - - - ----+| NVIDIA-SMI 410.104 Driver Version: 410.104 CUDA Version: 10.0 ||------------------------------------------------ - --------+--- - ----+----------------- - ------------ - ------+| GPU name persistence-M| Bus identification number A | to draw. ECC volatile || Fan Temperature Perf Pwr:Use/Cap| Memory Usage | GPU-Util Compute M ||=========================================== = = = =========================+========================|| 0 Tesla V100-SXM2... Off | 00000000:00:04.0 Off | 0 || N/A 35C P0 38W / 300W | 0MiB / 16130MiB | 0% Standard |+--------------------------------------------- --------+---- --------------- ---------+------------ -------- ----- ------------------------------------- -------- ----- ------------------------------------- -------- ----- -+| Processes: GPU memory || GPU PID type using process name ||======================================= = = = ================================|| No running processes found |+-------------------------------------------- -- ------------- -------- ------------------------ --- ------------------------- +
If you check the output you can see that I'm using aGPU NVIDIA Tesla V100🇧🇷 Make sure you run thosenvidia-smi
ask yourselfCheck your GPU modelBefore proceeding.
Now that I have my NVIDIA GPU model, I can determine the architecture version.
On this page, you can find your NVIDIA GPU architecture version for your specific GPU:
https://developer.nvidia.com/cuda-gpus
Scroll down to see the list of CUDA-enabled Tesla, Quadro, NVS, GeForce/Titan and Jetson products:
Since I'm using a V100, I'll click the"Tesla CUDA-enabled products"Section:
If I scroll down, I see my V100 GPU:
As you can see my version is NVIDIA GPU architecture7,0
— You need to do the same process for your own GPU model.
Once you have identified the architecture version of your NVIDIA GPU,write downand continue to the next section.
Step 6: Configure OpenCV with NVIDIA GPU support
At this point, we are ready to configure our build with theagain
Domain.
Losagain
The command checks for dependencies, configures the build, and generates the necessary files foragain
to actually compile OpenCV.
To set up the build, first make sure you are in the Python virtual environment you will be using to build OpenCV with NVIDIA GPU support:
$ work on opencv_cuda
Then change to the directory where you downloaded the OpenCV source code and create aaccumulate
Directory:
$ cd ~/opencv$ mkdir build$ cd build
You can then run the followingagain
Domain,Make sure they are set.CUDA_ARCH_BIN
Variable based on your NVIDIA GPU architecture version,you found in the previous section:
$ cmake -D CMAKE_BUILD_TYPE=LIBERAR \-D CMAKE_INSTALL_PREFIX=/usr/local \-D INSTALL_PYTHON_EXAMPLES=EIN \-D INSTALL_C_EXAMPLES=AUS \-D OPENCV_ENABLE_NONFREE=EIN \-D WITH_CUDA=EIN \-D WITH_CUDNN=EIN \-D OPENCV_DNN_CUDA =EIN \-D ENABLE_FAST_MATH=1 \-D CUDA_FAST_MATH=1 \-D CUDA_ARCH_BIN=7.0 \-D WITH_CUBLAS=1 \-D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib/modules \-D HAVE_opencv_python3=EIN \-D PYTHON_EXECUTABLE=~/. virtualenvs/opencv_cuda/bin/python \-D BUILD_EXAMPLES=ON ..
Here you can see that we are building OpenCV with CUDA support and cuDNN enabled (MIT_CUDA
jCON_CUDNN
, respectively).
We also instruct OpenCV to build the "dnn" module with CUDA support (OPENCV_DNN_CUDA
).
So do weENABLE_FAST_MATH
,CUDA_FAST_MATH
, jCON_CUBLAS
for optimization purposes.
The most important and error-prone setting is yourCUDA_ARCH_BIN
—Make sure you set it up correctly!
LosCUDA_ARCH_BIN
Variablehave toMapping to your NVIDIA GPU architecture version from the previous section.
If you set this value incorrectly, OpenCV will still be able to compile, but you will get the following error when trying to infer with thednn
Module:
File ssd_object_detection.py line 74 in detections = net.forward()cv2.error: OpenCV(4.2.0) /home/a_rosebrock/opencv/modules/dnn/src/cuda/execution.hpp:52: error: (-217 : Gpu API call) Invalid device role in "make_policy" function.
When you get this error, you know yourCUDA_ARCH_BIN
has not been configured correctly.
You can check if youragain
Command executed successfully and looking at the output:
...-- NVIDIA CUDA: JA (version 10.0, CUFFT CUBLAS FAST_MATH)-- NVIDIA GPU Arch: 70-- NVIDIA PTX Arches:-- -- cuDNN: JA (version 7.6.0)...
Here you can see that OpenCV andagain
I have successfully identified my CUDA-enabled GPU, NVIDIA GPU architecture version, and cuDNN version.
I like looking at them tooOpenCV module
section, especially theis to be built
Paper:
-- OpenCV modules:-- To be built: aruco bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann fuzzy gapi hdf hfs highgui img_hash imgcodecs imgproc line_descriptor ml objdetect optflow phase_unwrapping photo raster python3 reg quality rgbd mark shape stereo seam light_structure superres match_surface text tracking ts video videoio videostab xfeatures2d ximgproc xobjdetect xphoto-- Disabled: world-- Disabled by dependency: --- N/A: cnn_3dobj cvv freetype java js matlab ovis python2 sfm viz - - Aplicações: Testes perf_tests Exemplos Aplicações -- Documentação: NÃO -- Algoritmos não-livres: SIM
Here you can see that there are severalcuda*
Modules that indicate thisagain
tells OpenCV to build our CUDA-enabled modules (including OpenCV's "dnn" module).
You can also look at thosePython 3
to verify that your twoInterpreter
jswelling
points to your Python virtual environment:
-- Python 3:-- Interpreters: /home/a_rosebrock/.virtualenvs/opencv_cuda/bin/python3 (ver 3.5.3) -- Libraries: /usr/lib/x86_64-linux-gnu/libpython3.5m.so (ver 3.5.3) – numpy: /home/a_rosebrock/.virtualenvs/opencv_cuda/lib/python3.5/site-packages/numpy/core/include (version 1.18.1) – Installation path: lib/python3.5/site-packages /cv2/python-3.5
be sure to write it downinstallation path
Also!
You will need this path when we finish installing OpenCV.
Step 7: Compile OpenCV with "dnn" GPU support
Intendedagain
It came out with no errors, so you can compile OpenCV with NVIDIA GPU support with the following command:
$ also -j8
You can replace those8
with the number of cores available on your processor.
As my processor has eight cores, I offer a8
🇧🇷 If your processor only has four cores, replace them8
common4
.
As you can see, my build completed with no errors:
A common error you may see is the following:
$ makemake: * No target specified and no makefile found. Stop.
If that happens, you must come back.Step #6and check youragain
exit - theagain
The command likely ended with an error. Yeaagain
It ended up with an error, so the build files foragain
cannot be generated, so theagain
The command reports that there are no build files to build. If this happens, go back to youragain
output and check for errors.
Step 8: Install OpenCV with GPU support "dnn"
while youagain
command ofStep #7completed successfully, you can now install OpenCV as follows:
$ sudo make install $ sudo ldconfig
The last step issymbolic linkthe OpenCV library in your Python virtual environment.
To do this, you need to know where the OpenCV bindings were installed; You can determine this route through theinstallation path
setting inStep #6.
In my case theinstallation path
I waslib/python3.5/site-packages/cv2/python-3.5
.
This means that my OpenCV linkshe mustbe inside/usr/local/lib/python3.5/site-packages/cv2/python-3.5
.
I can confirm the location withls
Domain:
$ ls -l /usr/local/lib/python3.5/site-packages/cv2/python-3.5total 7168-rw-r--r-1 root staff 7339240 17 de janeiro 18:59 cv2.cpython-35m- x86_64 -linux-gnu.so
Here is how my OpenCV bindings are namedcv2.cpython-35m-x86_64-linux-gnu.so
—Yours should have a similar name based on your Python version and CPU architecture.
Now that I know the location of my OpenCV bindings, I need to bind them to my Python virtual environment usinginside
Domain:
$ cd ~/.virtualenvs/opencv_cuda/lib/python3.5/site-packages/$ ln -s /usr/local/lib/python3.5/site-packages/cv2/python-3.5/cv2.cpython-35m- x86_64-linux-gnu.so cv2.so
Take a second to the firstcheckthe paths of your files: theinside
Domain"will fail silently"if the path to the OpenCV links is wrong.
Again,do not do itCopy and paste the command above!Double- or triple-check your file paths!
Step 9: Check if OpenCV is using your GPU with "dnn" module
The last step is to check the following:
- OpenCV can be imported into your terminal
- OpenCV can access your NVIDIA GPU to make inferences about the
dnn
Module
Let's start by checking if we can import theseCV2
Library:
$ workon opencv_cuda$ pythonPython 3.5.3 (Default, Sep 27, 2018 17:25:39) [GCC 6.3.0 20170516] on Linux Type help, copyright, credits or license to for more information. > >> import cv2 >>> cv2.__version__'4.2.0' >>>
Note that I use thein ... jobs
Command to access my Python virtual environment for the first time; You must do the same when using virtual environments.
I care from thereCV2
Library and show version.
Sure enough, the OpenCV version reported is v4.2, which is actually the OpenCV version we compiled from.
Next, let's check if the OpenCV module "dnn" can access our GPU. The key to ensuring that the OpenCV module "dnn" uses the GPU can be obtained by adding the following two lines immediatelylatera model is loaded andbeforethe conclusion is:
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)
The two lines above tell OpenCV that our NVIDIA GPU should be used for inference.
To see an example OpenCV+ GPU model in action, first use the"Download"this tutorial to download our sample source code and pre-trained SSD object listener.
From there, open a terminal and run the following command:
$ python ssd_object_detection.py --prototxt MobileNetSSD_deploy.prototxt \--model MobileNetSSD_deploy.caffemodel \--input guitar.mp4 --output output.avi \--display 0 --use-gpu 1 [INFO] preferred backend and settings destination for CUDA...[INFO] accessing video stream...[INFO] elapsed time: 3.75 [INFO] approx. FPS: 65.90
Los--uso-gpu 1
Flag tells OpenCV to use our NVIDIA GPU for inference via OpenCV's dnn module.
As you can see, I get~65,90FPSwith my NVIDIA Tesla V100 GPU.
That way I can compare my performance with usagesinglethe CPU (i.e. no GPU):
$ python ssd_object_detection.py --prototxt MobileNetSSD_deploy.prototxt \--model MobileNetSSD_deploy.caffemodel --input guitar.mp4 \--output output.avi --display 0[INFO] accessing video stream...[INFO] verstrichene Zeit: 11.69 [INFORMATION] ca. FPS: 21.13
Here I stay straight~21,13FPS, which implies that when using the GPU I get a3x performance boost!
I'll give you a detailed walkthrough of the code in next week's blog post.
Help! I encounter a "make_policy" error
That's greatsuperimportant to check, double check and triple checkCUDA_ARCH_BIN
Variable.
If you set it wrong, you may encounter the following error when running thessd_object_detection.py
Script from the previous section:
File "real_time_object_detection.py", line 74, under detections = net.forward()cv2.error: OpenCV(4.2.0) /home/a_rosebrock/opencv/modules/dnn/src/cuda/execution.hpp:52: error : (-217: Gpu API call) Invalid device role in "make_policy" function.
This error indicates that yourCUDA_ARCH_BIN
the value was set incorrectly at runtimeagain
.
you have to go backStep #5(where it identifies your NVIDIA CUDA architecture version) and then run both againagain
jagain
.
I suggest you tooRemoverareaccumulate
directory and recreate itbeforecorreagain
jagain
:
$ cd ~/opencv$ rm -rf build$ mkdir build$ cd build
From there you can run both againagain
jagain
- in a nice wayaccumulate
The directory ensures that you have a clean build and all old (incorrect) settings are gone.
What is the next? I recommendPyImageSearch University.
Course information:
60+ courses in total • 64+ hours of code-on-demand video tutorials • Last Updated: Dec 2022
★★★★★4.84 (128 grades) • More than 15,800 students enrolled
I firmly believe that if you had the right teacher, you couldMaestroComputer Vision and Deep Learning.
Do you think learning computer vision and deep learning must be time consuming, overwhelming and complicated? Or does it have to involve math and complex equations? Or do you need a computer science degree?
That isnoThe case.
All you need to master computer vision and deep learning is someone to explain things to you.simple, intuitiveConditions.And that's exactly what I do🇧🇷 My mission is to transform education and the way complex AI topics are taught.
If you're serious about learning Computer Vision, your next stop should be PyImageSearch University, the most comprehensive online course on Computer Vision, Deep Learning, and OpenCV available today. see howSuccessfuljwith confidenceApply computer vision to your work, research, and projects. Join me in the domain of computer vision.
At PyImageSearch University, you will find:
- ✓More than 60 courseson essential topics in Computer Vision, Deep Learning and OpenCV
- ✓More than 60 certificatesof conclusion
- ✓64+ hoursvideo on demand
- ✓Presentation of new coursesregularlyto ensure you are up to date with the latest techniques
- ✓Jupyter notebooks configured not Google Colab
- ✓ Run all code examples in your web browser - works on Windows, macOS and Linux (no development environment setup required!)
- ✓ access tocentralized code repositories fornoMore than 500 tutorialsde PyImageSearch
- ✓Easy one-click downloadsfor code, datasets, pre-trained models, etc.
- ✓Accessno cell phone, laptop, desktop, etc.
Click here to join PyImageSearch University
summary
In this tutorial, you learned how to build and install the OpenCV Deep Neural Network (DNN) module with support for NVIDIA GPU, CUDA, and cuDNN, providing 211-1549% faster inferences and predictions.
If you use OpenCV's "dnn" module, you need to compile from source:You cannot "pipe" OpenCV with GPU support.
In next week's tutorial, I'll compare popular deep learning models for CPU and GPU inference speed, including:
- Single Shot Detectors (SSD)
- You only look once (YOLO)
- Enmascarar R-CNN
Armed with this information, you'll know which models will benefit the most from a GPU and ensure you can make an informed decision on whether or not a GPU is right for your specific project.
To download the source code for this post (and be notified when future tutorials are posted here on PyImageSearch),Just enter your email address in the form below!
Download the FREE 17-page source code and resource guide
Enter your email address below to receive a zip file of the code and aFREE 17-page resource guide on Computer Vision, OpenCV and Deep Learning.Inside, you'll find my handpicked tutorials, books, courses, and libraries to help you master CV and DL!