How to use OpenCV "dnn" module with NVIDIA, CUDA and cuDNN GPUs - PyImageSearch (2023)

Click here to download the source code for this version

In this tutorial, you will learn how to use the OpenCV Deep Neural Network (DNN) module with NVIDIA GPUs, CUDA and cuDNN211-1549%faster completion.

In August 2017 I publishedmy first tutorial on how to use OpenCV's "Deep Neural Network" (DNN) module for image classification.

PyImageSearch readers loved the convenience and ease of use of OpenCVdnnmodule so much so that later I posted more tutorials about itdnnModule including:

  • Object Recognition with Deep Learning and OpenCV
  • Real-time object detection with Deep Learning and OpenCV
  • Detection of YOLO objects with OpenCV
  • R-CNN mask with OpenCV

Each of these guides used OpenCVdnnModule to (1) load a pre-trained network from disk, (2) make predictions on an input image, and then (3) display the results, allowing you to build your own custom machine vision/learning pipeline deep for your specific project.

However, themain problemwhat is OpenCVdnnthe module was alack of NVIDIA GPU/CUDA support— with these templates youI could notEasily use a GPU to improve the frames per second (FPS) processing speed of your pipeline.

This wasn't a big issue for the Single Shot Detector (SSD) tutorials, which can easily run at 25-30+ FPS on a CPU, but it wasgiganticProblem for YOLO and Mask R-CNN struggling to get more than 1-3 FPS on a CPU.

That all changed at Google Summer of Code (GSoC) 2019.

Directed by dlibKing Davis, and implemented bylong live the sama,OpenCV 4.2 now supports NVIDIA GPUs for inferencing with OpenCVdnnmodule, improving inference speedup to 1549%!

In today's tutorial, I'll show you how to compile and install OpenCV to use your NVIDIA GPU for deep neural network inference.

So in next week's tutorial I'll provide code for Single Shot Detector, YOLO and Mask R-CNN to use your GPU with OpenCV. We then evaluate the results and compare them against CPU-only inference so you know which models benefit most from using a GPU.

For information on compiling and installing the OpenCV "dnn" module with support for NVIDIA GPUs, CUDA, and cuDNN, seeJust keep reading!

How to use OpenCV "dnn" module with NVIDIA, CUDA and cuDNN GPUs - PyImageSearch (2)

Looking at the source code for this release?

Go straight to the download area

How to use the OpenCV 'dnn' module with NVIDIA GPUs, CUDA and cuDNN

In the remainder of this tutorial, I'll show you how to build OpenCV from source code so that you can leverage NVIDIA's GPU-accelerated inference for pre-trained deep neural networks.

Assumptions when building OpenCV for NVIDIA GPU support

To compile and install the OpenCV "Deep Neural Network" module with NVIDIA GPU support, I will make the following assumptions:

  1. You have an NVIDIA GPU.This must be an obvious assumption. If you don't have NVIDIA GPU, you cannot compile OpenCV module "dnn" with NVIDIA GPU support.
  2. You are using Ubuntu 18.04 (or another Debian-based distribution).When it comes to deep learning, Ihighly recommendedUnix-based machines running on Windows systems (actuallyI am not compatible with windows on the PyImageSearch blog).If you intend to use a GPU for deep learning, choose Ubuntu over macOS or Windows- arevery easyto set up.
  3. You know how to use a command line.In this tutorial, we will use the command line. If you are not familiar with the command line, I recommendRead this introduction to the command linefirst and then spend a few hours (or even days) practicing. Again, this tutorial isnofor those who are new to the command line.
  4. It is capable of reading terminal output and diagnosing problems.Compiling OpenCV from source can be challenging if you've never done it before - there are a number of things that can cause problems, including missing packages, incorrect library paths, etc. Even with my detailed guides, you're likely to make a mistake. by the way.Don't be discouraged!Take the time to understand the commands you run, what they do, and most importantly,Read the output of the commands!Don't copy and paste blindly; You will only find errors.

With all that said, let's start configuring the OpenCV "dnn" module for NVIDIA GPU inference.

Step 1: Install the NVIDIA CUDA, CUDA Toolkit, and cuDNN Drivers

This tutorial assumes that youvonTer:

  • CUDA drivers for this specific GPU are installed
  • CUDA Toolkit and cuDNN configured and installed

If you have an NVIDIA GPU in your system, but have not yet installed the CUDA drivers, CUDA toolkit and cuDNN, first configure your computer:I will not cover CUDA setup and installation in this guide.

(Video) How To Install and Build OpenCV with GPU for Python | VS Code | NVIDIA Cuda and OpenCV 4.5.2

To know how to install NVIDIA CUDA driver, CUDA Toolkit and cuDNN, I recommend you to read myUbuntu 18.04 GPU and TensorFlow/Keras Installation Guide— After installing the appropriate NVIDIA drivers and toolkits, you can return to this tutorial.

Step 2: Install OpenCV and "dnn" GPU dependencies

The first step in configuring the OpenCV "dnn" module for NVIDIA GPU inference is to install the appropriate dependencies:

$ sudo apt-get update$ sudo apt-get upgrade$ sudo apt-get install build-essential cmake unzip pkg-config$ sudo apt-get install libjpeg-dev libpng-dev libtiff-dev$ sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev$ sudo apt-get install libv4l-dev libxvidcore-dev libx264-dev$ sudo apt-get install libgtk-3-dev$ sudo apt-get install libatlas-base-dev gfortran$ sudo apt-get install python3-dev

Most of these packages should have been installed if you followed myUbuntu 18.04 Deep Learning Setup Manual, but I would recommend running the above command just to be sure.

Step #3: Download the OpenCV source code

There is no "pip-installable" version of OpenCV that comes with NVIDIA GPU support; Instead we need to compile OpenCVfrom the startwith the correct NVIDIA GPU settings.

The first step to do this is to download the OpenCV v4.2 source code:

$ cd ~$ wget -O$ wget -O$ descomprimir$ descomprimir$ mv opencv-4.2.0 opencv$ mv opencv_contrib-4.2.0 opencv_contrib

Now we can proceed to configure our build.

Step 4: Setting up the Python Virtual Environment

if you followed meUbuntu 18.04, TensorFlow and Keras Deep Learning Setup Guide, then you must already have itvirtual environmentjconcha virtualFurnished:

  • If your machine isalready configured, heckmkvirtualenvcommands in this section.
  • If not, follow each of these steps to configure your device.

Python virtual environments are a good practice when it comes to Python development. They allow you to test different versions of Python libraries in separate and isolated development and production environments. Python virtual environments are considered abest practicesin the python world: I use it daily and so should you.

If you haven't installed it yetlump, the Python package manager, you can do this with the following command:

$ wget$ sudo python3

One timelumpinstalled, you can install bothvirtual environmentjconcha virtual:

$ sudo pip install virtualenv virtualenvwrapper$ sudo rm -rf ~/ ~/.cache/pip

You must then open your~/.bashrcfile and update it toautomaticallyLoad virtualenv/virtualenvwrapper every time you open a terminal.

I prefer to use thesenanoText editor, but you can use whichever editor you're most comfortable with:

$ nano ~/.bashrc

after having them~/.bashrcOpen the file, scroll to the end of the file and paste the following:

# virtualenv e virtualenvwrapperexport WORKON_HOME=$HOME/.virtualenvsexport VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3source /usr/local/bin/

From there, save and exit your terminal (Strength + X,j,log in).

You can then reload your~/.bashrcFile in your terminal session:

$fonte ~/.bashrc

You just need to run the above commandone time- since you updated your~/.bashrcFile is virtualenv/virtualenvwrapper environment variablesautomatically setevery time you open a new terminal window.

The final step is to create your Python virtual environment:

$ mkvirtualenv opencv_cuda -p python3

LosmkvirtualenvThe command creates a new Python virtual environment calledopencv_cudacom Python3.

Next, you need to install NumPy onopencv_cudaProximity:

$ pip instala numpy

If you close your terminal or shut down your Python virtual environment, you can access it again via thein ... jobsDomain:

$ work on opencv_cuda

If you are new to Python virtual environments,I suggest you take a second and read how they work.- You are abest practicesin the world of Python.

If you choose not to use them, that's fine, but remember that your choice doesn't exempt you from learning Python best practices. Take advantage now to invest in your knowledge.

Step 5: Determine your CUDA architecture version

When compiling the OpenCV module "dnn" with NVIDIA GPU support, we need to determine ourNVIDIA GPU architecture version:

  • This version number is aapplicationif we stop thisCUDA_ARCH_BINvariable in ouragaincommand in the next section.
  • NVIDIA GPU architecture version dependsWhat GPU are you usingSo make sure you know your GPU modelearly
  • If you don't configure yours correctlyCUDA_ARCH_BINThe variable may cause OpenCV to continue compiling but not be able to use your GPU for inference(making it troublesome to diagnose and debug).

One of the easiest ways to determine the architecture version of your NVIDIA GPU is to simply look at thenvidia-smiDomain:

$nvidia-smiMon Jan 27 14:11:32 2020+------------------------------------ - ------------------------------------------------- ---- - - - ----+| NVIDIA-SMI 410.104 Driver Version: 410.104 CUDA Version: 10.0 ||------------------------------------------------ - --------+--- - ----+----------------- - ------------ - ------+| GPU name persistence-M| Bus identification number A | to draw. ECC volatile || Fan Temperature Perf Pwr:Use/Cap| Memory Usage | GPU-Util Compute M ||=========================================== = = = =========================+========================|| 0 Tesla V100-SXM2... Off | 00000000:00:04.0 Off | 0 || N/A 35C P0 38W / 300W | 0MiB / 16130MiB | 0% Standard |+--------------------------------------------- --------+---- --------------- ---------+------------ -------- ----- ------------------------------------- -------- ----- ------------------------------------- -------- ----- -+| Processes: GPU memory || GPU PID type using process name ||======================================= = = = ================================|| No running processes found |+-------------------------------------------- -- ------------- -------- ------------------------ --- ------------------------- +

If you check the output you can see that I'm using aGPU NVIDIA Tesla V100🇧🇷 Make sure you run thosenvidia-smiask yourselfCheck your GPU modelBefore proceeding.

Now that I have my NVIDIA GPU model, I can determine the architecture version.

On this page, you can find your NVIDIA GPU architecture version for your specific GPU:

(Video) How To Install and Build OpenCV with GPU for C++ | Visual Studio Code | NVIDIA Cuda and OpenCV 4.5.2

Scroll down to see the list of CUDA-enabled Tesla, Quadro, NVS, GeForce/Titan and Jetson products:

Since I'm using a V100, I'll click the"Tesla CUDA-enabled products"Section:

If I scroll down, I see my V100 GPU:

As you can see my version is NVIDIA GPU architecture7,0— You need to do the same process for your own GPU model.

Once you have identified the architecture version of your NVIDIA GPU,write downand continue to the next section.

Step 6: Configure OpenCV with NVIDIA GPU support

At this point, we are ready to configure our build with theagainDomain.

LosagainThe command checks for dependencies, configures the build, and generates the necessary files foragainto actually compile OpenCV.

To set up the build, first make sure you are in the Python virtual environment you will be using to build OpenCV with NVIDIA GPU support:

$ work on opencv_cuda

Then change to the directory where you downloaded the OpenCV source code and create aaccumulateDirectory:

$ cd ~/opencv$ mkdir build$ cd build

You can then run the followingagainDomain,Make sure they are set.CUDA_ARCH_BINVariable based on your NVIDIA GPU architecture version,you found in the previous section:


Here you can see that we are building OpenCV with CUDA support and cuDNN enabled (MIT_CUDAjCON_CUDNN, respectively).

We also instruct OpenCV to build the "dnn" module with CUDA support (OPENCV_DNN_CUDA).

So do weENABLE_FAST_MATH,CUDA_FAST_MATH, jCON_CUBLASfor optimization purposes.

The most important and error-prone setting is yourCUDA_ARCH_BINMake sure you set it up correctly!

LosCUDA_ARCH_BINVariablehave toMapping to your NVIDIA GPU architecture version from the previous section.

If you set this value incorrectly, OpenCV will still be able to compile, but you will get the following error when trying to infer with thednnModule:

File line 74 in detections = net.forward()cv2.error: OpenCV(4.2.0) /home/a_rosebrock/opencv/modules/dnn/src/cuda/execution.hpp:52: error: (-217 : Gpu API call) Invalid device role in "make_policy" function.

When you get this error, you know yourCUDA_ARCH_BINhas not been configured correctly.

You can check if youragainCommand executed successfully and looking at the output:

...-- NVIDIA CUDA: JA (version 10.0, CUFFT CUBLAS FAST_MATH)-- NVIDIA GPU Arch: 70-- NVIDIA PTX Arches:-- -- cuDNN: JA (version 7.6.0)...

Here you can see that OpenCV andagainI have successfully identified my CUDA-enabled GPU, NVIDIA GPU architecture version, and cuDNN version.

I like looking at them tooOpenCV modulesection, especially theis to be builtPaper:

-- OpenCV modules:-- To be built: aruco bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann fuzzy gapi hdf hfs highgui img_hash imgcodecs imgproc line_descriptor ml objdetect optflow phase_unwrapping photo raster python3 reg quality rgbd mark shape stereo seam light_structure superres match_surface text tracking ts video videoio videostab xfeatures2d ximgproc xobjdetect xphoto-- Disabled: world-- Disabled by dependency: --- N/A: cnn_3dobj cvv freetype java js matlab ovis python2 sfm viz - - Aplicações: Testes perf_tests Exemplos Aplicações -- Documentação: NÃO -- Algoritmos não-livres: SIM

Here you can see that there are severalcuda*Modules that indicate thisagaintells OpenCV to build our CUDA-enabled modules (including OpenCV's "dnn" module).

You can also look at thosePython 3to verify that your twoInterpreterjswellingpoints to your Python virtual environment:

-- Python 3:-- Interpreters: /home/a_rosebrock/.virtualenvs/opencv_cuda/bin/python3 (ver 3.5.3) -- Libraries: /usr/lib/x86_64-linux-gnu/ (ver 3.5.3) – numpy: /home/a_rosebrock/.virtualenvs/opencv_cuda/lib/python3.5/site-packages/numpy/core/include (version 1.18.1) – Installation path: lib/python3.5/site-packages /cv2/python-3.5

be sure to write it downinstallation pathAlso!

You will need this path when we finish installing OpenCV.

Step 7: Compile OpenCV with "dnn" GPU support

IntendedagainIt came out with no errors, so you can compile OpenCV with NVIDIA GPU support with the following command:

$ also -j8

You can replace those8with the number of cores available on your processor.

As my processor has eight cores, I offer a8🇧🇷 If your processor only has four cores, replace them8common4.

As you can see, my build completed with no errors:

(Video) Build and Install OpenCV Python with Cuda GPU in UNDER 10 MINUTES

A common error you may see is the following:

$ makemake: * No target specified and no makefile found. Stop.

If that happens, you must come back.Step #6and check youragainexit - theagainThe command likely ended with an error. YeaagainIt ended up with an error, so the build files foragaincannot be generated, so theagainThe command reports that there are no build files to build. If this happens, go back to youragainoutput and check for errors.

Step 8: Install OpenCV with GPU support "dnn"

while youagaincommand ofStep #7completed successfully, you can now install OpenCV as follows:

$ sudo make install $ sudo ldconfig

The last step issymbolic linkthe OpenCV library in your Python virtual environment.

To do this, you need to know where the OpenCV bindings were installed; You can determine this route through theinstallation pathsetting inStep #6.

In my case theinstallation pathI waslib/python3.5/site-packages/cv2/python-3.5.

This means that my OpenCV linkshe mustbe inside/usr/local/lib/python3.5/site-packages/cv2/python-3.5.

I can confirm the location withlsDomain:

$ ls -l /usr/local/lib/python3.5/site-packages/cv2/python-3.5total 7168-rw-r--r-1 root staff 7339240 17 de janeiro 18:59 cv2.cpython-35m- x86_64

Here is how my OpenCV bindings are namedcv2.cpython-35m-x86_64-linux-gnu.soYours should have a similar name based on your Python version and CPU architecture.

Now that I know the location of my OpenCV bindings, I need to bind them to my Python virtual environment usinginsideDomain:

$ cd ~/.virtualenvs/opencv_cuda/lib/python3.5/site-packages/$ ln -s /usr/local/lib/python3.5/site-packages/cv2/python-3.5/cv2.cpython-35m-

Take a second to the firstcheckthe paths of your files: theinsideDomain"will fail silently"if the path to the OpenCV links is wrong.

Again,do not do itCopy and paste the command above!Double- or triple-check your file paths!

Step 9: Check if OpenCV is using your GPU with "dnn" module

The last step is to check the following:

  1. OpenCV can be imported into your terminal
  2. OpenCV can access your NVIDIA GPU to make inferences about thednnModule

Let's start by checking if we can import theseCV2Library:

$ workon opencv_cuda$ pythonPython 3.5.3 (Default, Sep 27, 2018 17:25:39) [GCC 6.3.0 20170516] on Linux Type help, copyright, credits or license to for more information. > >> import cv2 >>> cv2.__version__'4.2.0' >>>

Note that I use thein ... jobsCommand to access my Python virtual environment for the first time; You must do the same when using virtual environments.

I care from thereCV2Library and show version.

Sure enough, the OpenCV version reported is v4.2, which is actually the OpenCV version we compiled from.

Next, let's check if the OpenCV module "dnn" can access our GPU. The key to ensuring that the OpenCV module "dnn" uses the GPU can be obtained by adding the following two lines immediatelylatera model is loaded andbeforethe conclusion is:


The two lines above tell OpenCV that our NVIDIA GPU should be used for inference.

To see an example OpenCV+ GPU model in action, first use the"Download"this tutorial to download our sample source code and pre-trained SSD object listener.

From there, open a terminal and run the following command:

$ python --prototxt MobileNetSSD_deploy.prototxt \--model MobileNetSSD_deploy.caffemodel \--input guitar.mp4 --output output.avi \--display 0 --use-gpu 1 [INFO] preferred backend and settings destination for CUDA...[INFO] accessing video stream...[INFO] elapsed time: 3.75 [INFO] approx. FPS: 65.90

Los--uso-gpu 1Flag tells OpenCV to use our NVIDIA GPU for inference via OpenCV's dnn module.

As you can see, I get~65,90FPSwith my NVIDIA Tesla V100 GPU.

That way I can compare my performance with usagesinglethe CPU (i.e. no GPU):

$ python --prototxt MobileNetSSD_deploy.prototxt \--model MobileNetSSD_deploy.caffemodel --input guitar.mp4 \--output output.avi --display 0[INFO] accessing video stream...[INFO] verstrichene Zeit: 11.69 [INFORMATION] ca. FPS: 21.13

Here I stay straight~21,13FPS, which implies that when using the GPU I get a3x performance boost!

I'll give you a detailed walkthrough of the code in next week's blog post.

Help! I encounter a "make_policy" error

That's greatsuperimportant to check, double check and triple checkCUDA_ARCH_BINVariable.

If you set it wrong, you may encounter the following error when running thessd_object_detection.pyScript from the previous section:

(Video) Setup OpenCV-DNN module with CUDA backend support on Windows

File "", line 74, under detections = net.forward()cv2.error: OpenCV(4.2.0) /home/a_rosebrock/opencv/modules/dnn/src/cuda/execution.hpp:52: error : (-217: Gpu API call) Invalid device role in "make_policy" function.

This error indicates that yourCUDA_ARCH_BINthe value was set incorrectly at runtimeagain.

you have to go backStep #5(where it identifies your NVIDIA CUDA architecture version) and then run both againagainjagain.

I suggest you tooRemoverareaccumulatedirectory and recreate itbeforecorreagainjagain:

$ cd ~/opencv$ rm -rf build$ mkdir build$ cd build

From there you can run both againagainjagain- in a nice wayaccumulateThe directory ensures that you have a clean build and all old (incorrect) settings are gone.

What is the next? I recommendPyImageSearch University.

How to use OpenCV "dnn" module with NVIDIA, CUDA and cuDNN GPUs - PyImageSearch (9)

Course information:
60+ courses in total • 64+ hours of code-on-demand video tutorials • Last Updated: Dec 2022
★★★★★4.84 (128 grades) • More than 15,800 students enrolled

I firmly believe that if you had the right teacher, you couldMaestroComputer Vision and Deep Learning.

Do you think learning computer vision and deep learning must be time consuming, overwhelming and complicated? Or does it have to involve math and complex equations? Or do you need a computer science degree?

That isnoThe case.

All you need to master computer vision and deep learning is someone to explain things to you.simple, intuitiveConditions.And that's exactly what I do🇧🇷 My mission is to transform education and the way complex AI topics are taught.

If you're serious about learning Computer Vision, your next stop should be PyImageSearch University, the most comprehensive online course on Computer Vision, Deep Learning, and OpenCV available today. see howSuccessfuljwith confidenceApply computer vision to your work, research, and projects. Join me in the domain of computer vision.

At PyImageSearch University, you will find:

  • ✓More than 60 courseson essential topics in Computer Vision, Deep Learning and OpenCV
  • ✓More than 60 certificatesof conclusion
  • ✓64+ hoursvideo on demand
  • ✓Presentation of new coursesregularlyto ensure you are up to date with the latest techniques
  • ✓Jupyter notebooks configured not Google Colab
  • ✓ Run all code examples in your web browser - works on Windows, macOS and Linux (no development environment setup required!)
  • ✓ access tocentralized code repositories fornoMore than 500 tutorialsde PyImageSearch
  • ✓Easy one-click downloadsfor code, datasets, pre-trained models, etc.
  • ✓Accessno cell phone, laptop, desktop, etc.

Click here to join PyImageSearch University


In this tutorial, you learned how to build and install the OpenCV Deep Neural Network (DNN) module with support for NVIDIA GPU, CUDA, and cuDNN, providing 211-1549% faster inferences and predictions.

If you use OpenCV's "dnn" module, you need to compile from source:You cannot "pipe" OpenCV with GPU support.

In next week's tutorial, I'll compare popular deep learning models for CPU and GPU inference speed, including:

  • Single Shot Detectors (SSD)
  • You only look once (YOLO)
  • Enmascarar R-CNN

Armed with this information, you'll know which models will benefit the most from a GPU and ensure you can make an informed decision on whether or not a GPU is right for your specific project.

To download the source code for this post (and be notified when future tutorials are posted here on PyImageSearch),Just enter your email address in the form below!

How to use OpenCV "dnn" module with NVIDIA, CUDA and cuDNN GPUs - PyImageSearch (10)

(Video) Build and Install OpenCV With CUDA GPU Support on Windows 10 | OpenCV 4.5.1 | 2021

Download the FREE 17-page source code and resource guide

Enter your email address below to receive a zip file of the code and aFREE 17-page resource guide on Computer Vision, OpenCV and Deep Learning.Inside, you'll find my handpicked tutorials, books, courses, and libraries to help you master CV and DL!


1. Build and install OpenCV from source with CUDA and cuDNN support
2. Full installation of Cuda and Cudnn with Pytorch for all GPU's in Linux
(Robotics Workshop)
3. 【OPENCV CUDA】How to Build Opencv GPU with Cuda on Windows
(Harden Li)
4. Build and Install OpenCV With CUDA (GPU) Support on Windows 10
5. OpenCV GPU-CUDA installation on Ubuntu
(Robotics and Perception Team)
6. install yolov3 yolov4 yolov5 pytorch opencv cuda compiled GPU multi threading on ubantu 18.04 LTS


Top Articles
Latest Posts
Article information

Author: Gregorio Kreiger

Last Updated: 30/09/2023

Views: 5803

Rating: 4.7 / 5 (57 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Gregorio Kreiger

Birthday: 1994-12-18

Address: 89212 Tracey Ramp, Sunside, MT 08453-0951

Phone: +9014805370218

Job: Customer Designer

Hobby: Mountain biking, Orienteering, Hiking, Sewing, Backpacking, Mushroom hunting, Backpacking

Introduction: My name is Gregorio Kreiger, I am a tender, brainy, enthusiastic, combative, agreeable, gentle, gentle person who loves writing and wants to share my knowledge and understanding with you.