In this tutorial, you will learn how to perform multi-template matching with OpenCV.
Last week you discovered how to utilize OpenCV and the cv2.matchTemplate
function for basic template matching. The problem with this approach is that it could only detect one instance of the template in the input image — you could not perform multi-object detection!
We could only detect one object because we were using the cv2.minMaxLoc
function to find the single location with the largest normalized correlation score.
To perform multi-object template matching, what we instead need to do is:
- Apply the
cv2.matchTemplate
function as we normally would - Find all (x, y)-coordinates where the template matching result matrix is greater than a preset threshold score
- Extract all of these regions
- Apply non-maxima suppression to them
After applying the above four steps, we’ll be able to detect multiple templates in the input image.
To learn how to perform multi-template matching with OpenCV, just keep reading.
Looking for the source code to this post?
Jump Right To The Downloads SectionMulti-template matching with OpenCV
In the first part of this tutorial, we’ll discuss the problem with basic template matching and how we can extend it to multi-template matching using some basic computer vision and image processing techniques.
We’ll then configure our development environment and review our project directory structure.
From there, we’ll implement multi-template matching using OpenCV.
We’ll wrap up the tutorial with a discussion of our results.
The problem with basic template matching
As we saw in last week’s tutorial, applying basic template matching results in only one instance of a particular template being matched, as seen in Figure 1.
Our input image contains the eight of diamonds. While our template contains the diamond symbol, we would expect to detect all diamonds in the input image.
However, when using basic template matching, multi-object detection simply isn’t possible.
The solution is to filter the result matrix from the cv2.matchTemplate
function and then apply non-maxima suppression.
How can we match multiple templates with OpenCV?
To detect multiple objects/templates using OpenCV and cv2.matchTemplate
we need to filter the result
matrix generated by cv2.matchTemplate
, like so:
result = cv2.matchTemplate(image, template, cv2.TM_CCOEFF_NORMED) (yCoords, xCoords) = np.where(result >= args["threshold"])
Calling cv2.matchTemplate
results in a result
matrix with the following spatial dimensions:
- Width:
image.shape[1] - template.shape[1] + 1
- Height:
image.shape[0] - template.shape[0] + 1
We then apply the np.where
function to find all (x, y)-coordinates with a normalized correlation coefficient greater than our preset threshold — this thresholding step allows us to perform multi-template matching!
The final step, which we’ll cover later in this tutorial, is to apply non-maxima suppression to filter overlapping bounding boxes generated by the np.where
filtering.
After applying these steps, our output image will look like the following:
Notice that we’ve detected (almost) all the diamonds in the input image.
Configuring your development environment
To follow this guide, you need to have the OpenCV library installed on your system.
Luckily, OpenCV is pip-installable:
$ pip install opencv-contrib-python
If you need help configuring your development environment for OpenCV, I highly recommend that you read my pip install OpenCV guide — it will have you up and running in a matter of minutes.
Having problems configuring your development environment?
All that said, are you:
- Short on time?
- Learning on your employer’s administratively locked system?
- Wanting to skip the hassle of fighting with the command line, package managers, and virtual environments?
- Ready to run the code right now on your Windows, macOS, or Linux systems?
Then join PyImageSearch University today!
Gain access to Jupyter Notebooks for this tutorial and other PyImageSearch guides that are pre-configured to run on Google Colab’s ecosystem right in your web browser! No installation required.
And best of all, these Jupyter Notebooks will run on Windows, macOS, and Linux!
Project structure
Let’s take a second to inspect our project directory structure. Be sure to access the “Downloads” section of this tutorial to retrieve the source code and example images.
After unzipping the code archive, you’ll find the following directory:
$ tree . --dirsfirst . ├── images │ ├── 8_diamonds.png │ └── diamonds_template.png └── multi_template_matching.py 1 directory, 5 files
We have only a single Python script to review today, multi_template_matching.py
, which will perform multi-template matching using the input images in our images
directory.
Implementing multi-template matching with OpenCV
Last week we learned how to perform template matching. The problem with that approach is that it failed when multiple occurrences of the template existed in the input image — template matching would only report one matched template (i.e., the template with the largest correlation score).
The Python script we’re about to review, multi_template_matching.py
, will extend our basic template matching approach and allow us to match multiple templates.
Let’s get started:
# import the necessary pages from imutils.object_detection import non_max_suppression import numpy as np import argparse import cv2
Lines 2-5 import our required Python packages. Most importantly, we need the non_max_suppression
function from this tutorial that performs non-maxima suppression (NMS).
Applying multi-template matching will result in multiple detections for each object in our input image. We can fix this behavior by applying NMS to suppress weak, overlapping bounding boxes.
From there, we parse our command line arguments:
# construct the argument parser and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-i", "--image", type=str, required=True, help="path to input image where we'll apply template matching") ap.add_argument("-t", "--template", type=str, required=True, help="path to template image") ap.add_argument("-b", "--threshold", type=float, default=0.8, help="threshold for multi-template matching") args = vars(ap.parse_args())
We have three arguments to parse, two of which are required, and the third is optional:
--image
: The path to the input image where we will apply multi-template matching.--template
: Path to the template image (i.e., the example of the objects we want to detect).--threshold
: Threshold value used for NMS — values in the range [0.8, 0.95] typically work the best.
Next, let’s load our image
and template
from disk:
# load the input image and template image from disk, then grab the # template image spatial dimensions print("[INFO] loading images...") image = cv2.imread(args["image"]) template = cv2.imread(args["template"]) (tH, tW) = template.shape[:2] # display the image and template to our screen cv2.imshow("Image", image) cv2.imshow("Template", template)
Lines 20 and 21 load our image
and template
from disk. We grab the template‘s spatial dimensions on Line 22, so we can use them to derive the bounding box coordinates of matched objects easily.
Lines 25 and 26 display our image
and template
to our screen.
The next step is to perform template matching, just like we did last week:
# convert both the image and template to grayscale imageGray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) templateGray = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY) # perform template matching print("[INFO] performing template matching...") result = cv2.matchTemplate(imageGray, templateGray, cv2.TM_CCOEFF_NORMED)
Lines 29 and 30 convert our input images to grayscale while Lines 34 and 35 perform template matching.
If we are looking to detect just one instance of our template, we could simply call cv2.minMaxLoc
to find the (x, y)-coordinates with the largest normalized correlation coefficient.
However, since we want to detect multiple objects, we need to filter our result
matrix and find all (x, y)-coordinates that have a score greater than our --threshold
:
# find all locations in the result map where the matched value is # greater than the threshold, then clone our original image so we # can draw on it (yCoords, xCoords) = np.where(result >= args["threshold"]) clone = image.copy() print("[INFO] {} matched locations *before* NMS".format(len(yCoords))) # loop over our starting (x, y)-coordinates for (x, y) in zip(xCoords, yCoords): # draw the bounding box on the image cv2.rectangle(clone, (x, y), (x + tW, y + tH), (255, 0, 0), 3) # show our output image *before* applying non-maxima suppression cv2.imshow("Before NMS", clone) cv2.waitKey(0)
Line 40 uses np.where
to find all (x, y)-coordinates where the correlation score is greater than our --threshold
command line arguments.
Line 42 then displays the total number of matched locations before applying NMS.
From there, we loop over all the matched (x, y)-coordinates and draw their bounding boxes on our screen (Lines 45-48).
If we ended our implementation here, we would have a problem — a call to np.where
will return all locations of (x, y)-coordinates that are above our threshold.
It could very well be the case that multiple locations refer to the same object. If that happens, we’ll essentially report the same object multiple times, which we want to avoid at all costs.
The solution is to apply non-maxima suppression:
# initialize our list of rectangles rects = [] # loop over the starting (x, y)-coordinates again for (x, y) in zip(xCoords, yCoords): # update our list of rectangles rects.append((x, y, x + tW, y + tH)) # apply non-maxima suppression to the rectangles pick = non_max_suppression(np.array(rects)) print("[INFO] {} matched locations *after* NMS".format(len(pick))) # loop over the final bounding boxes for (startX, startY, endX, endY) in pick: # draw the bounding box on the image cv2.rectangle(image, (startX, startY), (endX, endY), (255, 0, 0), 3) # show the output image cv2.imshow("After NMS", image) cv2.waitKey(0)
Line 55 starts by initializing our list of bounding box rects
. We then loop over all our (x, y)-coordinates, compute their respective bounding boxes, and then update the rects
list.
Applying non-maxima suppression on Line 63 suppresses overlapping bounding boxes with lower scores, essentially collapsing multiple overlapping detections into a single detection.
Finally, Lines 67-70 loop over our final bounding boxes and draw them on our output image
.
Multi-template matching results
We are now ready to apply multi-template matching with OpenCV!
Be sure to access the “Downloads” section of this tutorial to retrieve the source code and example images.
From there, open a terminal and execute the following command:
$ python multi_template_matching.py --image images/8_diamonds.png \ --template images/diamonds_template.png [INFO] loading images... [INFO] performing template matching... [INFO] 601 matched locations *before* NMS [INFO] 8 matched locations *after* NMS
Figure 4 displays our diamonds_template.png
(left) and 8_diamonds.png
image (right). Our goal is to detect all the diamond symbols in the right image.
After applying the cv2.matchTemplate
function, we filter the resulting matrix, finding the (x, y)-coordinates with normalized correlation coefficients greater than our --threshold
argument.
This process results in a total of 601 matched objects, which we visualize below:
Looking at Figure 5, along with our terminal output, you’re likely surprised to see that we have 601 matched regions — how is that even possible?! There are only 8 diamonds on the eight of diamonds card (ten if you count the additional diamonds by the 8
digits themselves) — but that certainly doesn’t add up to 601 total matches!
This phenomenon is something I discuss in my non-maxima suppression tutorial. Object detection algorithms are similar to a “heatmap.” The closer a sliding window gets to an object in an image, the “hotter and hotter” the heatmap gets.
Then, when we filter this heatmap using the np.where
call, we end up with all the locations above a given threshold. Keep in mind that the np.where
function has no idea how many objects are in an image — it’s just telling you where there are likely objects.
The solution here is simple and one that nearly all object detection algorithms (including advanced deep learning-based ones) use — non-maxima suppression (NMS).
Using NMS, we examine the correlation coefficient scores and suppress those that are both (1) overlapping and (2) have lower scores than their surrounding neighbors.
Applying NMS yields the 8 matched locations of the diamond symbols:
What about the small diamonds next to the “8” digits in the corners of the cards. Why weren’t those detected?
That goes back to one of the primary limitations of template matching:
Template matching will fail when the objects you want to detect start differing in scale, rotation, and viewing angle.
Since the size of the diamonds is smaller than our template, the standard template matching procedure will fail to detect them.
When that happens, you could rely on multi-scale template matching. Alternatively, you may want to consider training an object detector that can naturally handle these types of variations, such as most deep learning-based object detectors (e.g., Faster R-CNN, SSDs, YOLO, etc.).
Credits and References
I would like to thank TheAILearner for their excellent article on template matching — I cannot take credit for the idea of using playing cards to demonstrate template matching. That was their idea, and it was an excellent one at that. Credits to them for coming up with that example, which I shamelessly used here, thank you.
Additionally, the eight of diamonds image was obtained from the Reddit post by u/fireball_73.
What's next? We recommend PyImageSearch University.
84 total classes • 114+ hours of on-demand code walkthrough videos • Last updated: February 2024
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled
I strongly believe that if you had the right teacher you could master computer vision and deep learning.
Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?
That’s not the case.
All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.
Inside PyImageSearch University you'll find:
- ✓ 86 courses on essential computer vision, deep learning, and OpenCV topics
- ✓ 86 Certificates of Completion
- ✓ 115+ hours of on-demand video
- ✓ Brand new courses released regularly, ensuring you can keep up with state-of-the-art techniques
- ✓ Pre-configured Jupyter Notebooks in Google Colab
- ✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
- ✓ Access to centralized code repos for all 540+ tutorials on PyImageSearch
- ✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
- ✓ Access on mobile, laptop, desktop, etc.
Summary
In this tutorial, you learned how to perform multi-template matching using OpenCV.
Unlike basic template matching, which can only detect a single instance of a template in an input image, multi-template matching allows us to detect multiple instances of the template.
Applying multi-object template matching is a four-step process:
- Apply the
cv2.matchTemplate
function as we normally would - Find all (x, y)-coordinates where the template matching result matrix is greater than a preset threshold score
- Extract all of these regions
- Apply non-maxima suppression to them
While this method can handle multi-object template matching, it is still prone to the other limitations of template matching — if the scale, rotation, or viewing angle of the objects change, template matching can fail.
You may be able to leverage multi-scale template matching (which is different from multi-template matching). Still, if you get to that point, you may want to look into more advanced object detection methods such as HOG + Linear SVM, Faster R-CNN, SSDs, and YOLO.
Regardless, template matching is super fast, highly efficient, and easy to implement. Hence, it’s worth it as a “first step” when performing template matching (just be aware of the limitations beforehand).
To download the source code to this post (and be notified when future tutorials are published here on PyImageSearch), simply enter your email address in the form below!
Download the Source Code and FREE 17-page Resource Guide
Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!
Comment section
Hey, Adrian Rosebrock here, author and creator of PyImageSearch. While I love hearing from readers, a couple years ago I made the tough decision to no longer offer 1:1 help over blog post comments.
At the time I was receiving 200+ emails per day and another 100+ blog post comments. I simply did not have the time to moderate and respond to them all, and the sheer volume of requests was taking a toll on me.
Instead, my goal is to do the most good for the computer vision, deep learning, and OpenCV community at large by focusing my time on authoring high-quality blog posts, tutorials, and books/courses.
If you need help learning computer vision and deep learning, I suggest you refer to my full catalog of books and courses — they have helped tens of thousands of developers, students, and researchers just like yourself learn Computer Vision, Deep Learning, and OpenCV.
Click here to browse my full catalog.