So you’ve just built your first awesome computer vision app.
Maybe it can detect faces in images. Or maybe your app can recognize prescription pills in photos. Or maybe your computer vision app can identify the covers of top selling books, all while displaying the latest reader reviews and the cheapest websites online to purchase them.
So the big question is…how do you wrap your computer vision app in an easy to use web API?
With more and more services heading to the cloud, your users, customers, and fellow developers are likely expecting an easy to consume web API (and probably in JSON format).
Creating a computer vision web API is actually not as hard as you think — I’ll go as far as to say it’s unbelievably easy to wrap your application in a consumable API using a Python web framework such as Django or Flask.
Personally, I’m a big fan of Django. I’ve done a ton of work with Django in the past and loved every minute of it. And while it’s a bit of overkill for a small example project like this (especially when compared to a micro-framework such as Flask), I still think it’s an excellent choice. And of course, feel free to port this implementation into whichever framework best fits your own personal needs and preferences.
Anyway, in the rest of this tutorial I’ll be demonstrating how to create your own face detection API in only 5 minutes!
And as a bonus at the end of this article, I’ll give you a sneak peak of what’s on deck for next week — the unveiling of the (free) PyImageSearch web API.
Set your timers — Ready. Set. Go!
Looking for the source code to this post?
Jump Right To The Downloads SectionOpenCV and Python versions:
In order to run this example, you’ll need Python 2.7 and OpenCV 2.4.X.
Creating a face detection API with Python and OpenCV (in just 5 minutes)
After getting a ton of awesome reader feedback on the step-by-step tutorial on installing OpenCV on your Raspberry Pi 2/B+, I decided to take the same approach to this tutorial — I’ve created 8 simple, bite size steps to get your own face detection API up and running.
The goal here is that if you were to run the commands presented at each of the steps below, along with copy-and-paste the code snippets into the appropriate Django project files, that your face detection API would be up and running on your local system within 5 minutes.
However, I am going to start by assuming that you have OpenCV setup and installed. If not, then you’re going to need to install it prior to proceeding (and that’s going to break the 5 minute goal of this post).
Disclaimer: Before finishing my PhD, I used to do a lot of web application development. But over the past 3 years my focus has been entirely on the computer vision, scientific, and research side of things. If my Django code is not perfect, I’ll be the first to apologize. However, also realize that the intention of this tutorial is not to build a “bulletproof” API using all the latest Django bells and whistles. Instead, it’s meant to be a simple and concise demonstration on how you can take a computer vision application (specifically, a face detector) and turn into a web API with little effort.
Step 1: Setup your environment
The first step is to get our development environment setup and running. We’ll need only three required packages:
$ pip install numpy django requests
We need NumPy since OpenCV represents images as multi-dimensional NumPy arrays. And technically NumPy should already be installed if you have OpenCV installed as well.
The django
packages obviously contains the Django web framework.
And we’ll also include use the requests package to make interfacing with our web API much easier.
Step 2: Create our Django project
Now that the pre-requisites are installed, let’s setup our Django project:
$ django-admin startproject cv_api $ cd cv_api
These commands create a new Django project, adequately named cv_api
.
The cv_api
directory now contains all the necessary code and configurations to run our Django project — this code has been auto-generated and includes basic database configurations, project based options, and application settings. It also includes the ability to run a built in web server for testing (which we’ll get to later in this tutorial).
Here’s the directory structure of our new project:
|--- cv_api | |--- cv_api | |--- __init__.py | |--- settings.py | |--- urls.py | |--- wsgi.py | |--- manage.py
Before we proceed, let’s briefly chat about the structure of a Django project.
A Django project consists of multiple apps. And one of the core paradigms of the Django framework is that each app should be reusable in any project (theoretically, anyway) — therefore, we do not (normally) place any app-specific code inside the cv_api
directory. Instead, we explicitly create separate “apps” inside the cv_api
project.
With this in mind, let’s create a new app named face_detector
, which will house our code for building a face detection API:
$ python manage.py startapp face_detector
Notice how we now have a face_detector
directory inside our cv_api
directory. Again, Django has auto-generated some boilerplate code and configurations for our face_detector
app, which we can see the contents of below:
|--- cv_api | |--- cv_api | |--- __init__.py | |--- settings.py | |--- urls.py | |--- wsgi.py | |--- face_detector | |--- __init__.py | |--- admin.py | |--- migrations | |--- models.py | |--- tests.py | |--- views.py | |--- manage.py
Now that our Django project is all setup, we can get to coding.
Step 3: My personal computer vision API template
This step is where the actual “wrapping” of our computer vision project comes into place and where we are going to insert our face detection code.
The Django framework is a type of a Model-View-Template (MVT) framework, similar to a Model-View-Controller, where a “View” can be thought of as a type of web page. Inside the View you place all the necessary code to interact with Models, such as pulling data from a database, and processing it. The View is also responsible for populating the Template before it is sent to the user.
In our case, all we need is the View portion of the framework. We are not going to be interacting with the database, so the Model is not relevant to us. And we are going to ship the results of our API back to the end-user as a JSON object, so we won’t need the Template to render any HTML for us.
Again, our API is simply going to accept an image from a URL/stream, process it, and return a JSON response.
Step 3a: My personal boilerplate template when building a Python + OpenCV API
Before we dive into the code to perform the actual face detection, I want to share with you my personal boilerplate template when building a Python + OpenCV. You can use this code as a starting point when building your own computer vision API.
# import the necessary packages from django.views.decorators.csrf import csrf_exempt from django.http import JsonResponse import numpy as np import urllib import json import cv2 @csrf_exempt def detect(request): # initialize the data dictionary to be returned by the request data = {"success": False} # check to see if this is a post request if request.method == "POST": # check to see if an image was uploaded if request.FILES.get("image", None) is not None: # grab the uploaded image image = _grab_image(stream=request.FILES["image"]) # otherwise, assume that a URL was passed in else: # grab the URL from the request url = request.POST.get("url", None) # if the URL is None, then return an error if url is None: data["error"] = "No URL provided." return JsonResponse(data) # load the image and convert image = _grab_image(url=url) ### START WRAPPING OF COMPUTER VISION APP # Insert code here to process the image and update # the `data` dictionary with your results ### END WRAPPING OF COMPUTER VISION APP # update the data dictionary data["success"] = True # return a JSON response return JsonResponse(data) def _grab_image(path=None, stream=None, url=None): # if the path is not None, then load the image from disk if path is not None: image = cv2.imread(path) # otherwise, the image does not reside on disk else: # if the URL is not None, then download the image if url is not None: resp = urllib.urlopen(url) data = resp.read() # if the stream is not None, then the image has been uploaded elif stream is not None: data = stream.read() # convert the image to a NumPy array and then read it into # OpenCV format image = np.asarray(bytearray(data), dtype="uint8") image = cv2.imdecode(image, cv2.IMREAD_COLOR) # return the image return image
This boilerplate API code defines two functions: detect
, which is our actual view, and _grab_image
, which is a nice little convenience function to read an image from disk, URL, or stream into OpenCV format. From a code organization and reusability perspective, you probably want to put the *_grab_image* function in a “utilities” module that is globally accessible throughout the Django project. But as a manner of completeness, I have included the _grab_image
function inside the views.py
file — I’ll leave it as a personal decision as to where you want to store this function.
Most of our time should be spent examining the detect
method. In reality, you could call this method whatever you want, but you probably want to make the name relevant to the goal the function is accomplishing. In the context of face detection, naming the main API endpoint as detect
in the face_detection
Django app seems fitting.
The detect
method accepts a single parameter, a request
, which is a Django object containing properties germane to the web request.
Inside the actual view, I like to define a data
dictionary. This dictionary represents all data that will be JSON-ified and shipped back to the user. At a bare minimum this dictionary should include a success/failure flag.
From there, we need to process the actual request
and determine how the image was sent to our API.
If our image was uploaded via multi-part form data, we can simply process the data stream directly and read it into OpenCV format (Lines 17-19).
Otherwise, we’ll assume that instead of the raw image being uploaded, a URL pointing to an image was passed into our API. In that case, we’ll read the image from the URL and into OpenCV format (Lines 22-32).
Lines 34-37 is where you would actually “wrap” your computer vision app. Here you would insert any code related to processing, manipulating, classifying, etc. of your image. You’ll also want to update your data
dictionary with any relevant information related to the results of processing your image.
Finally, after all the image processing is done, we send a JSON response of the data
back to the user on Line 43.
Step 4: Inserting the face detector into my template API
Now that we have examined the boilerplate code for a Python + OpenCV web API, let’s take it and insert the face detector. Open up the cv_api/face_detector/views.py
file and insert the following code:
# import the necessary packages from django.views.decorators.csrf import csrf_exempt from django.http import JsonResponse import numpy as np import urllib import json import cv2 import os # define the path to the face detector FACE_DETECTOR_PATH = "{base_path}/cascades/haarcascade_frontalface_default.xml".format( base_path=os.path.abspath(os.path.dirname(__file__))) @csrf_exempt def detect(request): # initialize the data dictionary to be returned by the request data = {"success": False} # check to see if this is a post request if request.method == "POST": # check to see if an image was uploaded if request.FILES.get("image", None) is not None: # grab the uploaded image image = _grab_image(stream=request.FILES["image"]) # otherwise, assume that a URL was passed in else: # grab the URL from the request url = request.POST.get("url", None) # if the URL is None, then return an error if url is None: data["error"] = "No URL provided." return JsonResponse(data) # load the image and convert image = _grab_image(url=url) # convert the image to grayscale, load the face cascade detector, # and detect faces in the image image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) detector = cv2.CascadeClassifier(FACE_DETECTOR_PATH) rects = detector.detectMultiScale(image, scaleFactor=1.1, minNeighbors=5, minSize=(30, 30), flags=cv2.cv.CV_HAAR_SCALE_IMAGE) # construct a list of bounding boxes from the detection rects = [(int(x), int(y), int(x + w), int(y + h)) for (x, y, w, h) in rects] # update the data dictionary with the faces detected data.update({"num_faces": len(rects), "faces": rects, "success": True}) # return a JSON response return JsonResponse(data) def _grab_image(path=None, stream=None, url=None): # if the path is not None, then load the image from disk if path is not None: image = cv2.imread(path) # otherwise, the image does not reside on disk else: # if the URL is not None, then download the image if url is not None: resp = urllib.urlopen(url) data = resp.read() # if the stream is not None, then the image has been uploaded elif stream is not None: data = stream.read() # convert the image to a NumPy array and then read it into # OpenCV format image = np.asarray(bytearray(data), dtype="uint8") image = cv2.imdecode(image, cv2.IMREAD_COLOR) # return the image return image
As you can see, we haven’t inserted much code beyond the standard boilerplate OpenCV API template.
The first thing you’ll notice is that I’m defining the FACE_DETECTOR_PATH
(Lines 11 and 12), which is simply the path to where the pre-trained OpenCV face detector lives — in this case, I’ve included the pre-trained face detector inside the face_detector/cascades
application.
The real face detection magic takes place on Lines 41-44.
Now that we have our image in OpenCV format (whether it was uploaded via multi-part form encoded data or via URL), we start by converting our input image to grayscale. We discard any color information since color add little to face detection accuracy.
From there we load our face detector
on Line 42, supplying the path to our pre-trained face detector. Now that our face detector is loaded, we can apply the detectMultiScale
method and detect the actual faces.
I’m not going to perform an exhaustive review of the parameters to detectMultiScale
since I cover them in-depth inside my book, Practical Python and OpenCV + Case Studies, but the important takeaway here is that these parameters influence the speed, efficiency, and the false-positive detection rate of faces in images.
The detectMultiScale
function returns a list of bounding boxes, or simply the (x, y)-coordinates, and width and height, of the faces in the image. Given this list of bounding boxes, we package them into our data
dictionary and ship them back to the user on Lines 47-53.
Not too bad, right?
As you can see, the majority of the code is still related the computer vision API boilerplate — the actual detection of the faces took only a few lines of code.
Step 5: Update the URLs to include an endpoint to our API
But before we can access our face detection API, we first need to update the project URLs to include our face detection endpoint.
Simply open up the cv_api/cv_api/urls.py
file, and update it to include a URL endpoint to our face detection view:
from django.conf.urls import patterns, include, url from django.contrib import admin urlpatterns = patterns('', # Examples: url(r'^face_detection/detect/$', 'face_detector.views.detect'), # url(r'^$', 'cv_api.views.home', name='home'), # url(r'^blog/', include('blog.urls')), url(r'^admin/', include(admin.site.urls)), )
Step 6: Run the Django test server
Alright, now we’re ready to test out our face detection API!
Simply use your terminal to navigate back to the cv_api
project root and fire up the test server:
$ python manage.py runserver
Our web server is now available at http://localhost:8000
And if you open up your web browser and point it to http://localhost:8000/face_detection/detect/
you should see the JSON response from our face detection API:
Obviously, since we have not uploaded an image to our API, we are getting a JSON response of {success: false}
, implying that a face could not be detected in the (non-existent) image.
Step 7: Test out the face detection API via cURL
Before we do anything too crazy, let’s test out our face detection using cURL. We’ll start by passing the URL of this image (https://pyimagesearch.com/wp-content/uploads/2015/05/obama.jpg) of Barack Obama into our face detection API:
Let’s construct the command to interact with our face detection API via cURL:
$ curl -X POST 'http://localhost:8000/face_detection/detect/' -d 'url=https://pyimagesearch.com/wp-content/uploads/2015/05/obama.jpg' ; echo "" {"num_faces": 1, "success": true, "faces": [[410, 100, 591, 281]]}
And sure enough, based on the output we were able to detect Obama’s face (although we can’t yet visualize the bounding box, we’ll do that in the following section).
Let’s try another image, this time uploading via file instead of URL:
Again, we’ll need to construct our cURL command, assuming that the name of the above file is adrian.jpg
:
$ curl -X POST -F image=@adrian.jpg 'http://localhost:8000/face_detection/detect/' ; echo "" {"num_faces": 1, "success": true, "faces": [[180, 114, 222, 156]]}
And based on the JSON response we were indeed about to detect the face in the image.
Step 8: Write some Python code to interact with the face detection API
Using cURL to test out our face detection API was simple enough — but let’s write some actual Python code that can upload and interact with images sent to our API. This way we can actually ingest the JSON response and draw the bounding boxes surrounding the faces in the images.
Open up a new file, name it test_api.py
, and include the following code:
# import the necessary packages import requests import cv2 # define the URL to our face detection API url = "http://localhost:8000/face_detection/detect/" # use our face detection API to find faces in images via image URL image = cv2.imread("obama.jpg") payload = {"url": "https://pyimagesearch.com/wp-content/uploads/2015/05/obama.jpg"} r = requests.post(url, data=payload).json() print "obama.jpg: {}".format(r) # loop over the faces and draw them on the image for (startX, startY, endX, endY) in r["faces"]: cv2.rectangle(image, (startX, startY), (endX, endY), (0, 255, 0), 2) # show the output image cv2.imshow("obama.jpg", image) cv2.waitKey(0) # load our image and now use the face detection API to find faces in # images by uploading an image directly image = cv2.imread("adrian.jpg") payload = {"image": open("adrian.jpg", "rb")} r = requests.post(url, files=payload).json() print "adrian.jpg: {}".format(r) # loop over the faces and draw them on the image for (startX, startY, endX, endY) in r["faces"]: cv2.rectangle(image, (startX, startY), (endX, endY), (0, 255, 0), 2) # show the output image cv2.imshow("adrian.jpg", image) cv2.waitKey(0)
We’ll start by importing the requests
package to handle sending and receiving data from our API. We’ll also import cv2
for our OpenCV bindings.
From there, Lines 6-20 handle uploading an image via URL to our face detection API.
All we need to do is define a payload
dictionary that contains a url
key, with the corresponding value being our image URL of Barack Obama above. We then ship this payload dictionary to the face detection endpoint (Lines 6-11), where our API responds with the number of faces detected in the image, along with the bounding boxes of the faces. Finally, we take the bounding boxes and draw them on the actual image (Lines 15-20).
We’ll also upload an image from disk to our face detection API on Lines 24-35. Just like uploading an image via URL, uploading from an image from disk is just as simple — we just need to specify the files
parameter rather than the data
parameter when making a call to requests.post
.
To see our script in action, just execute the following command:
$ python test_api.py
First, we’ll see the image of the bounding box drawn around Obama’s face:
Followed by the successful detection and bounding box around my face:
Clearly our face detection API is working! And we were able to utilize it via both image file upload and image URL.
Faces aren’t being detected in my images. What gives?
If you downloaded the code to this post and gave it a try with your own images, you might have run into circumstances where faces were not detected in your images — even though the faces were clearly visible.
So what gives?
While Haar cascades are quite fast and can obtain decent accuracy, they have two prominent shortcomings.
The first is parameter tuning — you’ll need to tweak the parameters of detectMultiScale
to get the detection just right for many images. It can be a real pain, especially if you are looking to process many images in bulk and can’t visually inspect the output of each face detection.
The second shortcoming of Haar cascades is that they can be highly prone to false positives, meaning that faces are detected when there really aren’t any faces there! Again, this problem can be fixed by tuning the parameters of detectMultiScale
on a case-by-case basis.
In reality, Haar cascades and the Viola-Jones detector, while effective, have ran their course in computer vision history. For highly accurate object detectors we now rely on HOG + Linear SVMs and deep learning based methods, especially Convolutional Neural Networks.
That all said, it’s hard to beat the pure speed of Haar cascades, even if their accuracy and false-positive rate is a bit sub-par, at least compared to today’s state-of-the-art techniques.
Bonus: A live example of the face detection API
Want to give the face detection API a try? No problem.
I already have a face detection API instance spun up and running. You can find the face detection API endpoint here:
http://api.pyimagesearch.com/face_detection/detect/
And here’s another cURL example of detecting faces in an image to get you started. Only this time we are using the live API endpoint:
$ curl -X POST 'http://api.pyimagesearch.com/face_detection/detect/' -d 'url=https://pyimagesearch.com/wp-content/uploads/2015/05/obama.jpg' ; echo "" {"num_faces": 1, "success": true, "faces": [[410, 100, 591, 281]]}
So given the URL http://api.pyimagesearch.com, I bet you can guess what next week’s announcement is…but I’ll leave the rest until next Monday.
What's next? We recommend PyImageSearch University.
84 total classes • 114+ hours of on-demand code walkthrough videos • Last updated: February 2024
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled
I strongly believe that if you had the right teacher you could master computer vision and deep learning.
Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?
That’s not the case.
All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.
Inside PyImageSearch University you'll find:
- ✓ 86 courses on essential computer vision, deep learning, and OpenCV topics
- ✓ 86 Certificates of Completion
- ✓ 115+ hours of on-demand video
- ✓ Brand new courses released regularly, ensuring you can keep up with state-of-the-art techniques
- ✓ Pre-configured Jupyter Notebooks in Google Colab
- ✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
- ✓ Access to centralized code repos for all 540+ tutorials on PyImageSearch
- ✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
- ✓ Access on mobile, laptop, desktop, etc.
Summary
In this blog post we learned how to wrap our computer vision applications into an easy to use and consume, JSON web API. We utilized the Django web framework to build our API, but we could use any other Python web framework such as Flask — it really depends on your personal preference and how simple or advanced you want your API to be.
I also shared my personal boilerplate API code that you can use to wrap your own computer vision applications and make them web-ready.
Finally, I gave a sneak preview of next week’s big announcement — the (free) PyImageSearch API.
So, what’s next?
If you enjoyed this blog post and want to learn more about computer vision and OpenCV, I would definitely recommend taking a look at my book, Practical Python and OpenCV + Case Studies.
Inside my book you’ll continue to learn all about face detection (including an explanation of the detectMultiScale
parameters I mentioned earlier in this post) in both images and video, tracking objects in video, recognizing handwritten digits, and even how to identify book covers in a snap of your smartphone.
If these topics sound interesting to you, definitely take a look and consider grabbing a free sample chapter.
Download the Source Code and FREE 17-page Resource Guide
Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!