In today’s blog post we are going to create a deep learning REST API that wraps a Keras model in an efficient, scalable manner.
Our Keras + deep learning REST API will be capable of batch processing images, scaling to multiple machines (including multiple web servers and Redis instances), and round-robin scheduling when placed behind a load balancer.
To accomplish this we will be using:
- Keras
- Redis (an in-memory data structure store)
- Flask (a micro web framework for Python)
- Message queuing and message broker programming paradigms
This blog post is a bit more advanced than other tutorials on PyImageSearch and is intended for readers:
- Who are familiar with the Keras deep learning library
- Who have an understanding of web frameworks and web services (and ideally coded a simple website/web service before)
- Who understand basic data structures, such as hash tables/dictionaries, lists, along with their associated asymptotic complexities
For a more simple Keras + deep learning REST API, please refer to this guest post I did on the official Keras.io blog.
To learn how to create your own scalable Keras + deep learning REST API, just keep reading!
Looking for the source code to this post?
Jump Right To The Downloads SectionA scalable Keras + deep learning REST API
2020-06-16 Update: This blog post is now TensorFlow 2+ compatible!
Today’s tutorial is broken into multiple parts.
We’ll start with a brief discussion of the Redis data store and how it can be used to facilitate message queuing and message brokering.
From there, we’ll configure our Python development environment by installing the required Python packages to build our Keras deep learning REST API.
Once we have our development environment configured we can implement our actual Keras deep learning REST API using the Flask web framework. After implementing, we’ll start the Redis and Flask servers, follow by submitting inference requests to our deep learning API endpoint using both cURL and Python.
Finally, we’ll end with a short discussion on the considerations you should keep in mind when building your own deep learning REST API.
A short introduction to Redis as a REST API message broker/message queue
Redis is an in-memory data store. It is different than a simple key/value store (such as memcached) as it can can store actual data structures.
Today we’re going to utilize Redis as a message broker/message queue. This involves:
- Running Redis on our machine
- Queuing up data (images) to our Redis store to be processed by our REST API
- Polling Redis for new batches of input images
- Classifying the images and returning the results to the client
To read more about Redis, I encourage you to review this short introduction.
Configuring and installing Redis for our Keras REST API
Redis is very easy to install. Below you’ll find the commands to download, extract, and install Redis on your system:
$ wget http://download.redis.io/redis-stable.tar.gz $ tar xvzf redis-stable.tar.gz $ cd redis-stable $ make $ sudo make install
To start the Redis server, use the following command:
$ redis-server
Leave this terminal open to keep the Redis data store running.
In another terminal, you can validate Redis is up and running:
$ redis-cli ping PONG
Provided that you get a PONG
back from Redis, you’re ready to go.
Configuring your Python development environment to build a Keras REST API
I recommend that you work on this project inside of a Python virtual environment so that it does not impact system level Python and projects.
To do this, you’ll need to install pip, virtualenv, and virtualenvwrapper (provided you haven’t already). Instructions on configuring these tools in your environment are described in:
Please note that PyImageSearch does not recommend or support Windows for CV/DL projects.
You’ll also need the following packages installed into your virtual environment:
$ workon dl4cv $ pip install flask $ pip install gevent $ pip install requests $ pip install redis
That’s it!
Implementing a scalable Keras REST API
Let’s get started building our server script. For convenience I’ve implemented the server in a single file, however it can be modularized as you see fit.
For best results and to avoid copy/paste errors, I encourage you to use the “Downloads” section of this blog post to grab the associated scripts and images.
Let’s open up run_keras_server.py
and walk through it together:
# import the necessary packages from tensorflow.keras.applications import ResNet50 from tensorflow.keras.preprocessing.image import img_to_array from tensorflow.keras.applications.resnet50 import preprocess_input from tensorflow.keras.applications.resnet50 import decode_predictions from threading import Thread from PIL import Image import numpy as np import base64 import flask import redis import uuid import time import json import sys import io
There are quite a few imports listed above, notably ResNet50
, flask
, and redis
.
For the sake of simplicity, we’ll be using ResNet pre-trained on the ImageNet dataset. I’ll point out where you can swap out ResNet for your own models.
The flask
module contains the Flask library (used to build our web API). The redis
module will enable us to interface with the Redis data store.
From there, let’s initialize constants which will be used throughout run_keras_server.py
:
# initialize constants used to control image spatial dimensions and # data type IMAGE_WIDTH = 224 IMAGE_HEIGHT = 224 IMAGE_CHANS = 3 IMAGE_DTYPE = "float32" # initialize constants used for server queuing IMAGE_QUEUE = "image_queue" BATCH_SIZE = 32 SERVER_SLEEP = 0.25 CLIENT_SLEEP = 0.25
We’ll be passing float32
images to the server with dimensions of 224 x 224 and containing 3
channels.
Our server can handle a BATCH_SIZE = 32
. If you have GPU(s) on your production system, you’ll want to tune your BATCH_SIZE
for optimal performance.
I’ve found that setting both SERVER_SLEEP
and CLIENT_SLEEP
to 0.25
seconds (the amount of time the server and client will pause before polling Redis again, respectively) will work well on most systems. Definitely adjust these constants if you’re building a production system.
Let’s kick off our Flask app and Redis server:
# initialize our Flask application, Redis server, and Keras model app = flask.Flask(__name__) db = redis.StrictRedis(host="localhost", port=6379, db=0) model = None
Here you can see how easy it is to start Flask.
I’ll assume that before you run this server script that your Redis server is running. Our Python script connect to the Redis store on our localhost
on port 6379
(the default host and port values for Redis).
Don’t forget to initialize a global Keras model
to None
here as well.
From there let’s handle serialization of images:
def base64_encode_image(a): # base64 encode the input NumPy array return base64.b64encode(a).decode("utf-8") def base64_decode_image(a, dtype, shape): # if this is Python 3, we need the extra step of encoding the # serialized NumPy string as a byte object if sys.version_info.major == 3: a = bytes(a, encoding="utf-8") # convert the string to a NumPy array using the supplied data # type and target shape a = np.frombuffer(base64.decodestring(a), dtype=dtype) a = a.reshape(shape) # return the decoded image return a
Redis will act as our temporary data store on the server. Images will come in to the server via a variety of methods such as cURL, a Python script, or even a mobile app.
Furthermore, images could come in only every once in awhile (a few every hours or days) or at a very high rate (multiple per second). We need to put the images somewhere as they queue up prior to being processed. Our Redis store will act as the temporary storage.
In order to store our images in Redis, they need to be serialized. Since images are just NumPy arrays, we can utilize base64 encoding to serialize the images. Using base64 encoding also has the added benefit of allowing us to use JSON to store additional attributes with the image.
Our base64_encode_image
function handles the serialization and is defined on Lines 36-38.
Similarly, we need to deserialize our image prior to passing them through our model. This is handled by the base64_decode_image
function on Lines 40-52.
Let’s pre-process our image:
def prepare_image(image, target): # if the image mode is not RGB, convert it if image.mode != "RGB": image = image.convert("RGB") # resize the input image and preprocess it image = image.resize(target) image = img_to_array(image) image = np.expand_dims(image, axis=0) image = preprocess_input(image) # return the processed image return image
On Line 54, I’ve defined a prepare_image
function which pre-processes our input image for classification using the ResNet50 implementation in Keras.. When utilizing your own models I would suggest modifying this function to perform any required pre-processing, scaling, or normalization.
From there we’ll define our classification method:
def classify_process(): # load the pre-trained Keras model (here we are using a model # pre-trained on ImageNet and provided by Keras, but you can # substitute in your own networks just as easily) print("* Loading model...") model = ResNet50(weights="imagenet") print("* Model loaded")
The classify_process
function will be kicked off in its own thread as we’ll see in __main__
below. This function will poll for image batches from the Redis server, classify the images, and return the results to the client.
Line 73 loads the model
. I’ve sandwiched this action with terminal print
messages — depending on the size of your Keras model, loading be instantaneous or it could take a few seconds.
Loading the model happens only once when this thread is launched — it would be terribly slow if we had to load the model each time we wanted to process an image and furthermore it could lead to a server crash due to memory exhaustion.
After loading the model, this thread will continually poll for new images and then classify them:
# continually pool for new images to classify while True: # attempt to grab a batch of images from the database, then # initialize the image IDs and batch of images themselves queue = db.lrange(IMAGE_QUEUE, 0, BATCH_SIZE - 1) imageIDs = [] batch = None # loop over the queue for q in queue: # deserialize the object and obtain the input image q = json.loads(q.decode("utf-8")) image = base64_decode_image(q["image"], IMAGE_DTYPE, (1, IMAGE_HEIGHT, IMAGE_WIDTH, IMAGE_CHANS)) # check to see if the batch list is None if batch is None: batch = image # otherwise, stack the data else: batch = np.vstack([batch, image]) # update the list of image IDs imageIDs.append(q["id"])
Here we’re first using the Redis database’s lrange
function to get, at most, BATCH_SIZE
images from our queue (Line 80).
From there we initialize our imageIDs
and batch
(Lines 81 and 82) and begin looping over the queue
beginning on Line 85.
In the loop, we first decode the object and deserialize it into a NumPy array, image
(Lines 87-89).
Next, on Lines 91-97, we’ll add the image
to the batch
(or if the batch
is currently None
we just set the batch
to the current image
).
We also append the id
of the image to imageIDs
(Line 100).
Let’s finish out the loop and function:
# check to see if we need to process the batch if len(imageIDs) > 0: # classify the batch print("* Batch size: {}".format(batch.shape)) preds = model.predict(batch) results = decode_predictions(preds) # loop over the image IDs and their corresponding set of # results from our model for (imageID, resultSet) in zip(imageIDs, results): # initialize the list of output predictions output = [] # loop over the results and add them to the list of # output predictions for (imagenetID, label, prob) in resultSet: r = {"label": label, "probability": float(prob)} output.append(r) # store the output predictions in the database, using # the image ID as the key so we can fetch the results db.set(imageID, json.dumps(output)) # remove the set of images from our queue db.ltrim(IMAGE_QUEUE, len(imageIDs), -1) # sleep for a small amount time.sleep(SERVER_SLEEP)
In this code block, we check if there are any images in our batch (Line 103).
If we have a batch of images, we make predictions on the entire batch by passing it through the model (Line 106).
From there, we loop over a the imageIDs
and corresponding prediction results
(Lines 111-123). These lines append labels and probabilities to an output list and then store the output in the Redis database using the imageID
as the key (Lines 117-123).
We remove the set of images that we just classified from our queue using ltrim
on Line 126.
And finally, we sleep for the set SERVER_SLEEP
time and await the next batch of images to classify.
Let’s handle the /predict
endpoint of our REST API next:
@app.route("/predict", methods=["POST"]) def predict(): # initialize the data dictionary that will be returned from the # view data = {"success": False} # ensure an image was properly uploaded to our endpoint if flask.request.method == "POST": if flask.request.files.get("image"): # read the image in PIL format and prepare it for # classification image = flask.request.files["image"].read() image = Image.open(io.BytesIO(image)) image = prepare_image(image, (IMAGE_WIDTH, IMAGE_HEIGHT)) # ensure our NumPy array is C-contiguous as well, # otherwise we won't be able to serialize it image = image.copy(order="C") # generate an ID for the classification then add the # classification ID + image to the queue k = str(uuid.uuid4()) d = {"id": k, "image": base64_encode_image(image)} db.rpush(IMAGE_QUEUE, json.dumps(d))
As you’ll see later, when we POST to the REST API, we’ll be using the /predict
endpoint. Our server could, of course, have multiple endpoints.
We use the @app.route
decorator above our function in the format shown on Line 131 to define our endpoint so that Flask knows what function to call. We could easily have another endpoint which uses AlexNet instead of ResNet and we’d define the endpoint with associated function in a similar way. You get the idea, but for our purposes today, we just have one endpoint called /predict
.
Our predict
method defined on Line 132 will handle the POST requests to the server. The goal of this function is to build the JSON data
that we’ll send back to the client.
If the POST data contains an image (Lines 138 and 139) we convert the image to PIL/Pillow format and preprocess it (Lines 142-144).
While developing this script, I spent considerable time debugging my serialization and deserialization functions, only to figure out that I needed Line 148 to convert the array to C-contiguous ordering (which is something you can read more about here). Honestly, it was a pretty big pain in the ass to figure out, but I hope it helps you get up and running quickly.
If you were wondering about the id
mentioned back on Line 100, it is actually generated here using uuid
, a universally unique identifier, on Line 152. We use a UUID to prevent hash/key conflicts.
Next, we append the id
as well as the base64
encoding of the image
to the d
dictionary. It’s very simple to push this JSON data to the Redis db
using rpush
(Line 154).
Let’s poll the server to return the predictions:
# keep looping until our model server returns the output # predictions while True: # attempt to grab the output predictions output = db.get(k) # check to see if our model has classified the input # image if output is not None: # add the output predictions to our data # dictionary so we can return it to the client output = output.decode("utf-8") data["predictions"] = json.loads(output) # delete the result from the database and break # from the polling loop db.delete(k) break # sleep for a small amount to give the model a chance # to classify the input image time.sleep(CLIENT_SLEEP) # indicate that the request was a success data["success"] = True # return the data dictionary as a JSON response return flask.jsonify(data)
We’ll loop continuously until the model server returns the output predictions. We start an infinite loop and attempt to get the predictions Lines 158-160.
From there, if the output
contains predictions, we deserialize the results and add them to data
which will be returned to the client.
We also delete
the result from the db
(since we have pulled the results form the database and no longer need to store them in the database) and break
out of the loop (Lines 164-173).
Otherwise, we don’t have any predictions and we need to sleep and continue to poll (Line 177).
If we reach Line 180, we’ve successfully acquired our predictions. In this case we add a success
value of True
to the client data.
Note: For this example script, I didn’t bother adding timeout logic in the above loop which would ideally add a success
value of False
to the data. I’ll leave that up to you to handle and implement.
Lastly we call flask.jsonify
on data
and return it to the client (Line 183). This completes our predict function.
To demo our Keras REST API, we need a __main__
function to actually start the server:
# if this is the main thread of execution first load the model and # then start the server if __name__ == "__main__": # load the function used to classify input images in a *separate* # thread than the one used for main classification print("* Starting model service...") t = Thread(target=classify_process, args=()) t.daemon = True t.start() # start the web server print("* Starting web service...") app.run()
Lines 187-197 define the __main__
function which will kick off our classify_process
thread (Lines 191-193) and run the Flask app (Line 197).
Starting the scalable Keras REST API
To test our Keras deep learning REST API, be sure to download the source code + example images using the “Downloads” section of this blog post.
From there, let’s start the Redis server if it isn’t already running:
$ redis-server
Then, in a separate terminal, let’s start our REST API Flask server:
$ python run_keras_server.py Using TensorFlow backend. * Loading Keras model and Flask starting server...please wait until server has fully started ... * Running on http://127.0.0.1:5000
Additionally, I would suggest waiting until your model is loaded completely into memory before submitting requests to the server.
Now we can move on to testing the server with both cURL and Python.
Using cURL to access our Keras REST API
The cURL tool is available pre-installed on most (Unix-based) operating systems. We can POST an image file to our deep learning REST API at the /predict
endpoint by using the following command:
$ curl -X POST -F image=@jemma.png 'http://localhost:5000/predict'
You’ll receive the predictions back in JSON format right in your terminal:
{ "predictions": [ { "label": "beagle", "probability": 0.9461546540260315 }, { "label": "bluetick", "probability": 0.031958919018507004 }, { "label": "redbone", "probability": 0.006617196369916201 }, { "label": "Walker_hound", "probability": 0.0033879687543958426 }, { "label": "Greater_Swiss_Mountain_dog", "probability": 0.0025766862090677023 } ], "success": true }
Let’s try passing another image, this time a space shuttle:
$ curl -X POST -F image=@space_shuttle.png 'http://localhost:5000/predict' { "predictions": [ { "label": "space_shuttle", "probability": 0.9918227791786194 }, { "label": "missile", "probability": 0.006030891090631485 }, { "label": "projectile", "probability": 0.0021368064917623997 }, { "label": "warplane", "probability": 1.980597062356537e-06 }, { "label": "submarine", "probability": 1.8291866581421345e-06 } ], "success": true }
The results of which can be seen below:
Once again our Keras REST API has correctly classified the input image.
Using Python to submit requests to the Keras REST API
As you can see, verification using cURL was quite easy. Now let’s build a Python script that will POST an image and parse the returning JSON programmatically.
Let’s review simple_request.py
:
# import the necessary packages import requests # initialize the Keras REST API endpoint URL along with the input # image path KERAS_REST_API_URL = "http://localhost:5000/predict" IMAGE_PATH = "jemma.png"
We use Python requests
in this script to handle POSTing data to the server.
Our server is running on the localhost
and can be accessed on port 5000
with the endpoint /predict
as is specified by the KERAS_REST_API_URL
variable (Line 6). If the server is running remotely or on a different machine, be sure to specify the appropriate domain/ip, port, and endpoint.
We also define an IMAGE_PATH
(Line 7). In this case, jemma.png
is in the same directory as our script. If you want to test with other images, be sure to specify the full path to your input image.
Let’s load the image and send it off to the server:
# load the input image and construct the payload for the request image = open(IMAGE_PATH, "rb").read() payload = {"image": image} # submit the request r = requests.post(KERAS_REST_API_URL, files=payload).json() # ensure the request was sucessful if r["success"]: # loop over the predictions and display them for (i, result) in enumerate(r["predictions"]): print("{}. {}: {:.4f}".format(i + 1, result["label"], result["probability"])) # otherwise, the request failed else: print("Request failed")
We read the image on Line 10 in binary mode and put the it into a payload dictionary.
The payload is POST’ed to the server with requests.post
on Line 14.
If we get a success
message, we can loop over the predictions and print them to the terminal. I made this script simple, but you could also draw the highest prediction text on the image using OpenCV if you want to get fancy.
Running the simple request script
Putting the script to work is easy. Open up a terminal and execute the following command (provided both our Flask server and Redis server are running, of course).
$ python simple_request.py 1. beagle: 0.9462 2. bluetick: 0.0320 3. redbone: 0.0066 4. Walker_hound: 0.0034 5. Greater_Swiss_Mountain_dog: 0.0026
For the space_shuttle.png
, simply modify the IMAGE_PATH
variable:
IMAGE_PATH = "space_shuttle.png"
And from there, run the script again:
$ python simple_request.py 1. space_shuttle: 0.9918 2. missile: 0.0060 3. projectile: 0.0021 4. warplane: 0.0000 5. submarine: 0.0000
Considerations when scaling your deep learning REST API
If you anticipate heavy load for extended periods of time on your deep learning REST API you may want to consider a load balancing algorithm such as round-robin scheduling to help evenly distribute requests across multiple GPU machines and Redis servers.
Keep in mind that Redis is an in-memory data store so we can only store as many images in the queue we have available memory.
A single 224 x 224 x 3 image with a float32
data type will consume 60,2112 bytes of memory.
Assuming a server with a modest 16GB of RAM, this implies that we can hold approximately 26,500 images in our queue, but at that point we likely would want to add more GPU servers to burn through the queue faster.
However, there is a subtle problem…
Depending on how you deploy your deep learning REST API, there is a subtle problem with keeping the classify_process
function in the same file as the rest of our web API code.
Most web servers, including Apache and nginx, allow for multiple client threads.
If you keep classify_process
in the same file as your predict
view, then you may load multiple models if your server software deems it necessary to create a new thread to serve the incoming client requests — for every new thread, a new view will be created, and therefore a new model will be loaded.
The solution is to move classify_process
to an entirely separate process and then start it along with your Flask web server and Redis server.
In next week’s blog post I’ll build on today’s solution, show how to resolve this problem, and demonstrate:
- How to configure the Apache web server to serve our deep learning REST API
- How to run
classify_process
as an entirely separate Python script, avoiding “multiple model syndrome” - Provide stress test results, confirming and verifying that our deep learning REST API can scale under heavy load
What's next? We recommend PyImageSearch University.
84 total classes • 114+ hours of on-demand code walkthrough videos • Last updated: February 2024
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled
I strongly believe that if you had the right teacher you could master computer vision and deep learning.
Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?
That’s not the case.
All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.
Inside PyImageSearch University you'll find:
- ✓ 86 courses on essential computer vision, deep learning, and OpenCV topics
- ✓ 86 Certificates of Completion
- ✓ 115+ hours of on-demand video
- ✓ Brand new courses released regularly, ensuring you can keep up with state-of-the-art techniques
- ✓ Pre-configured Jupyter Notebooks in Google Colab
- ✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
- ✓ Access to centralized code repos for all 540+ tutorials on PyImageSearch
- ✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
- ✓ Access on mobile, laptop, desktop, etc.
Summary
In today’s blog post we learned how to build a scalable Keras + deep learning REST API.
To accomplish this, we:
- Built a simple Flask app to load our Keras model into memory and accept incoming requests.
- Utilized Redis to act as an in-memory message queue/message broker.
- Utilized threading to batch process input images, write them back to the message queue and then return the results to the client.
This method can scale to multiple machines, including multiple web servers and Redis instances.
I hope you enjoyed today’s blog post!
Be sure to enter your email address in the form below to be notified when future tutorials are published here on PyImageSearch!
Download the Source Code and FREE 17-page Resource Guide
Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!
Siva
Hi Adrian,
Thank you for the wonderful post! I was wondering if the architecture could have been simplified by replacing the Flask / Redis stack with a single Twisted server. What are your thoughts?
Adrian Rosebrock
Hey Siva — I’ve only used Twisted once so my knowledge on the library is pretty limited so I’m probably not the best person to address that question.
In any case, are you referring specifically to the polling of images when they are in the queue? If so, yes, the event-driven nature of Twisted would help with that. However, there is a problem when you consider the image queue:
1. CNNs are most efficient when processing images in batches. If you use Twisted for single events (such as a new image entering the queue) it won’t help as much since we would rather wait a tiny bit more of time to more efficiently (and quickly) classify a larger batch.
2. Redis is an incredibly fast in memory data store which allows us to batch queue our images. This batching goes back to point 1. It actually helps for speed.
Siva
Hi Adrian – yes, I was referring to the polling of the images. And, now the Flask / Redis architecture makes sense if it’s better to batch the image requests. Thanks!
TaeWoo Kim
Hey Adrian. Is there any benefit to batch prediction if I am using CPU only? I ran test on image classfication on video where im trying to classify each frame, on three scenarios
– one 1 image at a time
– batch of ALL images at once
– few batches of 32 frames/images
– many batches of 4 frames/images
Using CPU only, there was no real benefit of using batched predictions.. (batched were all 90+ seconds on a test video, where as 1 image/frame at a time took 85 seconds)
Adrian Rosebrock
You’ll see more benefit of batched prediction on your GPU rather than CPU.
Flo
Hey Adrian,
like always a wonderful post!
I have not worked with redis before, but from what I could glance from the documentation it looks to me that your keras – classify_process() will not scale well.
It first retrieves a new batch of images, processes them and then removes them from the queue. Assuming all workers access the same redis instance (the same image queue) that would mean a second worker could load the same batch of images while the first one is processing them. Not only would those images get analyzed twice, the slower worker would remove a batch of pictures without them having been seen by any model.
The StrictRedis docstrings mention two functions that could help:
– lock() – which supposedly “mimics the behavior of threading.Lock”. The solution would be to lock before reading and to release after deleting from the queue
– lpop() – “Remove[s] and return[s] the first item of the list”, so you would need a loop (and multiple round trips to redis) to get a batch
Let me know what you think
Adrian Rosebrock
Hey Flo — I discuss this in next week’s blog post as well, but the point of this method is to have one image queue per GPU. If you have multiple GPUs you’ll want to create a separate queue, for example image_queue_0, image_queue_1, image_queue_N for each of your N GPUs. This will prevent any issues with multiple GPUs processing the same batch.
Additionally, Redis is single threaded so if you use a different image queue name for each GPU you will not run into any batches being processed multiple times.
Again, make sure you read next week’s blog post so this point becomes more clear.
Fred
Hi, Adrian
If there are multiple queue, and each redis queue for each worker GPU, how to choose the queue where client put task in ?
Thanks!
Adrian Rosebrock
Round robin queuing would be the easiest and most efficient.
Akash
Hi Adrian! Thanks for this deeply informative post . Could we do the same for text recognition from images?
Adrian Rosebrock
Provided you have trained your model to perform text recognition you can swap in your model (instead of ResNet) and use it as an API in the exact same manner we have done in this blog post.
Casey
Wow this is very informative. I have your ImageNet Bundle and have yet to start it but if the quality is even close to this (which knowing your previous content it will be) you should have charged more. Excellent post!
Adrian Rosebrock
Thanks Casey 🙂 The ImageNet Bundle of Deep Learning for Computer Vision with Python is even more in-depth and high quality than this blog post. Enjoy it and please do reach out if you have any questions on it.
Charlie
Hi Adrian,
Why are you using threading instead of multiprocessing?
Thanks
Adrian Rosebrock
There isn’t a need for multiprocessing here. Threading is typically used for I/O bound tasks while multiprocessing is used for CPU heavy tasks.
Damon Wang
Hi Adrian,
Could you teach me how to serialize and deserialize the videos(e.g. MP4 videos)? Cause I want to use your blog to classify videos with other DL models.
Thanks
Adrian Rosebrock
There are a few ways to approach this. Is your goal to feed the video, one frame at a time, through the DL model?
Damon Wang
Thank you so much for your reply.
My goal is to feed the whole video to the DL model.
The steps of my project(based on Flask) includes:
1.feed the videos into Redis
2.Get the videos from Redis, extract the video frames, feed the frames into DL models, get the prediction of the video.
But I wonder whether I should serialize and deserialize the videos before feed videos into Redis.
Thanks
Adrian Rosebrock
Video files are significantly larger than images. I wouldn’t recommend putting the video itself into Redis as Redis is an in-memory file store. You could technically feed each frame, one-by-one, from the client to the server but that’s likely wasteful.
Instead, you should consider modifying this code so:
1. The video file is uploaded to the server and saved to disk
2. The video file is processed by the server (again, from disk, not via Redis)
3. The resulting video file or results are returned to the client
Again, I really do not recommend trying to store video files in Redis as you could quickly run out of memory.
TaeWoo Kim
In the case of video (i.e. classifying each frame), would redis even be needed at all?
TaeWoo Kim
In other words, for running image classification on videos, would your original post on keras blog (https://blog.keras.io/building-a-simple-keras-deep-learning-rest-api.html) would suffice, no?
Adrian Rosebrock
I wouldn’t use this method if you want to process entire video files. Video files are significantly larger and your system would quickly run out of RAM. I would use a hybrid approach where a video is uploaded, saved to disk, and a new job is kicked off that runs in the background to process all frames of the video.
Prakruti
Hi Adrian,
Is it possible to deploy a flask api as a service without being bounded to wsgi and apache ?
Cant one just execute the python file with api and use it from localhost:5000 ? A user without sudo rights would need something like this right ? Because one does not have access to apache config or rights to start apache server .
Also, What if I want to just call this api from another java wrapper ?
Adrian Rosebrock
1. Be careful if you use the Flask server for this. It’s not threaded as I discuss in both this post and this one. Even though your model will be loaded properly using the Flask testing server won’t use more than one thread so it defeats the purpose.
2. If you would like to call the API from Java you should look into the HTTP request libraries available for Java (I’m not familiar with them) but it’s 100% possible, just do your research and you’ll be fine 🙂
Regis Amichia
Hi Adrian,
First of all thanks a lot for this post.
I have an issue following your methods. I trained my model offline, saved it in a h.5 file and I would like to know how to upload it on Redis and then, load it in my code.
Thanks a lot for your answer
Adrian Rosebrock
I think you’re confusing what Redis does. Redis does not hold your model, the server does. Redis only holds the images in the queue. You can modify the
classify_process
function to load your own model using Keras’load_model
function. Be sure to refer to the Keras docs if you have never used this function before. I would also recommend reading through Deep Learning for Computer Vision with Python so you can study deep learning in more detail as well.Slim Frikha
Hi Adrian,
First, thanks for this great article with thorough explanations!
I noticed in this example that you actually programmed the scheduler logic with the 2 while True loops and what it basically does is the following:
– every X seconds, the server wokes up to check if there is images to predict in Redis queue
– every Y seconds, the server wokes to see if results are ready in Redis queue
I was wondering if you maybe tried or tested the same stack but with Celery as a task scheduler to avoid doing so.
Thanks!
Adrian Rosebrock
Celery would be a feasible solution as well. I wanted to keep this solution as a template for others to build off though. You can add any other bells and whistles you see fit.
Sivar
Hi Adrian,
Thank you for the excellent post!. I am trying to build a REST API to identify the user details as soon as the user image is sent to to the REST Server. There are about 70k users and I have only one image for each user. Highly appreciate your inputs on the below queries.
1. Can I do image classification to identify the user as soon as the user image is sent to the
REST Server.
2. How do I split train and test data set since I have only one image for each one of the 70k
users.
3. Do you have sample code to build the custom image model for my problem.
4. The each user images are already available in remote URL. Do I need to download
those images to build the custom user image mode.
Thanks.
Sivar
Adrian Rosebrock
I think it would be best to start with a bit more detail on the project itself. Is your goal to perform face recognition for each user?
Sonny
Hi Adrian, thanks for this great post!
Could you please point out the link to the article to solve the subtle problem?
Adrian Rosebrock
Sorry, not sure what you mean by “point out the link to the article”. What link are you referring to?
Nick
Great article!
Can you please clarify the advantage of pushing the images to Redis?
Is it merely that batch predicting is more memory efficient?
Does it increase the speed of processing multiple POST requests?
In theory, is there something about this method that makes the handling of requests quicker, than if images WEREN’T pushed to a Redis server?
Thank you!
Nick
Adrian Rosebrock
There are two advantages here. The first is batch processing is more efficient. The second is that you may wish to have dedicated queueing servers and dedicate inference servers.
Thinh
Hi Adrian, can you provide me the link to the post for the subtle solution that is discussed in this post with Redis server ?
Adrian Rosebrock
You mean this post?
Dubey
Hello Adrian,
Thanks for the great post.
As I know Apache creates a new thread for each incoming request, which means each thread would execute its own predict method. What do you mean by “each time a new view is created”? A new thread would mean what – new call to classify_process() or predict() method?
If its classify_process(), each a time model would be initialised, not just a few times. If its predict() method, we exactly want that.
Looking forward to positive response.
chen qu
Hi Adrian,
Another issue I observed is how you retrive the list from redis inside function classify_process().
From the code, the process is as follows:
1. retrieve the batch from the db queue
2. process(inference) the batch
3. insert result back in db by keys
4. remove the batch from the db queue
But, considering the senario that you have 2 independent processes, each executes its own classify_process(). Considering that step 2 inference takes some time, then the 2nd process will retrieve the same batch to work on, which is not intended.
I suppose that the correct pipleline could be as follows:
1. retrieve the batch from the db queue
2. remove the batch from the db queue
3. process(inference) the batch
4. insert result back in db by keys
In this way, the mono-thread architecture of redis will guarantee the correct manipulation of the db queue.
Kartikeya Bhardwaj
Hi Adrian
I have successfully deployed my CNN as a flask api like the above. My project requires me to send a Folder of Images rather than a single image for predictions.
Is there a way to modify “simple_request.py” file so that it’s able to send a Folder of many images as a Request ?
Thanks a lot!
JPS
Hello Adrian!
First of all, thank you so much for this post, it was very well explained and easy to understand. My question is about the image to predict. Is it possible for me to take an image from another directory? Because the script worked for the images that are in the same folder, but if I try to put a file from another directory (and I put the full path) it doesn’t work. Can you help me on this?
Adrian Rosebrock
Thanks, I’m glad you enjoyed the tutorial. As for your question, yes, it’s absolutely possible to supply paths to images in a different directory. Double-check and triple-check your image paths as they are likely incorrect.
Masoud
Hi Adrian
Imagine a company has built API for video summarization. In each request for API, after getting arguments, the code loads the ML model (Tensorflow model for example), then does the process on video and gives back the download link for video.
Unfortunately the model is around 500 MB and it sounds really exhausting for the server to load the model on each call.
I was thinking of some way that if we could load the model in very first call and do all the other tasks in queue and if for some amount of time we did not get any other request then we free up the memory.
But I’m a data scientist not web developer and have no clue how to do it.
So Adrian can you help me?
Your website is so educative by the way.
keep shining Adrian.
Adrian Rosebrock
Loading the model on every call would be wasteful. See this tutorial for more information.