Multi-object tracking with dlib

In this tutorial, you will learn how to use the dlib library to efficiently track multiple objects in real-time video.

So far in this series on object tracking we have learned how to:

We can of course track multiple objects with dlib; however, to obtain the best performance possible, we need to utilize multiprocessing and distribute the object trackers across multiple cores of our processor.

Correctly utilizing multiprocessing allows us to improve our dlib multi-object tracking frames per second (FPS) throughput rate by over 45%!

To learn how to track multiple objects using dlib, just keep reading!

Looking for the source code to this post?

Multi-object tracking with dlib

In the first part of this guide, I’ll demonstrate how to can implement a simple, naïve dlib multi-object tracking script. This program will track multiple objects in video; however, we’ll notice that the script runs a bit slow.

To increase our FPS throughput rate I’ll show you a faster, more efficient dlib multi-object tracker implementation.

Finally, I’ll discuss some improvements and suggestions you can make to enhance our multi-object tracking implementations as well.

Project structure

To get started, make sure you use the “Downloads” section of this tutorial to download the source code and example video.

From there, you can use the tree command to view our project structure:

$ tree
.
├── mobilenet_ssd
│   ├── MobileNetSSD_deploy.caffemodel
│   └── MobileNetSSD_deploy.prototxt
├── multi_object_tracking_slow.py
├── multi_object_tracking_fast.py
├── race.mp4
├── race_output_slow.avi
└── race_output_fast.avi

1 directory, 7 files

The mobilenet_ssd/ directory contains our MobileNet + SSD Caffe model files which allow us to detect people (along with other objects).

We’ll review two Python scripts today:

multi_object_tracking_slow.py : The simple “naïve” method of dlib multiple object tracking.
multi_object_tracking_fast.py : The advanced, fast, method which takes advantage of multiprocessing.

The remaining three files are videos. We have the original race.mp4 video and two processed output videos.

The “naïve” dlib multiple object tracking implementation

The first dlib multi-object tracking implementation we are going to cover today is “naïve” in the sense that it will:

Utilize a simple list of tracker objects.
Update each of the trackers sequentially, using only a single core of our processor.

For some object tracking tasks this implementation will be more than sufficient; however, to optimize our FPS throughput rate, we should distribute the object trackers across multiple processes.

We’ll start with our simple implementation in this section and then move on to the faster method in the next section.

To get started, open up the multi_object_tracking_slow.py script and insert the following code:

# import the necessary packages
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import dlib
import cv2

We begin by importing necessary packages and modules on Lines 2-7. Most importantly we’ll be using dlib and OpenCV. We’ll also use some features from my imutils package of convenience functions such as the frames per second counter.

To install dlib, follow this guide. I have a number of OpenCV installation tutorials available as well (even for the latest OpenCV 4!). You might even try the fastest way to install OpenCV on your system via pip.

To install imutils , simply use pip in your terminal:

$ pip install --upgrade imutils

Now that we (a) have the software installed, and (b) have placed the relevant import statements in our script, let’s parse our command line arguments:

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-p", "--prototxt", required=True,
	help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=True,
	help="path to Caffe pre-trained model")
ap.add_argument("-v", "--video", required=True,
	help="path to input video file")
ap.add_argument("-o", "--output", type=str,
	help="path to optional output video file")
ap.add_argument("-c", "--confidence", type=float, default=0.2,
	help="minimum probability to filter weak detections")
args = vars(ap.parse_args())

If you aren’t familiar with the terminal and command line arguments, please give this post a read.

Our script processes the following command line arguments at runtime:

--prototxt : The path to the Caffe “deploy” prototxt file.
--model : The path to the model file which accompanies the prototxt.
--video : The path to the input video file. We’ll perform multi-object tracking with dlib on this video.
--output : An optional path to an output video file. If no path is specified then no video will be output to disk. I recommend outputting to an .avi or .mp4 file.
--confidence : An optional override for the object detection confidence threshold of 0.2 . This value represents the minimum probability to filter weak detections from the object detector.

Let’s define our list of CLASSES that this model supports as well as load our model from disk:

# initialize the list of class labels MobileNet SSD was trained to
# detect
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
	"bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
	"dog", "horse", "motorbike", "person", "pottedplant", "sheep",
	"sofa", "train", "tvmonitor"]

# load our serialized model from disk
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])

The MobileNet SSD pre-trained Caffe model supports 20 classes and 1 background class. The CLASSES are defined on Lines 25-28 in list form.

Note: Do not modify this list or the ordering of class objects if you’re using the Caffe model provided in the “Downloads”. Similarly, if you happen to be loading a different model, you’ll need to define the classes that the model supports here (order does matter). If you’re curious how our object detector works, be sure to refer to this post.

We’re only concerned about the "person" class for today’s foot race example, but you could easily modify Line 95 (covered later in this post) below to track alternate class(es).

On Line 32, we load our pre-trained object detector model. We will use our pre-trained SSD to detect the presence of objects in a video. From there we will create a dlib object tracker to track each of the detected objects.

We have a few more initializations to perform:

# initialize the video stream and output video writer
print("[INFO] starting video stream...")
vs = cv2.VideoCapture(args["video"])
writer = None

# initialize the list of object trackers and corresponding class
# labels
trackers = []
labels = []

# start the frames per second throughput estimator
fps = FPS().start()

On Line 36, we initialize our video stream — we’ll be reading frames from our input video one at a time.

Subsequently, on Line 37 our video writer is initialized to None . We’ll work more with the video writer in the upcoming while loop.

Now let’s initialize our trackers and labels lists on Lines 41 and 42.

And finally, we start our frames per second counter on Line 45.

We’re all set to begin processing our video:

# loop over frames from the video file stream
while True:
	# grab the next frame from the video file
	(grabbed, frame) = vs.read()

	# check to see if we have reached the end of the video file
	if frame is None:
		break

	# resize the frame for faster processing and then convert the
	# frame from BGR to RGB ordering (dlib needs RGB ordering)
	frame = imutils.resize(frame, width=600)
	rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

	# if we are supposed to be writing a video to disk, initialize
	# the writer
	if args["output"] is not None and writer is None:
		fourcc = cv2.VideoWriter_fourcc(*"MJPG")
		writer = cv2.VideoWriter(args["output"], fourcc, 30,
			(frame.shape[1], frame.shape[0]), True)

On Line 48 we begin looping over frames, where Line 50 actually grabs the frame itself.

A quick check to see if we’ve reached the end of the video file and need to stop looping is made on Lines 53 and 54.

Preprocessing takes place on Lines 58 and 59. First, the frame is resized to 600 pixels wide, maintaining aspect ratio. Then, the frame is converted to the rgb color channel ordering for dlib compatibility (OpenCV’s default is BGR and dlib’s default is RGB).

From there we instantiate the video writer (if necessary) on Lines 63-66. To learn more about writing video to disk with OpenCV, check out my previous blog post.

Let’s begin the object detection phase:

	# if there are no object trackers we first need to detect objects
	# and then create a tracker for each object
	if len(trackers) == 0:
		# grab the frame dimensions and convert the frame to a blob
		(h, w) = frame.shape[:2]
		blob = cv2.dnn.blobFromImage(frame, 0.007843, (w, h), 127.5)

		# pass the blob through the network and obtain the detections
		# and predictions
		net.setInput(blob)
		detections = net.forward()

In order to perform object tracking we must first perform object detection, either:

Manually, by stopping the video stream and hand-selecting the bounding box(es) of each object.
Programmatically, using an object detector trained to detect the presence of an object (which is what we are doing here).

If there are no object trackers (Line 70), then we know we have yet to perform object detection.

We create and pass a blob through the SSD network to detect objects on Lines 72-78. To learn about the cv2.blobFromImage function, be sure to refer to my writeup in this article.

Next, we proceed to loop over the detections to find objects belonging to the "person" class since our input video is a human foot race:

		# loop over the detections
		for i in np.arange(0, detections.shape[2]):
			# extract the confidence (i.e., probability) associated
			# with the prediction
			confidence = detections[0, 0, i, 2]

			# filter out weak detections by requiring a minimum
			# confidence
			if confidence > args["confidence"]:
				# extract the index of the class label from the
				# detections list
				idx = int(detections[0, 0, i, 1])
				label = CLASSES[idx]

				# if the class label is not a person, ignore it
				if CLASSES[idx] != "person":
					continue

We begin looping over detections on Line 81 where we:

Filter out weak detections (Line 88).
Ensure each detection is a "person" (Lines 91-96). You can, of course, remove this line of code or customize it to your own filtering needs.

Now that we’ve located each "person" in the frame, let’s instantiate our trackers and draw our initial bounding box(es) + class label(s):

				# compute the (x, y)-coordinates of the bounding box
				# for the object
				box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
				(startX, startY, endX, endY) = box.astype("int")

				# construct a dlib rectangle object from the bounding
				# box coordinates and start the correlation tracker
				t = dlib.correlation_tracker()
				rect = dlib.rectangle(startX, startY, endX, endY)
				t.start_track(rgb, rect)

				# update our set of trackers and corresponding class
				# labels
				labels.append(label)
				trackers.append(t)

				# grab the corresponding class label for the detection
				# and draw the bounding box
				cv2.rectangle(frame, (startX, startY), (endX, endY),
					(0, 255, 0), 2)
				cv2.putText(frame, label, (startX, startY - 15),
					cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 255, 0), 2)

To begin tracking objects we:

Compute the bounding box of each detected object (Lines 100 and 101).
Instantiate and pass the bounding box coordinates to the tracker (Lines 105-107). The bounding box is especially important here. We need to create a dlib.rectangle for the bounding box and pass it to the start_track method. From there, dlib can start to track the object.
Finally, we populate the trackers list with the individual tracker (Line 112).

As a result, in the next code block, we’ll handle the case where trackers have already been established and we just need to update positions.

There are two additional tasks we perform in the initial detection step:

Append the class label to the labels list (Line 111). In the event that you’re tracking multiple types of objects (such as "dog" + "person" ), you may wish to know what the type of each object is.
Draw each bounding box rectangle around and class label above the object (Lines 116-119).

If the length of our detections list is greater than zero, we know we are in the object tracking phase:

	# otherwise, we've already performed detection so let's track
	# multiple objects
	else:
		# loop over each of the trackers
		for (t, l) in zip(trackers, labels):
			# update the tracker and grab the position of the tracked
			# object
			t.update(rgb)
			pos = t.get_position()

			# unpack the position object
			startX = int(pos.left())
			startY = int(pos.top())
			endX = int(pos.right())
			endY = int(pos.bottom())

			# draw the bounding box from the correlation object tracker
			cv2.rectangle(frame, (startX, startY), (endX, endY),
				(0, 255, 0), 2)
			cv2.putText(frame, l, (startX, startY - 15),
				cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 255, 0), 2)

In the object tracking phase, we loop over all trackers and corresponding labels on Line 125.

Then we proceed to update each object position (Lines 128-129). In order to update the position, we simply pass the rgb image.

After extracting bounding box coordinates, we can draw a bounding box rectangle and label for each tracked object (Lines 138-141).

The remaining steps in the frame processing loop involve writing to the output video (if necessary) and displaying the results:

	# check to see if we should write the frame to disk
	if writer is not None:
		writer.write(frame)

	# show the output frame
	cv2.imshow("Frame", frame)
	key = cv2.waitKey(1) & 0xFF

	# if the `q` key was pressed, break from the loop
	if key == ord("q"):
		break

	# update the FPS counter
	fps.update()

Here we:

Write the frame to video if necessary (Lines 144 and 145).
Show the output frame and capture keypresses (Lines 148 and 149). If the "q" key is pressed (“quit”), we break out of the loop.
Finally, we update our frames per second information for benchmarking purposes (Line 156).

The remaining steps are to print FPS throughput information in the terminal and release pointers:

# stop the timer and display FPS information
fps.stop()
print("[INFO] elapsed time: {:.2f}".format(fps.elapsed()))
print("[INFO] approx. FPS: {:.2f}".format(fps.fps()))

# check to see if we need to release the video writer pointer
if writer is not None:
	writer.release()

# do a bit of cleanup
cv2.destroyAllWindows()
vs.release()

To close out, our fps stats are collected and printed (Lines 159-161), the video writer is released (Lines 164 and 165), and we close all windows + release the video stream.

Let’s assess accuracy and performance.

To follow along and run this script, make sure you use the “Downloads” section of this blog post to download the source code + example video.

From there, open up a terminal and execute the following command:

$ python multi_object_tracking_slow.py --prototxt mobilenet_ssd/MobileNetSSD_deploy.prototxt \
	--model mobilenet_ssd/MobileNetSSD_deploy.caffemodel \
	--video race.mp4 --output race_output_slow.avi
[INFO] loading model...
[INFO] starting video stream...
[INFO] elapsed time: 24.51
[INFO] approx. FPS: 13.87

It appears that our multi-object tracker is working!

But as you can see, we are only obtaining ~13 FPS.

For some applications, this FPS throughput rate may be sufficient — however, if you need faster FPS, I would suggest taking a look at our more efficient dlib multi-object tracker below.

Secondly, understand that tracking accuracy isn’t perfect. Refer to the third suggestion in the “Improvements and Suggestions” section below as well as read my first post on dlib object tracking for more information.

The fast, efficient dlib multi-object tracking implementation

If you run the dlib multi-object tracking script from the previous section and open up your system’s activity monitor at the same time, you’ll notice that only one core of your processor is being utilized.

In order to speed up our object tracking pipeline we can leverage Python’s multiprocessing module, similar to the threading module, but instead used to spawn processes rather than threads.

Utilizing processes enables our operating system to perform better process scheduling, mapping the process to a particular processor core on our machine (most modern operating systems are able to efficiently schedule processes that are using a lot of CPU in a parallel manner).

If you are new to Python’s multiprocessing module I would suggest you read this excellent introduction from Sebastian Raschka.

Otherwise, go ahead and open up mutli_object_tracking_fast.py and insert the following code:

# import the necessary packages
from imutils.video import FPS
import multiprocessing
import numpy as np
import argparse
import imutils
import dlib
import cv2

Our packages are imported on Lines 2-8. We’re importing the multiprocessing library on Line 3.

We’ll be using the Python Process class to spawn a new process — each new process is independent from the original process.

To spawn this process we need to provide a function that Python can call, which Python will then take and create a brand new process + execute it:

def start_tracker(box, label, rgb, inputQueue, outputQueue):
	# construct a dlib rectangle object from the bounding box
	# coordinates and then start the correlation tracker
	t = dlib.correlation_tracker()
	rect = dlib.rectangle(box[0], box[1], box[2], box[3])
	t.start_track(rgb, rect)

The first three parameters to start_tracker include:

box : Bounding box coordinates of the object we are going to track, presumably returned by some sort of object detector, whether manual or programmatic.
label : Human-readable label of the object.
rgb : An RGB-ordered image that we’ll be using to start the initial dlib object tracker.

Keep in mind how Python multiprocessing works — Python will call this function and then create a brand new interpreter to execute the code within. Therefore, each start_tracker spawned process will be independent from its parent. To communicate with the Python driver script we need to leverage either Pipes or Queues. Both types of objects are thread/process safe, accomplished using locks and semaphores.

In essence, we are creating a simple producer/consumer relationship:

Our parent process will produce new frames and add them to the queue of a particular object tracker.
The child process will then consume the frame, apply object tracking, and then return the updated bounding box coordinates.

I decided to use Queue objects for this post; however, keep in mind that you could use a Pipe if you wish — be sure to refer to the Python multiprocessing documentation for more details on these objects.

Now let’s begin an infinite loop which will run in the process:

	# loop indefinitely -- this function will be called as a daemon
	# process so we don't need to worry about joining it
	while True:
		# attempt to grab the next frame from the input queue
		rgb = inputQueue.get()

		# if there was an entry in our queue, process it
		if rgb is not None:
			# update the tracker and grab the position of the tracked
			# object
			t.update(rgb)
			pos = t.get_position()

			# unpack the position object
			startX = int(pos.left())
			startY = int(pos.top())
			endX = int(pos.right())
			endY = int(pos.bottom())

			# add the label + bounding box coordinates to the output
			# queue
			outputQueue.put((label, (startX, startY, endX, endY)))

We loop indefinitely here — this function will be called as a daemon process, so we don’t need to worry about joining it.

First, we’ll attempt to grab a new frame from the inputQueue on Line 21.

If the frame is not empty, we’ll grab the frame and then update the object tracker, allowing us to obtain the updated bounding box coordinates (Lines 24-34).

Finally, we write the label and bounding box to the outputQueue so the parent process can utilize them in the main loop of our script (Line 38).

Back to the parent process, we’ll parse our command line arguments:

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-p", "--prototxt", required=True,
	help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=True,
	help="path to Caffe pre-trained model")
ap.add_argument("-v", "--video", required=True,
	help="path to input video file")
ap.add_argument("-o", "--output", type=str,
	help="path to optional output video file")
ap.add_argument("-c", "--confidence", type=float, default=0.2,
	help="minimum probability to filter weak detections")
args = vars(ap.parse_args())

The command line arguments for this script are exactly the same as our slower, non-multiprocessing script. If you need a refresher on the arguments, just click here. And furthermore, read my post about argparse and command line arguments if you aren’t familiar with them.

Let’s initialize our input and output queues:

# initialize our lists of queues -- both input queue and output queue
# for *every* object that we will be tracking
inputQueues = []
outputQueues = []

These queues will hold the objects we are tracking. Each process spawned will need two Queue objects:

One to read input frames from
And a second to write results to

This next block is identical to our previous script:

# initialize the list of class labels MobileNet SSD was trained to
# detect
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
	"bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
	"dog", "horse", "motorbike", "person", "pottedplant", "sheep",
	"sofa", "train", "tvmonitor"]

# load our serialized model from disk
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])

# initialize the video stream and output video writer
print("[INFO] starting video stream...")
vs = cv2.VideoCapture(args["video"])
writer = None

# start the frames per second throughput estimator
fps = FPS().start()

We define our model’s CLASSES and load the model itself (Lines 61-68). Remember, these CLASSES are static — our MobileNet SSD supports these classes and only these classes. If you want to detect + track other objects you’ll need to find another pretrained model or train one. Furthermore, the order of this list matters! Do not change the ordering of the list unless you enjoy being confused! I would also recommend reading this tutorial if you want to further understand how object detectors work.

We initialize our video stream object and set our video writer object to None (Lines 72 and 73).

Our frames per second calculator is instantiated and started on Line 76.

Now let’s begin looping over frames from the video stream:

# loop over frames from the video file stream
while True:
	# grab the next frame from the video file
	(grabbed, frame) = vs.read()

	# check to see if we have reached the end of the video file
	if frame is None:
		break

	# resize the frame for faster processing and then convert the
	# frame from BGR to RGB ordering (dlib needs RGB ordering)
	frame = imutils.resize(frame, width=600)
	rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

	# if we are supposed to be writing a video to disk, initialize
	# the writer
	if args["output"] is not None and writer is None:
		fourcc = cv2.VideoWriter_fourcc(*"MJPG")
		writer = cv2.VideoWriter(args["output"], fourcc, 30,
			(frame.shape[1], frame.shape[0]), True)

The above code block is, yet again, identical to the one in the previous script. Be sure to refer above as needed.

Now let’s handle the case where we have no inputQueues :

	# if our list of queues is empty then we know we have yet to
	# create our first object tracker
	if len(inputQueues) == 0:
		# grab the frame dimensions and convert the frame to a blob
		(h, w) = frame.shape[:2]
		blob = cv2.dnn.blobFromImage(frame, 0.007843, (w, h), 127.5)

		# pass the blob through the network and obtain the detections
		# and predictions
		net.setInput(blob)
		detections = net.forward()

		# loop over the detections
		for i in np.arange(0, detections.shape[2]):
			# extract the confidence (i.e., probability) associated
			# with the prediction
			confidence = detections[0, 0, i, 2]

			# filter out weak detections by requiring a minimum
			# confidence
			if confidence > args["confidence"]:
				# extract the index of the class label from the
				# detections list
				idx = int(detections[0, 0, i, 1])
				label = CLASSES[idx]

				# if the class label is not a person, ignore it
				if CLASSES[idx] != "person":
					continue

If there are no inputQueues (Line 101) then we know we need to apply object detection prior to object tracking.

We apply object detection on Lines 103-109 and then proceed to loop over the results on Line 112. We grab our confidence values and filter out weak detections on Lines 115-119.

If our confidence meets the threshold established by our command line arguments, we consider the detection, but we further filter it out by class label . In this case, we’re only looking for "person" objects (Lines 122-127).

Assuming we have found a "person" , we’ll create queues and spawn tracking processes:

				# compute the (x, y)-coordinates of the bounding box
				# for the object
				box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
				(startX, startY, endX, endY) = box.astype("int")
				bb = (startX, startY, endX, endY)

				# create two brand new input and output queues,
				# respectively
				iq = multiprocessing.Queue()
				oq = multiprocessing.Queue()
				inputQueues.append(iq)
				outputQueues.append(oq)

				# spawn a daemon process for a new object tracker
				p = multiprocessing.Process(
					target=start_tracker,
					args=(bb, label, rgb, iq, oq))
				p.daemon = True
				p.start()

				# grab the corresponding class label for the detection
				# and draw the bounding box
				cv2.rectangle(frame, (startX, startY), (endX, endY),
					(0, 255, 0), 2)
				cv2.putText(frame, label, (startX, startY - 15),
					cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 255, 0), 2)

We first compute the bounding box coordinates on Lines 131-133.

From there we create two new queues, iq and oq (Lines 137 and 138), appending them to inputQueues and outputQueues respectively (Lines 139 and 140).

From there we spawn a new start_tracker process, passing the bounding box, label , rgb image, and the iq + oq (Lines 143-147). Don’t forget to read more about multiprocessing here.

We also draw the detected object’s bounding box rectangle and class label (Lines 151-154).

Otherwise, we’ve already performed object detection so we need to apply each of the dlib object trackers to the frame:

	# otherwise, we've already performed detection so let's track
	# multiple objects
	else:
		# loop over each of our input ques and add the input RGB
		# frame to it, enabling us to update each of the respective
		# object trackers running in separate processes
		for iq in inputQueues:
			iq.put(rgb)

		# loop over each of the output queues
		for oq in outputQueues:
			# grab the updated bounding box coordinates for the
			# object -- the .get method is a blocking operation so
			# this will pause our execution until the respective
			# process finishes the tracking update
			(label, (startX, startY, endX, endY)) = oq.get()

			# draw the bounding box from the correlation object
			# tracker
			cv2.rectangle(frame, (startX, startY), (endX, endY),
				(0, 255, 0), 2)
			cv2.putText(frame, label, (startX, startY - 15),
				cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 255, 0), 2)

Looping over each of the inputQueues , we add the rgb image to them (Lines 162 and 163).

Then we loop over each of the outputQueues (Line 166), obtaining the bounding box coordinates from each independent object tracker (Line 171). Finally, we draw the bounding box + associated class label on Lines 175-178.

Let’s finish out the loop and script:

	# check to see if we should write the frame to disk
	if writer is not None:
		writer.write(frame)

	# show the output frame
	cv2.imshow("Frame", frame)
	key = cv2.waitKey(1) & 0xFF

	# if the `q` key was pressed, break from the loop
	if key == ord("q"):
		break

	# update the FPS counter
	fps.update()

# stop the timer and display FPS information
fps.stop()
print("[INFO] elapsed time: {:.2f}".format(fps.elapsed()))
print("[INFO] approx. FPS: {:.2f}".format(fps.fps()))

# check to see if we need to release the video writer pointer
if writer is not None:
	writer.release()

# do a bit of cleanup
cv2.destroyAllWindows()
vs.release()

We write the frame to the output video if necessary as well as show the frame to the screen (Lines 181-185).

If the "q" key is pressed, we “quit”, breaking out of the loop (Lines 186-190).

If we do continue processing frames, our fps calculator is updated on Line 193, and then we start the process at the beginning of the while loop again.

Otherwise, we’re done processing frames, and we display the FPS throughput info + release pointers and close windows.

To execute this script, make sure you use the “Downloads” section of the post to download the source code + example video.

From there, open up a terminal and execute the following command:

$ python multi_object_tracking_fast.py --prototxt mobilenet_ssd/MobileNetSSD_deploy.prototxt \
	--model mobilenet_ssd/MobileNetSSD_deploy.caffemodel \
	--video race.mp4 --output race_output_fast.avi
[INFO] loading model...
[INFO] starting video stream...
[INFO] elapsed time: 14.01
[INFO] approx. FPS: 24.26

As you can see, our faster, more efficient multi-object tracker is running at 24 FPS, an improvement by over 45% from our previous implementation ?!

Furthermore, if you open up your activity monitor while this script is running you will see that more of your system’s overall CPU Is being utilized.

This speedup is obtained by allowing each of the dlib object trackers to run in a separate process which in turn enables your operating system to perform more efficient scheduling of the CPU resources.

Improvements and suggestions

The dlib multi-object tracking Python scripts I’ve shared with you today will work just fine for processing shorter video streams; however, if you intend on utilizing this implementation for long-running production environments (in the order of many hours to days of video) there are two primary improvements I would suggest you make:

The first improvement would be to utilize processing pools rather than spawning a brand new process for each object to be tracked.

The implementation covered here today constructs a brand new Queue and Process for each object that we need to track.

For today’s purposes that’s fine, but consider if you wanted to track 50 objects in a video — this implies that you would spawn 50 processes, one for each object. At that point, the overhead of your system managing all those processes will destroy any increase in FPS throughput. Instead, you would want to utilize processing pools.

If your system has N processor cores, then you would want to create a pool with N – 1 processes, leaving one core to your operating system to perform system operations. Each of these processes should perform multiple object tracking, maintaining a list of object trackers, similar to the first multi-object tracking we covered today.

This improvement will allow you to utilize all cores of your processor without the overhead of having to spawn many independent processes.

The second improvement I would make is to clean up the processes and queues.

In the event that dlib reports an object as “lost” or “disappeared” we are not returning from the start_tracker function, implying that that process will live for the life of the parent script and only be killed when the parent exits.

Again, that’s fine for our purposes here today, but if you intend on utilizing this code in production environments, you should:

Update the start_tracker function to return once dlib reports the object as lost.
Delete the inputQueue and outputQueue for the corresponding process as well.

Failing to perform this cleanup will lead to needless computational consumption and memory overhead for long-running jobs.

The third improvement is to improve tracking accuracy by running the object detector every N frames (rather than just once at the start).

I actually demonstrated this in my previous post on people counting with OpenCV. It requires more logic and thought, but yields a much more accurate tracker.

I elected to forego the implementation for this script so that I could teach you the multiprocessing method concisely.

Ideally, you would use this third improvement in addition to multiprocessing.

What's next? We recommend PyImageSearch University.

Course information:
86+ total classes • 115+ hours hours of on-demand code walkthrough videos • Last updated: July 2025
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled

I strongly believe that if you had the right teacher you could master computer vision and deep learning.

Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?

That’s not the case.

All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.

If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.

Inside PyImageSearch University you'll find:

✓ 86+ courses on essential computer vision, deep learning, and OpenCV topics
✓ 86 Certificates of Completion
✓ 115+ hours hours of on-demand video
✓ Brand new courses released regularly, ensuring you can keep up with state-of-the-art techniques
✓ Pre-configured Jupyter Notebooks in Google Colab
✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
✓ Access to centralized code repos for all 540+ tutorials on PyImageSearch
✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
✓ Access on mobile, laptop, desktop, etc.

Click here to join PyImageSearch University

Summary

In this tutorial, we learned how to utilize the dlib library to perform multi-object tracking.

We also learned how to leverage multiprocessing to:

Distribute the actual object tracker instantiations to multiple cores of our processor,
Thereby leading to an increase in FPS throughput rate by over 45%.

I would encourage you to utilize the multiprocessing implementation of our dlib multi-object tracker for your own applications as it’s faster and more efficient; however, you should refer to the “Improvements and suggestions” section of this tutorial where I discuss how you can further enhance the multi-object tracking implementation.

If you enjoyed this series on object tracking, be sure to enter your email in the form below to download today’s source code + videos as well as to be notified of future tutorials here on PyImageSearch.

Download the Source Code and FREE 17-page Resource Guide

Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!

About the Author

Hi there, I’m Adrian Rosebrock, PhD. All too often I see developers, students, and researchers wasting their time, studying the wrong things, and generally struggling to get started with Computer Vision, Deep Learning, and OpenCV. I created this website to show you what I believe is the best possible way to get your start.

86 responses to: Multi-object tracking with dlib

Felix

October 29, 2018 at 12:20 pm

Please tell how did you build the model?
- Adrian Rosebrock
  
  October 29, 2018 at 1:02 pm
  
  Are you referring to the object detector? If so, be sure to refer to this blog post.
  - ERIC
    
    December 3, 2018 at 7:50 pm
    
    HI Adrian,
    
    I searched all your tutorials and found no topic about multi-task learning. This is very hot in AdAs due to its various benefits.
    
    Typical MTL model includes Hyperface, TCDCN,etc., which can perform multitasking with just one single model.
    
    Can you share your knowledge in this domain?
Walid

October 29, 2018 at 1:14 pm

Thanks a lot.
it worked 🙂
- Adrian Rosebrock
  
  October 29, 2018 at 2:18 pm
  
  Awesome, thanks Walid 🙂
David

October 29, 2018 at 1:28 pm

Hi Adrian. You’ve made another awesome tutorial. I’ve been using a Raspberry Pi 3 I’ve set up in second story window looking down on the street, and with a combination of your OpenCV motion detection script with contours to find and “box” moving objects, your centroid tracking algorithm to associate the object with it’s tracklet and some multi-processing pools with queues. For object detection, I send some frames to my server’s REST API which is running a Tensorflow model. This lets the Pi just focus on detecting motion and creating and associating tracklets with the right moving object while the server just tells it what type of object was using that track.

This lets the Pi detect and track multiple objects in real time (just ~3 seconds behind real life) . However, using motion contours and centroid tracking isn’t ideal because you get ID switching when objects get too close. Mosse will probably give less ID switching than centroid tracking but I just couldn’t get it to run in real time on the Pi even with mulit-processing pools; though I might not have optimised them correctly.

I’m tempted to try this dlib processing pool implementation on the Pi, but I recall you commenting once that dlib will be too slow on the Pi. Will this multiprocessing implementation you outlined here solve that?
- Adrian Rosebrock
  
  October 29, 2018 at 2:20 pm
  
  Hey David, thanks for the comment. And congrats on a great project, it sounds like it’s really coming along!
  
  As far as your question goes, I never actually said that dlib on the Pi would be too slow, I said that some functions implemented inside of dlib would be slow. Exactly when I said that depends on which post you read so I unfortunately cannot comment further. I haven’t ran this code on the Pi so it may be worth a shot but I wouldn’t expect it to run tremendously faster but it’s absolutely worth a try!
  - David
    
    November 9, 2018 at 3:20 pm
    
    Hi Adrian,
    
    My apologies; I should have been less ambiguous. It’s not that dlib would be too slow but that some OpenCV tracker functions would be faster and better suited to the Pi. I hope that’s a correct interpretation.
    
    If anyone would like to use my code, they can find it here -> https://github.com/grasslandnetwork. It’s an open source, object detection and tracker network built using some of the things I learned reading Adrian’s code. By giving the software access to any fixed perspective, 2-D camera feed, it provides a 3-D, searchable, simulated re-creation of events (just people and cars for now) in the part of the world it’s viewing using OpenStreetMap 3D. So you can view events in a “god view” mode like SimCity® or Civilization® but with the ability to rewind time and view their entire history.
    
    You can have real-time object detection on hardware as small as a Raspberry Pi at very high FPS because it’s paired with a Serverless Lambda function (https://github.com/grasslandnetwork/node_lite_object_detection) running a Tensorflow model. So you get “infinite” horizontal scaling of object detection inference for maximum FPS without having to buy expensive hardware.
    
    More information and demos can be found on the website here -> https://www.grassland.network
    - Adrian Rosebrock
      
      November 10, 2018 at 9:55 am
      
      Thanks for the clarification and for sharing, David 🙂
yolower

October 29, 2018 at 2:25 pm

is it possible to use yolov3 opencv as an object detector and then track it?
- Adrian Rosebrock
  
  November 2, 2018 at 8:39 am
  
  Yes, absolutely. I will be discussing how to use OpenCV and YOLO together in a tutorial that will be published later this week. Stay tuned!
Raghu K

October 30, 2018 at 3:16 am

Hello Adrian, thanks a lot for yet another wonderful blog post. All of your posts and course materials have been immensely helpful.

I just had one question? How do I know when the tracker has lost the object? tracker.update returns a score; but I am not really sure what that number means and what kind of thresholds to use to filter out false positives.

Looking at the the source, the code defines it as const double psr = (G(p.y(),p.x()).real()-rs.mean())/rs.stddev(); (dlib.net/dlib/image_processing/correlation_tracker.h.html)

Thanks
- Adrian Rosebrock
  
  November 2, 2018 at 8:33 am
  
  Make sure you see my reply to Prateek.
prateek

October 30, 2018 at 5:39 am

how do remove an tracker instance from list of tracker.
For example you can see usain bolt tracker went out of context in the end.
so when to remove a tracker from tracker list?

One way is to check confidence of tracker, but that vary a lot for different object types and different types of videos(resolution etc.), other way is to drop a tracker when object reach the boundary of the frame. But is there any other efficient way to do it?
- Adrian Rosebrock
  
  November 2, 2018 at 8:28 am
  
  You would want to check the confidence of the tracker as you said and then drop it when the confidence is too low. Secondly, I would suggest using heuristics, again as you suggested, when the tracker reaches the boundary of the frame.
  - Tehseen Akhtar
    
    February 1, 2019 at 4:46 am
    
    Hi Adrian,
    Well i have a suggestion, you can track an object for let’s say 30 frames and then from the 30th frame track it backwards till you reach the first frame. If you land on or near the starting position then it’s a good track else it’s a bad track.
    Cheers
    Tehseen Akhtar
Prashant

October 30, 2018 at 6:25 am

Hi Adrian,

Yet another awesome tutorial, Thanks for sharing this one, it always helpful to use your tips and hacks while solving real life examples.

I had one question, how can we train the SSD model of dlib with custom labels, to predict objects for which model is never trained on ?
- Adrian Rosebrock
  
  November 2, 2018 at 8:25 am
  
  The SSD is actually from OpenCV, not dlib. I would suggest you read my gentle guide to deep learning object detection so you can learn the fundamentals of object detection. From there, refer to Deep Learning for Computer Vision with Python where I discuss and provide code on how to train your own custom deep learning object detectors.
SilvaMFPedro

October 30, 2018 at 7:41 am

Hi. I wonder is it possible to refresh the tracking process as soon as the bounding boxes seem to be out of order according to the object?

Thank you
- Adrian Rosebrock
  
  November 2, 2018 at 8:24 am
  
  Absolutely. See my blog post on building an OpenCV people counter where I do exactly that.
igor

October 30, 2018 at 2:35 pm

Hi Adrian, I’m also using a Raspberry PI 3, but, with mobile robots. if possible, I would like some suggestions. I am developing a mobile robot that aims to go to a certain place and identify a single object quickly.

Then I need models, techniques to identify a specific object in the frame quickly and without many flaws.

I already used it:

YOLO: It has precision, but I realized that you need a powerful machine and I do not have one, .

Haar Cascade: Quick, but does not have much precision, can have false positives.
—————————
Any suggestion of other alternatives that are quick to identify a specific object?

If the solution needs good processing power, I could do as David did, send the images to a local REST API over WIFI, and it returns the instructions for the raspberry.
- igor
  
  October 30, 2018 at 2:38 pm
  
  sorry, I doubled unintentionally
- Adrian Rosebrock
  
  November 2, 2018 at 8:17 am
  
  Depending on your level of accuracy required you may not be able to use the Pi, it might not have enough computational horsepower. You could try using a Movidius NCS to increase your FPS throughput rate.
tommy

October 31, 2018 at 3:01 am

Thanks, Adrian.
FYI, I could condirm it works after fixing followings.
1. should use cast like this
rect = dlib.rectangle(box[0],box[1],box[2],box[3]) ->
dlib.rectangle(int(box[0]),int(box[1]),int(box[2]),int(box[3]))
2. should insert “if __name__ == ‘__main__’:” just before main codes.
If not, freeze_support() error was occurred.
- Adrian Rosebrock
  
  November 2, 2018 at 7:42 am
  
  Hey Tommy — which operating system were you using? Just want to make sure it’s documented for other readers.
  - Zubair Ahmed
    
    November 4, 2018 at 11:14 pm
    
    The problem appears on Windows 10 64-bit and I wrapped everything in the main() function then called it like below to fix this. Hope it helps
    
    if __name__ == “__main__”:
    main()
    - Adrian Rosebrock
      
      November 6, 2018 at 1:20 pm
      
      Thanks Zubair!
    - Anusha
      
      November 14, 2018 at 7:45 am
      
      Where exactly did you insert the if statement? Like where is the main() function located?
  - chen
    
    November 5, 2018 at 11:11 am
    
    tommy is right
    
    should insert “if __name”…….
    - Adrian Rosebrock
      
      November 6, 2018 at 1:14 pm
      
      Correct, for Windows users you will need to update the code.
  - tommy lim
    
    November 6, 2018 at 9:07 am
    
    Hi Adrian,
    My OS is Window 10 Pro.
    Python 3.6.4
    Imutil is the latest one.
    opencv is 3.4.3
    
    If you need more info, plz let me know.
    Thanks.
    Best Regards,
    Tommy Lim
    - Adrian Rosebrock
      
      November 6, 2018 at 12:30 pm
      
      Thanks Tommy. Based on what I’ve seen in other comments it appears to be a Windows-specific error. Thank you for providing the solution.
Ram

October 31, 2018 at 3:51 am

hey adrian,

what should i do if i want to use yolov3 for object detection, which part shld i change
- Adrian Rosebrock
  
  November 2, 2018 at 7:41 am
  
  I’ll be doing a blog post on how to use YOLO with OpenCV later this month, stay tuned!
Quy

October 31, 2018 at 5:30 am

Hi Adrian,
Awesome tutorial, thank you so much for sharing the code. Really enjoy reading your blog. I have some questions as below:
1/ I have got a problem executing the final code (error in line 2-7). I have created a virtual environment for dlib (called dlib_test) as per your instruction. For OpenCV, i also had another virtual environment named py3cv4 (followed your previous tutorial). From my understanding, they are two separated environments and I can only access, say dlib in dlib_test environment and so do OpenCV. However I need to use both of them to make the final code work. How can I do that? Specifically I want to know which environment I will use to run the final code? I am quite new to this and trying to learn from final output backward so my question might be naive I guess 🙂
2/ My second question: is there a way to connect my home camera for input data?
I look forward to hearing from you
Wishing you all the best
Quy
- Adrian Rosebrock
  
  November 2, 2018 at 7:40 am
  
  1. You can either (1) install OpenCV into your “dlib_test” virtual environment or (2) simply sym-link your “cv2.so” bindings for your “py3cv4” Python virtual environment. Either will work.
  
  2. Yes, you can follow this tutorial to learn how to access your webcam.
  - Quy
    
    November 5, 2018 at 3:24 am
    
    Hi Adrian,
    Thank you Adrian, I did it. Sim-link “cv2” seems much more convenient. Now I am following your tutorial to access my home camera to see how thing works
    Best,
    Quy
Ram

November 1, 2018 at 5:30 am

hey adrian,

how can i use this to dynamic no of multiple objects, I am working on vehicle tracking at a signal now this code tracks the cars which are seen in first frame but when a new car comes to frame how can i keep track of that too
- Adrian Rosebrock
  
  November 2, 2018 at 7:20 am
  
  See my blog post on building a people counter with OpenCV. You can swap out the “person” class for the “car” class.
Sean

November 1, 2018 at 10:23 pm

In your opinion, is dlib correlation tracker better or is OpenCV KCF Tracker better?
- Adrian Rosebrock
  
  November 2, 2018 at 7:13 am
  
  There is no one true “best” object tracker. It’s entirely dependent on your application and what you are trying to build. Each tracker has its own pros and cons. See this blog post for more information.
  - Mustafa Ozturk
    
    April 11, 2020 at 10:58 am
    
    Hi Adrien,
    Yes there is no one true “best” object tracker. But I think we can compare two methods. Have you ever compared dlib correlation tracker with bult-in KFC tracker?
Dave Ingram

November 2, 2018 at 1:47 am

I want to develop a system to track walkers / runners going past so as to analyze their technique. a camera on a server motor must trigger when a person enters from the left then stays on the person in center of frame of frame for about 10m.. It will then reverse and wait fo the next person. I am very much in the initial stage but hope to use a Pi , in fact I have a spare Pi top which will work nicely… I was hoping to user an action Camera on a tripod but cant seem to figure out if I can connect top t it over Wifi or not. There are Android apps but nothing for Linux
- Adrian Rosebrock
  
  November 2, 2018 at 7:12 am
  
  Hey Dave — my primary concern is that the Pi won’t have enough computational horsepower for the project. The object detector used in this tutorial only runs at ~1 FPS on the Pi. You could use a Movidius NCS and get up to 4-6 FPS but again, that’s not ideal. You would also want to use a more computationally efficient, but potentially less accurate, tracker such as MOSSE.
Marcelo

November 8, 2018 at 4:00 pm

Perfect tutorial Adrian!! Many thanks! I adapted it to my webcam and worked! But when I leave the scene, the boundingbox gets stucked in the frame. How do I discard the tracker when the tracked object is gone?

Regards
- Adrian Rosebrock
  
  November 10, 2018 at 10:04 am
  
  You’ll want to check the returned confidence of the tracker. If it gets too low, discard the tracker.
Divyanshu

November 12, 2018 at 9:00 am

Hello Adrian, This is Divyanshu and i am trying to implement the slow version of object tracking first but getting this error, please help.
File “”, line 44, in
rect = dlib.rectangle(startX, startY, endX, endY)
- Divyanshu
  
  November 12, 2018 at 9:11 am
  
  Hey, I fixed the issue .I simply converted all the 4 points to integer and it worked for me.
  - Adrian Rosebrock
    
    November 13, 2018 at 4:41 pm
    
    Congrats on resolving the issue!
Divyanshu

November 12, 2018 at 9:53 am

Hello Adrian,i implemented the slow tracker right now and it is working fine but when i tested with the webcam ,it was detecting me accurately for sometime and then as i reach corner of the frame it freezes there and then no further tracking takes place.
- Adrian Rosebrock
  
  November 13, 2018 at 4:40 pm
  
  That is indeed strange behavior. Try to use either “print” statements or “pdb” to help determine what line is causing the freeze as again, that is not normal behavior.
Anusha

November 14, 2018 at 7:36 am

Hey Adrian, thanks for the great tutorial. I tried to implement this code in my Windows system but however, it gives me the following error:

RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:

if __name__ == ‘__main__’:
freeze_support()
…

The “freeze_support()” line can be omitted if the program
is not going to be frozen to produce an executable.

I searched a lot to find out that it’s an error that occurs in Windows and we have to wrap the code with the above if statement to avoid executing multiple times. The thing is I don’t know/i’m not aware of where to insert this if statement in the multiple_object_tracking_fast.py file. Can you please help me out with this? Thanks in Advance
- Andy Woods
  
  July 1, 2019 at 9:43 am
  
  Windows being a pain there. This is how I fixed it. https://gist.github.com/andytwoods/ed5792173e6456604d9905e51c8d82ca
Niko Gamulin

November 15, 2018 at 12:52 pm

Thank you Adrian for awesome tutorials!

Have you developed/implemented any multi-object, multi-camera detection and tracking algorithm, such as this one: https://github.com/ergysr/DeepCC? It would be great if you could cover the topic.

Thank you and keep up with great work.
- Adrian Rosebrock
  
  November 19, 2018 at 12:55 pm
  
  Thanks for the suggestion, Niko! I will certainly consider it but cannot guarantee if/when I will cover the topic.
Zubair Ahmed

November 17, 2018 at 10:23 am

Hi Adrian

I set on a improvements path like you suggested and oh my its a path less traveled it seems.

I used pool.map that had problem with multi-arguments so I used starmap and finally figured out how to correctly use it and pass multi-args to it

Then I stumbled upon an issue of sharing Queues to multiple processes which was solved by using multiprocessing.Manager()

Despite having all of these I’m still stuck where it seems my main thread crashes bringing down everything, this happens after only 6-8 processes have been created using Pool and I do see some trackers before everything freezes, not sure whats going on

Can you please suggest something?
- Adrian Rosebrock
  
  November 19, 2018 at 12:43 pm
  
  Hm, I’m honestly not sure what may be going on there. Are you using a Windows system? If so I’ve never tested this code on a Windows machine so I’m wondering if it’s Windows-specific process issue.
  - Zubair Ahmed
    
    November 19, 2018 at 1:28 pm
    
    Yes I’m on windows I highly doubt its a windows specific issue but the fastest way to find out if you can please run my code on your machine, if its not too much to ask, can I please send it over to you?
    - Adrian Rosebrock
      
      November 19, 2018 at 1:45 pm
      
      Honestly I won’t have the time this week (it’s by far the busiest week of the year). Furthermore trying to debug another programmer/practitioner’s code is a super time consuming task. I am more than happy to provide the tutorials + source code for free here on the PyImageSearch blog. I hope others can use them, learn from them, and build their own applications, but the support I can give for custom implementations outside what I offer is a bit limited — there are only so many hours in the day and I’m trying to help everyone that I can 🙂 I hope you understand. I would also suggest spinning up your own Linux instance. You do have the VMs from DL4CV that you could utilize as well.
      - Zubair Ahmed
        
        November 19, 2018 at 2:05 pm
        
        I understand that hours are limited so I didnt mean that you debug the code, not at all. My program freezes within 2 seconds of launch on Windows. I cant go through the trouble of setting up a Linux instance only to have the same problem. Anyway, I’ll figure this out
      - Adrian Rosebrock
        
        November 19, 2018 at 2:14 pm
        
        You already have a pre-configured Linux instance with the DL4CV though. I can understand simply “wanting the script to work”; however, as engineers we must be willing to do the debugging. It helps us obtain a better understanding of the problem.
      - Zubair Ahmed
        
        November 19, 2018 at 2:29 pm
        
        Yes I’m debugging it for few days now no luck so far, I’ll find it out
- niaoyu
  
  September 16, 2019 at 11:30 pm
  
  hi could you just share your code with pooling improvements, I’m struggling with that.
  
  Many thxs!
boris102

January 10, 2019 at 11:20 am

Good day Adrian. Are you planning to create a GUI tutorial ? For example on lesson facial recognition on library DLIB using (PyQt5)thank You
- Adrian Rosebrock
  
  January 11, 2019 at 9:36 am
  
  I cover face recognition with dlib in this tutorial. I don’t have any plans to cover GUIs, I’m not much of a GUI developer and GUIs are pretty far removed from the topic of computer vision.
Adithya

January 15, 2019 at 12:32 am

Hi Adrian,

Just wanted to know if there is some other way of increasing the FPS for tracking multiple objects with dlib’s correlation tracker. Actually I am constrained by the fact that I can only use a single core, so multiprocessing is not coming out good for me.

Is there anyway to increase the speed, for example by tweaking the stuff inside the code os correlation tracker in the dlib library?

Regards,
Adithya
- Adrian Rosebrock
  
  January 16, 2019 at 9:44 am
  
  If you’re tracking multiple objects the best way to increase FPS throughput is to distribute across cores. If you have only one core then you can’t do that. I’m not sure of any internal tweaks to dlib to make it faster, you would need to ask Davis King, the creator of dlib.
Sam C

January 18, 2019 at 2:48 am

Hi Adrian, is it possible to run this code (with multiprocessing) on a Jetson tx2?
- Adrian Rosebrock
  
  January 22, 2019 at 9:42 am
  
  The code itself will run on a Jetson TX2 but I don’t believe it will natively distribute the process across all cores.
Sandesh

February 2, 2019 at 3:33 am

Hello Adrian,
How can I implement this tracking on YOLOv3 model?
- Adrian Rosebrock
  
  February 5, 2019 at 9:38 am
  
  Use this tutorial to understand the YOLO object detector. Then swap in the SSD for YOLO (but make sure you read the YOLO post first).
- Ron
  
  January 28, 2020 at 10:31 am
  
  Sorry, Adrian, please give me advice, how i can train my dataset, which software i can use for labeling?
  - Adrian Rosebrock
    
    January 30, 2020 at 8:43 am
    
    I really like VGG’s Image Annotator, VIA.
Simon

March 25, 2019 at 3:11 am

Hi, I converted your idea with tensorflow-SSD and it works fine.

But it there anyway to put labels on bbox like ‘person1’, ‘person2’ …? Because everyone detected as ‘person’ and I want to know which bbox is tracked per frame.
Thanks,
- Adrian Rosebrock
  
  March 27, 2019 at 8:56 am
  
  You can use this tutorial to assign IDs to each of the tracked objects.
Jang

April 14, 2019 at 2:57 am

Thanks for the wonderful tutorial Adrian!

Just one quick question on how to reduce the number of duplicated tracking. I have a video of a person moving a box from one side of the image to the center for the whole 1 minute time. I’ve customized the program to detect every 10th frame. By the end of the video I see multiple bounding box of the the same person. Could you suggest a way to remove or reduce this duplicates?

Thanks in advance!
- Adrian Rosebrock
  
  April 18, 2019 at 7:29 am
  
  Hey Jang — it’s hard for me to say what the issue is without seeing the exact video. I’m also not sure what additions you have made to the code. It’s unfortunately too hard for me to say with the problem is.
smit

May 16, 2019 at 1:47 am

Thanks for the wonderful post Adrian!

I wanted to know how can we track object across multiple cameras. Like what approaches we can use to achieve the above scenario.
- Adrian Rosebrock
  
  May 23, 2019 at 10:23 am
  
  Sorry, I don’t have any tutorials on that topic at the moment. I may consider it for a future tutorial though.
Naufil Hassan

May 17, 2019 at 6:25 am

I saw a way to speed up correlation tracker.
It say to use tracker.update_noscale() at majority frames and using tracker.update() at other frames. This actually improves the speed with the condition that scale of an object is changing gradually.
I installed dlib library with command (pip install dlib) but I it gives an error that it doesn’t exist when I call tracker.update_noscale(). Can you please let me know how to solve it?

Thanks
- Adrian Rosebrock
  
  May 23, 2019 at 10:13 am
  
  Thanks Naufil. I think that the tracker.update_noscale could be a C++ function only with no Python bindings.
Steve

June 7, 2019 at 8:27 pm

Adrian

Thank you for this great tutorial I especially like your list of suggested improvements. I tried using the multiprocessor pooling function, but the program freezes up. I’m creating a list of trackers created from start tracker method, then pooling update method that I was hoping would spread the work across the other cores.

positions = [pool.apply(tracking, args=(tracker, rgb)) for tracker in trackers]

Any suggest on how to setup a pool would be greatly appreciated

Any suggests for how to implement the tracker pool would be greatly appreciated.
- Adrian Rosebrock
  
  June 12, 2019 at 1:52 pm
  
  Hey Steve — without knowing your exact code it’s hard to know what the issues is. I may come back and do a separate “pool” post but I’m not sure if/when that may be.
Rohit Lal

June 18, 2019 at 2:08 am

Really nice tutorial! Thanks for sharing. I would like to ask you that is there any way that i can track people in multi camera setup? For example if there is a guy labelled ‘X’ detected in camera 1 so is there a way to detect the same person in other cameras?
- Adrian Rosebrock
  
  June 19, 2019 at 1:56 pm
  
  I don’t have any tutorials for that particular use case. I may cover it in the future but I cannot guarantee if/when that may be.
saloni

February 7, 2020 at 2:04 am

Adrian

Thank you for this great tutorial It really helped…
Can you suggest me the way how can i change the tracker ??
I would like to use CSRT Tracker instead of dlib tracker.

Comment section

Hey, Adrian Rosebrock here, author and creator of PyImageSearch. While I love hearing from readers, a couple years ago I made the tough decision to no longer offer 1:1 help over blog post comments.

At the time I was receiving 200+ emails per day and another 100+ blog post comments. I simply did not have the time to moderate and respond to them all, and the sheer volume of requests was taking a toll on me.

Instead, my goal is to do the most good for the computer vision, deep learning, and OpenCV community at large by focusing my time on authoring high-quality blog posts, tutorials, and books/courses.

If you need help learning computer vision and deep learning, I suggest you refer to my full catalog of books and courses — they have helped tens of thousands of developers, students, and researchers just like yourself learn Computer Vision, Deep Learning, and OpenCV.

Click here to browse my full catalog.

Looking for the source code to this post?

Multi-object tracking with dlib

Project structure

The “naïve” dlib multiple object tracking implementation

The fast, efficient dlib multi-object tracking implementation

Improvements and suggestions

What's next? We recommend PyImageSearch University.

Summary

Download the Source Code and FREE 17-page Resource Guide

About the Author

86 responses to: Multi-object tracking with dlib

Comment section

PyImageSearch University

Multi-object tracking with dlib

OpenCV: Resolving NoneType errors

OpenCV center of contour

Topics

Books & Courses

PyImageSearch

Looking for the source code to this post?

Multi-object tracking with dlib

Project structure

The “naïve” dlib multiple object tracking implementation

The fast, efficient dlib multi-object tracking implementation

Improvements and suggestions

What's next? We recommend PyImageSearch University.

Summary

Download the Source Code and FREE 17-page Resource Guide

About the Author

Reader Interactions

Object tracking with dlib

Creating GIFs with OpenCV

86 responses to: Multi-object tracking with dlib

Comment section

Similar articles

You can learn Computer Vision, Deep Learning, and OpenCV.

Footer

Topics

Books & Courses

PyImageSearch

Access the code to this tutorial and all other 500+ tutorials on PyImageSearch

What's included in PyImageSearch University?