Table of Contents
- Training and Making Predictions with Siamese Networks and Triplet Loss
- Configuring Your Development Environment
- Having Problems Configuring Your Development Environment?
- Project Structure
- Training Our Siamese Network Model with Triplet Loss
- Making Predictions with Our Siamese Network Based Face Recognition Model
- Summary
Training and Making Predictions with Siamese Networks and Triplet Loss
In this tutorial, we will learn to train our Siamese network based face recognition application using Keras and TensorFlow. Furthermore, we will discuss how we can use our model to make predictions in real-time.
In the previous tutorial of this series, we tried to understand the formulation of triplet loss. We discussed how it could be used to learn an embedding space where “similar faces” (i.e., from the same person) reside close to each other and “dissimilar face” (i.e., from different people) reside farther apart. Additionally, we discussed a typical Siamese network pipeline and how it can be used to build our face recognition model.
Furthermore, we implemented the triplet loss and developed our Siamese network based face recognition pipeline in Keras and TensorFlow.
In this tutorial, we will take this further and learn how to train our face recognition model using Keras and TensorFlow. Once our model is trained, we will use it to predict new unseen faces in real-time.
This lesson is the 4th in a 5-part series on Siamese networks and their application in face recognition:
- Face Recognition with Siamese Networks, Keras, and TensorFlow
- Building a Dataset for Triplet Loss with Keras and TensorFlow
- Triplet Loss with Keras and TensorFlow
- Training and Making Predictions with Siamese Networks and Triplet Loss (this tutorial)
- Evaluating Siamese Network Accuracy (F1-Score, Precision, and Recall) with Keras and TensorFlow
To learn how to train and make predictions with Siamese networks and triplet loss, just keep reading.
Looking for the source code to this post?
Jump Right To The Downloads SectionTraining and Making Predictions with Siamese Networks and Triplet Loss
In the second part of this series, we developed the modules required to build the data pipeline for our face recognition application. Furthermore, in the previous tutorial, we developed modules to build our Siamese Model and triplet loss function. In this tutorial, we will put everything together and build our end-to-end face recognition application using the modules that we built previously. Additionally, we will learn to train our end-to-end face recognition model and discuss how we can make predictions using it in real-time.
For this tutorial, we will use Keras and TensorFlow, as we have done in the previous parts of this series. Keras and TensorFlow provide various functionalities that allow us to elegantly put all the modules together and develop our end-to-end pipeline. This provides a simple and intuitive way in which the different parts of our application can communicate and work in tandem to create an efficient and effective face recognition application. Furthermore, Keras provides a simple API that helps us implement, compile, train, and save our model with minimal code.
Figure 1 depicts the overview of our face recognition pipeline and shows how the modules we built in the previous parts of this series work together to develop our final end-to-end application.
Let us revisit our modules and understand the structure and flow of our application.
First, we develop the data pipeline, as shown in Figure 1. Next, the Data Generator (created using the TripleGenerator
Class) is used to create our training and validation Dataset with the help of the tf.data.Dataset
functionality provided by TensorFlow. This dataset is then used to create our DataLoaders, allowing us to apply pre-processing transformations using the MapFunction
Class and generate batches of data samples. Finally, the data pipeline returns two data loaders (i.e., trainDs
and valDs
for training and validation, respectively).
Next, we develop our Siamese network pipeline. We create our embedding module with the help of the get_embedding_module()
function, which we had defined in earlier tutorials. Then, we use the embedding module to embed the anchor, positive, and negative images to build our Siamese network using the get_siamese_network()
function. Finally, we pass our Siamese network to the SiameseModel
Class which implements the triplet loss and training and test step code.
In the end, we compile and train our model using Keras and finally save our trained model so we can use it in the inference phase for making real-time predictions.
Now that we have discussed the overview of our pipeline, let us dive into the code to train and make predictions with our Siamese network based face recognition application.
Configuring Your Development Environment
To follow this guide, you need to have the TensorFlow and OpenCV libraries installed on your system.
Luckily, both TensorFlow and OpenCV are pip-installable:
$ pip install tensorflow $ pip install opencv-contrib-python
If you need help configuring your development environment for OpenCV, we highly recommend that you read our pip install OpenCV guide — it will have you up and running in a matter of minutes.
Having Problems Configuring Your Development Environment?
All that said, are you:
- Short on time?
- Learning on your employer’s administratively locked system?
- Wanting to skip the hassle of fighting with the command line, package managers, and virtual environments?
- Ready to run the code right now on your Windows, macOS, or Linux system?
Then join PyImageSearch University today!
Gain access to Jupyter Notebooks for this tutorial and other PyImageSearch guides that are pre-configured to run on Google Colab’s ecosystem right in your web browser! No installation required.
And best of all, these Jupyter Notebooks will run on Windows, macOS, and Linux!
Project Structure
We first need to review our project directory structure.
Start by accessing the “Downloads” section of this tutorial to retrieve the source code and example images.
├── crop_faces.py ├── face_crop_model │ ├── deploy.prototxt.txt │ └── res10_300x300_ssd_iter_140000.caffemodel ├── inference.py ├── pyimagesearch │ ├── config.py │ ├── dataset.py │ └── model.py └── train.py
In the previous tutorial, we presented a step-by-step walkthrough of our model.py
from the pyimagesearch
folder, which allows us to implement the triplet loss function and build our Siamese network model.
In this tutorial, we will discuss in detail the train.py
file, which implements the code to train our face recognition pipeline, and the inference.py
file, which will help us make predictions using our Siamese network based face recognition application.
Training Our Siamese Network Model with Triplet Loss
Now that we have discussed the overview of our face recognition pipeline and the function performed by the modules we have built, let us put everything together and train our Siamese network based face recognition pipeline using Keras and TensorFlow.
We open our train.py
file and get started.
# USAGE # python train.py # import the necessary packages from pyimagesearch.dataset import TripletGenerator from pyimagesearch.model import get_embedding_module from pyimagesearch.model import get_siamese_network from pyimagesearch.model import SiameseModel from pyimagesearch.dataset import MapFunction from pyimagesearch import config from tensorflow import keras import tensorflow as tf import os # create the data input pipeline for train and val dataset print("[INFO] building the train and validation generators...") trainTripletGenerator = TripletGenerator( datasetPath=config.TRAIN_DATASET) valTripletGenerator = TripletGenerator( datasetPath=config.TRAIN_DATASET) print("[INFO] building the train and validation `tf.data` dataset...") trainTfDataset = tf.data.Dataset.from_generator( generator=trainTripletGenerator.get_next_element, output_signature=( tf.TensorSpec(shape=(), dtype=tf.string), tf.TensorSpec(shape=(), dtype=tf.string), tf.TensorSpec(shape=(), dtype=tf.string), ) ) valTfDataset = tf.data.Dataset.from_generator( generator=valTripletGenerator.get_next_element, output_signature=( tf.TensorSpec(shape=(), dtype=tf.string), tf.TensorSpec(shape=(), dtype=tf.string), tf.TensorSpec(shape=(), dtype=tf.string), ) )
First, we import the important modules we built earlier to train our face recognition model on Lines 5-13. In previous tutorials, we noted that the pyimagesearch
folder contains the code for the dataset module (dataset.py
), the model definition (model.py
), and the configuration file (config.py
), which we discussed in detail. We will now use these modules to train our face recognition application.
We start by importing the TripletGenerator
class from the dataset module (Line 5) and the get_embedding_module
, get_siamese_network
, and SiameseModel
class from the model definition (Lines 6-8). We also import the MapFunction
class and the config
file on Lines 9 and 10, respectively. Finally, we import the Keras, TensorFlow, and os packages on Lines 11-13.
Next, we develop our data pipeline to allow us to sample batches for training and validation. We use the TripletGenerator
class (that we developed earlier) to define the training data generator (i.e., trainTripletGenerator
) and validation data generator (i.e., valTripletGenerator
) on Lines 17-19. The TripletGenerator
class takes as input the path to the respective dataset (i.e., config.TRAIN_DATASET
) as shown on Line 20.
Now that we have defined our data generators, we use the tf.data.Dataset.from_generator
functionality to define our training and validation datasets on Lines 22-37. First, we define our training dataset (i.e., trainTfDataset
) on Lines 22-29. Note that tf.data.Dataset.from_generator
takes as input a callable generator function (i.e., trainTripletGenerator.get_next_element
), whose outputs must be compatible with the output format defined by the output_signature
argument.
Similarly, we create the validation dataset (i.e., valTfDataset
) using tf.data.Dataset.from_generator
on Lines 30-37.
# preprocess the images mapFunction = MapFunction(imageSize=config.IMAGE_SIZE) print("[INFO] building the train and validation `tf.data` pipeline...") trainDs = (trainTfDataset .map(mapFunction) .shuffle(config.BUFFER_SIZE) .batch(config.BATCH_SIZE) .prefetch(config.AUTO) ) valDs = (valTfDataset .map(mapFunction) .batch(config.BATCH_SIZE) .prefetch(config.AUTO) ) # build the embedding module and the siamese network print("[INFO] build the siamese model...") embeddingModule = get_embedding_module(imageSize=config.IMAGE_SIZE) siameseNetwork = get_siamese_network( imageSize=config.IMAGE_SIZE, embeddingModel=embeddingModule, ) siameseModel = SiameseModel( siameseNetwork=siameseNetwork, margin=0.5, lossTracker=keras.metrics.Mean(name="loss"), ) # compile the siamese model siameseModel.compile( optimizer=keras.optimizers.Adam(config.LEARNING_RATE) ) # train and validate the siamese model print("[INFO] training the siamese model...") siameseModel.fit( trainDs, steps_per_epoch=config.STEPS_PER_EPOCH, validation_data=valDs, validation_steps=config.VALIDATION_STEPS, epochs=config.EPOCHS, ) # check if the output directory exists, if it doesn't, then # create it if not os.path.exists(config.OUTPUT_PATH): os.makedirs(config.OUTPUT_PATH) # save the siamese network to disk modelPath = config.MODEL_PATH print(f"[INFO] saving the siamese network to {modelPath}...") keras.models.save_model( model=siameseModel.siameseNetwork, filepath=modelPath, include_optimizer=False, )
On Line 40, we define the pre-processing operations that we want to apply to our data samples using the MapFunction
, which takes the config.IMAGE_SIZE
parameter as an argument.
Finally, on Lines 42-52, we use the training and validation dataset (i.e., trainTfDataset
and valTfDataset
) to define our training and validation data loaders (i.e., trainDs
and valDs
). Note that TensorFlow allows us to apply different functionalities to our generated data samples.
- The map function (which takes as an argument our
mapFunction
class) applies pre-processing transformations to our data samples. - The shuffle functionality (which takes as argument
config.BUFFER_SIZE
) to randomly sample elements from a buffer ofconfig.BUFFER_SIZE
number of elements. - The batch functionality (which takes as argument
config.BATCH_SIZE
) allows us to sample batches of data samples with the number of elements per batch defined by theconfig.BATCH_SIZE
argument. - The prefetch functionality directs TensorFlow to prepare later elements while current elements are being processed.
Now that we have created the data pipeline, we can define our Siamese Model. We first build our embedding module using the get_embedding_module
function, which takes as input the imageSize
(Line 56).
Next, we use the get_siamese_network
function, which takes as an argument the imageSize
and embeddingModule
to build and return our siameseNetwork
(Lines 57-60).
Now that we have our siameseNetwork
, we use the SiameseModel
class to build our Siamese network based face recognition model (Lines 61-65). The SiameseModel
class takes as arguments the siameseNetwork
, the margin distance that we discussed earlier, and the keras.metrics.Mean(name="loss")
metric.
We can now compile our model using siameseModel.compile
and use the Adam
optimizer with the learning rate equal to config.LEARNING_RATE
, as shown on Lines 68-70.
Finally, we use the fit functionality to train our Siamese network based face recognition model (Lines 74-80). This takes as input the training data loader (i.e., trainDs
), the steps_per_epoch
, the validation data loader (i.e., valDs
), the number of validation steps (i.e., config.VALIDATION_STEPS
), and the total number of epochs (i.e., config.EPOCHS
).
Next, we prepare to save our model. We check if the output directory exists, and if it does not, we create it (Lines 84 and 85). On Line 88, we define the modelPath
, and on Lines 90-94, we use the keras.models.save_model
function to save our trained model.
Making Predictions with Our Siamese Network Based Face Recognition Model
Now that we have discussed the code required to train our model, let us implement the code to make predictions in real-time with our trained Siamese network based face recognition application.
We start by opening the inference.py
file.
# USAGE # python inference.py # import the necessary packages from pyimagesearch.dataset import TripletGenerator from pyimagesearch.dataset import MapFunction from pyimagesearch.model import SiameseModel from matplotlib import pyplot as plt from pyimagesearch import config from tensorflow import keras import tensorflow as tf import os # create the data input pipeline for test dataset print("[INFO] building the test generator...") testTripletGenerator = TripletGenerator( datasetPath=config.TEST_DATASET) print("[INFO] building the test `tf.data` dataset...") testTfDataset = tf.data.Dataset.from_generator( generator=testTripletGenerator.get_next_element, output_signature=( tf.TensorSpec(shape=(), dtype=tf.string), tf.TensorSpec(shape=(), dtype=tf.string), tf.TensorSpec(shape=(), dtype=tf.string), ) ) mapFunction = MapFunction(imageSize=config.IMAGE_SIZE) testDs = (testTfDataset .map(mapFunction) .batch(4) .prefetch(config.AUTO) )
As always, we first import the necessary modules like TripletGenerator
, MapFunction
, and SiameseModel
on Lines 5-7. Additionally, we import the necessary packages like pyplot
(from matplotlib
), config
, keras
, tensorflow
, and os
, as shown on Lines 8-12.
We start by using the TripletGenerator
class to define the test data generator (i.e., testTripletGenerator
), which takes as input the path to the test dataset (i.e., config.TEST_DATASET
), as shown on Lines 16 and 17.
Then, we use the tf.data.Dataset.from_generator
function to define our test dataset (i.e., testTfDataset
). Similar to what we discussed for the training pipeline, this function takes as inputs the test data generator function testTripletGenerator.get_next_element
and the output_signature
, as shown on Lines 19-26.
Similar to what we did in the training phase, now we use the MapFunction
class to define the pre-processing transformations for the test set (Line 27) and define our test data loader testDs
(Lines 28-32). Here, we use the map functionality to apply the pre-process transformations and use a batch size of 4
as shown.
# load the siamese network from disk and build the siamese model modelPath = config.MODEL_PATH print(f"[INFO] loading the siamese network from {modelPath}...") siameseNetwork = keras.models.load_model(filepath=modelPath) siameseModel = SiameseModel( siameseNetwork=siameseNetwork, margin=0.5, lossTracker=keras.metrics.Mean(name="loss"), ) # load the test data (anchor, positive, negative) = next(iter(testDs)) (apDistance, anDistance) = siameseModel((anchor, positive, negative)) plt.figure(figsize=(10, 10)) rows = 4 for row in range(rows): plt.subplot(rows, 3, row * 3 + 1) plt.imshow(anchor[row]) plt.axis("off") plt.title("Anchor image") plt.subplot(rows, 3, row * 3 + 2) plt.imshow(positive[row]) plt.axis("off") plt.title(f"Positive distance: {apDistance[row]:0.2f}") plt.subplot(rows, 3, row * 3 + 3) plt.imshow(negative[row]) plt.axis("off") plt.title(f"Negative distance: {anDistance[row]:0.2f}") # check if the output directory exists, if it doesn't, then # create it if not os.path.exists(config.OUTPUT_PATH): os.makedirs(config.OUTPUT_PATH) # save the inference image to disk outputImagePath = config.OUTPUT_IMAGE_PATH print(f"[INFO] saving the inference image to {outputImagePath}...") plt.savefig(fname=outputImagePath)
Next, we define the modelPath
, where our trained Siamese model is stored (Line 35), and use the keras.models.load_model
function to load our model (i.e., siameseNetwork
) (Line 37). Now that we have our pre-trained siameseNetwork
, we use it to build our siameseModel
using the SiameseModel
class (Lines 38-42).
Now that we have defined our test data pipeline, we can sample from the test set and see our face recognition model in action.
We use the iter()
method to convert our testDs
to an iterator and then use the next()
method to sample a batch of test data. We store the output as a tuple (i.e., (anchor, positive, negative)
) (Line 45).
We then pass these samples as input to the siameseModel
and get the distance between the anchor
and positive
samples and the distance between the anchor
and negative
samples (i.e., (apDistance, anDistance)
) (Line 46).
We can now create a plot using matplotlib
to visualize our samples. We start by defining our figure with figsize=(10, 10)
using the plt.figure()
function, as shown on Line 47. We also define the number of rows in our plot, which is equal to the batch size of our test loader (Line 48).
Next, we iterate over the different rows, and for each row, we create a subplot for the anchor
image, positive
image, and the negative
image, as shown on Lines 49-61.
Finally, we prepare to save our inference images. We check if the output directory where we want to store our inference output exists, and if it doesn’t, then we create it (Lines 65 and 66).
We then define the outputImagePath
(Line 69) and use the plt.savefig
function to save our inference image (Line 71).
What's next? We recommend PyImageSearch University.
86 total classes • 115+ hours of on-demand code walkthrough videos • Last updated: October 2024
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled
I strongly believe that if you had the right teacher you could master computer vision and deep learning.
Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?
That’s not the case.
All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.
Inside PyImageSearch University you'll find:
- ✓ 86 courses on essential computer vision, deep learning, and OpenCV topics
- ✓ 86 Certificates of Completion
- ✓ 115+ hours of on-demand video
- ✓ Brand new courses released regularly, ensuring you can keep up with state-of-the-art techniques
- ✓ Pre-configured Jupyter Notebooks in Google Colab
- ✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
- ✓ Access to centralized code repos for all 540+ tutorials on PyImageSearch
- ✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
- ✓ Access on mobile, laptop, desktop, etc.
Summary
In this tutorial, we discussed how to train our Siamese network based face recognition model using Keras and TensorFlow. Specifically, we tried to understand how the modules we built in the previous parts of this series come together to form our face recognition application.
Furthermore, we discussed and implemented the code to predict new unseen face images in real-time using our trained face recognition model.
In the upcoming tutorials of this series, we will evaluate the performance of our face recognition model using different metrics.
Citation Information
Chandhok, S. “Training and Making Predictions with Siamese Networks and Triplet Loss,” PyImageSearch, P. Chugh, A. R. Gosthipaty, S. Huot, K. Kidriavsteva, R. Raha, and A. Thanki, eds., 2023, https://pyimg.co/avjyi
@incollection{Chandhok_2023_training_and_making, author = {Shivam Chandhok}, title = {Training and Making Predictions with Siamese Networks and Triplet Loss}, booktitle = {PyImageSearch}, editor = {Puneet Chugh and Aritra Roy Gosthipaty and Susan Huot and Kseniia Kidriavsteva and Ritwik Raha and Abhishek Thanki}, year = {2023}, url = {https://pyimg.co/avjyi}, }
Unleash the potential of computer vision with Roboflow - Free!
- Step into the realm of the future by signing up or logging into your Roboflow account. Unlock a wealth of innovative dataset libraries and revolutionize your computer vision operations.
- Jumpstart your journey by choosing from our broad array of datasets, or benefit from PyimageSearch’s comprehensive library, crafted to cater to a wide range of requirements.
- Transfer your data to Roboflow in any of the 40+ compatible formats. Leverage cutting-edge model architectures for training, and deploy seamlessly across diverse platforms, including API, NVIDIA, browser, iOS, and beyond. Integrate our platform effortlessly with your applications or your favorite third-party tools.
- Equip yourself with the ability to train a potent computer vision model in a mere afternoon. With a few images, you can import data from any source via API, annotate images using our superior cloud-hosted tool, kickstart model training with a single click, and deploy the model via a hosted API endpoint. Tailor your process by opting for a code-centric approach, leveraging our intuitive, cloud-based UI, or combining both to fit your unique needs.
- Embark on your journey today with absolutely no credit card required. Step into the future with Roboflow.
To download the source code to this post (and be notified when future tutorials are published here on PyImageSearch), simply enter your email address in the form below!
Download the Source Code and FREE 17-page Resource Guide
Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!
Comment section
Hey, Adrian Rosebrock here, author and creator of PyImageSearch. While I love hearing from readers, a couple years ago I made the tough decision to no longer offer 1:1 help over blog post comments.
At the time I was receiving 200+ emails per day and another 100+ blog post comments. I simply did not have the time to moderate and respond to them all, and the sheer volume of requests was taking a toll on me.
Instead, my goal is to do the most good for the computer vision, deep learning, and OpenCV community at large by focusing my time on authoring high-quality blog posts, tutorials, and books/courses.
If you need help learning computer vision and deep learning, I suggest you refer to my full catalog of books and courses — they have helped tens of thousands of developers, students, and researchers just like yourself learn Computer Vision, Deep Learning, and OpenCV.
Click here to browse my full catalog.