Table of Contents
AugMix Breakdown (Part 1): Introduction to AugMix with KerasCV
This lesson is the 1st of a 3-part series on AugMix Breakdown:
- AugMix with KerasCV Breakdown (Part 1): Introduction to AugMix (this tutorial)
- AugMix with KerasCV Breakdown (Part 2): Steps Involved in AugMix
- AugMix with KerasCV Breakdown (Part 3): Implementation of AugMix
To learn the importance of AugMix and its formulation for a robust Augmentation pipeline, just keep reading.
AugMix Breakdown (Part 1): Introduction to AugMix with KerasCV
Welcome to an exploration of AugMix, a sophisticated technique that revolutionizes how we approach data augmentation in Machine Learning. This tutorial will guide you through the intricacies of AugMix, a data augmentation method that enhances the robustness of your models and improves their predictive certainty.
We’ll delve into how AugMix applies a series of changes to your training data, creating a diverse array of image variations. This isn’t just a random process, but a carefully orchestrated operation that maintains the core essence of the original image.
Additionally, we’ll navigate the practical application of data augmentation techniques using KerasCV, a popular library in the Machine Learning community.
So, prepare to embark on a journey through the paper of AugMix. Let’s dive in!
Configuring Your Development Environment
To follow this guide, you need to have the KerasCV library installed on your system.
Luckily, KerasCV is pip-installable:
$ pip install keras-cv
Need Help Configuring Your Development Environment?
All that said, are you:
- Short on time?
- Learning on your employer’s administratively locked system?
- Wanting to skip the hassle of fighting with the command line, package managers, and virtual environments?
- Ready to run the code immediately on your Windows, macOS, or Linux system?
Then join PyImageSearch University today!
Gain access to Jupyter Notebooks for this tutorial and other PyImageSearch guides pre-configured to run on Google Colab’s ecosystem right in your web browser! No installation required.
And best of all, these Jupyter Notebooks will run on Windows, macOS, and Linux!
Introduction: Augmentations
Picture this: You’re teaching a toddler to recognize a dog. You show them pictures of different dogs — big ones, small ones, black ones, white ones, and so forth, you get it. Now, they can identify a dog pretty well. But what if you showed them a picture of a dog in a Halloween costume (Figure 1)?
Or a dog upside down, or maybe a dog dressed in boots? They might get confused because these images of dogs look different than the ones they learned from.
In machine learning, particularly in tasks related to computer vision, this situation is quite common. We teach models to recognize objects in images (e.g., dogs) by showing them many different images. But in real life, objects can appear in all sorts of conditions that the model may not have seen during its training (e.g., different lighting, angles, or sizes). This is illustrated in Figure 2.
This is where data augmentation steps in! It’s a technique where we create new training data from our existing data but with modifications. We might flip an image, rotate it, zoom in or out, or even change the colors a bit. It’s like showing our model pictures of dogs in all possible costumes, angles, and lighting conditions.
By using augmentations, we’re helping our model understand that objects in images can appear differently and still be the same object. This makes our model more robust, flexible, and better at its job of identifying objects, no matter how they appear in new images. It’s like giving our model a superpower to see beyond the usual and understand the unusual!
Before we move ahead with the rest of the tutorial, let us see how we can define augmentations in Keras and KerasCV. The following code snippet demonstrates how to
import keras import keras_cv augmentations = keras.Sequential( [ keras_cv.layers._name_of_the_augmentaion_(), keras_cv.layers._name_of_the_augmentaion_(), keras_cv.layers._name_of_the_augmentaion_(), ] )
But Wait, What Is KerasCV?
KerasCV is a library of modular computer vision-oriented Keras components. KerasCV is a subdivision of Keras, a deep learning framework built on top of Keras, specifically designed for computer vision tasks (e.g., classification, segmentation, image data augmentation, etc.). This means it is the new repository on the block to keep an eye on, if you are serious about computer vision.
Through this tutorial, we demonstrate how to get your hands dirty with keras_cv
, starting with one of the most important data preprocessing techniques: Augmentations!
pip install keras_cv
The Dataset
For this tutorial, we will use the very popular Oxford IIIT dataset. You can download it from Roboflow.
Roboflow has free tools for each stage of the computer vision pipeline that will streamline your workflows and supercharge your productivity.
Sign up or log in to your Roboflow account to access state-of-the-art dataset libraries and revolutionize your computer vision pipeline.
You can start by choosing your own datasets or using the PyImageSearch assorted library of useful datasets.
Bring data in any of 40+ formats to Roboflow, train using any state-of-the-art model architectures, deploy across multiple platforms (API, NVIDIA, browser, iOS, etc.), and connect to applications or 3rd party tools.
The following code snippet shows how to start with AugMix using the Oxford IIIT dataset. The output is shown in Figure 3.
import keras_cv import tensorflow_datasets as tfds # Load the dataset dataset = tfds.load( name="oxford_iiit_pet" ) # Build the AugMix layer and pass the images through the layer augmix = keras_cv.layers.preprocessing.AugMix(value_range=(0, 255)) augmix_inputs = augmix(batch_inputs) keras_cv.visualization.plot_image_gallery( images=augmix_inputs["images"], value_range=(0, 255), )
The Problem
AugMix was introduced by Hendrycks et al. (2019, AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty). Before we discuss why AugMix is important, let us first learn the problem it solves.
Machine learning models are like students. They learn from a book (training data), and they’re expected to use that knowledge to solve problems (make predictions) in the real world (deployment). It’s very important that the book is representative of the real world. Otherwise, they’ll struggle to solve real-world problems correctly.
Mismatch Between Training and Test Data
Now imagine if a student studied from a book about mammals but was then tested on birds. The student would likely fail because the test (test data) doesn’t match what they learned (training data). The same thing can happen with machine learning models. But even though this problem is common, it’s not studied enough. As a result, machine learning models often struggle when they encounter data that’s different from what they were trained on.
Some of the factors that contribute to this problem are the following.
Data Corruption
Just a little bit of change or “corruption” to the data (e.g., a mammal wearing a hat in the test data when the book only had mammals without hats) can confuse the machine learning model (classifier). There aren’t a lot of techniques available yet to make these models more resistant to such changes.
Uncertainty Quantification
Some machine learning models (e.g., probabilistic and Bayesian neural networks) also try to measure how sure they are about their predictions (uncertainty). But these models can also struggle when the data shifts because they weren’t trained on similar examples.
Corruption-Specific Training
Training a model focusing on specific corruptions (e.g., mammals wearing specific types of hats) encourages the model to only remember these specific corruptions and not learn a general ability to handle any kind of hat.
Data Augmentation
Some people propose aggressively changing the training data (aggressive data augmentation) to prepare the model for all kinds of changes it might encounter. However, this approach can require a lot of computational resources.
Trade-Off Between Accuracy, Robustness, and Uncertainty
It was found that many techniques that improve the model’s test score (clean accuracy) make it less able to handle data corruption (robustness). Similarly, techniques that make a model more robust often make it worse at estimating how sure it is about its predictions (uncertainty). So, there’s often a trade-off between accuracy, robustness, and uncertainty.
The Solution: AugMix
AugMix is a new technique that manages to hit a sweet spot. It improves the model’s ability to handle data corruption (robustness) and its ability to estimate how sure it is about its predictions (uncertainty), all while maintaining or even improving its test score (accuracy). And it does this on standard benchmark datasets, commonly used to test machine learning models.
How AugMix Works
- AugMix uses randomness (stochasticity) and a wide variety of changes (diverse augmentations) to the training data.
- It also uses a special type of loss function called Jensen-Shannon Divergence consistency loss, which is a mathematical way to measure how different two probability distributions are. Furthermore, it mixes multiple changed versions of an image to achieve great performance.
- It’s like showing the model different versions of the same image (e.g., a dog in different lighting conditions, from different angles, etc.) and teaching it that they are all still in the same category (a dog), different angles of the dog shown in Figure 4.
AugMix Results on ImageNet
ImageNet is a popular benchmark dataset for image classifiers. It also reduces perturbation instability from 57.2% to 37.4%.
Perturbation instability measures how much small changes to the input (e.g., slightly changing the color of an image) affect the model’s predictions.
A high perturbation instability means the model’s predictions change a lot, even for small changes to the input, which is not desirable. By reducing this from 57.2% to 37.4%, AugMix makes the model more stable and reliable.
Why Is AugMix Important?
Before concluding this introductory tutorial on AugMix, let us also figure out why AugMix matters so much. Let’s look at Figure 5, where we see that AugMix can preserve much of the original image semantics compared to other data augmentation methodologies.
Let’s revisit the pointers we learned through this tutorial to formulate why AugMix matters for a data augmentation pipeline.
So, What Is AugMix Again?
This is a technique used to change the training data (data augmentation) to make the model stronger (improve robustness) and better at estimating how sure it is about its predictions (uncertainty estimates). The great thing about AugMix is that it can be easily added to existing training procedures (training pipelines).
Components of AugMix
AugMix is characterized by two main components. The first is using simple changes (augmentation operations) to the images in the training data. The second is a special mathematical measure called consistency loss, which we’ll explain later.
Stochastic Sampling
AugMix doesn’t just apply one change to an image, but stacks many different changes to the same image. The changes it uses are decided randomly (sampled stochastically). This produces a wide variety of changed versions of the same image.
Consistent Embedding
When the model looks at an image, it converts it into a numerical representation called an embedding. AugMix enforces that the model produces a similar embedding for all the changed versions of the same image. This makes sense because, regardless of the changes, it’s still the same image.
Jensen-Shannon Divergence Consistency Loss
This is the special mathematical measure (consistency loss) used by AugMix. It measures how different two probability distributions are. In this case, it’s used to measure how different the model’s predictions are for the different, changed versions of the same image. The goal is to make this divergence as small as possible, which means the model is consistent in its predictions for the changed versions of the same image.
Mixing Augmentations
This refers to applying several changes (augmentations) to the same image at once. This can produce a wide range of different versions of the same image. This diversity is crucial for making the model stronger (inducing robustness). A common issue with deep learning models is that they tend to memorize (overfit) the specific changes they see while training (fixed augmentations) instead of learning to handle changes in general.
Previous Methods
Earlier techniques tried to increase this diversity by applying a series of changes one after the other (composing augmentation primitives in a chain). But this can quickly distort the image too much (causing the image to degrade and drift off the data manifold), making it harder for the model to learn from it. You can see an illustration of this in Figure 6.
Mitigating Image Degradation
AugMix has a clever solution to this issue. It maintains the diversity of changes without distorting the image too much by mixing the results of several changes (augmentation chains). It creates a blend (convex combination) of several differently changed versions of the same image. This keeps the diversity of changes high, but ensures the changed images still resemble the original image, making it easier for the model to learn from them.
Summary
In this tutorial, we have delved into the significant role of data augmentation in the Machine Learning pipeline, explored how to generate augmentations using KerasCV, and introduced AugMix as a powerful data augmentation methodology.
Data augmentation is a crucial step in the Machine Learning pipeline, enhancing the diversity of our training data and thereby improving the robustness of our models. By generating variations of our training data, we can better equip our models to handle real-world data variability and prevent overfitting.
We also explored how to generate augmentations using KerasCV, a popular library in the Machine Learning community. This hands-on experience provided us with practical insights into implementing data augmentation techniques.
References
- AugMix paper: https://arxiv.org/abs/1912.02781
- Official code released at: https://github.com/google-research/augmix
@misc{wood2022kerascv, title={KerasCV}, author={Wood, Luke and Tan, Zhenyu and Stenbit, Ian and Bischof, Jonathan and Zhu, Scott and Chollet, Fran\c{c}ois and others}, year={2022}, howpublished={\url{https://github.com/keras-team/keras-cv}}, }
Citation Information
A. R. Gosthipaty and R. Raha. “AugMix with KerasCV Breakdown (Part 1): Introduction to AugMix,” PyImageSearch, P. Chugh, S. Huot, and K. Kidriavsteva, eds., 2023, https://pyimg.co/n2asx
@incollection{ARG-RR_2023_AugMix1, author = {Aritra Roy Gosthipaty and Ritwik Raha}, title = {{AugMix} with {KerasCV} Breakdown (Part 1): Introduction to {AugMix}}, booktitle = {PyImageSearch}, editor = {Puneet Chugh and Susan Huot and Kseniia Kidriavsteva}, year = {2023}, url = {https://pyimg.co/n2asx}, }
Unleash the potential of computer vision with Roboflow - Free!
- Step into the realm of the future by signing up or logging into your Roboflow account. Unlock a wealth of innovative dataset libraries and revolutionize your computer vision operations.
- Jumpstart your journey by choosing from our broad array of datasets, or benefit from PyimageSearch’s comprehensive library, crafted to cater to a wide range of requirements.
- Transfer your data to Roboflow in any of the 40+ compatible formats. Leverage cutting-edge model architectures for training, and deploy seamlessly across diverse platforms, including API, NVIDIA, browser, iOS, and beyond. Integrate our platform effortlessly with your applications or your favorite third-party tools.
- Equip yourself with the ability to train a potent computer vision model in a mere afternoon. With a few images, you can import data from any source via API, annotate images using our superior cloud-hosted tool, kickstart model training with a single click, and deploy the model via a hosted API endpoint. Tailor your process by opting for a code-centric approach, leveraging our intuitive, cloud-based UI, or combining both to fit your unique needs.
- Embark on your journey today with absolutely no credit card required. Step into the future with Roboflow.
Join the PyImageSearch Newsletter and Grab My FREE 17-page Resource Guide PDF
Enter your email address below to join the PyImageSearch Newsletter and download my FREE 17-page Resource Guide PDF on Computer Vision, OpenCV, and Deep Learning.
Comment section
Hey, Adrian Rosebrock here, author and creator of PyImageSearch. While I love hearing from readers, a couple years ago I made the tough decision to no longer offer 1:1 help over blog post comments.
At the time I was receiving 200+ emails per day and another 100+ blog post comments. I simply did not have the time to moderate and respond to them all, and the sheer volume of requests was taking a toll on me.
Instead, my goal is to do the most good for the computer vision, deep learning, and OpenCV community at large by focusing my time on authoring high-quality blog posts, tutorials, and books/courses.
If you need help learning computer vision and deep learning, I suggest you refer to my full catalog of books and courses — they have helped tens of thousands of developers, students, and researchers just like yourself learn Computer Vision, Deep Learning, and OpenCV.
Click here to browse my full catalog.