AugMix with KerasCV Breakdown (Part 1): Introduction to AugMix

Home » Blog » AugMix with KerasCV Breakdown (Part 1): Introduction to AugMix

AugMix Breakdown (Part 1): Introduction to AugMix with KerasCV

Configuring Your Development Environment
Need Help Configuring Your Development Environment?
Introduction: Augmentations
- But Wait, What Is KerasCV?
The Dataset
The Problem
The Solution: AugMix

How AugMix Works
AugMix Results on ImageNet

Why Is AugMix Important?

So, What Is AugMix Again?
Components of AugMix
Stochastic Sampling
Consistent Embedding
Jensen-Shannon Divergence Consistency Loss
Mixing Augmentations
Previous Methods
Mitigating Image Degradation

Summary

References
Citation Information

AugMix Breakdown (Part 1): Introduction to AugMix with KerasCV

This lesson is the 1st of a 3-part series on AugMix Breakdown:

AugMix with KerasCV Breakdown (Part 1): Introduction to AugMix (this tutorial)
AugMix with KerasCV Breakdown (Part 2): Steps Involved in AugMix
AugMix with KerasCV Breakdown (Part 3): Implementation of AugMix

To learn the importance of AugMix and its formulation for a robust Augmentation pipeline, just keep reading.

AugMix Breakdown (Part 1): Introduction to AugMix with KerasCV

Welcome to an exploration of AugMix, a sophisticated technique that revolutionizes how we approach data augmentation in Machine Learning. This tutorial will guide you through the intricacies of AugMix, a data augmentation method that enhances the robustness of your models and improves their predictive certainty.

We’ll delve into how AugMix applies a series of changes to your training data, creating a diverse array of image variations. This isn’t just a random process, but a carefully orchestrated operation that maintains the core essence of the original image.

Additionally, we’ll navigate the practical application of data augmentation techniques using KerasCV, a popular library in the Machine Learning community.

So, prepare to embark on a journey through the paper of AugMix. Let’s dive in!

Configuring Your Development Environment

To follow this guide, you need to have the KerasCV library installed on your system.

Luckily, KerasCV is pip-installable:

$ pip install keras-cv

Need Help Configuring Your Development Environment?

Need help configuring your dev environment? Want access to pre-configured Jupyter Notebooks running on Google Colab? Be sure to join PyImageSearch University — you’ll be up and running with this tutorial in minutes.

All that said, are you:

Short on time?
Learning on your employer’s administratively locked system?
Wanting to skip the hassle of fighting with the command line, package managers, and virtual environments?
Ready to run the code immediately on your Windows, macOS, or Linux system?

Then join PyImageSearch University today!

Gain access to Jupyter Notebooks for this tutorial and other PyImageSearch guides pre-configured to run on Google Colab’s ecosystem right in your web browser! No installation required.

And best of all, these Jupyter Notebooks will run on Windows, macOS, and Linux!

Introduction: Augmentations

Picture this: You’re teaching a toddler to recognize a dog. You show them pictures of different dogs — big ones, small ones, black ones, white ones, and so forth, you get it. Now, they can identify a dog pretty well. But what if you showed them a picture of a dog in a Halloween costume (Figure 1)?

**Figure 1:** Source — Top 10 Halloween Costume Ideas for Dogs

Or a dog upside down, or maybe a dog dressed in boots? They might get confused because these images of dogs look different than the ones they learned from.

In machine learning, particularly in tasks related to computer vision, this situation is quite common. We teach models to recognize objects in images (e.g., dogs) by showing them many different images. But in real life, objects can appear in all sorts of conditions that the model may not have seen during its training (e.g., different lighting, angles, or sizes). This is illustrated in Figure 2.

This is where data augmentation steps in! It’s a technique where we create new training data from our existing data but with modifications. We might flip an image, rotate it, zoom in or out, or even change the colors a bit. It’s like showing our model pictures of dogs in all possible costumes, angles, and lighting conditions.

By using augmentations, we’re helping our model understand that objects in images can appear differently and still be the same object. This makes our model more robust, flexible, and better at its job of identifying objects, no matter how they appear in new images. It’s like giving our model a superpower to see beyond the usual and understand the unusual!

Before we move ahead with the rest of the tutorial, let us see how we can define augmentations in Keras and KerasCV. The following code snippet demonstrates how to

import keras
import keras_cv

augmentations = keras.Sequential(
    [
        keras_cv.layers._name_of_the_augmentaion_(),
        keras_cv.layers._name_of_the_augmentaion_(),
        keras_cv.layers._name_of_the_augmentaion_(),
    ]
)

But Wait, What Is KerasCV?

KerasCV is a library of modular computer vision-oriented Keras components. KerasCV is a subdivision of Keras, a deep learning framework built on top of Keras, specifically designed for computer vision tasks (e.g., classification, segmentation, image data augmentation, etc.). This means it is the new repository on the block to keep an eye on, if you are serious about computer vision.

Through this tutorial, we demonstrate how to get your hands dirty with keras_cv, starting with one of the most important data preprocessing techniques: Augmentations!

pip install keras_cv

The Dataset

For this tutorial, we will use the very popular Oxford IIIT dataset. You can download it from Roboflow.

Roboflow has free tools for each stage of the computer vision pipeline that will streamline your workflows and supercharge your productivity.

Sign up or log in to your Roboflow account to access state-of-the-art dataset libraries and revolutionize your computer vision pipeline.

You can start by choosing your own datasets or using the PyImageSearch assorted library of useful datasets.

Bring data in any of 40+ formats to Roboflow, train using any state-of-the-art model architectures, deploy across multiple platforms (API, NVIDIA, browser, iOS, etc.), and connect to applications or 3rd party tools.

The following code snippet shows how to start with AugMix using the Oxford IIIT dataset. The output is shown in Figure 3.

import keras_cv
import tensorflow_datasets as tfds

# Load the dataset
dataset = tfds.load(
    name="oxford_iiit_pet"
)

# Build the AugMix layer and pass the images through the layer
augmix = keras_cv.layers.preprocessing.AugMix(value_range=(0, 255))
augmix_inputs = augmix(batch_inputs)

keras_cv.visualization.plot_image_gallery(
    images=augmix_inputs["images"],
    value_range=(0, 255),
)

**Figure 3:** Source — Image generated by the authors from the Oxford-IIIT Pet Dataset

The Problem

AugMix was introduced by Hendrycks et al. (2019, AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty). Before we discuss why AugMix is important, let us first learn the problem it solves.

Machine learning models are like students. They learn from a book (training data), and they’re expected to use that knowledge to solve problems (make predictions) in the real world (deployment). It’s very important that the book is representative of the real world. Otherwise, they’ll struggle to solve real-world problems correctly.

Mismatch Between Training and Test Data

Now imagine if a student studied from a book about mammals but was then tested on birds. The student would likely fail because the test (test data) doesn’t match what they learned (training data). The same thing can happen with machine learning models. But even though this problem is common, it’s not studied enough. As a result, machine learning models often struggle when they encounter data that’s different from what they were trained on.

Some of the factors that contribute to this problem are the following.

Data Corruption

Just a little bit of change or “corruption” to the data (e.g., a mammal wearing a hat in the test data when the book only had mammals without hats) can confuse the machine learning model (classifier). There aren’t a lot of techniques available yet to make these models more resistant to such changes.

Uncertainty Quantification

Some machine learning models (e.g., probabilistic and Bayesian neural networks) also try to measure how sure they are about their predictions (uncertainty). But these models can also struggle when the data shifts because they weren’t trained on similar examples.

Corruption-Specific Training

Training a model focusing on specific corruptions (e.g., mammals wearing specific types of hats) encourages the model to only remember these specific corruptions and not learn a general ability to handle any kind of hat.

Data Augmentation

Some people propose aggressively changing the training data (aggressive data augmentation) to prepare the model for all kinds of changes it might encounter. However, this approach can require a lot of computational resources.

Trade-Off Between Accuracy, Robustness, and Uncertainty

It was found that many techniques that improve the model’s test score (clean accuracy) make it less able to handle data corruption (robustness). Similarly, techniques that make a model more robust often make it worse at estimating how sure it is about its predictions (uncertainty). So, there’s often a trade-off between accuracy, robustness, and uncertainty.

The Solution: AugMix

AugMix is a new technique that manages to hit a sweet spot. It improves the model’s ability to handle data corruption (robustness) and its ability to estimate how sure it is about its predictions (uncertainty), all while maintaining or even improving its test score (accuracy). And it does this on standard benchmark datasets, commonly used to test machine learning models.

How AugMix Works

AugMix uses randomness (stochasticity) and a wide variety of changes (diverse augmentations) to the training data.
It also uses a special type of loss function called Jensen-Shannon Divergence consistency loss, which is a mathematical way to measure how different two probability distributions are. Furthermore, it mixes multiple changed versions of an image to achieve great performance.
It’s like showing the model different versions of the same image (e.g., a dog in different lighting conditions, from different angles, etc.) and teaching it that they are all still in the same category (a dog), different angles of the dog shown in Figure 4.

**Figure 4:** Source — Know Your Data Catalog

AugMix Results on ImageNet

ImageNet is a popular benchmark dataset for image classifiers. It also reduces perturbation instability from 57.2% to 37.4%.

Perturbation instability measures how much small changes to the input (e.g., slightly changing the color of an image) affect the model’s predictions.

A high perturbation instability means the model’s predictions change a lot, even for small changes to the input, which is not desirable. By reducing this from 57.2% to 37.4%, AugMix makes the model more stable and reliable.

Why Is AugMix Important?

Before concluding this introductory tutorial on AugMix, let us also figure out why AugMix matters so much. Let’s look at Figure 5, where we see that AugMix can preserve much of the original image semantics compared to other data augmentation methodologies.

**Figure 5:** Source — Hendrycks et al. (2019, Figure 1)

Let’s revisit the pointers we learned through this tutorial to formulate why AugMix matters for a data augmentation pipeline.

So, What Is AugMix Again?

This is a technique used to change the training data (data augmentation) to make the model stronger (improve robustness) and better at estimating how sure it is about its predictions (uncertainty estimates). The great thing about AugMix is that it can be easily added to existing training procedures (training pipelines).

Components of AugMix

AugMix is characterized by two main components. The first is using simple changes (augmentation operations) to the images in the training data. The second is a special mathematical measure called consistency loss, which we’ll explain later.

Stochastic Sampling

AugMix doesn’t just apply one change to an image, but stacks many different changes to the same image. The changes it uses are decided randomly (sampled stochastically). This produces a wide variety of changed versions of the same image.

Consistent Embedding

When the model looks at an image, it converts it into a numerical representation called an embedding. AugMix enforces that the model produces a similar embedding for all the changed versions of the same image. This makes sense because, regardless of the changes, it’s still the same image.

Jensen-Shannon Divergence Consistency Loss

This is the special mathematical measure (consistency loss) used by AugMix. It measures how different two probability distributions are. In this case, it’s used to measure how different the model’s predictions are for the different, changed versions of the same image. The goal is to make this divergence as small as possible, which means the model is consistent in its predictions for the changed versions of the same image.

Mixing Augmentations

This refers to applying several changes (augmentations) to the same image at once. This can produce a wide range of different versions of the same image. This diversity is crucial for making the model stronger (inducing robustness). A common issue with deep learning models is that they tend to memorize (overfit) the specific changes they see while training (fixed augmentations) instead of learning to handle changes in general.

Previous Methods

Earlier techniques tried to increase this diversity by applying a series of changes one after the other (composing augmentation primitives in a chain). But this can quickly distort the image too much (causing the image to degrade and drift off the data manifold), making it harder for the model to learn from it. You can see an illustration of this in Figure 6.

**Figure 6:** Source — Hendrycks et al. (2019, Figure 3)

Mitigating Image Degradation

AugMix has a clever solution to this issue. It maintains the diversity of changes without distorting the image too much by mixing the results of several changes (augmentation chains). It creates a blend (convex combination) of several differently changed versions of the same image. This keeps the diversity of changes high, but ensures the changed images still resemble the original image, making it easier for the model to learn from them.

Summary

In this tutorial, we have delved into the significant role of data augmentation in the Machine Learning pipeline, explored how to generate augmentations using KerasCV, and introduced AugMix as a powerful data augmentation methodology.

Data augmentation is a crucial step in the Machine Learning pipeline, enhancing the diversity of our training data and thereby improving the robustness of our models. By generating variations of our training data, we can better equip our models to handle real-world data variability and prevent overfitting.

We also explored how to generate augmentations using KerasCV, a popular library in the Machine Learning community. This hands-on experience provided us with practical insights into implementing data augmentation techniques.

References

AugMix paper: https://arxiv.org/abs/1912.02781
Official code released at: https://github.com/google-research/augmix

@misc{wood2022kerascv,
  title={KerasCV},
  author={Wood, Luke and Tan, Zhenyu and Stenbit, Ian and Bischof, Jonathan and Zhu, Scott and Chollet, Fran\c{c}ois and others},
  year={2022},
  howpublished={\url{https://github.com/keras-team/keras-cv}},
}

Citation Information

A. R. Gosthipaty and R. Raha. “AugMix with KerasCV Breakdown (Part 1): Introduction to AugMix,” PyImageSearch, P. Chugh, S. Huot, and K. Kidriavsteva, eds., 2023, https://pyimg.co/n2asx

@incollection{ARG-RR_2023_AugMix1,
  author = {Aritra Roy Gosthipaty and Ritwik Raha},
  title = {{AugMix} with {KerasCV} Breakdown (Part 1): Introduction to {AugMix}},
  booktitle = {PyImageSearch},
  editor = {Puneet Chugh and Susan Huot and Kseniia Kidriavsteva},
  year = {2023},
  url = {https://pyimg.co/n2asx},
}

Unleash the potential of computer vision with Roboflow - Free!

Step into the realm of the future by signing up or logging into your Roboflow account. Unlock a wealth of innovative dataset libraries and revolutionize your computer vision operations.
Jumpstart your journey by choosing from our broad array of datasets, or benefit from PyimageSearch’s comprehensive library, crafted to cater to a wide range of requirements.
Transfer your data to Roboflow in any of the 40+ compatible formats. Leverage cutting-edge model architectures for training, and deploy seamlessly across diverse platforms, including API, NVIDIA, browser, iOS, and beyond. Integrate our platform effortlessly with your applications or your favorite third-party tools.
Equip yourself with the ability to train a potent computer vision model in a mere afternoon. With a few images, you can import data from any source via API, annotate images using our superior cloud-hosted tool, kickstart model training with a single click, and deploy the model via a hosted API endpoint. Tailor your process by opting for a code-centric approach, leveraging our intuitive, cloud-based UI, or combining both to fit your unique needs.
Embark on your journey today with absolutely no credit card required. Step into the future with Roboflow.

Join Roboflow Now

Join the PyImageSearch Newsletter and Grab My FREE 17-page Resource Guide PDF

Enter your email address below to join the PyImageSearch Newsletter and download my FREE 17-page Resource Guide PDF on Computer Vision, OpenCV, and Deep Learning.

Table of Contents

Unleash the potential of computer vision with Roboflow - Free!

Join the PyImageSearch Newsletter and Grab My FREE 17-page Resource Guide PDF

About the Author

People Counter on OAK

Getting Started with Docker for Machine Learning

Comment section

Similar articles

You can learn Computer Vision, Deep Learning, and OpenCV.

Footer

Topics

Books & Courses

PyImageSearch