Table of Contents
- Introduction to Recurrent Neural Networks with Keras and TensorFlow
- Introduction
- Configuring Your Development Environment
- Having Problems Configuring Your Development Environment?
- Project Structure
- What Are Sequential Data
- A Caveat: Masking and Padding
- Modeling Sequential Data with MLPs
- The Recurrence Formula
- Recurrent Neural Network (an overview)
- Training and Visualizations
- Loading and Inference
- Summary
Introduction to Recurrent Neural Networks with Keras and TensorFlow
It’s a standard Monday morning for you. You are sitting at your workstation, waiting for another computer vision problem statement. By now, you have become an absolute maestro at computer vision (CV) problems.
But to your horror, your company gives you a sequential text classification problem instead of CV. They say they are expanding and for that reason, in front of you lies an absolutely alien domain of which you know nothing.
Thankfully, we would never want that to happen to you. So, finally, due to popular demand, we bring you our first tutorial on dealing with sequential text data; Recurrent Neural Networks (RNNs).
The world of deep learning has progressed immensely, with Transformer models ruling both NLP and CV domains. But to understand Transformers, it is important to grasp the intuition behind RNNs, your gateway to working with sequential data.
Oh, we are really excited about this one! This tutorial marks our first venture into deep learning with sequential data. We have been a vision-first firm for a long time now, and it is about time we learn and process the information provided by the language world.
In this tutorial, we talk about sequential data and how to model it. We build a Recurrent Neural Network and train it on a well-defined application of the real world.
This lesson is the first in a 3-part series on NLP 102:
- Introduction to Recurrent Neural Networks with Keras and TensorFlow (today’s tutorial)
- Long Short-Term Memory Networks
- Neural Machine Translation
To learn how to build a Recurrent Neural Network with TensorFlow and Keras, just keep reading.
Looking for the source code to this post?
Jump Right To The Downloads SectionIntroduction to Recurrent Neural Networks with Keras and TensorFlow
Introduction
Imagine you have been employed by a movie critique firm. Movies receive a lot of reviews from all over the globe. Your mission, should you choose to accept it, is to predict each review’s sentiment to catch the audience’s drift.
The task is simple, given a movie review, classify it as either a positive review or a negative review. Now, as it happens, this task is known as Sentiment Classification in the Deep Learning World.
Don’t confuse this with a computer vision problem. We are not reading facial expressions by employing our old friend OpenCV. Here we deal with text data. Specifically volumes of text data. To mimic the task, we chose imdb_reviews, a 25,000 highly polar movie review dataset.
Configuring Your Development Environment
To follow this guide, you need to have the TensorFlow and the TensorFlow Datasets library installed on your system.
Luckily, both are pip-installable:
$ pip install tensorflow $ pip install tensorflow_datasets $ pip install matplotlib
Having Problems Configuring Your Development Environment?
All that said, are you:
- Short on time?
- Learning on your employer’s administratively locked system?
- Wanting to skip the hassle of fighting with the command line, package managers, and virtual environments?
- Ready to run the code right now on your Windows, macOS, or Linux system?
Then join PyImageSearch University today!
Gain access to Jupyter Notebooks for this tutorial and other PyImageSearch guides that are pre-configured to run on Google Colab’s ecosystem right in your web browser! No installation required.
And best of all, these Jupyter Notebooks will run on Windows, macOS, and Linux!
Project Structure
We first need to review our project directory structure.
Start by accessing the “Downloads” section of this tutorial to retrieve the source code and example images.
From there, take a look at the directory structure:
$ tree -dirsfirst . |____ output | |____ lstm_plot.png | |____ rnn_plot.png |____ pyimagesearch | |____ plot.py | |____ save_load.py | |____ config.py | |____ standardization.py | |____ __init__.py | |____ model.py | |____ dataset.py |____ train.py |____ inference.py |____ terminal_output.txt
In the pyimagesearch
directory, we have:
plot.py
: Script to help us visualize outputs.save_load.py
: Script to load and save model weights.config.py
: Script containing the entire configuration pipeline.standardization.py
: Script containing utilities to help us prepare the data.__init__.py
: Script, which turns the directory into a python package.model.py
: Script housing the model.dataset.py
: Script to help us load the data to our project.
In the core directory, we have two scripts:
train.py
: Script to train the RNN model.inference.py
: Script to draw inference from our
Note: The code download for this blog post contains code snippets for Long Short-Term Memory (LSTM) as well. These will be covered in the following blog post on LSTM.
What Are Sequential Data
Before we go into movie reviews and understand their sentiment, we first need to understand the data.
We are all Computer Vision engineers here and know how images are an array of numbers. But how do we interpret a corpus of text?
If we think about this, all texts can be easily represented as a sequence of characters, as shown in Figure 2. Notice how text is not just a collection but a sequence of characters. This signifies that the characters are equally as important as the order in which they reside.
Any data where the order or sequence is as essential as the data itself is called Sequential Data. Some examples of Sequential Data are sentences, stock market data, audio data, etc.
Let us now try to understand how movie reviews relate to Sequential Data. We first open the dataset.py
python file, which helps load the dataset on disk.
# import the necessary packages import tensorflow_datasets as tfds
We begin with the necessary imports with Line 2. The dataset we will use (imdb_reviews) is already in the tensorflow_datasets
package.
Next, we define the get_imdb_dataset
function to load the dataset on the disk.
def get_imdb_dataset(folderName, batchSize, bufferSize, autotune, test=False): # check whether the test flag is true if test: # load the test dataset, batch it, and prefetch it testDs = tfds.load( name="imdb_reviews", data_dir=folderName, as_supervised=True, shuffle_files=True, split="test" ) testDs = testDs.batch(batchSize).prefetch(autotune) # return the test dataset return testDs # otherwise we will be loading the training and validation dataset else: # load the training and validation dataset (trainDs, valDs) = tfds.load( name="imdb_reviews", data_dir=folderName, as_supervised=True, shuffle_files=True, split=["train[:90%]", "train[90%:]"] ) # shuffle, batch, and prefetch the train and the validation # dataset trainDs = (trainDs .shuffle(bufferSize) .batch(batchSize) .prefetch(autotune) ) valDs = (valDs .shuffle(bufferSize) .batch(batchSize) .prefetch(autotune) ) # return the train and the validation dataset return (trainDs, valDs)
This function takes in the following inputs:
folderName
: the path on the local system to which the dataset will be downloadedbatchSize
: the size in which we want to batch our databufferSize
: the size of the buffer from which elements are randomly selectedautotune
: a constant provided by thetf.data
API for space optimization while prefetchingtest
: a Boolean flag used to determine if the dataset to be loaded is for testing or training purposes
Lines 7-16 execute when the test
flag is set to True
. This code snippet downloads (or uses the cached) test split of the dataset, batches, and prefetches it.
Lines 21-45 execute when the test
flag is set to False
. This means that this dataset will be used for training. The code snippet downloads (or uses the cached) train and validation split of the dataset, shuffles, batches, and prefetches it.
The only difference between the two clauses (training and testing) is that we shuffle the training dataset while keeping the testing dataset.
But having the data at hand, loading, batching, and prefetching it is not enough. Primarily because the data looks like this, as shown in Figure 3.
To make this data usable:
- We need to remove the unnecessary characters (standardization)
- Tokenize the dataset
- Vectorize the tokens
We will follow each of the steps gradually. First, let us see how to standardize the dataset for our purposes.
But what is standardization? It removes unnecessary punctuations and HTML tags from the text corpus. It is a pre-processing step that is very important in any text data pipeline.
We create a custom standardization function in the standardization.py
file.
# import the necessary packages import tensorflow as tf import string import re def custom_standardization(inputData): # transform everything to lowercase lowercase = tf.strings.lower(inputData) # strip off the html break point and punctuations and return it strippedHtml = tf.strings.regex_replace(lowercase, "<br />", " ") strippedPunctuation = tf.strings.regex_replace(strippedHtml, f"[{re.escape(string.punctuation)}]", "") return strippedPunctuation
On Lines 2-4, we import the necessary packages needed.
Next, we define the custom_standardization
function to standardize our dataset. The dataset is first converted to lowercase on Line 8. The HTML tags, spaces, and punctuations are removed on Lines 11-13, and finally, the standardized dataset is returned on Line 14.
After standardizing our text dataset, the next step is to tokenize and vectorize it. Tokenization refers to the process of splitting tokens from the dataset into units. A token can be a character, a word, or a sentence. They are created according to the task at hand.
For our sentiment classifier, we will consider a token to be a word.
Can we feed tokens to the deep learning model after tokenization? Not just yet.
We still need to convert these tokens into numbers. The process of representing tokens into numbers is called text vectorization. With vectorization, each token (word) in our text corpus will be represented by a number.
The following are the basic steps of text vectorization:
- We will create a dictionary of all the unique words from the text corpus. Assign unique numbers for every word. This dictionary is called the vocabulary.
- Now we will substitute the tokens (words) in the dataset with their respective unique numbers as in the vocabulary dictionary.
The entire process is demonstrated in Figure 4.
A Caveat: Masking and Padding
Padding and masking are very useful techniques that will optimize our training. But what exactly is its function, and why do we need it?
Imagine you have a corpus of text. The sentences within this corpus are of different lengths. This means that every sentence has to be standardized to a common length.
Practice makes perfect
Perfection is a dream worth chasing
Becomes …
Practice makes perfect [pad] [pad] [pad] [pad]
Perfection is a dream worth chasing [pad]
Now these [pad]
tokens are of no use in the model except for standardizing the length. So we mask these tokens while training. The process of masking and padding is demonstrated in Figure 5.
Modeling Sequential Data with MLPs
But before we dive into modeling sequential data with Recurrent Neural Networks, let us first understand the shortcomings of the models we would have generally used.
What if we tried to model sequential data with Multilayer Perceptrons (MLPs)?
The answer lies in the question. We create a sequence of our data and pass it through the MLP. The MLP starts learning the data, but something is missing. The process is demonstrated in Figure 6.
Can you tell what is missing in Figure 6?
The MLP learns individual data points. Each time a data point is passed through the MLP, the color of the entire network changes. This signifies that the MLP models that particular data point. So far, so good.
But there is something else here: is succeeded by and then by and so on. This makes the data sequential. While the MLP can learn the individual data points perfectly, it will not be able to learn the order of the data.
And therein lies the problem.
The Recurrence Formula
To solve this problem, we need a function that takes care of the current and previous states.
is the current input. is the current output and is the past output. The function that models this relation is represented by .
Let’s unfold this equation and understand its importance.
- At
- At
It is quite evident from the above examples that the equation can model the previous states along with the current state. This is an extremely important equation we will use throughout our tutorial.
In the case of Recurrent Neural Networks the function is a simple that introduces nonlinearity. If you need a quick brush-up on the tanh function, feel free to play around with the interactive graph below:
The recurrence formula for Recurrent Neural Network can be represented as:
is called the hidden state of the network. Notice how the current hidden state is a function of the current input and the previous hidden state . This is demonstrated in Figure 7.
Recurrent Neural Network (an overview)
Now that we know the formula behind Recurrent Neural Networks, let us see how to build one. The diagram of a Recurrent Neural Network is shown in Figure 8.
An RNN cell consists of three parts:
- The input or
- The output or
- And the hidden state
The hidden state is fed back as input to the next state and the input for that state. Fortunately, this is implemented using the SimpleRNN
function inside the tf.keras.layers
API.
The relation between the input, output, and the hidden state is demonstrated in the following. Figure 9 shows an unfolded RNN.
And now, we begin to build our RNN model. Let’s go through the model.py
file.
# import the necessary packages from tensorflow.keras import layers from tensorflow import keras
We begin with the necessary imports on Lines 2-3.
def get_rnn_model(vocabSize): # input for variable-length sequences of integers inputs = keras.Input(shape=(None,), dtype="int32") # embed the tokens in a 128-dimensional vector with masking # applied and apply dropout x = layers.Embedding(vocabSize, 128, mask_zero=True)(inputs) x = layers.Dropout(0.2)(x) # add 3 simple RNNs x = layers.SimpleRNN(64, return_sequences=True)(x) x = layers.SimpleRNN(64, return_sequences=True)(x) x = layers.SimpleRNN(64)(x) # add a classifier head x = layers.Dense(units=64, activation="relu")(x) x = layers.Dense(units=32, activation="relu")(x) x = layers.Dropout(0.2)(x) outputs = layers.Dense(1, activation="sigmoid")(x) # build the RNN model model = keras.Model(inputs, outputs, name="RNN") # return the RNN model return model
On Line 5, we define the get_rnn_model
we use to create the model. On Line 7, we create the input layer for a variable length sequence of input. The padding and masking mechanism is taken care of by our Embedding
layer, which we initialize in Line 11. The Embedding
layer also embeds the variable-length sequence into a 128
-dimensional vector. For those unclear, consider this 128-dimensional matrix as a way the computers view and assign meaning to the text.
On Line 12, we apply dropout to the inputs. We add 3 simple RNN cells on Lines 15-17 and a classifier head with a sigmoid activation on Lines 20-23.
Finally, the model is built using keras.Model API on Line 26 and returned on Line 29.
Training and Visualizations
With the model created and ready, we can finally begin our training procedure. But before that, we must first look at some helper functions that will assist with visualizations and saving.
We begin with a function inside plot.py
that will help us plot the loss and accuracy.
# import the necessary packages import matplotlib.pyplot as plt def plot_loss_accuracy(history, filepath): # plot the training and validation loss plt.style.use("ggplot") (fig, axs) = plt.subplots(2, 1) axs[0].plot(history["loss"], label="train_loss") axs[0].plot(history["val_loss"], label="val_loss") axs[0].set_xlabel("Epoch #") axs[0].set_ylabel("Loss") axs[0].legend() axs[1].plot(history["accuracy"], label="train_accuracy") axs[1].plot(history["val_accuracy"], label="val_accuracy") axs[1].set_xlabel("Epoch #") axs[1].set_ylabel("Accuracy") axs[1].legend() fig.savefig(filepath)
On Line 2, we begin by importing matplotlib. On Line 4, we define the plot_loss_accuracy
function that takes the model history
and filepath
as input.
Next, on Lines 6-17, we plot the loss and accuracy for training and validation and save the figure in the specified filepath
(Line 18).
Next, we define another helper function called save_load.py
to save the adapted vectorization layer for later use.
from tensorflow.keras.layers import TextVectorization import tensorflow as tf import pickle def save_vectorizer(vectorizer, name): # pickle the weights of the vectorization layer pickle.dump({"weights": vectorizer.get_weights()}, open(f"{name}.pkl", "wb"))
We begin with the necessary imports on Lines 1-3. Next, we define a function called save_vectorizer
on Lines 5-8 that pickles and saves the weight of the vectorization layer.
And finally, with all the necessary functions defined, we can start with train.py
, where we actually train our RNN model.
# USAGE # python train.py # set the seed for reproducibility import tensorflow as tf tf.keras.utils.set_random_seed(42) # import the necessary packages from pyimagesearch.standardization import custom_standardization from pyimagesearch.plot import plot_loss_accuracy from pyimagesearch.save_load import save_vectorizer from pyimagesearch.dataset import get_imdb_dataset from pyimagesearch.model import get_rnn_model from pyimagesearch.model import get_lstm_model from pyimagesearch import config from tensorflow.keras import layers from tensorflow import keras import os
We begin with all the necessary imports on Lines 5-18.
# get the IMDB dataset print("[INFO] getting the IMDB dataset...") (trainDs, valDs) = get_imdb_dataset(folderName=config.DATASET_PATH, batchSize=config.BATCH_SIZE, bufferSize=config.BUFFER_SIZE, autotune=tf.data.AUTOTUNE, test=False) # initialize the text vectorization layer vectorizeLayer = layers.TextVectorization( max_tokens=config.VOCAB_SIZE, output_mode="int", output_sequence_length=config.MAX_SEQUENCE_LENGTH, standardize=custom_standardization, ) # grab the text from the training dataset and adapt the text # vectorization layer on it trainText = trainDs.map(lambda text, label: text) vectorizeLayer.adapt(trainText) # vectorize the training and the validation dataset trainDs = trainDs.map(lambda text, label: (vectorizeLayer(text), label)) valDs = valDs.map(lambda text, label: (vectorizeLayer(text), label)) # get the RNN model and compile it print("[INFO] building the RNN model...") modelRNN = get_rnn_model(vocabSize=config.VOCAB_SIZE) modelRNN.compile(metrics=["accuracy"], optimizer=keras.optimizers.Adam(learning_rate=config.LR), loss=keras.losses.BinaryCrossentropy(from_logits=False), ) # train the RNN model print("[INFO] training the RNN model...") historyRNN = modelRNN.fit(trainDs, epochs=config.EPOCHS, validation_data=valDs, )
Next, on Lines 22-24, we get the IMDb dataset for movie reviews.
On Lines 27-32, we initialize the text vectorization layer with:
max_tokens
: The maximum number of tokens inside the vocabulary.output_mode
: The data type for the output.output_sequence_length
: The maximum sequence length that we will need for padding and masking.standardize
: The custom standardization function that we defined previously.
On Lines 36 and 37, we adapt the vectorization layer on the training dataset.
When a text vectorization layer is initialized, it does not hold any information about the vocabulary of the training corpus. To build a vocabulary, we need to pass the entire training dataset through the vectorization layer. This is known as adapting to the text.
Finally, on Lines 40 and 41, we vectorize the text of the training and validation dataset using the adapted vectorization layer.
On Lines 44-49, we initialize the predefined RNN model and compile it with Adam optimizer and Binary Cross-Entropy loss.
On Lines 52-55, we fit the model on the training data and save its history onto the historyRNN
variable.
# check whether the output folder exists, if not build the output folder if not os.path.exists(config.OUTPUT_PATH): os.makedirs(config.OUTPUT_PATH) # save the loss and accuracy plots of RNN and LSTM models plot_loss_accuracy(history=historyRNN.history, filepath=config.RNN_PLOT) plot_loss_accuracy(history=historyLSTM.history, filepath=config.LSTM_PLOT)
On Lines 72 and 73, we check whether the output path exists or create it otherwise. The plot_loss_accuracy
function is called on Line 76 to visualize the loss and accuracy of the RNN model.
# save the trained RNN and LSTM models to disk print(f"[INFO] saving the RNN model to {config.RNN_MODEL_PATH}...") keras.models.save_model(model=modelRNN, filepath=config.RNN_MODEL_PATH, include_optimizer=False) print(f"[INFO] saving the LSTM model to {config.LSTM_MODEL_PATH}...") keras.models.save_model(model=modelLSTM, filepath=config.LSTM_MODEL_PATH, include_optimizer=False)
On Lines 80-82, we then save the trained RNN model on our disk.
# save the text vectorization layer to disk save_vectorizer(vectorizer=vectorizeLayer, name=config.TEXT_VEC_PATH)
Finally, on Line 88, we save the text vectorization layer we used to vectorize the training and validation data.
We can verify the model output’s training, validation loss, and accuracy. We have a 67.88% validation accuracy in just 10 epochs!
$ python train.py [INFO] getting the IMDB dataset... [INFO] building the RNN model... [INFO] training the RNN model... Epoch 1/10 22/22 [==============================] - 11s 329ms/step - loss: 0.7018 - accuracy: 0.4971 - val_loss: 0.6981 - val_accuracy: 0.4820 Epoch 2/10 22/22 [==============================] - 7s 320ms/step - loss: 0.6935 - accuracy: 0.5152 - val_loss: 0.6967 - val_accuracy: 0.4916 Epoch 3/10 22/22 [==============================] - 7s 293ms/step - loss: 0.6883 - accuracy: 0.5405 - val_loss: 0.6959 - val_accuracy: 0.5000 Epoch 4/10 22/22 [==============================] - 7s 307ms/step - loss: 0.6850 - accuracy: 0.5509 - val_loss: 0.6952 - val_accuracy: 0.5064 Epoch 5/10 22/22 [==============================] - 7s 303ms/step - loss: 0.6802 - accuracy: 0.5673 - val_loss: 0.6950 - val_accuracy: 0.5100 Epoch 6/10 22/22 [==============================] - 7s 302ms/step - loss: 0.6729 - accuracy: 0.5915 - val_loss: 0.6953 - val_accuracy: 0.5136 Epoch 7/10 22/22 [==============================] - 7s 294ms/step - loss: 0.6650 - accuracy: 0.6094 - val_loss: 0.6943 - val_accuracy: 0.5232 Epoch 8/10 22/22 [==============================] - 7s 303ms/step - loss: 0.6493 - accuracy: 0.6402 - val_loss: 0.6812 - val_accuracy: 0.5668 Epoch 9/10 22/22 [==============================] - 7s 294ms/step - loss: 0.6141 - accuracy: 0.6774 - val_loss: 0.6379 - val_accuracy: 0.6380 Epoch 10/10 22/22 [==============================] - 7s 296ms/step - loss: 0.5501 - accuracy: 0.7335 - val_loss: 0.5945 - val_accuracy: 0.6788
Loading and Inference
Now that the model has been trained and saved to disk, we need to perform inference to understand the model performance. Before starting with inference, let us first open save_load.py
again and look at the load_vectorizer
function.
def load_vectorizer(name, maxTokens, outputLength, standardize=None): # load the pickles data fromDisk = pickle.load(open(f"{name}.pkl", "rb")) # build a new vectorization layer newVectorizer = TextVectorization(max_tokens=maxTokens, output_mode="int", output_sequence_length=outputLength, standardize=standardize) # call the adap method with some dummy data for the vectorization # layer to initialize properly newVectorizer.adapt(tf.data.Dataset.from_tensor_slices(["xyz"])) newVectorizer.set_weights(fromDisk["weights"]) # return the vectorization layer return newVectorizer
We define the load_vectorizer
function on Line 10, which takes the following input:
name
: the file path of the savedTextVectorization
layer weightsmaxTokens
: the maximum number of tokens in the vocabularyoutputLength
: the length of the output sequencestandardize
: we do not need any standardization function here
In the training pipeline, we build a TextVectorization
layer to tokenize and vectorize our training data. In the inference pipeline, we need the same TextVectorization layer as the training pipeline.
To do this, we save the weights of the adapted TextVectorization
layer and load the weights on top of a newly initialized layer. On Line 12, we load the weights of the saved TextVectorization
layer. On Line 25, we return the loaded vectorizer.
# USAGE # python inference.py # import the necessary packages from pyimagesearch.standardization import custom_standardization from pyimagesearch.save_load import load_vectorizer from pyimagesearch.dataset import get_imdb_dataset from pyimagesearch import config from tensorflow import keras import tensorflow as tf # load the pre-trained RNN and LSTM model print("[INFO] loading the pre-trained RNN model...") modelRnn = keras.models.load_model(filepath=config.RNN_MODEL_PATH) modelRnn.compile(optimizer="adam", metrics=["accuracy"], loss=keras.losses.BinaryCrossentropy(from_logits=False), ) print("[INFO] loading the pre-trained LSTM model...") modelLstm = keras.models.load_model(filepath=config.LSTM_MODEL_PATH) modelLstm.compile(optimizer="adam", metrics=["accuracy"], loss=keras.losses.BinaryCrossentropy(from_logits=False), )
On Lines 5-10, we import the necessary packages. On Line 14, the saved RNN model is loaded back to disk. The loaded model is then compiled with the suitable metrics, loss, and optimizer on Lines 15-17.
# get the IMDB dataset print("[INFO] getting the IMDB test dataset...") testDs = get_imdb_dataset(folderName=config.DATASET_PATH, batchSize=config.BATCH_SIZE, bufferSize=config.BUFFER_SIZE, autotune=tf.data.AUTOTUNE, test=True) # load the pre-trained text vectorization layer vectorizeLayer = load_vectorizer(name=config.TEXT_VEC_PATH, maxTokens=config.VOCAB_SIZE, outputLength=config.MAX_SEQUENCE_LENGTH, standardize=custom_standardization) # vectorize the test dataset testDs = testDs.map(lambda text, label: (vectorizeLayer(text), label)) # evaluate the trained RNN and LSTM model on the test dataset for model in [modelRnn, modelLstm]: print(f"[INFO] test evaluation for {model.name}:") (testLoss, testAccuracy) = model.evaluate(testDs) print(f"\t[INFO] test loss: {testLoss:0.2f}") print(f"\t[INFO] test accuracy: {testAccuracy * 100:0.2f}%")
On Lines 26-28, we get the testing dataset and pre-process it. Line 37 will map the dataset to get the vectorized tokens and labels.
On Lines 40-44, we evaluate the testing accuracy and the testing loss of the RNN model. Our model achieves 68.42% accuracy at inference!
$ python inference.py [INFO] loading the pre-trained RNN model... [INFO] loading the pre-trained LSTM model... [INFO] getting the IMDB test dataset... [INFO] test evaluation for RNN: 25/25 [==============================] - 4s 96ms/step - loss: 0.6035 - accuracy: 0.6842 [INFO] test loss: 0.60 [INFO] test accuracy: 68.42%
What's next? We recommend PyImageSearch University.
86 total classes • 115+ hours of on-demand code walkthrough videos • Last updated: October 2024
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled
I strongly believe that if you had the right teacher you could master computer vision and deep learning.
Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?
That’s not the case.
All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.
Inside PyImageSearch University you'll find:
- ✓ 86 courses on essential computer vision, deep learning, and OpenCV topics
- ✓ 86 Certificates of Completion
- ✓ 115+ hours of on-demand video
- ✓ Brand new courses released regularly, ensuring you can keep up with state-of-the-art techniques
- ✓ Pre-configured Jupyter Notebooks in Google Colab
- ✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
- ✓ Access to centralized code repos for all 540+ tutorials on PyImageSearch
- ✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
- ✓ Access on mobile, laptop, desktop, etc.
Summary
In this tutorial, we have covered the basics of sequential data and how to model them. We at PyImageSearch always want our readers to have a strong foundation. To cover some exciting blog posts on Attention and Transformers (coming soon, really soon), this was the first big step to take.
You are now familiar with a modest amount of text processing utilities and the recurrence formula and can build a Recurrent Neural Network for modeling sequential data of any sort.
In the coming tutorial on this series, we will study the shortcomings of RNN and understand how to bypass them using Long Short-Term Memory.
References
- https://www.tensorflow.org/tutorials/keras/text_classification
- https://wandb.ai/authors/rnn-viz/reports/Under-the-Hood-of-RNNs–VmlldzoyNTQ4MjY
- https://keras.io/api/layers/recurrent_layers/simple_rnn/
- Mathematical Animations generated using: Manim CE
- Interactive graph generated using: Desmos
Citation Information
A. R. Gosthipaty, D. Chakraborty, and R. Raha. “Introduction to Recurrent Neural Networks with Keras and TensorFlow,” PyImageSearch, P. Chugh, S. Huot, K. Kidriavsteva, and A. Thanki, eds., 2022, https://pyimg.co/a3dwm
@incollection{GCR_2022_RNN, author = {Aritra Roy Gosthipaty and Devjyoti Chakraborty and Ritwik Raha}, title = {Introduction to Recurrent Neural Networks with Keras and TensorFlow}, booktitle = {PyImageSearch}, editor = {Puneet Chugh and and Susan Huot and Kseniia Kidriavsteva and Abhishek Thanki}, year = {2022}, note = {https://pyimg.co/a3dwm}, }
Unleash the potential of computer vision with Roboflow - Free!
- Step into the realm of the future by signing up or logging into your Roboflow account. Unlock a wealth of innovative dataset libraries and revolutionize your computer vision operations.
- Jumpstart your journey by choosing from our broad array of datasets, or benefit from PyimageSearch’s comprehensive library, crafted to cater to a wide range of requirements.
- Transfer your data to Roboflow in any of the 40+ compatible formats. Leverage cutting-edge model architectures for training, and deploy seamlessly across diverse platforms, including API, NVIDIA, browser, iOS, and beyond. Integrate our platform effortlessly with your applications or your favorite third-party tools.
- Equip yourself with the ability to train a potent computer vision model in a mere afternoon. With a few images, you can import data from any source via API, annotate images using our superior cloud-hosted tool, kickstart model training with a single click, and deploy the model via a hosted API endpoint. Tailor your process by opting for a code-centric approach, leveraging our intuitive, cloud-based UI, or combining both to fit your unique needs.
- Embark on your journey today with absolutely no credit card required. Step into the future with Roboflow.
To download the source code to this post (and be notified when future tutorials are published here on PyImageSearch), simply enter your email address in the form below!
Download the Source Code and FREE 17-page Resource Guide
Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!
Comment section
Hey, Adrian Rosebrock here, author and creator of PyImageSearch. While I love hearing from readers, a couple years ago I made the tough decision to no longer offer 1:1 help over blog post comments.
At the time I was receiving 200+ emails per day and another 100+ blog post comments. I simply did not have the time to moderate and respond to them all, and the sheer volume of requests was taking a toll on me.
Instead, my goal is to do the most good for the computer vision, deep learning, and OpenCV community at large by focusing my time on authoring high-quality blog posts, tutorials, and books/courses.
If you need help learning computer vision and deep learning, I suggest you refer to my full catalog of books and courses — they have helped tens of thousands of developers, students, and researchers just like yourself learn Computer Vision, Deep Learning, and OpenCV.
Click here to browse my full catalog.