One concept we have not discussed yet is architecture visualization, the process of constructing a graph of nodes and associated connections in a network and saving the graph to disk as an image (i.e., PNG, JPG, etc.). Nodes in the graphs represent layers, while connections between nodes represent the flow of data through the network.
These graphs typically include the following components for each layer:
- The input volume size.
- The output volume size.
- And optionally the name of the layer.
We typically use network architecture visualization when (1) debugging our own custom network architectures and (2) publication, where a visualization of the architecture is easier to understand than including the actual source code or trying to construct a table to convey the same information. In the remainder of this tutorial, you will learn how to construct network architecture visualization graphs using Keras, followed by serializing the graph to disk as an actual image.
To learn how to visualize network architectures using Keras and TensorFlow, just keep reading.
Looking for the source code to this post?
Jump Right To The Downloads SectionThe Importance of Architecture Visualization
Visualizing the architecture of a model is a critical debugging tool, especially if you are:
- Implementing an architecture in a publication, but are unfamiliar with it.
- Implementing your own custom network architecture.
In short, network visualization validates our assumptions that our code is correctly building the model we are intending to construct. By examining the output graph image, you can see if there is a flaw in your logic. The most common flaws include:
- Incorrectly ordering layers in the network.
- Assuming an (incorrect) output volume size after a
CONV
orPOOL
layer.
Whenever implementing a network architecture, I suggest you visualize the network architecture after every block of CONV
and POOL
layers, which will enable you to validate your assumptions (and more importantly, catch “bugs” in the network early on).
Bugs in Convolutional Neural Networks are not like other logic bugs in applications resulting from edge cases. Instead, a CNN very well may train and obtain reasonable results even with an incorrect layer ordering, but if you don’t realize that this bug has happened, you might report your results thinking you did one thing, but in reality did another.
In the remainder of this tutorial, I’ll help you visualize your own network architectures to avoid these types of problematic situations.
Configuring your development environment
To follow this guide, you need to have the OpenCV library installed on your system.
Luckily, OpenCV is pip-installable:
$ pip install opencv-contrib-python
If you need help configuring your development environment for OpenCV, I highly recommend that you read my pip install OpenCV guide — it will have you up and running in a matter of minutes.
Having problems configuring your development environment?
All that said, are you:
- Short on time?
- Learning on your employer’s administratively locked system?
- Wanting to skip the hassle of fighting with the command line, package managers, and virtual environments?
- Ready to run the code right now on your Windows, macOS, or Linux system?
Then join PyImageSearch University today!
Gain access to Jupyter Notebooks for this tutorial and other PyImageSearch guides that are pre-configured to run on Google Colab’s ecosystem right in your web browser! No installation required.
And best of all, these Jupyter Notebooks will run on Windows, macOS, and Linux!
Installing graphviz and pydot
To construct a graph of our network and save it to disk using Keras, we need to install the graphviz
prerequisite:
On Ubuntu, this is as simple as:
$ sudo apt-get install graphviz
While on macOS, we can install graphviz
via Homebrew:
$ brew install graphviz
Once the graphviz
library is installed, we need to install two Python packages:
$ pip install graphviz $ pip install pydot
Visualizing Keras Networks
Visualizing network architectures with Keras is incredibly simple. To see how easy it is, open a new file, name it visualize_architecture.py
and insert the following code:
# import the necessary packages from pyimagesearch.nn.conv import LeNet from tensorflow.keras.utils import plot_model # initialize LeNet and then write the network architecture # visualization graph to disk model = LeNet.build(28, 28, 1, 10) plot_model(model, to_file="lenet.png", show_shapes=True)
Line 2 imports our implementation of LeNet (earlier tutorial) — this is the network architecture that we’ll be visualizing. Line 3 imports the plot_model
function from Keras. As this function name suggests, plot_model
is responsible for constructing a graph based on the layers inside the input model and then writing the graph to disk an image.
On Line 7, we instantiate the LeNet architecture as if we were going to apply it to MNIST for digit classification. The parameters include the width of the input volume (28 pixels), the height (28 pixels), the depth (1 channel), and the total number of class labels (10).
Finally, Line 8 plots our model
saves it to disk under the name lenet.png
.
To execute our script, just open a terminal and issue the following command:
$ python visualize_architecture.py
Once the command successfully exists, check your current working directory:
$ ls lenet.png visualize_architecture.py
As you’ll see, there is a file named lenet.png
— this file is our actual network visualization graph. Open it up and examine it (Figures 2 and 3).
Here, we can see a visualization of the data flow through our network. Each layer is represented as a node in the architecture which are then connected to other layers, ultimately terminating after the softmax classifier is applied. Notice how each layer in the network includes an input
and output
attribute — these values are the size of the respective volume’s spatial dimensions when it enters the layer and after it exits the layer.
Walking through the LeNet architecture, we see the first layer is our InputLayer
which accepts a 28×28×1 input image. The spatial dimensions for the input and output of the layer are the same as this is simply a “placeholder” for the input data.
You might be wondering what the None
represents in the data shape (None, 28, 28, 1)
. The None
is actually our batch size. When visualizing the network architecture, Keras does not know our intended batch size so it leaves the value as None
. When training this value would change to 32, 64, 128, etc., or whatever batch size we deemed appropriate.
Next, our data flows to the first CONV
layer, where we learn 20 kernels on the 28×28×1 input. The output of this first CONV
layer is 28×28×20. We have retained our original spatial dimensions due to zero padding, but by learning 20 filters we have changed the volume size.
An activation layer follows the CONV
layer, which by definition cannot change the input volume size. However, a POOL
operation can reduce the volume size — here our input volume is reduced from 28×28×20 down to 14×14×20.
The second CONV
accepts the 14×14×20 volume as input, but then learns 50 filters, changing the output volume size to 14×14×50 (again, zero padding is leveraged to ensure the convolution itself does not reduce the width and height of the input). An activation is applied prior to another POOL
operation which again halves the width and height from 14×14×50 down to 7×7×50.
At this point, we are ready to apply our FC
layers. To accomplish this, our 7×7×50 input is flattened into a list of 2,450 values (since 7×7×50 = 2,450). Now that we have flattened the output of the convolutional part of our network, we can apply an FC
layer that accepts the 2,450 input values and learns 500 nodes. An activation follows, followed by another FC
layer, this time reducing 500 down to 10 (the total number of class labels for the MNIST dataset).
Finally, a softmax classifier is applied to each of the 10 input nodes, giving us our final class probabilities.
What's next? We recommend PyImageSearch University.
86 total classes • 115+ hours of on-demand code walkthrough videos • Last updated: October 2024
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled
I strongly believe that if you had the right teacher you could master computer vision and deep learning.
Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?
That’s not the case.
All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.
Inside PyImageSearch University you'll find:
- ✓ 86 courses on essential computer vision, deep learning, and OpenCV topics
- ✓ 86 Certificates of Completion
- ✓ 115+ hours of on-demand video
- ✓ Brand new courses released regularly, ensuring you can keep up with state-of-the-art techniques
- ✓ Pre-configured Jupyter Notebooks in Google Colab
- ✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
- ✓ Access to centralized code repos for all 540+ tutorials on PyImageSearch
- ✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
- ✓ Access on mobile, laptop, desktop, etc.
Summary
Just as we can express the LeNet architecture in code, we can also visualize the model itself as an image. As you get started on your deep learning journey, I highly encourage you to use this code to visualize any networks you are working with, especially if you are unfamiliar with them. Ensuring you understand the flow of data through the network and how the volume sizes change based on CONV
, POOL
, and FC
layers will give you a dramatically more intimate understanding of the architecture rather than relying on code alone.
When implementing my own network architectures, I validate that I’m on the right track by visualizing the architecture every 2-3 layer blocks as I’m actually coding the network — this action helps me find bugs or flaws in my logic early on.
To download the source code to this post (and be notified when future tutorials are published here on PyImageSearch), simply enter your email address in the form below!
Download the Source Code and FREE 17-page Resource Guide
Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!
Comment section
Hey, Adrian Rosebrock here, author and creator of PyImageSearch. While I love hearing from readers, a couple years ago I made the tough decision to no longer offer 1:1 help over blog post comments.
At the time I was receiving 200+ emails per day and another 100+ blog post comments. I simply did not have the time to moderate and respond to them all, and the sheer volume of requests was taking a toll on me.
Instead, my goal is to do the most good for the computer vision, deep learning, and OpenCV community at large by focusing my time on authoring high-quality blog posts, tutorials, and books/courses.
If you need help learning computer vision and deep learning, I suggest you refer to my full catalog of books and courses — they have helped tens of thousands of developers, students, and researchers just like yourself learn Computer Vision, Deep Learning, and OpenCV.
Click here to browse my full catalog.