One concept we have not discussed yet is architecture visualization, the process of constructing a graph of nodes and associated connections in a network and saving the graph to disk as an image (i.e., PNG, JPG, etc.). Nodes in the graphs represent layers, while connections between nodes represent the flow of data through the network.
These graphs typically include the following components for each layer:
- The input volume size.
- The output volume size.
- And optionally the name of the layer.
We typically use network architecture visualization when (1) debugging our own custom network architectures and (2) publication, where a visualization of the architecture is easier to understand than including the actual source code or trying to construct a table to convey the same information. In the remainder of this tutorial, you will learn how to construct network architecture visualization graphs using Keras, followed by serializing the graph to disk as an actual image.
To learn how to visualize network architectures using Keras and TensorFlow, just keep reading.
The Importance of Architecture Visualization
Visualizing the architecture of a model is a critical debugging tool, especially if you are:
- Implementing an architecture in a publication, but are unfamiliar with it.
- Implementing your own custom network architecture.
In short, network visualization validates our assumptions that our code is correctly building the model we are intending to construct. By examining the output graph image, you can see if there is a flaw in your logic. The most common flaws include:
- Incorrectly ordering layers in the network.
- Assuming an (incorrect) output volume size after a
Whenever implementing a network architecture, I suggest you visualize the network architecture after every block of
POOL layers, which will enable you to validate your assumptions (and more importantly, catch “bugs” in the network early on).
Bugs in Convolutional Neural Networks are not like other logic bugs in applications resulting from edge cases. Instead, a CNN very well may train and obtain reasonable results even with an incorrect layer ordering, but if you don’t realize that this bug has happened, you might report your results thinking you did one thing, but in reality did another.
In the remainder of this tutorial, I’ll help you visualize your own network architectures to avoid these types of problematic situations.
Configuring your development environment
To follow this guide, you need to have the OpenCV library installed on your system.
Luckily, OpenCV is pip-installable:
$ pip install opencv-contrib-python
If you need help configuring your development environment for OpenCV, I highly recommend that you read my pip install OpenCV guide — it will have you up and running in a matter of minutes.
Having problems configuring your development environment?
All that said, are you:
- Short on time?
- Learning on your employer’s administratively locked system?
- Wanting to skip the hassle of fighting with the command line, package managers, and virtual environments?
- Ready to run the code right now on your Windows, macOS, or Linux system?
Then join PyImageSearch University today!
Gain access to Jupyter Notebooks for this tutorial and other PyImageSearch guides that are pre-configured to run on Google Colab’s ecosystem right in your web browser! No installation required.
And best of all, these Jupyter Notebooks will run on Windows, macOS, and Linux!
Installing graphviz and pydot
To construct a graph of our network and save it to disk using Keras, we need to install the
On Ubuntu, this is as simple as:
$ sudo apt-get install graphviz
While on macOS, we can install
graphviz via Homebrew:
$ brew install graphviz
graphviz library is installed, we need to install two Python packages:
$ pip install graphviz $ pip install pydot
Visualizing Keras Networks
Visualizing network architectures with Keras is incredibly simple. To see how easy it is, open a new file, name it
visualize_architecture.py and insert the following code:
# import the necessary packages from pyimagesearch.nn.conv import LeNet from tensorflow.keras.utils import plot_model # initialize LeNet and then write the network architecture # visualization graph to disk model = LeNet.build(28, 28, 1, 10) plot_model(model, to_file="lenet.png", show_shapes=True)
Line 2 imports our implementation of LeNet (earlier tutorial) — this is the network architecture that we’ll be visualizing. Line 3 imports the
plot_model function from Keras. As this function name suggests,
plot_model is responsible for constructing a graph based on the layers inside the input model and then writing the graph to disk an image.
On Line 7, we instantiate the LeNet architecture as if we were going to apply it to MNIST for digit classification. The parameters include the width of the input volume (28 pixels), the height (28 pixels), the depth (1 channel), and the total number of class labels (10).
Finally, Line 8 plots our
model saves it to disk under the name
To execute our script, just open a terminal and issue the following command:
$ python visualize_architecture.py
Once the command successfully exists, check your current working directory:
$ ls lenet.png visualize_architecture.py
As you’ll see, there is a file named
lenet.png — this file is our actual network visualization graph. Open it up and examine it (Figures 2 and 3).
Here, we can see a visualization of the data flow through our network. Each layer is represented as a node in the architecture which are then connected to other layers, ultimately terminating after the softmax classifier is applied. Notice how each layer in the network includes an
output attribute — these values are the size of the respective volume’s spatial dimensions when it enters the layer and after it exits the layer.
Walking through the LeNet architecture, we see the first layer is our
InputLayer which accepts a 28×28×1 input image. The spatial dimensions for the input and output of the layer are the same as this is simply a “placeholder” for the input data.
You might be wondering what the
None represents in the data shape
(None, 28, 28, 1). The
None is actually our batch size. When visualizing the network architecture, Keras does not know our intended batch size so it leaves the value as
None. When training this value would change to 32, 64, 128, etc., or whatever batch size we deemed appropriate.
Next, our data flows to the first
CONV layer, where we learn 20 kernels on the 28×28×1 input. The output of this first
CONV layer is 28×28×20. We have retained our original spatial dimensions due to zero padding, but by learning 20 filters we have changed the volume size.
An activation layer follows the
CONV layer, which by definition cannot change the input volume size. However, a
POOL operation can reduce the volume size — here our input volume is reduced from 28×28×20 down to 14×14×20.
CONV accepts the 14×14×20 volume as input, but then learns 50 filters, changing the output volume size to 14×14×50 (again, zero padding is leveraged to ensure the convolution itself does not reduce the width and height of the input). An activation is applied prior to another
POOL operation which again halves the width and height from 14×14×50 down to 7×7×50.
At this point, we are ready to apply our
FC layers. To accomplish this, our 7×7×50 input is flattened into a list of 2,450 values (since 7×7×50 = 2,450). Now that we have flattened the output of the convolutional part of our network, we can apply an
FC layer that accepts the 2,450 input values and learns 500 nodes. An activation follows, followed by another
FC layer, this time reducing 500 down to 10 (the total number of class labels for the MNIST dataset).
Finally, a softmax classifier is applied to each of the 10 input nodes, giving us our final class probabilities.
What's next? We recommend PyImageSearch University.
84 total classes • 114+ hours of on-demand code walkthrough videos • Last updated: February 2024
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled
I strongly believe that if you had the right teacher you could master computer vision and deep learning.
Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?
That’s not the case.
All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.
Inside PyImageSearch University you'll find:
- ✓ 84 courses on essential computer vision, deep learning, and OpenCV topics
- ✓ 84 Certificates of Completion
- ✓ 114+ hours of on-demand video
- ✓ Brand new courses released regularly, ensuring you can keep up with state-of-the-art techniques
- ✓ Pre-configured Jupyter Notebooks in Google Colab
- ✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
- ✓ Access to centralized code repos for all 536+ tutorials on PyImageSearch
- ✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
- ✓ Access on mobile, laptop, desktop, etc.
Just as we can express the LeNet architecture in code, we can also visualize the model itself as an image. As you get started on your deep learning journey, I highly encourage you to use this code to visualize any networks you are working with, especially if you are unfamiliar with them. Ensuring you understand the flow of data through the network and how the volume sizes change based on
FC layers will give you a dramatically more intimate understanding of the architecture rather than relying on code alone.
When implementing my own network architectures, I validate that I’m on the right track by visualizing the architecture every 2-3 layer blocks as I’m actually coding the network — this action helps me find bugs or flaws in my logic early on.
To download the source code to this post (and be notified when future tutorials are published here on PyImageSearch), simply enter your email address in the form below!
Download the Source Code and FREE 17-page Resource Guide
Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!