Let me ask you three questions:
- What if you could could run state-of-the-art neural networks on a USB stick?
- What if you could see over 10x performance on this USB stick compared to your CPU?
- And what if this entire device costs under $100?
Sound interesting?
Enter Intel’s Movidius Neural Compute Stick (NCS).
Raspberry Pi users will especially welcome the device as it can dramatically improve upon image classification and object detection speeds and capabilities. You may find that the Movidius is “just what you needed” to speedup network inference time in (1) a small form factor and (2) a good price.
Inside today’s post we’ll discuss:
- What the Movidius Neural Compute Stick is capable of
- If you should buy one
- How to quickly and easily get up and running with the Movidius
- Benchmarks comparing network inference times on a MacBook Pro and Raspberry Pi
Next week I’ll provide additional benchmarks and object detection scripts using the Movidius as well.
To get started with the Intel Movidius Neural Compute Stick and to learn how you can deploy a CNN model to your Raspberry Pi + NCS, just keep reading.
Deprecation Notice: This article uses the Movidius APIv1 and APIv2 which is now superseded by Intel’s OpenVINO software for using the Movidius NCS. Learn more about OpenVINO in this PyImageSearch article.
Looking for the source code to this post?
Jump Right To The Downloads SectionGetting started with the Intel Movidius Neural Compute Stick
Today’s blog post is broken into five parts.
First, I’ll answer:
What is the Intel Movidius Neural Compute Stick and should I buy one?
From there I’ll explain the workflow of getting up and running with the Movidius Neural Compute Stick. The entire process is relatively simple, but it needs to be spelled out so that we understand how to work with the NCS
We’ll then setup our Raspberry Pi with the NCS in API-only mode. We’ll also do a quick sanity check to ensure we have communication to the NCS.
Next up, I’ll walk through my custom Raspberry Pi + Movidius NCS image classification benchmark script. We’ll be using SqueezeNet, GoogLeNet, and AlexNet.
We’ll wrap up the blog post by comparing benchmark results.
What is the Intel Movidius Neural Compute Stick?
Intel’s Neural Compute Stick is a USB-thumb-drive-sized deep learning machine.
You can think of the NCS like a USB powered GPU, although that is quite the overstatement — it is not a GPU, and it can only be used for prediction/inference, not training.
I would actually classify the NCS as a coprocessor. It’s got one purpose: running (forward-pass) neural network calculations. In our case, we’ll be using the NCS for image classification.
The NCS should not be used for training a neural network model, rather it is designed for deployable models. Since the device is meant to be used on single board computers such as the Raspberry Pi, the power draw is meant to be minimal, making it inappropriate for actually training a network.
So now you’re wondering: Should I buy the Movidius NCS?
At only $77 dollars, the NCS packs a punch. You can buy the device on Amazon or at any of the retailers listed on Intel’s site.
Under the hood of the NCS is a Myriad 2 processor (28 nm) capable of 80-150 GFLOPS performance. This processor is also known as a Vision Processing Unit (or vision accelerator) and it consumes only 1W of power (for reference, the Raspberry Pi 3 B consumes 1.2W with HDMI off, LEDs off, and WiFi on).
Whether buying the NCS is worth it to you depends on the answers to several questions:
- Do you have an immediate use case or do you have $77 to burn on another toy?
- Are you willing to deal with the growing pains of joining a young community? While certainly effective, we don’t know if these “vision processing units” are here to stay.
- Are you willing to devote a machine (or VM) to the SDK?
- Pi Users: Are you willing to devote a separate Pi or at least a separate microSD to the NCS? Are you aware that the device based on it’s form factor dimensions will block 3 USB ports unless you use a cable to go to the NCS dongle?
Question 1 is up to you.
The reason I’m asking question 2 is because Intel is notorious for poor documentation and even discontinuing their products as they did with the Intel Galileo.
I’m not suggesting that either will occur with the NCS. The NCS is in the deep learning domain which is currently heading full steam ahead, so the future of this product does look bright. It also doesn’t hurt that there aren’t too many competing products.
Questions 2 and 3 (and their answers) are related. In short, you can’t isolate the development environments with virtual environments and the installer actually removes previous installations of OpenCV from your system. For this reason you should not get the installer scripts anywhere near your current projects and working environments. I learned the hard way. Trust me.
Hopefully I haven’t scared you off — that is not my intention. Most people will be purchasing the Movidius NCS to pair with a Raspberry Pi or other single board computer.
Question 4 is for Pi users. When it comes to the Pi, if you’ve been following any other tutorials on PyImageSearch.com, you’re aware that I recommend Python virtual environments to isolate your Python projects and associated dependencies. Python virtual environments are a best practice in the Python community.
One of my biggest gripes with the Neural Compute Stick is that Intel’s install scripts will actually make your virtual environments nearly unusable. The installer downloads packages from the Debian/Ubuntu Aptitude repos and changes your PYTHONPATH system variable.
It get really messy real quick and to be on the safe side, you should use a fresh microSD (purchase a 32GB 98MB/s microSD on Amazon) with Raspbian Stretch. You might even buy another Pi to marry to the NCS if you’re working on a deployable project.
When I received my NCS I was excited to plug it into my Pi…but unfortunately I was off to a rough start.
Check out the image below.
I found out that with the NCS plugged in, it blocks all 3 other USB ports on my Pi. I can’t even plug my wireless keyboard/mouse dongle in another port!
Now, I understand that the NCS is meant to be used with devices other than the Raspberry Pi, but given that the Raspberry Pi is one of the most used single board systems, I was a bit surprised that Intel didn’t consider this — perhaps it’s because the device consumes a lot of power and they want you to think twice about plugging in additional peripherals to your Pi.
This is very frustrating. The solution is to buy a 6in USB 3.0 extension such as this one:
With those considerations in mind, the Movidius NCS is actually a great device at a good value. So let’s dive into the workflow.
Movidius NCS Workflow
Deprecation Notice: This article uses the Movidius APIv1 and APIv2 which is now superseded by Intel’s OpenVINO software for using the Movidius NCS. Learn more about OpenVINO in this PyImageSearch article.
Working with the NCS is quite easy once you understand the workflow.
The bottom line is that you need a graph file to deploy to the NCS. This graph file can live in the same directory as your Python script if you’d like — it will get sent to the NCS using the NCS API. I’m including a few graph files with the “Downloads” associated with this blog post.
In general, the workflow of using the NCS is:
- Use a pre-trained TensorFlow/Caffe model or train a network with Tensorflow/Caffe on Ubuntu or Debian.
- Use the NCS SDK toolchain to generate a graph file.
- Deploy the graph file and NCS to your single board computer running a Debian flavor of Linux. I used a Raspberry Pi 3 B running Raspbian (Debian based).
- With Python, use the NCS API to send the graph file to the NCS and request predictions on images. Process the prediction results and take an (arbitrary) action based on the results.
Today, we’ll set up the Raspberry Pi in with the NCS API-only mode toolchain. This setup does not include tools to generate graph files, nor does it install Caffe, Tensorflow, etc.
Then, we’ll create our own custom image classification benchmarking script. You’ll notice that this script is based heavily on a previous post on Deep learning with the Raspberry Pi and OpenCV.
First, let’s prepare our Raspberry Pi.
Setting up your Raspberry Pi and the NCS in API-only mode
I learned the hard way that the Raspberry Pi can’t handle the SDK (what was I thinking?) by reading some sparse documentation.
I later started from square one and found better documentation that instructed me to set up my Pi in API-only mode (now this makes sense). I was quickly up and running in this fashion and I’ll show you how to do the same thing.
For your Pi, I recommend that you install the SDK in API-only mode on a fresh installation of Raspbian Stretch.
To install the Raspbian Stretch OS on your Pi, grab the Stretch image here and then flash the card using these instructions.
From there, boot up your Pi and connect to WiFi. You can complete all of the following actions over an SSH connection or using a monitor + keyboard/mouse (with the 6inch dongle listed above as the USB ports are blocked by the NCS) if you prefer.
Let’s update the system:
$ sudo apt-get update && sudo apt-get upgrade
Then let’s install a bunch of packages:
$ sudo apt-get install -y libusb-1.0-0-dev libprotobuf-dev $ sudo apt-get install -y libleveldb-dev libsnappy-dev $ sudo apt-get install -y libopencv-dev $ sudo apt-get install -y libhdf5-serial-dev protobuf-compiler $ sudo apt-get install -y libatlas-base-dev git automake $ sudo apt-get install -y byacc lsb-release cmake $ sudo apt-get install -y libgflags-dev libgoogle-glog-dev $ sudo apt-get install -y liblmdb-dev swig3.0 graphviz $ sudo apt-get install -y libxslt-dev libxml2-dev $ sudo apt-get install -y gfortran $ sudo apt-get install -y python3-dev python-pip python3-pip $ sudo apt-get install -y python3-setuptools python3-markdown $ sudo apt-get install -y python3-pillow python3-yaml python3-pygraphviz $ sudo apt-get install -y python3-h5py python3-nose python3-lxml $ sudo apt-get install -y python3-matplotlib python3-numpy $ sudo apt-get install -y python3-protobuf python3-dateutil $ sudo apt-get install -y python3-skimage python3-scipy $ sudo apt-get install -y python3-six python3-networkx
Notice that we’ve installed libopencv-dev
from the Debian repositories. This is the first time I’m ever recommending it and hopefully the last time as well. Installing OpenCV via apt-get (1) installs an older version of OpenCV, (2) does not install the full version of OpenCV, and (3) does not take advantage of various system operations. Again, I do not recommend this method to installing OpenCV.
Additionally, you can see we’re installing a whole bunch of packages that I generally prefer to manage inside Python virtual environments with pip. Be sure you are using a fresh memory card so you don’t mess up other projects that you’ve been working on, on your Pi.
Since we’re using OpenCV and Python, we’ll need the python-opencv
bindings. The installation instructions on the Movidius blog don’t include this tool. You may install the python-opencv
bindings by entering the following:
$ sudo apt-get install -y python-opencv
Let’s also install imutils and the picamera API:
$ pip install imutils $ pip install “picamera[array]”
For the above pip install
commands, I installed into the global Python site-packages (not with a virtual environment).
From there, let’s make a workspace directory and clone the NCSDK:
$ cd ~ $ mkdir workspace $ cd workspace $ git clone https://github.com/movidius/ncsdk
And while we’re at it, let’s clone down the NC App Zoo as we’ll want it for later.
$ git clone https://github.com/movidius/ncappzoo
And from there, navigate into the following directory:
$ cd ~/workspace/ncsdk/api/src
In that directory we’ll use the Makefile to install SDK in API-only mode:
$ make $ sudo make install
Test the Raspberry Pi installation on the NCS
Let’s test the installation by using code from the NC App Zoo. Be sure that the NCS is plugged into your Pi at this point.
$ cd ~/workspace/ncappzoo/apps/hello_ncs_py $ make run making run python3 hello_ncs.py; Hello NCS! Device opened normally. Goodbye NCS! Device closed normally. NCS device working.
You should see the exact output as above.
Using a pre-compiled graph file
Since we only have the API on our Pi, we’ll have to rely on a pre-generated graph file in order to perform our classification project today. I’ve included the relevant graph files in the “Downloads” section of this blog post.
Next week, I’ll be back to show you how you can generate your own graph files with the full-blown SDK installed on a capable Ubuntu desktop/laptop.
Classification with the Movidius NCS
If you open up the run.py
file that we just created with the Makefile, you’ll notice that most inputs are hardcoded and that the file is ugly in general.
Instead, we’re going to create our own file for classification and benchmarking.
In a previous post, Deep learning on the Raspberry Pi with OpenCV, I described how to use the OpenCV’s DNN module to perform object classification.
Today, we’re going to modify that exact same script to make it compatible with the Movidius NCS.
If you compare both scripts you’ll see that they are nearly identical. For this reason, I’ll simply be pointing out the differences, so I encourage you to refer to the previous post for full explanations.
Each script is included in the “Downloads” section of this blog post, so be sure to grab the zip and follow along.
Let’s review the differences in the modified file named pi_ncs_deep_learning.py
:
# import the necessary packages from mvnc import mvncapi as mvnc import numpy as np import argparse import time import cv2
Here we are importing our packages — the only difference is on Line 2 where we import the mvncapi as mvnc
. This import is for the NCS API.
From there, we need to parse our command line arguments:
# construct the argument parse and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-i", "--image", required=True, help="path to input image") ap.add_argument("-g", "--graph", required=True, help="path to graph file") ap.add_argument("-d", "--dim", type=int, required=True, help="dimension of input to network") ap.add_argument("-l", "--labels", required=True, help="path to ImageNet labels (i.e., syn-sets)") args = vars(ap.parse_args())
In this block I’ve removed two arguments (--prototxt
and --model
) while adding two arguments (--graph
and --dim
).
The --graph
argument is the path to our graph file — it takes the place of the prototxt and model.
Graph files can be generated via the NCS SDK, which we’ll cover in next week’s blog post. I’ve included the graph files for this week in the “Downloads“ for convenience. In the case of Caffe the graph is generated from the prototxt and model files with the SDK.
The --dim
argument simply specifies the pixel dimensions of the (square) image we’ll be sending through the neural network. Dimensions of the image were hardcoded in the previous post.
Next, we’ll load the class labels and input image from disk:
# load the class labels from disk rows = open(args["labels"]).read().strip().split("\n") classes = [r[r.find(" ") + 1:].split(",")[0] for r in rows] # load the input image from disk, make a copy, resize it, and convert to float32 image_orig = cv2.imread(args["image"]) image = image_orig.copy() image = cv2.resize(image, (args["dim"], args["dim"])) image = image.astype(np.float32)
Here we’re loading the class labels from synset_words.txt
with the same method as previously.
Then, we utilize OpenCV to load the image.
One slight change is that we’re making a copy of the original image on Line 26. We need two copies — one for preprocessing/normalization/classification and one for displaying to our screen later on.
Line 27 resizes our image and you’ll notice that we’re using args["dim"]
— our command line argument value.
Common choices for width and height image dimensions inputted to Convolutional Neural Networks include 32 × 32, 64 × 64, 224 × 224, 227 × 227, 256 × 256, and 299 × 299. Your exact image dimensions will depend on which CNN you are using.
Line 28 converts the image array data to float32
format which is a requirement for the NCS and the graph files we’re working with.
Next, we perform mean subtraction, but we’ll do it in a slightly different way this go around:
# load the mean file and normalize ilsvrc_mean = np.load("ilsvrc_2012_mean.npy").mean(1).mean(1) image[:,:,0] = (image[:,:,0] - ilsvrc_mean[0]) image[:,:,1] = (image[:,:,1] - ilsvrc_mean[1]) image[:,:,2] = (image[:,:,2] - ilsvrc_mean[2])
We load the ilsvrc_2012_mean.npy
file on Line 31. This comes from the ImageNet Large Scale Visual Recognition Challenge and can be used for SqueezeNet, GoogLeNet, AlexNet, and typically all other networks trained on ImageNet that utilize mean subtraction (we hardcode the path for this reason).
The image mean subtraction is computed on Lines 32-34 (using the same method shown in the Movidius example scripts on GitHub).
From there, we need to establish communication with the NCS and load the graph into the NCS:
# grab a list of all NCS devices plugged in to USB print("[INFO] finding NCS devices...") devices = mvnc.EnumerateDevices() # if no devices found, exit the script if len(devices) == 0: print("[INFO] No devices found. Please plug in a NCS") quit() # use the first device since this is a simple test script print("[INFO] found {} devices. device0 will be used. " "opening device0...".format(len(devices))) device = mvnc.Device(devices[0]) device.OpenDevice() # open the CNN graph file print("[INFO] loading the graph file into RPi memory...") with open(args["graph"], mode="rb") as f: graph_in_memory = f.read() # load the graph into the NCS print("[INFO] allocating the graph on the NCS...") graph = device.AllocateGraph(graph_in_memory)
As you can tell, the above code block is completely different because last time we didn’t use the NCS at all.
Let’s walk through it — it’s actually very straightforward.
In order to prepare to use a neural network on the NCS we need to perform the following actions:
- List all connected NCS devices (Line 38).
- Break out of the script altogether if there’s a problem finding one NCS (Lines 41-43).
- Select and open
device0
(Lines 48 and 49). - Load the graph file into Raspberry Pi memory so that we can transfer it to the NCS with the API (Lines 53 and 54).
- Load/allocate the graph on the NCS (Line 58).
The Movidius developers certainly got this right — their API is very easy to use!
In case you missed it above, it is worth noting here that we are loading a pre-trained graph. The training step has already been performed on a more powerful machine and the graph was generated by the NCS SDK. Training your own network is outside the scope of this blog post, but covered in detail in both PyImageSearch Gurus and Deep Learning for Computer Vision with Python.
You’ll recognize the following block if you read the previous post, but you’ll notice three changes:
# set the image as input to the network and perform a forward-pass to # obtain our output classification start = time.time() graph.LoadTensor(image.astype(np.float16), "user object") (preds, userobj) = graph.GetResult() end = time.time() print("[INFO] classification took {:.5} seconds".format(end - start)) # clean up the graph and device graph.DeallocateGraph() device.CloseDevice() # sort the indexes of the probabilities in descending order (higher # probabilitiy first) and grab the top-5 predictions preds = preds.reshape((1, len(classes))) idxs = np.argsort(preds[0])[::-1][:5]
Here we will classify the image with the NCS and the API.
Using our graph object, we call graph.LoadTensor
to make a prediction and graph.GetResult
to grab the resulting predictions. This is a two-step action, where before we simply called net.forward
on a single line.
We time these actions to compute our benchmark while displaying the result to the terminal just like previously.
We perform our housekeeping duties next by clearing the graph memory and closing the connection to the NCS on Lines 69 and 70.
From there we’ve got one remaining block to display our image to the screen (with a very minor change):
# loop over the top-5 predictions and display them for (i, idx) in enumerate(idxs): # draw the top prediction on the input image if i == 0: text = "Label: {}, {:.2f}%".format(classes[idx], preds[0][idx] * 100) cv2.putText(image_orig, text, (5, 25), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2) # display the predicted label + associated probability to the # console print("[INFO] {}. label: {}, probability: {:.5}".format(i + 1, classes[idx], preds[0][idx])) # display the output image cv2.imshow("Image", image_orig) cv2.waitKey(0)
In this block, we draw the highest prediction and probability on the top of the image. We also print the top-5 predictions + probabilities in the terminal.
The very minor change in this block is that we’re drawing the text on image_orig
rather than image
.
Finally, we display the output image_orig
on the screen. If you are using SSH to connect with your Raspberry Pi this will only work if you supply the -X
flag for X11 forwarding when SSH’ing into your Pi.
To see the results of applying deep learning image classification on the Raspberry Pi using the Intel Movidius Neural Compute Stick and Python, proceed to the next section.
Raspberry Pi and deep learning results
For this benchmark, we’re going to compare using the Pi CPU to using the Pi paired with the NCS coprocessor.
Just for fun, I also threw in the results from using my Macbook Pro with and without the NCS (which requires an Ubuntu 16.04 VM that we’ll be building and configuring next week).
We’ll be using three models:
- SqueezeNet
- GoogLeNet
- AlexNet
Just to keep things simple, we’ll be running the classification on the same image each time — a barber chair:
Since the terminal output results are quite long, I’m going to leave them out of the following blocks. Instead, I’ll be sharing a table of the results for easy comparison.
Here are the CPU commands (you can actually run this on your Pi or on your desktop/laptop despite pi
in the filename):
# SqueezeNet with OpenCV DNN module using the CPU $ python pi_deep_learning.py --prototxt models/squeezenet_v1.0.prototxt \ --model models/squeezenet_v1.0.caffemodel --dim 227 \ --labels synset_words.txt --image images/barbershop.png [INFO] loading model... [INFO] classification took 0.42588 seconds [INFO] 1. label: barbershop, probability: 0.8526 [INFO] 2. label: barber chair, probability: 0.10092 [INFO] 3. label: desktop computer, probability: 0.01255 [INFO] 4. label: monitor, probability: 0.0060597 [INFO] 5. label: desk, probability: 0.004565 # GoogLeNet with OpenCV DNN module using the CPU $ python pi_deep_learning.py --prototxt models/bvlc_googlenet.prototxt \ --model models/bvlc_googlenet.caffemodel --dim 224 \ --labels synset_words.txt --image images/barbershop.png ... # AlexNet with OpenCV DNN module using the CPU $ python pi_deep_learning.py --prototxt models/bvlc_alexnet.prototxt \ --model models/bvlc_alexnet.caffemodel --dim 227 \ --labels synset_words.txt --image images/barbershop.png ...
Note: In order to use the OpenCV DNN module, you must have OpenCV 3.3 at a minimum. You can install an optimized OpenCV 3.3 on your Raspberry Pi using these instructions.
And here are the NCS commands using the new modified script that we just walked through above (you can actually run this on your Pi or on your desktop/laptop despite pi
in the filename):
# SqueezeNet on NCS $ python pi_ncs_deep_learning.py --graph graphs/squeezenetgraph \ --dim 227 --labels synset_words.txt --image images/barbershop.png [INFO] finding NCS devices... [INFO] found 1 devices. device0 will be used. opening device0... [INFO] loading the graph file into RPi memory... [INFO] allocating the graph on the NCS... [INFO] classification took 0.085902 seconds [INFO] 1. label: barbershop, probability: 0.94482 [INFO] 2. label: restaurant, probability: 0.013901 [INFO] 3. label: shoe shop, probability: 0.010338 [INFO] 4. label: tobacco shop, probability: 0.005619 [INFO] 5. label: library, probability: 0.0035152 # GoogLeNet on NCS $ python pi_ncs_deep_learning.py --graph graphs/googlenetgraph \ --dim 224 --labels synset_words.txt --image images/barbershop.png ... # AlexNet on NCS $ python pi_ncs_deep_learning.py --graph graphs/alexnetgraph \ --dim 227 --labels synset_words.txt --image images/barbershop.png ...
Note: In order to use the NCS, you must have a Raspberry Pi loaded with a fresh install of Raspbian (Stretch preferably) and the NCS API-only mode toolchain installed as per the instructions in this blog post. Alternatively you may use an Ubuntu machine or VM.
Please pay attention to both Notes above. You’ll need two separate microSD cards to complete these experiments. The NCS API-only mode toolchain uses OpenCV 2.4 and therefore does not have the new DNN module. You cannot use virtual environments with the NCS, so you need completely isolated systems. Do yourself a favor and get a few spare microSD cards — I like the 32 GB 98MB/s cards. Dual booting your Pi might be an option, but I’ve never tried it and don’t want to deal with the hassle of partitioned microSD cards.
Now for the results summarized in a table:
The NCS is clearly faster on the Pi when compared to using the Pi’s CPU for classification achieving a 6.45x speedup (545%) on GoogLeNet. The NCS is sure to bring noticeable speed to the table on larger networks such as the three compared here.
Note: The results gathered on the Raspberry Pi used my optimized OpenCV install instructions. If you are not using the optimized OpenCV install, you would see speedups in the range of 10-11x!
When comparing execution on my MacBook Pro with Ubuntu VM to the SDK VM on my MBP, performance is worse — this is expected for a number of reasons.
For starters, my MBP has a much more powerful CPU. It turns out that it’s faster to run the full inference on the CPU versus the added overhead of moving the image from the CPU to the NCS and then pulling the results back.
Second, there is USB overhead when conducting USB passthrough to the VM. USB 3 isn’t supported via the VirtualBox USB passthrough either.
It is worth noting that the Raspberry Pi 3 B has USB 2.0. If you really want speed for a single board computer setup, select a machine that supports USB 3.0. The data transfer speeds alone will be apparent if you are benchmarking.
Next week’s results will be even more evident when we compare real-time video FPS benchmarks, so be sure to check back on Monday.
Where to from here?
I’ll be back soon with another blog post to share with you how to generate your own custom graph files for the Movidius NCS.
I’ll also be describing how to perform object detection in realtime video using the Movidius NCS — we’ll benchmark and compare the FPS speedup and I think you’ll be quite impressed.
In the meantime, be sure to check out the Movidius blog and TopCoder Competition.
Movidus blog on GitHub
Intel and Movidius are keeping their blog up to date on GitHub. Be sure to bookmark their page and/or subscribe to RSS:
You might also want to sign into GitHub and click the “watch” button on the Movidius repos:
TopCoder Competition
Are you interested in pushing the limits of the Intel Movidius Neural Compute Stick?
Intel is sponsoring a competition on TopCoder.
There are $20,000 in prizes up for grabs (first place wins $8,000)!
Registration and submission close on February 26, 2018.
Keep track of the leaderboard and standings!
What's next? We recommend PyImageSearch University.
86 total classes • 115+ hours of on-demand code walkthrough videos • Last updated: October 2024
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled
I strongly believe that if you had the right teacher you could master computer vision and deep learning.
Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?
That’s not the case.
All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.
Inside PyImageSearch University you'll find:
- ✓ 86 courses on essential computer vision, deep learning, and OpenCV topics
- ✓ 86 Certificates of Completion
- ✓ 115+ hours of on-demand video
- ✓ Brand new courses released regularly, ensuring you can keep up with state-of-the-art techniques
- ✓ Pre-configured Jupyter Notebooks in Google Colab
- ✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
- ✓ Access to centralized code repos for all 540+ tutorials on PyImageSearch
- ✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
- ✓ Access on mobile, laptop, desktop, etc.
Summary
Today we explored Intel’s new Movidius Neural Compute Stick. My goal here today was expose you to this new deep learning device (which we’ll be using in future blog posts as well). I also demonstrated how to use the NCS workflow and API.
In general, the NCS workflow involes:
- Training a network with Tensorflow or Caffe using a machine running Ubuntu/Debian (or using a pre-trained network).
- Using the NCS SDK to generate a graph file.
- Deploying the graph file and NCS to your single board computer running a Debian flavor of Linux. We used a Raspberry Pi 3 B running Raspbian (Debian based).
- Performing inference, classification, object detection, etc.
Today, we skipped Steps 1 and 2. Instead I am providing graph files which you can begin using on your Pi immediately.
Then, we wrote our own classification benchmarking Python script and analyzed the results which demonstrate a significant 10x speedup on the Raspberry Pi.
I’m quite impressed with the NCS capabilities so far — it pairs quite well with the Raspberry Pi and I think it is a great value if (1) you have a use case for it or (2) you just want to hack and tinker.
I hope you enjoyed today’s introductory post on Intel’s new Movidius Neural Compute Stick!
To stay informed about PyImageSearch blog posts, sales, and events such as PyImageConf, be sure to enter your email address in the form below.
Deprecation Notice: This article uses the Movidius APIv1 and APIv2 which is now superseded by Intel’s OpenVINO software for using the Movidius NCS. Learn more about OpenVINO in this PyImageSearch article.
Download the Source Code and FREE 17-page Resource Guide
Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!
Alexander Sack
Adrian, how many custom graphs have you actually tried with the NCS?
I’ve been an early adopter of this stick and even run it under docker (you have to edit the install process a bit to get it to go but it will work that way quite happily).
I’ve found that Intel’s support for Tensorflow is woefully lacking, and the procedure you have to do to edit your graph to get it ready to compile is tedious and error prone. My biggest complaint hands down is that you really need to know Caffe to use it effectively (and I’m not a Caffe kinda guy, I prefer hot chocolate) and I have still not being able to do Keras->TF->Caffe-NCS or even PyTorch->Caffe->NCS (though I am still experimenting with the latter).
Adrian Rosebrock
Great question, thanks for asking Alexander.
To be honest, the number of custom graphs I’ve played around with is pretty small (6-7 tops). About half of them required some sort of change to the Caffe prototxt model definition file. I have not tried any TensorFlow models yet.
I hope Intel decides to support Keras inside their SDK in the future. Trying to train via Keras, export into TensorFlow, and then edit the model graph for the NCS sounds like a rats nest of issues. Even though you’re not a fan of Caffe, it’s likely the easiest way to go from trained model to NCS.
Patrick Poirier
Adrian,
Concerning your comment ” Trying to train via Keras, export into TensorFlow, and then edit the model graph for the NCS sounds like a rats nest of issues”
Have you looked at this:
https://github.com/ardamavi/Intel-Movidius-NCS-Keras
It seems pretty straightforward
Best Regards
Adrian Rosebrock
Patrick! Great find. I wish I had come across that before. Have you used this method yet? Thank you for sharing.
Philipp
I love the NCS and really liked how easy it was to set up, even for me.
I am still desperately waiting for the ability to run Tensorflow MobileNets SSD models for Object Detection as this is still not possible…
Have a custom trained model that is just waiting for it to get on that stick!!
🙂
Adrian Rosebrock
Hey Philipp — I’ll be covering Caffe SSD + MobileNets on the NCS next week. If I have time I’ll try to get it to work with a TensorFlow SSD + MobileNet. Is there a particular SSD + MobileNet you are trying to work with?
Philipp
Oh that would be wonderful!
Like many others who just got into Object Detection I am working with the SSD Mobilenet V1 Coco (11_06_2017) from the Tensorflow Object Detection Model Zoo
As far as I know Tensorflow SSD Models are not supported by the NCSDK
Seamus
Thanks Adrian for another great blog. I just bought one for reasons 1 through 4. 🙂
Adrian Rosebrock
Enjoy it Seamus, let me know how it works for you 🙂
wally
How does this compare to the Google AIY Vision Kit co-processor board? There are initial issues with the AIY Vision kit too, such that Microcenter gave me my money back, essentially issuing a recall.
Obvious differences are the AIY kit board inputs direct from a Pi2 camera module and does a pass through to the PiZero camera port — this was the issue as the supplied cable didn’t fit the connector on the AIY board 🙁
I can afford a $77 “toy” so I will be ordering one and look forward to part 2.
Adrian Rosebrock
I have a Google AIY Vision Kit but I haven’t yet played around with it. I’m planning on doing a tutorial on the Google AIY later this month/early next and then I’ll be able to better discuss the differences between the two.
Tim U
Any results from your AIY Kit testing?
Adrian Rosebrock
I honestly still haven’t had a chance to do anything with it.
Sergei
Arr, this morning Ive met this amazing stick, this evening you published manual. Thats the destiny
Adrian Rosebrock
I’m glad the post was timely, Sergei! 🙂
Ella
Thank you so much for the amazing post Adrian!!! Would you consider doing another tutorial on chaining multiple NCS’s in the future for video processing?
Adrian Rosebrock
I’ll be doing another blog post next week that covers video processing. It doesn’t cover using multiple NCS, but it does help you get started on the video processing route. Be on the look out for it!
Philippe Rivest
Hi!
Great guide 😀
Is it complicated to create my own models? For instance a model that classifies musical instruments.
Thank you
Adrian Rosebrock
Keep in mind that the Movidius is currently only supporting Caffe and TensorFlow models. Depending on how complex your model is and any type of special layers you use, it could be non-trivial to convert the model using the Movidius SDK.
It sounds like you’re interested in studying the basics of deep learning, which by definition, includes training your own models. I have a number of tutorials that cover this on the PyImageSearch blog. You should also take a look at Deep Learning for Computer Vision with Python where discuss deep learning in detail.
I hope that helps!
simon
Hi, I have the same UBS stick and wonder it is able to run some custom DNN like SSD or YOLO?
Adrian Rosebrock
Yep! It’s absolutely possible to run SSD and YOLO on the Movidius. I’ll be demonstrating how to use the Movidius for object detection in next week’s post. Stay tuned.
Prathamesh
May I get a link to this post if you did write ?
Adrian Rosebrock
I did write it. You can find it here. You can also use the search bar at the top-right of this page to search tutorials on PyImageSearch as well.
Jason Hoffman
Adrian, I had forgotten what a fantastic writer and teacher you are. Every couple of months I check back in and It makes me wish I had a project to put your wisdom to use on. Keep up the great work, doc!
Adrian Rosebrock
Thank you for the kind words Jason 🙂
Foggy
Can this device be used to speed up your home surveilance rpi security cam?
Adrian Rosebrock
No, the home surveillance Raspberry Pi security cam is not using any deep learning. The Movidius NCS is meant to be used for running networks at a faster speed.
JBeale
Looking at one of the Movidius github examples, https://github.com/movidius/ncappzoo/blob/master/caffe/SSD_MobileNet/run.py
it looks to me like they have some bugs in their code they are trying to work around (and even a typo in the comment explaining it- they mean “non-finite” I think, not “non infinite”):
# boxes with non infinite (inf, nan, etc) numbers must be ignored
print(‘box at index: ‘ + str(box_index) + ‘ has nonfinite data, ignoring it’)
Adrian Rosebrock
Getting the SSD + MobileNet detector to run was a bit of a process. I’ll be discussing it in next weeks post.
kaisar khatak
Adding compute to the PI via USB? Very cool, even despite the USB 2.0 HW constraint. Nvidia TX1/TX2 systems still preferred though…
Have you tried running the video face matcher from the app zoo? It looks like code depends on opencv 3.3 built from source. Apparently, the NCS supports newer versions of opencv, though I did see “-D BUILD_opencv_cnn_3dobj=OFF \ -D BUILD_opencv_dnn_modern=OFF ” in the install-opencv-from_source.sh script.
https://github.com/movidius/ncappzoo/tree/master/apps/video_face_matcher
Cheers.
Adrian Rosebrock
I have not tried to run the face matcher. When running the install scripts for the Movidius it forcibly uninstalled previous versions of OpenCV on my system and then installed OpenCV 2.4. Hacking the make script to compile OpenCV 3 instead might be possible but it’s not something I’ve tried.
The TX1 and TX2 are great devices but they also have a heftier price tag. It’s hard to say which one is “preferable” as they both have their use cases. I would likely recommend on a case-by-case basis rather than saying “always use this one”.
Peter van Lith
Hi Adrian.
While playing around with the Movidius I am running into a problem with the first SqueezeNet example. First of all in the previous block you are using python3, in this one it says python. Isn’t that calling python2 instead of python3?
When I use python3 it starts executing the make file but fails because it cannot find mvNCProfile. It seems as if the api-only install is missing something or is there perhaps a problem with the PYTHONPATH ?
John Beale
One errata: I followed these instructions from a fresh Raspbian install, but I found there was one item missing. I also had to do: sudo apt-get install python-opencv
after that, I was able to run the SqueezeNet ‘run.py’ example and see electric guitar 99.12%
Adrian Rosebrock
Thanks for sharing, John!
Raghvendra
Hi,
Test the Raspberry Pi installation on the NCS was SUCCESS. But after that when I try, Generating your Movidius NCS neural network. It gives me the following error, what did I do wrong?
making prototxt
Prototxt file already exists
making profile
mvNCProfile deploy.prototxt -s 12
make: mvNCProfile: Command not found
Makefile:73: recipe for target 'profile' failed
make: *** [profile] Error 127
Colin
I run into the same problem. Looks like we ran into the same problem at the same time.
Adrian Rosebrock
Hi Raghvendra — I think you are using the Makefile from Movidius. Is that correct? If so, the first message is that the prototxt has already been formatted and created. I’m not sure why the second message says mvNCProfile failed. Which model are you building, and can you include a link to the GitHub page if that’s where it came from?
Colin
I’m using the ncappzoo/caffe/SqueezeNet example.
Adrian Rosebrock
Hi Colin and Raghvendra — you’ll need the full-blown SDK to use the SqueezeNet Makefile. Please refer to this post where the SDK is used to generate graph files.
Sandor Seres
Hi,
I have tried to move my own, trained network (written in Keras with Tensorflow backend) to the stick (12 layers, plus some Dropout)
Already find
https://github.com/ardamavi/Intel-Movidius-NCS-Keras to use, and it seems mostly work .
But it still have issue with Dropout layers…
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'dropout_5/keras_learning_phase' with dtype bool
[[Node: dropout_5/keras_learning_phase = Placeholder[dtype=DT_BOOL, shape=, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
I am thinking what to do?
– remove the Dropouts from the final model (but how..)
– retrain the whole model again without Dropout ( +10 hour run on AWS p2.xlarge)
Anyone had similar problem?
S&|
David Hoffman
Sandor — I find it hard to believe (and quite frustrating) that mvNCCompile doesn’t support dropout regularization (or just doesn’t play nice with it). It appears that other users on the Movidius Forums (such as Yang) have experienced your exact problem. The solution mentioned there is to pass a constant of 1.0 from the dropout nodes. Dropout with the Movidius NCS might be a future blog post idea for Adrian. Report back if you are able to overcome this hurtle.
Ctvr
Oh. It’s amazing!. Based on your article, it can help improving speed when running dnn on raspberry pi. But if i want to work with computer vision such as canny edge, erosion and dilate (with large kernel) ,…, can it improve speed on raspberry pi?
Adrian Rosebrock
The NCS will not help with basic image processing/computer vision operations. To speedup standard OpenCV functions on the Pi you should compile and install an optimized version of OpenCV.
Raghvendra
Hi Adrian,
Thank you for the reply. I actually replicated all the steps you listed. I tried SqueezeNet and GoogleNet. I did not took anything from Github, i just used the directories available at Caffe examples folder.
Adrian Rosebrock
Thanks for sharing, Raghvendra!
Graham Wheeler
Is it useful for non-vision tasks? E.g. LSTM nets for text summarization?
Adrian Rosebrock
I have not tried it for non-vision tasks, but potentially it could be, provided your model is Caffe/TensorFlow, can be converted to a NCS-compatible model, and you can load your data into it. Again, I would suggest checking the Intel documentation on this one as I’m not entirely sure. Near every example I’ve seen with the NCS is vision related in some capacity.
Hein
Hi Adrian,
I am having this error while installing pip install ‘picamera[array]’ . Could you please help assist?
Adrian Rosebrock
Hi Hein — I’m not sure but you might need double quotes. Please refer to the docs.
Matt Trask
I had this exact same problem. If you copy and paste the line of text from this blog post, it has a ‘left double quote” symbol before the word picamera. Retyping the line makes it work.
claude
Hello Adrian,
Thanks a lot for your post.
I followed your post completely to install MOVIDIUS NCS.
But , I have some questions:
1) Why not install the required bunch of packages in a virtual environment, as we are used to with you?
2) What is the version of opencv and python installed with this bunch of packages ?
Because after to implement this packages on my RPI3, I have Python 2.7.13 and not python 3 !!!! why ?) and I have openCV version : 2.4.9.1
3) Is it possible to up to date opencv and python version with bunch of this packages ?
4) If not, why ?
5) Is this the only procedure recommended by Intel to access MOVIDIUS NCS?
Thanks in advance for your reply.
Kind regards.
Claude
David Hoffman
Hi Claude,
I don’t have a the answer to “why” in general. It’s a software compatibility issue that you’ll have to take up with Intel.
1) You can try virtual environments, but for compatibility you need to use their install scripts. I was not able to get virtual environments to work.
2) OpenCV 2.4. Python 2, but I bet if you make the effort, Python 3 will work.
3) I would not recommend updating OpenCV and Python.
4) I don’t have a good reason here. What functionality do you need from OpenCV 3 or Python 3 that you can’t do with the previous versions?
5) As far as I know this is what Intel recommends as of this date. Be sure to follow their blog to know if installation methods change. If that happens, I’m guessing Adrian will update the blog post.
Joakim Byström
Hello Adrian,
doing great projects! But when i try to follow the instructions for Movidius “install some packages” a few of them are not available. Anyone has a solution?
The errors look like this:
sudo apt-get install -y libusb-1.0-0-dev libprotobuf-dev
E: Package ‘libusb-1.0-0-dev’ has no installation candidate
E: Unable to locate package libprotobuf-dev
Best wishes, Joakim Byström
Adrian Rosebrock
Hi Joakim,
I checked the Movidius instructions and they still match for those two packages.
Here is what I recommend for
libusb-1.0-0-dev
: Install from the tarball available on http://libusb.info/.For
libprotobuf-dev
, I’d suggest starting here on the Ubuntu package site.Let me know if you get it up and running or if you run into further issues.
Mathieu
I had the same problem and i was not able to install the two packages even with the link 🙁
Adrian Rosebrock
Hey Mathieu — make sure you read David Ramirez’s comment below.
David Ramirez
Hey Joakim, i have had the same error and i fixed it with this tutorial https://www.blackmoreops.com/2014/12/13/fixing-error-package-packagename-not-available-referred-another-package-may-mean-package-missing-obsoleted-available-another-source-e-pa/ hope it is not too late.
Joakim Byström
Update: After coming back from my business trip, I ran
apt-get update && apt-get upgrade
again and then tried the same command as I did before:
sudo apt-get install -y libusb-1.0-0-dev libprotobuf-dev
it now worked without errors…
Best wishes, Joakim Byström
Adrian Rosebrock
Congrats on resolving the issue, Joakim!
Aluizio Rocha
Hi Adrian,
Thanks a lot for this great post about the Movidius Neural Compute Stick. I have been testing it on a Raspberry Pi Zero W and the results are amazing compared to using its CPU:
SqueezeNet – CPU: 7.3183 seconds, NCS: 0.11804 seconds, Speedup: 6,099.84%
GoogLeNet – CPU: 15.28 seconds, NCS: 0.15962 seconds, Speedup: 9,472.73%
AlexNet – CPU: 20.292 seconds, NCS: 0.12166 seconds, Speedup: 16,759.27%
For computer vision projects using a tiny Raspberry Pi Zero, the Movidius NCS is well worth it a lot.
Best regards.
Adrian Rosebrock
Nice! Thanks so much for sharing these results, Aluizio. I appreciate as does the rest of the PyImageSearch community 🙂
Robin
If you could post the code without the $ it would be much easier to copy-paste-use.
Cheers
Adrian Rosebrock
Thanks for the suggestion Robin, however most Unix bash prompts end in a $. I place the $ before all shell commands so that the reader can distinguish between shell commands and stdout.
Aneeq
Thanks a lot for this amazing blog post! Just curious if the tutorial on generating our own graph files is up?
Adrian Rosebrock
No, I’ve been busy writing other content. I’d really like to write a tutorial on creating your own graph files but I’m not sure if/when that may be.
Jean-Marie
Hi, following this tutorail and NCS perfectly working for inference but doesn’t display image locally or via ssh :
downloaded package has been installed on a subfolfer of workspace. Something to do with root perhaps ?
(terminal output removed due to formatting destroying HTML)
(Image:548): Gtk-WARNING **: cannot open display:
Any idea ?
Adrian Rosebrock
Make sure you enable X11 forwarding when SSH’ing into the Pi. This will enable the window to open:
$ ssh -X pi@your_ip_address
Jean-Marie
Another question : following your link (https://www.raspberrypi.org/downloads/raspbian/), I install the RASPBIAN STRETCH LITE so doesn’t seem to have X server config that could explain my ‘ Gtk-WARNING **: cannot open display’.
Must we install the RASPBIAN STRETCH WITH DESKTOP for this NCS tutorial ?
Adrian Rosebrock
As far as I know, X forwarding will still work even if you have a headless Pi. Just try the -X argument when connecting. Don’t forget that you’ll also need an X application on your own machine such as Xming or Quartz.
Michael
Adrian,
I’m running into an issue with locating 3 packages. I’m getting an error when installing
sudo apt-get install -y python3-pygraphviz python3-protobuf python3-network
and will get the following error
E: Unable to locate package python3-pygraphviz
E: Unable to locate package python3-protobuf
E: Unable to locate package python3-network
I’ve done and apt-get update and upgrade as suggested by others who have seen similiar issues, but still can’t seem to install these packages. Always the same errors. Any suggestions?
Best regards,
Michael
Adrian Rosebrock
Hi Michael — which Raspbian OS are you running? I believe when I wrote this blog post, I used Stretch and my hardware was a Pi 3 B (not the 3 B+ as it wasn’t released yet). Also, if you haven’t yet, be sure to refer to the Intel documentation page that I followed when standing up my Pi + Movidius.
Richard
Adrian,
FYI, Movidus have removed the /api/src folders from the ncsdk folder to release them under a different licence, those that should have been created with git clone https://github.com/movidius/ncsdk .
Richard
Tuan Anh
Hello Adrian,
Can I use the NCS for speeding up the image processing progress? I know I have to transfer the image from the CPU to the NCS to process, but I can’t find any examples on the Internet.
Adrian Rosebrock
No, the NCS is only for running pre-trained neural networks. You cannot “push” computation to it like you would with a traditional GPU.
Andrea
Very cool implementation! How long does it take to predict a single frame?
Adrian Rosebrock
See Figure 5 where I provide benchmark speeds.
Tyler LeCouffe
Hi
Apologies if I missed this answer already…my question is, was this tutorial adapted from Intel’s course? https://software.intel.com/en-us/ai-academy/students/kits/ai-on-the-edge-vision-movidius
I haven’t taken it yet but it seems pretty interesting too.. Just wondering if you had any comment/feedback.
Thanks, Tyler
Adrian Rosebrock
I have not tried it with Intel’s course. I may in the future but for the time being I cannot comment on it.
Max
Adrian, as you may already know, there’s NCS2 which is not only announced by Intel, but has appeared on the market. Have you got any plans to try it out?
Adrian Rosebrock
Yes, I will be using it in my upcoming Computer Vision + Raspberry Pi book.
Grant
Will these instructions work with the NCS2 and if not, do you plan to come out with new instructions for it?
Adrian Rosebrock
I’ll be covering the NCS2 inside my upcoming Computer Vision and Raspberry Pi book. That book will include new instructions.
Glen
I would like to see the testing with the newest Compute Stick 2 but expand the testing with using 4 of the Sticks simultaneously via a hub in comparison with the MacBook Pro.
Adrian, what would you theorize the the results would be?
Navaneeth
Hi i’ve installed sdk in my ubuntu but how can i know wheter my NCS is connected or not ? is there any commands ?
Prathamesh
2 questions.
1. Will deeper models such as the faster RCNN based on inception backbone work on the NCS? I see that people are mostly talking about MobileNets and the likes.
2. You mentioned that installing the OpenVINO SDK/toolkit messes up with existing OpenCV &/or virtual environments. So, is there some way we can get around these lacunas and install the SDK for generating graphs from custom models?
Thanks!
Adrian Rosebrock
I’m answering both of those questions inside Raspberry Pi for Computer Vision.