Today’s blog post wouldn’t be possible without PyImageSearch Gurus member, Hans Boone. Hans is working on a computer vision project to automatically detect Machine-readable Zones (MRZs) in passport images — much like the region detected in the image above.
The MRZ region in passports or travel cards fall into two classes: Type 1 and Type 3. Type 1 MRZs are three lines, with each line containing 30 characters. The Type 3 MRZ only has two lines, but each line contains 44 characters. In either case, the MRZ encodes identifying information of a given citizen, including the type of passport, passport ID, issuing country, name, nationality, expiration date, etc.
Inside the PyImageSearch Gurus course, Hans showed me his progress on the project and I immediately became interested. I’ve always wanted to apply computer vision algorithms to passport images (mainly just for fun), but lacked the dataset to do so. Given the personal identifying information a passport contains, I obviously couldn’t write a blog post on the subject and share the images I used to develop the algorithm.
Luckily, Hans agreed to share some of the sample/specimen passport images he has access to — and I jumped at the chance to play with these images.
Now, before we get to far, it’s important to note that these passports are not “real” in the sense that they can be linked to an actual human being. But they are genuine passports that were generated using fake names, addresses, etc. for developers to work with.
You might think that in order to detect the MRZ region of a passport that you need a bit of machine learning, perhaps using the Linear SVM + HOG framework to construct an “MRZ detector” — but that would be overkill.
Instead, we can perform MRZ detection using only basic image processing techniques such as thresholding, morphological operations, and contour properties. In the remainder of this blog post, I’ll detail my own take on how to apply these methods to detect the MRZ region of a passport.
Detecting machine-readable zones in passport images
Let’s go ahead and get this project started. Open up a new file, name it
detect_mrz.py , and insert the following code:
# import the necessary packages from imutils import paths import numpy as np import argparse import imutils import cv2 # construct the argument parse and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-i", "--images", required=True, help="path to images directory") args = vars(ap.parse_args()) # initialize a rectangular and square structuring kernel rectKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (13, 5)) sqKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (21, 21))
Lines 2-6 import our necessary packages. I’ll assume you already have OpenCV installed. You’ll also need imutils, my collection of convenience functions to make basic image processing operations with OpenCV easier. You can install
$ pip install --upgrade imutils
From there, Lines 9-11 handle parsing our command line argument. We only need a single switch here,
--images , which is the path to the directory containing the passport images we are going to process.
Finally, Lines 14 and 15 initialize two kernels which we’ll later use when applying morphological operations, specifically the closing operation. For the time being, simply note that the first kernel is rectangular with a width approximately 3x larger than the height. The second kernel is square. These kernels will allow us to close gaps between MRZ characters and openings between MRZ lines.
Now that our command line arguments are parsed, we can start looping over each of the images in our dataset and process them:
# loop over the input image paths for imagePath in paths.list_images(args["images"]): # load the image, resize it, and convert it to grayscale image = cv2.imread(imagePath) image = imutils.resize(image, height=600) gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) # smooth the image using a 3x3 Gaussian, then apply the blackhat # morphological operator to find dark regions on a light background gray = cv2.GaussianBlur(gray, (3, 3), 0) blackhat = cv2.morphologyEx(gray, cv2.MORPH_BLACKHAT, rectKernel)
Lines 20 and 21 loads our original image from disk and resizes it to have a maximum height of 600 pixels. You can see an example of an original image below:
Gaussian blurring is applied on Line 26 to reduce high frequency noise. We then apply a blackhat morphological operation to the blurred, grayscale image on Line 27.
A blackhat operator is used to reveal dark regions (i.e., MRZ text) against light backgrounds (i.e., the background of the passport itself). Since the passport text is always black on a light background (at least in terms of this dataset), a blackhat operation is appropriate. Below you can see the output of applying the blackhat operator:
The next step in MRZ detection is to compute the gradient magnitude representation of the blackhat image using the Scharr operator:
# compute the Scharr gradient of the blackhat image and scale the # result into the range [0, 255] gradX = cv2.Sobel(blackhat, ddepth=cv2.CV_32F, dx=1, dy=0, ksize=-1) gradX = np.absolute(gradX) (minVal, maxVal) = (np.min(gradX), np.max(gradX)) gradX = (255 * ((gradX - minVal) / (maxVal - minVal))).astype("uint8")
Here we compute the Scharr gradient along the x-axis of the blackhat image, revealing regions of the image that are not only dark against a light background, but also contain vertical changes in the gradient, such as the MRZ text region. We then take this gradient image and scale it back into the range [0, 255] using min/max scaling:
While it isn’t entirely obvious why we apply this step, I will say that it’s extremely helpful in reducing false-positive MRZ detections. Without it, we can accidentally mark embellished or designed regions of the passport as the MRZ. I will leave this as an exercise to you to verify that computing the gradient of the blackhat image can improve MRZ detection accuracy.
The next step is to try to detect the actual lines of the MRZ:
# apply a closing operation using the rectangular kernel to close # gaps in between letters -- then apply Otsu's thresholding method gradX = cv2.morphologyEx(gradX, cv2.MORPH_CLOSE, rectKernel) thresh = cv2.threshold(gradX, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
First, we apply a closing operation using our rectangular kernel. This closing operation is meant to close gaps in between MRZ characters. We then apply thresholding using Otsu’s method to automatically threshold the image:
As we can see from the figure above, each of the MRZ lines is present in our threshold map.
The next step is to close the gaps between the actual lines, giving us one large rectangular region that corresponds to the MRZ:
# perform another closing operation, this time using the square # kernel to close gaps between lines of the MRZ, then perform a # series of erosions to break apart connected components thresh = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, sqKernel) thresh = cv2.erode(thresh, None, iterations=4)
Here we perform another closing operation, this time using our square kernel. This kernel is used to close gaps between the individual lines of the MRZ, giving us one large region that corresponds to the MRZ. A series of erosions are then performed to break apart connected components that may have been joined during the closing operation. These erosions are also helpful in removing small blobs that are irrelevant to the MRZ.
For some passport scans, the border of the passport may have become attached to the MRZ region during the closing operations. To remedy this, we set 5% of the left and right borders of the image to zero (i.e., black):
# during thresholding, it's possible that border pixels were # included in the thresholding, so let's set 5% of the left and # right borders to zero p = int(image.shape * 0.05) thresh[:, 0:p] = 0 thresh[:, image.shape - p:] = 0
You can see the output of our border removal below.
Compared to Figure 5 above, you can now see that the border has been removed.
The last step is to find the contours in our thresholded image and use contour properties to identify the MRZ:
# find contours in the thresholded image and sort them by their # size cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) cnts = imutils.grab_contours(cnts) cnts = sorted(cnts, key=cv2.contourArea, reverse=True) # loop over the contours for c in cnts: # compute the bounding box of the contour and use the contour to # compute the aspect ratio and coverage ratio of the bounding box # width to the width of the image (x, y, w, h) = cv2.boundingRect(c) ar = w / float(h) crWidth = w / float(gray.shape) # check to see if the aspect ratio and coverage width are within # acceptable criteria if ar > 5 and crWidth > 0.75: # pad the bounding box since we applied erosions and now need # to re-grow it pX = int((x + w) * 0.03) pY = int((y + h) * 0.03) (x, y) = (x - pX, y - pY) (w, h) = (w + (pX * 2), h + (pY * 2)) # extract the ROI from the image and draw a bounding box # surrounding the MRZ roi = image[y:y + h, x:x + w].copy() cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2) break # show the output images cv2.imshow("Image", image) cv2.imshow("ROI", roi) cv2.waitKey(0)
On Line 56-58 we compute the contours (i.e., outlines) of our thresholded image. We then take these contours and sort them based on their size in descending order on Line 59 (implying that the largest contours are first in the list).
On Line 62 we start looping over our sorted list of contours. For each of these contours, we’ll compute the bounding box (Line 66) and use it to compute two properties: the aspect ratio and the coverage ratio. The aspect ratio is simply the width of the bounding box divided by the height. The coverage ratio is the width of the bounding box divided by the width of the actual image.
Using these two properties we can make a check on Line 72 to see if we are examining the MRZ region. The MRZ is rectangular, with a width that is much larger than the height. The MRZ should also span at least 75% of the input image.
Provided these two cases hold, Lines 75-84 use the (x, y)-coordinates of the bounding box to extract the MRZ and draw the bounding box on our input image.
Finally, Lines 87-89 display our results.
To see our MRZ detector in action, just execute the following command:
$ python detect_mrz.py --images examples
Below you can see of an example of a successful MRZ detection, with the MRZ outlined in green:
Here is another example of detecting the Machine-readable Zone in a passport image using Python and OpenCV:
It doesn’t matter if the MRZ region is at the top or the bottom of the image. By applying morphological operations, extracting contours, and computing contour properties, we are able to extract the MRZ without a problem.
The same is true for the following image:
Let’s give another image a try:
Up until now we have only seen Type 1 MRZs that contain three lines. However, our method works just as well with Type 3 MRZs that contain only two lines:
Here’s another example of detecting a Type 3 MRZ:
What's next? I recommend PyImageSearch University.
74 total classes • 84 hours of on-demand code walkthrough videos • Last updated: March 2023
★★★★★ 4.84 (128 Ratings) • 15,800+ Students Enrolled
I strongly believe that if you had the right teacher you could master computer vision and deep learning.
Do you think learning computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?
That’s not the case.
All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that’s exactly what I do. My mission is to change education and how complex Artificial Intelligence topics are taught.
If you're serious about learning computer vision, your next stop should be PyImageSearch University, the most comprehensive computer vision, deep learning, and OpenCV course online today. Here you’ll learn how to successfully and confidently apply computer vision to your work, research, and projects. Join me in computer vision mastery.
Inside PyImageSearch University you'll find:
- ✓ 74 courses on essential computer vision, deep learning, and OpenCV topics
- ✓ 74 Certificates of Completion
- ✓ 84 hours of on-demand video
- ✓ Brand new courses released regularly, ensuring you can keep up with state-of-the-art techniques
- ✓ Pre-configured Jupyter Notebooks in Google Colab
- ✓ Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
- ✓ Access to centralized code repos for all 500+ tutorials on PyImageSearch
- ✓ Easy one-click downloads for code, datasets, pre-trained models, etc.
- ✓ Access on mobile, laptop, desktop, etc.
In this blog post we learned how to detect Machine-readable Zones (MRZs) in passport scans using only basic image processing techniques, namely:
- Morphological operations (specifically, closings and erosions).
- Contour properties.
These operations, while simple, allowed us to detect the MRZ regions in images without having to rely on more advanced feature extraction and machine learning methods such as Linear SVM + HOG for object detection.
Remember, when faced with a challenging computer vision problem — always consider the problem and your assumptions! As this blog post demonstrates, you might be surprised what basic image processing functions used in tandem can accomplish.
Once again, a big thanks to PyImageSearch Gurus member, Hans Boone, who supplied us with these example passport images! Thanks Hans!
Download the Source Code and FREE 17-page Resource Guide
Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked tutorials, books, courses, and libraries to help you master CV and DL!
66 responses to: Detecting machine-readable zones in passport images
Nicely done sir,. thank you for sharing.
Many thanks for your excellent blog, Adrian. Regarding this post, will it work on OpenCV3 and Python 3, or is the source code for OpenCV2.x and Python 2.7?
This code will work for both OpenCV 2.4 and OpenCV 3, along with both Python 2.7 and Python 3.
Awesome Adrian! Thanks for sharing!
Is there is away to unencrypt the MRZ via python ?
like getting name , surname , passport NO … etc
I personally do not know of any, but at the same time I don’t do a lot of work with passports.
There is a Python package called PassportEye that will recognize the MRZ, then OCR and parse the fields. The code does reference this blog post, but some of the image samples here do not work with his code.
Thank you for sharing Don!
Just a simple question : who is your ID specimen provider ?
Thank you by advance
Please see the first paragraph of the blog post: Hans, a member of the PyImageSearch Gurus course.
Thank for sharing.
Please convert it in c++ code.
hello i am very new, but i cant run this code. I tried to type: python detect_mrz.py –images examples in the command line but it says invalid syntax. I saved all the images to the folder named example and put it under my project path. plz help
Make sure you use the “Downloads” section of this post to download the source code + example images instead of trying to copy and paste the code and download the images yourself. This will ensure your project structure is the same as mine.
please tell me what is the purpose of this line:
image = imutils.resize(image, height=600)
That line of code resizes the image to have a height of 600 pixels, maintaining the aspect ratio.
I got lost in the ” gradX = (255 * ((gradX – minVal) / (maxVal – minVal))).astype(“uint8”) “.
If you cv2.imshow gradX after the np.absolute, you’ll see that in the MRZ zone, the letters (and almost everything) are just white spots without too much shape.
How was that operation able to create the result on figure 3?
PS: your OpenCV + case studies book is very nice and this blog is amazing!
After computing the gradient representation of the image, we need to scale it back to the range [0, 255] by applying “min-max scaling”. We then convert the data type from a float to an unsigned 8-bit integer which is what images are typically represented as when displaying them to our screen. Other OpenCV functions also assume an 8-bit unsigned integer data type.
Hello Adrian! Great Work!
Can you tell me what means ‘None’ in:
thresh = cv2.erode(thresh, None, iterations=4) line.
It must be kernel or some but what kernel will be passed if its argument is ‘None’ ?
Nonehere indicates that a default 3×3 structuring kernel should be used.
Adrian, can you tell me why image needs to be resized to 600 height?
When processing an input image, we rarely work with images that are larger than 600-800 pixels along their maximum dimension. The extra detail in high resolution images may look appeasing to the human eye, but the only “confuse” computer vision and image processing algorithms. The less detail there is, the easier it is for these algorithms to focus on the “structural” components of the image.
hello, I’m working on similar project and need your help to sort out somethings.
can you get in touch with me?
i would really be glad.
Hey Adrian am wondering if you can make a tutorial about this Extracting personal information on a passport such as owner’s photo, first / last name, birth date, document ID number, issue / expiration date, place of issue
I will consider it. In the mean time, I would suggest you read this post on OCR.
What are the arguments “-i” and “–image” that I need to execute ap.add_argument(“-i”, “–images”, required=True, help=”path to images directory”)?
Please read up on command line arguments.
Hello, good article.
I have questions, why the rectKernet size is (13, 5) and sqKernel(21,21)?. how you calculate that?
Does it depends on input image size?
It depends on your image size. All images hear are resized to have a height of 600 pixels. I fiddled with the kernel sizes experimentally after the image was resized to 600 pixels. This is common practice when building basic image processing workflows.
I have seen this in other people’s implementations, also in the openCV tutorials :
thresh = cv2.threshold(gradX, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
Is there a specific reason for ORing the threshold parameters :
cv2.THRESH_BINARY | cv2.THRESH_OTSU ?
Yes, we take the bitwise OR on the two flags to indicate that we are doing both binary thresholding using Otsu’s method.
Can you explain me please these lines?
# during thresholding, it’s possible that border pixels were
# included in the thresholding, so let’s set 5% of the left and
# right borders to zero
p = int(image.shape * 0.05)
thresh[:, 0:p] = 0
thresh[:, image.shape – p:] = 0
To start we compute 5% of the width. Then set all pixels that are within 5% of the width on either side to zero. We do this because we know the MRZ cannot be along the width borders of the image.
Please provide for Android MRZ scanner
I don’t have any plans on building an Android MRZ scanner but if you implement this method in Java + OpenCV I am confident that you can do it 🙂
Thanks Jeremiah, I’m glad you liked it.
File “../../Downloads/mrz/detect_mrz.py”, line 90, in
NameError: name ‘roi’ is not defined
Make sure you are using the “Downloads” section of this blog post to download the source code and example images. This error was likely introduced through a copy and paste issue.
Even I am having the same issue, but I don’t think the reason is copy-paste or indentation. I am working with different image and it seems that this particular code is not able to find any ROIs in the image. I think I will have to fiddle with values of rectKernel, sqKernel and/or coverage in contour for-loop. Do you think these values would help to find ROIs? I have a person holding a white rectangular paper and I am trying to find a number on that paper.
Yes, you will want to fiddle with the size of the kernels. You can use the “cv2.imshow” function to help you debug the mask as you play with the values.
Many thanks for your excellent this post, Adrian.
This script works very well.
But there are some problems with Iraqi and Yemeni passports.
In the Iraqi passport, there is a barcode box above mrz. Which makes a mistake when calculating
Do you have a solution?
The Iraqi passport sample is at the following link:
What specifically are the “mistakes” the script is making? You’ll likely need to modify the heuristics of the script to crop the location the MRZ is supposed to be.
I want to just select mrz.
However, the barcode box (top box mrz) and mrz are chosen together!!!
What is the “top box MRZ”? Perhaps you could share an image of your output?
If I am not mistaken, this method works for MRZ is essentially due to two features. (1) the MRZ is black while the background is white, and (2) MRZ text is packed in a relatively large rectangular shape.
So how about credit card number? Those numbers are “3D” and the background of the card varies a lot depending on the issuing bank. I have tried out your credit card sample code, but those works for template card only.
Are there any suggestions on what kinds of features we should look into, so that we can readout the credit card number (or at least find the ROI). Thanks.
Hey Benedict — great question, thanks for asking. This coming Monday I’ll be publishing a blog post on “text detection” in natural scene images. Keep an eye out for it as it will address your exact question.
How can i apply the same technique to find machine readable zones in a document image.
One solution maybe to compare the height and width of the contour to detect paragraphs and lines but also detects a few unwanted contours.
Great article. How can we apply perspective transform to the ROI rectangle?
Follow the steps in this post.
Hi Adrian, thanks for the link , the four_point_transform method takes points as parameter but here we have ROI rectangle. Do i have to process the ROI with all the steps in the suggested post?
That depends on what the contents of your ROI is. What specifically is in your ROI?
My ROI is a MAT obtained using frame.submat(rectangle).
But what does the ROI actually contain? What are the visual contents of the ROI?
The visual content is the MRZ part. as described in your post. I’ve adapted your code to java.
I think you may want to go back and refactor how you’re extracting the ROI. If you can find the original ROI, assuming you’re doing so programmatically, just compute the rotated bounding box of the ROI (as we did in the original tutorial I linked you to), and then perform the perspective transform. There really isn’t a reason to apply a perspective transform after you’ve already extracted the ROI, it just makes the pipeline more complicated.
Hi, I’m new in Android Development and specially in OpenCV, this is the first time I’m using OpenCV, can you please share me the code and guide me how can I get the MRZ image from camera using Java ?
I’m wondering if you plan to make a tutorial to OCR the MRZ. Preprocessing the MRZ to get good results. I’ve tried with Tesseract but the result varies since I use live camera and i guess i need to thin the character too.
Hey Leks — I don’t have any plans to cover OCR’ing the MRZ but I will consider it for a future tutorial (I don’t know if/when I will cover it though).
The code is running fine but it is only getting the first line, not both.Need help please
Thank your for this post. When I get the result, the bounding box does not fully enclose the MRZ code. How can I slightly expand the height and width of the box so that it fully encloses the MRZ?
You can define a heuristic to pad the bounding boxes. Take the width and height of the bounding box, multiply it by your N-percent value. Subtract the value from the top-left corner. Add them to the bottom-right.
Thank you I was able to get it working!
Nice job, Danny!
thanks a lot for this excellent piece. Works like a charm. And it’s a perfect introduction for opencv for newbies like me.
Just one note:
i was trying with other images (photos of my id). In some cases -actually when the photo is better: well focused, no shadows, no blur, etc.-, the step “close gaps between lines” didn’t work 100%. It does not close the three lines of the MRZ, and it’s just closing the first two. So, afterwards, the found countour just includes the first two lines of the MRZ.
I fixed it (at least successfuly till now) changing from 21 to 55 here:
##sqKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (21, 21))
sqKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (55, 55))
It was a totally random value: I don’t even know what this parameter exactly is. But I think it’s worth noting it!
In fact, I’m trying to implement MRZ from browser/webcam images, where quality is not very good and, worse: people -me the first- are not steady hand precisely xD
For these images, where the content, apart from the document, includes fingers in the corners, half a face of the guy, etc. the roi detection from this blog post does not work… but as said, is a perfect introduction to opencv, so I’ll try to study about it and getting it done.
Actually, 55,55 did not work in some lighting photo; but this does:
sqKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (33, 33))
Gregorius Ivan Sebastian
Very awesome tutorial, Adrian. Everything is explained in-depth and very informative. There is one line that I do not understand is the operations at line 75-78, why are the points computed with such equations?