1. 程式人生 > >識別簡單的答題卡(Bubble sheet multiple choice scanner and test grader using OMR, Python and OpenCV——jsxyhelu重新整編)

識別簡單的答題卡(Bubble sheet multiple choice scanner and test grader using OMR, Python and OpenCV——jsxyhelu重新整編)

該部落格轉自www.pyimagesearch.com,進行了相關修改補充。

Figure 14: Recognizing bubble sheet exams using computer vision.

Over the past few months I’ve gotten quite the number of requests landing in my inbox to build a bubble sheet/Scantron-like【簡而言之答題卡】 test reader using computer vision and image processing techniques.

And while I’ve been having a lot of fun doing this series on machine learning and deep learning, I’d be 

lying if I said this little mini-project wasn’t a short, welcome break. One of my favorite parts of running the PyImageSearch blog is demonstrating how to build actual solutions to problems using computer vision.【顯而易見,pyimageSearch正在向dl方向轉型,實際上這也是我準備下一步要做的】

In fact, what makes this project 

so special is that we are going to combine the techniques from many previous blog posts, including building a document scanner, contour sorting, and perspective transforms. Using the knowledge gained from these previous posts, we’ll be able to make quick work of this bubble sheet scanner and test grader.

You see, last Friday afternoon I quickly Photoshopped an example bubble test paper, printed out a few copies, and then set to work on coding up the actual implementation.

Overall, I am quite pleased with this implementation and I think you’ll absolutely be able to use this bubble sheet grader/OMR system as a starting point for your own projects.【完成一個完整的專案仍然是非常有獨特價值的】

Bubble sheet scanner and test grader using OMR, Python, and OpenCV

In the remainder of this blog post, I’ll discuss what exactly Optical Mark Recognition (OMR) is. I’ll then demonstrate how to implement a bubble sheet test scanner and grader using strictly computer vision and image processing techniques, along with the OpenCV library.

Once we have our OMR system implemented, I’ll provide sample results of our test grader on a few example exams, including ones that were filled out with nefarious intent.

Finally, I’ll discuss some of the shortcomings of this current bubble sheet scanner system and how we can improve it in future iterations.

What is Optical Mark Recognition (OMR)?【名稱解釋】

Optical Mark Recognition, or OMR for short, is the process of automatically analyzing human-marked documents and interpreting their results.【這個定義蠻好】

Arguably, the most famous, easily recognizable form of OMR are bubble sheet multiple choice tests, not unlike the ones you took in elementary school, middle school, or even high school.

If you’re unfamiliar with “bubble sheet tests” or the trademark/corporate name of “Scantron tests”, they are simply multiple-choice tests that you take as a student. Each question on the exam is a multiple choice — and you use a #2 pencil to mark the “bubble” that corresponds to the correct answer.我在想,這個工作,可以和GOGPY一起,成為一個非常棒的新視訊內容

The most notable bubble sheet test you experienced (at least in the United States) were taking the SATs during high school, prior to filling out college admission applications.

believe that the SATs use the software provided by Scantron to perform OMR and grade student exams, but I could easily be wrong there. I only make note of this because Scantron is used in over 98% of all US school districts.

In short, what I’m trying to say is that there is a massive market for Optical Mark Recognition and the ability to grade and interpret human-marked forms and exams.

如果能夠結合答題卡的定製,那這個專案將非常完整了。經過一段時間的積累,可以成為拳頭專案】這將是10月份的重要工作。

Implementing a bubble sheet scanner and grader using OMR, Python, and OpenCV

Now that we understand the basics of OMR, let’s build a computer vision system using Python and OpenCV that can read and grade bubble sheet tests.

Of course, I’ll be providing lots of visual example images along the way so you can understand exactly what techniques I’m applying and why I’m using them.

Below I have included an example filled in bubble sheet exam that I have put together for this project:

Figure 1: The example, filled in bubble sheet we are going to use when developing our test scanner software.

Figure 1: The example, filled in bubble sheet we are going to use when developing our test scanner software.

We’ll be using this as our example image as we work through the steps of building our test grader. Later in this lesson, you’ll also find additional sample exams.

I have also included a blank exam template as a .PSD (Photoshop) file so you can modify it as you see fit. You can use the “Downloads” section at the bottom of this post to download the code, example images, and template file.【提供模板的doc,我看可以】

The 7 steps to build a bubble sheet scanner and grader

The goal of this blog post is to build a bubble sheet scanner and test grader using Python and OpenCV.

To accomplish this, our implementation will need to satisfy the following 7 steps:【分為7步,這個很專業的寫法】

  • Step #1: Detect the exam in an image. 從影象中讀取到答題卡
  • Step #2: Apply a perspective transform to extract the top-down, birds-eye-view of the exam. 對影象進行透視變化
  • Step #3: Extract the set of bubbles (i.e., the possible answer choices) from the perspective transformed exam. 從轉換結果中獲取bubbles
  • Step #4: Sort the questions/bubbles into rows. 按行切割
  • Step #5: Determine the marked (i.e., “bubbled in”) answer for each row. 每行識別
  • Step #6: Lookup the correct answer in our answer key to determine if the user was correct in their choice. 識別
  • Step #7: Repeat for all questions in the exam. 重複

The next section of this tutorial will cover the actual implementation of our algorithm.

The bubble sheet scanner implementation with Python and OpenCV

To get started, open up a new file, name it test_grader.py , and let’s get to work:

Bubble sheet scanner and test grader using OMR, Python and OpenCV Python
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import numpy as np
import argparse
import imutils
import cv2
 
# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument( "-i", "--image", required = True,
help = "path to the input image")
args = vars(ap.parse_args())
 
# define the answer key which maps the question number
# to the correct answer
ANSWER_KEY = { 0 : 1, 1 : 4, 2 : 0, 3 : 3, 4 : 1}

On Lines 2-7 we import our required Python packages.

You should already have OpenCV and Numpy installed on your system, but you might not have the most recent version of imutils, my set of convenience functions to make performing basic image processing operations easier. To install imutils  (or upgrade to the latest version), just execute the following command:

Bubble sheet scanner and test grader using OMR, Python and OpenCV Shell
1 $ pip install -- upgrade imutils

Lines 10-12 parse our command line arguments. We only need a single switch here, --image , which is the path to the input bubble sheet test image that we are going to grade for correctness.

Line 17 then defines our ANSWER_KEY .

As the name of the variable suggests, the ANSWER_KEY  provides integer mappings of the question numbers to the index of the correct bubble.

In this case, a key of 0 indicates the first question, while a value of 1 signifies “B” as the correct answer (since “B” is the index 1 in the string “ABCDE”). As a second example, consider a key of 1 that maps to a value of 4 — this would indicate that the answer to the second question is “E”.

As a matter of convenience, I have written the entire answer key in plain english here:

  • Question #1: B
  • Question #2: E
  • Question #3: A
  • Question #4: D
  • Question #5: B

Next, let’s preprocess our input image:

Bubble sheet scanner and test grader using OMR, Python and OpenCV Python
19 20 21 22 23 24 # load the image, convert it to grayscale, blur it
# slightly, then find edges
image = cv2.imread(args[ "image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, ( 5, 5), 0)
edged = cv2.Canny(blurred, 75, 200)

On Line 21 we load our image from disk, followed by converting it to grayscale (Line 22), and blurring it to reduce high frequency noise (Line 23).

We then apply the Canny edge detector on Line 24 to find the edges/outlines of the exam.

Below I have included a screenshot of our exam after applying edge detection:

Figure 2: Applying edge detection to our exam neatly reveals the outlines of the paper.

Figure 2: Applying edge detection to our exam neatly reveals the outlines of the paper.

Notice how the edges of the document are clearly defined, with all four vertices of the exam being present in the image.

Obtaining this silhouette of the document is extremely important in our next step as we will use it as a marker to apply a perspective transform to the exam, obtaining a top-down, birds-eye-view【非常專業的觀點】 of the document:

Bubble sheet scanner and test grader using OMR, Python and OpenCV Python 【注意這裡 approxPolyDP的運用,是亮點
# find contours in the edge map, then initialize
# the contour that corresponds to the document
cnts = cv2.findContours(edged. copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[ 0] if imutils.is_cv2() else cnts[ 1]
docCnt = None
 
# ensure that at least one contour was found
if len(cnts) > 0 :
# sort the contours according to their size in
# descending order
cnts = sorted(cnts, key =cv2.contourArea, reverse = True)
 
# loop over the sorted contours
for c in cnts :
# approximate the contour
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0. 02 * peri, True)
 
# if our approximated contour has four points,
# then we can assume we have found the paper
if len(approx) == 4 :
docCnt = approx
break

Now that we have the outline of our exam, we apply the cv2.findContours  function to find the lines that correspond to the exam itself.

We do this by sorting our contours by their area (from largest to smallest) on Line 37 (after making sure at least one contour was found on Line 34, of course【邊界處理,非常重要】). This implies that larger contours will be placed at the front of the list, while smaller contours will appear farther back in the list.

We make the assumption that our exam will be the main focal point of the image, and thus be larger than other objects in the image. This assumption allows us to “filter” our contours, simply by investigating their area and knowing that the contour that corresponds to the exam should be near the front of the list.

However, contour area and size is not enough — we should also check the number of vertices on the contour.

To do, this, we loop over each of our (sorted) contours on Line 40. For each of them, we approximate the contour, which in essence means we simplify the number of points in the contour, making it a “more basic” geometric shape. You can read more about contour approximation in this post on building a mobile document scanner.

On Line 47 we make a check to see if our approximated contour has four points, and if it does, we assume that we have found the exam.

Below I have included an example image that demonstrates the docCnt  variable being drawn on the original image:

Figure 3: An example of drawing the contour associated with the exam on our original image, indicating that we have successfully found the exam.

Figure 3: An example of drawing the contour associated with the exam on our original image, indicating that we have successfully found the exam.

Sure enough, this area corresponds to the outline of the exam.

Now that we have used contours to find the outline of the exam, we can apply a perspective transform to obtain a top-down, birds-eye-view of the document:

Bubble sheet scanner and test grader using OMR, Python and OpenCV Python 【看看他的想法】
51 52 53 54 55 # apply a four point perspective transform to both the # original image and grayscale image to obtain a top-down # birds eye view of the paper paper = four_point_transform ( image , docCnt . reshape ( 4 , 2 ) ) warped = four_point_transform ( gray , docCnt . reshape ( 4 , 2 ) )

def four_point_transform(image, pts) :
# obtain a consistent order of the points and unpack them
# individually
rect = order_points(pts)
(tl, tr, br, bl) = rect
 
# compute the width of the new image, which will be the
# maximum distance between bottom-right and bottom-left
# x-coordiates or the top-right and top-left x-coordinates
widthA = np.sqrt(((br[ 0] - bl[ 0]) ** 2) + ((br[ 1] - bl[ 1]) ** 2))
widthB = np.sqrt(((tr[ 0] - tl[ 0]) ** 2) + ((tr[ 1] - tl[ 1]) ** 2))
maxWidth = max( int(widthA), int(widthB))
 
# compute the height of the new image, which will be the
# maximum distance between the top-right and bottom-right
# y-coordinates or the top-left and bottom-left y-coordinates
heightA = np.sqrt(((tr[ 0] - br[ 0]) ** 2) + ((tr[ 1] - br[ 1]) ** 2))
heightB = np.sqrt(((tl[ 0] - bl[ 0]) ** 2) + ((tl[ 1] - bl[ 1]) ** 2))
maxHeight = max( int(heightA), int(heightB))
 
# now that we have the dimensions of the new image, construct
# the set of destination points to obtain a "birds eye view",
# (i.e. top-down view) of the image, again specifying points
# in the top-left, top-right, bottom-right, and bottom-left
# order
dst = np. array([
[ 0, 0],
[maxWidth - 1, 0],
[maxWidth - 1, maxHeight - 1],
[ 0, maxHeight - 1]], dtype = "float32")
 
# compute the perspective transform matrix and then apply it
M = cv2.getPerspectiveTransform(rect, dst)
warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))
 
# return the warped image
return warped

def order_points(pts) :
# initialzie a list of coordinates that will be ordered
# such that the first entry in the list is the top-left,
# the second entry is the top-right, the third is the
# bottom-right, and the fourth is the bottom-left
rect = np.zeros(( 4, 2), dtype = "float32")
 
# the top-left point will have the smallest sum, whereas
# the bottom-right point will have the largest sum
s = pts.sum(axis = 1)
rect[ 0] = pts[np.argmin(s)]
rect[ 2] = pts[np.argmax(s)]
 
# now, compute the difference between the points, the
# top-right point will have the smallest difference,
# whereas the bottom-left will have the largest difference
diff = np.diff(pts, axis = 1)
rect[ 1] = pts[np.argmin(diff)]
rect[ 3] = pts[np.argmax(diff)]
 
# return the ordered coordinates
return rect