项目实战—答题卡识别判卷续
在之前的轮廓检测之后,接下来我们需要开始进行对图像进行Transform变换,从而对图像进行校正。
# 对原始图像和灰度图都进行四点Transform变换 paper = four_point_transform(image, docCnt.reshape(4, 2)) warped = four_point_transform(gray, docCnt.reshape(4, 2))
我们使用了four_point_transform函数,它将轮廓的(x, y) 坐标以一种特别、可重复的方式整理,并且对轮廓包围的区域进行****变换。暂时我们只需要知道它的变换效果就行了。
可以看到,图像经过变化之后已经被校正,接下来可以进行下一步的处理了。
我们从原始图像中获取了答题卡,并应用****变换获取90度俯视效果。
下面要对题目进行判断了。
这一步开始于二值化,或者说是图像的前景和后景的分离/阈值处理。
# 对灰度图应用大津二值化算法 thresh = cv2.threshold(warped, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
现在,我们的图像是一个纯粹二值图像了。
图像的背景是黑色的,而前景是白色的。
这二值化使得我们能够再次应用轮廓提取技术,以找到每个题目中的气泡选项。
# 在二值图像中查找轮廓,然后初始化题目对应的轮廓列表 cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) cnts = cnts[0] if imutils.is_cv2() else cnts[1] questionCnts = [] # 对每一个轮廓进行循环处理 for c in cnts: # 计算轮廓的边界框,然后利用边界框数据计算宽高比 (x, y, w, h) = cv2.boundingRect(c) ar = w / float(h) # 为了辨别一个轮廓是一个气泡,要求它的边界框不能太小,在这里边至少是20个像素,而且它的宽高比要近似于1 if w >= 20 and h >= 20 and ar >= 0.9 and ar <= 1.1: questionCnts.append(c)
我们由二值图像中的轮廓,获取轮廓边界框,利用边界框数据来判定每一个轮廓是否是一个气泡,如果是,将它加入题目列表questionCnts。
将我们得到的题目列表中的轮廓在图像中画出,得到下图:
只有题目气泡区域被圈出来了,而其它地方没有。
接下来就是阅卷了:
# 以从顶部到底部的方法将我们的气泡轮廓进行排序,然后初始化正确答案数的变量。 questionCnts = contours.sort_contours(questionCnts, method="top-to-bottom")[0] correct = 0 # 每个题目有5个选项,所以5个气泡一组循环处理 for (q, i) in enumerate(np.arange(0, len(questionCnts), 5)): # 从左到右为当前题目的气泡轮廓排序,然后初始化被涂画的气泡变量 cnts = contours.sort_contours(questionCnts[i:i + 5])[0] bubbled = None
首先,我们对questionCnts进行从上到下的排序,使得靠近顶部的一行气泡在列表中最先出现。然后对每行气泡应用从左到右的排序,使左边的气泡在队列中先出现。解释下,就是气泡轮廓按纵坐标先排序,并排的5个气泡轮廓纵坐标相差不大,总会被排在一起,而且每组气泡之间按从上到下的顺序排列,然后再将每组轮廓按横坐标分出先后。
第二步,我们需要判断哪个气泡被填充了。我们可以利用二值图像中每个气泡区域内的非零像素点数量来进行判断。
# 对一行从左到右排列好的气泡轮廓进行遍历 for (j, c) in enumerate(cnts): # 构造只有当前气泡轮廓区域的掩模图像 mask = np.zeros(thresh.shape, dtype="uint8") cv2.drawContours(mask, [c], -1, 255, -1) # 对二值图像应用掩模图像,然后就可以计算气泡区域内的非零像素点。 mask = cv2.bitwise_and(thresh, thresh, mask=mask) total = cv2.countNonZero(mask) # 如果像素点数最大,就连同气泡选项序号一起记录下来 if bubbled is None or total > bubbled[0]: bubbled = (total, j)
接着就是查找答案字典,判断正误了。
如果气泡作答是对的,则用绿色圈起来,如果不对,就用红色圈出正确答案:
# USAGE # python test_grader.py --image test_01.png # import the necessary packages from imutils.perspective import four_point_transform from imutils import contours import numpy as np import argparse import imutils import cv2 # construct the argument parse and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-i", "--image", required=True, help="path to the input image") args = vars(ap.parse_args()) # define the answer key which maps the question number # to the correct answer ANSWER_KEY = {0: 1, 1: 4, 2: 0, 3: 3, 4: 1} # load the image, convert it to grayscale, blur it # slightly, then find edges image = cv2.imread(args["image"]) gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) blurred = cv2.GaussianBlur(gray, (5, 5), 0) edged = cv2.Canny(blurred, 75, 200) # find contours in the edge map, then initialize # the contour that corresponds to the document cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[0] # cnts = cnts[0] if imutils.is_cv2() else cnts[1] docCnt = None # ensure that at least one contour was found if len(cnts) > 0: # sort the contours according to their size in # descending order cnts = sorted(cnts, key=cv2.contourArea, reverse=True) # loop over the sorted contours for c in cnts: # approximate the contour peri = cv2.arcLength(c, True) approx = cv2.approxPolyDP(c, 0.02 * peri, True) # if our approximated contour has four points, # then we can assume we have found the paper if len(approx) == 4: docCnt = approx break # apply a four point perspective transform to both the # original image and grayscale image to obtain a top-down # birds eye view of the paper paper = four_point_transform(image, docCnt.reshape(4, 2)) warped = four_point_transform(gray, docCnt.reshape(4, 2)) # apply Otsu's thresholding method to binarize the warped # piece of paper thresh = cv2.threshold(warped, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1] # find contours in the thresholded image, then initialize # the list of contours that correspond to questions cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[0] # cnts = cnts[0] if imutils.is_cv2() else cnts[1] questionCnts = [] # loop over the contours for c in cnts: # compute the bounding box of the contour, then use the # bounding box to derive the aspect ratio (x, y, w, h) = cv2.boundingRect(c) ar = w / float(h) # in order to label the contour as a question, region # should be sufficiently wide, sufficiently tall, and # have an aspect ratio approximately equal to 1 if w >= 20 and h >= 20 and ar >= 0.9 and ar <= 1.1: questionCnts.append(c) # sort the question contours top-to-bottom, then initialize # the total number of correct answers questionCnts = contours.sort_contours(questionCnts, method="top-to-bottom")[0] correct = 0 # each question has 5 possible answers, to loop over the # question in batches of 5 for (q, i) in enumerate(np.arange(0, len(questionCnts), 5)): # sort the contours for the current question from # left to right, then initialize the index of the # bubbled answer cnts = contours.sort_contours(questionCnts[i:i + 5])[0] bubbled = None # loop over the sorted contours for (j, c) in enumerate(cnts): # construct a mask that reveals only the current # "bubble" for the question mask = np.zeros(thresh.shape, dtype="uint8") cv2.drawContours(mask, [c], -1, 255, -1) # apply the mask to the thresholded image, then # count the number of non-zero pixels in the # bubble area mask = cv2.bitwise_and(thresh, thresh, mask=mask) total = cv2.countNonZero(mask) # if the current total has a larger number of total # non-zero pixels, then we are examining the currently # bubbled-in answer if bubbled is None or total > bubbled[0]: bubbled = (total, j) # initialize the contour color and the index of the # *correct* answer color = (0, 0, 255) k = ANSWER_KEY[q] # check to see if the bubbled answer is correct if k == bubbled[1]: color = (0, 255, 0) correct += 1 # draw the outline of the correct answer on the test cv2.drawContours(paper, [cnts[k]], -1, color, 3) # grab the test taker score = (correct / 5.0) * 100 print("[INFO] score: {:.2f}%".format(score)) cv2.putText(paper, "{:.2f}%".format(score), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 0, 255), 2) cv2.imshow("Original", image) cv2.imshow("Exam", paper) cv2.waitKey(0)
以上是所有的代码,现在来看演示结果: