1. 程式人生 > >EasyPR--開發詳解(8)文字定位

EasyPR--開發詳解(8)文字定位

dont bubuko 通用 設置 光照 detect improve nmp easy

轉自https://www.cnblogs.com/subconscious/p/5637735.html

今天我們來介紹車牌定位中的一種新方法--文字定位方法(MSER),包括其主要設計思想與實現。接著我們會介紹一下EasyPR v1.5-beta版本中帶來的幾項改動。

一. 文字定位法

  在EasyPR前面幾個版本中,最為人所詬病的就是定位效果不佳,尤其是在面對生活場景(例如手機拍攝)時。由於EasyPR最早的數據來源於卡口,因此對卡口數據進行了優化,而並沒有對生活場景中圖片有較好處理的策略。後來一個版本(v1.3)增加了顏色定位方法,改善了這種現象,但是對分辨率較大的圖片處理仍然不好。再加上顏色定位在面對低光照,低對比度的圖像時處理效果大幅度下降,顏色本身也是一個不穩定的特征。因此EasyPR的車牌定位的整體魯棒性仍然不足。

  針對這種現象,EasyPR v1.5增加了一種新的定位方法,文字定位方法,大幅度改善了這些問題。下面幾幅圖可以說明文字定位法的效果。

技術分享圖片 技術分享圖片

圖1 夜間的車牌圖像(左) , 圖2 對比度非常低的圖像(右)

技術分享圖片 技術分享圖片

圖3 近距離的圖像(左) , 圖4 高分辨率的圖像(右)


  圖1是夜間的車牌圖像,圖2是對比度非常低的圖像,圖3是非常近距離拍攝的圖像,圖4則是高分辨率(3200寬)的圖像。

  文字定位方法是采用了低級過濾器提取文字,然後再將其組合的一種定位方法。原先是利用在場景中定位文字,在這裏利用其定位車牌。與在掃描文檔中的文字不同,自然場景中的文字具有低對比度,背景各異,光亮幹擾較多等情況,因此需要一個極為魯棒的方法去提取出來。目前業界用的較多的是MSER(最大穩定極值區域)方法。EasyPR使用的是MSER的一個改良方法,專門針對文字進行了優化。在文字定位出來以後,一般需要用一個分類器將其中大部分的定位錯誤的文字去掉,例如ANN模型。為了獲得最終的車牌,這些文字需要組合起來。由於實際情況的復雜,簡單的使用普通的聚類效果往往不好,因此EasyPR使用了一種魯棒性較強的種子生長方法(seed growing)去組合。

  我在這裏簡單介紹一下具體的實現。關於方法的細節可以看代碼,有很多的註釋(代碼可能較長)。關於方法的思想可以看附錄的兩篇論文。

技術分享圖片 View Code

//! use verify size to first generate char candidates
void mserCharMatch(const Mat &src, std::vector<Mat> &match, std::vector<CPlate>& out_plateVec_blue, std::vector<CPlate>& out_plateVec_yellow,
bool usePlateMser, std::vector<RotatedRect>& out_plateRRect_blue, std::vector<RotatedRect>& out_plateRRect_yellow, int img_index,
bool showDebug) {
Mat image = src;

std::vector<std::vector<std::vector<Point>>> all_contours;
std::vector<std::vector<Rect>> all_boxes;
all_contours.resize(2);
all_contours.at(0).reserve(1024);
all_contours.at(1).reserve(1024);
all_boxes.resize(2);
all_boxes.at(0).reserve(1024);
all_boxes.at(1).reserve(1024);

match.resize(2);

std::vector<Color> flags;
flags.push_back(BLUE);
flags.push_back(YELLOW);

const int imageArea = image.rows * image.cols;
const int delta = 1;
//const int delta = CParams::instance()->getParam2i();;
const int minArea = 30;
const double maxAreaRatio = 0.05;

Ptr<MSER2> mser;
mser = MSER2::create(delta, minArea, int(maxAreaRatio * imageArea));
mser->detectRegions(image, all_contours.at(0), all_boxes.at(0), all_contours.at(1), all_boxes.at(1));

// mser detect
// color_index = 0 : mser-, detect white characters, which is in blue plate.
// color_index = 1 : mser+, detect dark characters, which is in yellow plate.

#pragma omp parallel for
for (int color_index = 0; color_index < 2; color_index++) {
Color the_color = flags.at(color_index);

std::vector<CCharacter> charVec;
charVec.reserve(128);

match.at(color_index) = Mat::zeros(image.rows, image.cols, image.type());

Mat result = image.clone();
cvtColor(result, result, COLOR_GRAY2BGR);

size_t size = all_contours.at(color_index).size();

int char_index = 0;
int char_size = 20;

// Chinese plate has max 7 characters.
const int char_max_count = 7;

// verify char size and output to rects;
for (size_t index = 0; index < size; index++) {
Rect rect = all_boxes.at(color_index)[index];
std::vector<Point>& contour = all_contours.at(color_index)[index];

// sometimes a plate could be a mser rect, so we could
// also use mser algorithm to find plate
if (usePlateMser) {
RotatedRect rrect = minAreaRect(Mat(contour));
if (verifyRotatedPlateSizes(rrect)) {
//rotatedRectangle(result, rrect, Scalar(255, 0, 0), 2);
if (the_color == BLUE) out_plateRRect_blue.push_back(rrect);
if (the_color == YELLOW) out_plateRRect_yellow.push_back(rrect);
}
}

// find character
if (verifyCharSizes(rect)) {
Mat mserMat = adaptive_image_from_points(contour, rect, Size(char_size, char_size));
Mat charInput = preprocessChar(mserMat, char_size);
Rect charRect = rect;

Point center(charRect.tl().x + charRect.width / 2, charRect.tl().y + charRect.height / 2);
Mat tmpMat;
double ostu_level = cv::threshold(image(charRect), tmpMat, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);

//cv::circle(result, center, 3, Scalar(0, 0, 255), 2);

// use judegMDOratio2 function to
// remove the small lines in character like "zh-cuan"
if (judegMDOratio2(image, rect, contour, result)) {
CCharacter charCandidate;
charCandidate.setCharacterPos(charRect);
charCandidate.setCharacterMat(charInput);
charCandidate.setOstuLevel(ostu_level);
charCandidate.setCenterPoint(center);
charCandidate.setIsChinese(false);
charVec.push_back(charCandidate);
}
}
}

// improtant, use matrix multiplication to acclerate the
// classification of many samples. use the character
// score, we can use non-maximum superssion (nms) to
// reduce the characters which are not likely to be true
// charaters, and use the score to select the strong seed
// of which the score is larger than 0.9
CharsIdentify::instance()->classify(charVec);

// use nms to remove the character are not likely to be true.
double overlapThresh = 0.6;
//double overlapThresh = CParams::instance()->getParam1f();
NMStoCharacter(charVec, overlapThresh);
charVec.shrink_to_fit();

std::vector<CCharacter> strongSeedVec;
strongSeedVec.reserve(64);
std::vector<CCharacter> weakSeedVec;
weakSeedVec.reserve(64);
std::vector<CCharacter> littleSeedVec;
littleSeedVec.reserve(64);

//size_t charCan_size = charVec.size();
for (auto charCandidate : charVec) {
//CCharacter& charCandidate = charVec[char_index];
Rect rect = charCandidate.getCharacterPos();
double score = charCandidate.getCharacterScore();
if (charCandidate.getIsStrong()) {
strongSeedVec.push_back(charCandidate);
}
else if (charCandidate.getIsWeak()) {
weakSeedVec.push_back(charCandidate);
//cv::rectangle(result, rect, Scalar(255, 0, 255));
}
else if (charCandidate.getIsLittle()) {
littleSeedVec.push_back(charCandidate);
//cv::rectangle(result, rect, Scalar(255, 0, 255));
}
}

std::vector<CCharacter> searchCandidate = charVec;

// nms to srong seed, only leave the strongest one
overlapThresh = 0.3;
NMStoCharacter(strongSeedVec, overlapThresh);

// merge chars to group
std::vector<std::vector<CCharacter>> charGroupVec;
charGroupVec.reserve(64);
mergeCharToGroup(strongSeedVec, charGroupVec);

// genenrate the line of the group
// based on the assumptions , the mser rects which are
// given high socre by character classifier could be no doubtly
// be the characters in one plate, and we can use these characeters
// to fit a line which is the middle line of the plate.
std::vector<CPlate> plateVec;
plateVec.reserve(16);
for (auto charGroup : charGroupVec) {
Rect plateResult = charGroup[0].getCharacterPos();
std::vector<Point> points;
points.reserve(32);

Vec4f line;
int maxarea = 0;
Rect maxrect;
double ostu_level_sum = 0;

int leftx = image.cols;
Point leftPoint(leftx, 0);
int rightx = 0;
Point rightPoint(rightx, 0);

std::vector<CCharacter> mserCharVec;
mserCharVec.reserve(32);

// remove outlier CharGroup
std::vector<CCharacter> roCharGroup;
roCharGroup.reserve(32);

removeRightOutliers(charGroup, roCharGroup, 0.2, 0.5, result);
//roCharGroup = charGroup;

for (auto character : roCharGroup) {
Rect charRect = character.getCharacterPos();
cv::rectangle(result, charRect, Scalar(0, 255, 0), 1);
plateResult |= charRect;

Point center(charRect.tl().x + charRect.width / 2, charRect.tl().y + charRect.height / 2);
points.push_back(center);
mserCharVec.push_back(character);
//cv::circle(result, center, 3, Scalar(0, 255, 0), 2);

ostu_level_sum += character.getOstuLevel();

if (charRect.area() > maxarea) {
maxrect = charRect;
maxarea = charRect.area();
}
if (center.x < leftPoint.x) {
leftPoint = center;
}
if (center.x > rightPoint.x) {
rightPoint = center;
}
}

double ostu_level_avg = ostu_level_sum / (double)roCharGroup.size();
if (1 && showDebug) {
std::cout << "ostu_level_avg:" << ostu_level_avg << std::endl;
}
float ratio_maxrect = (float)maxrect.width / (float)maxrect.height;

if (points.size() >= 2 && ratio_maxrect >= 0.3) {
fitLine(Mat(points), line, CV_DIST_L2, 0, 0.01, 0.01);

float k = line[1] / line[0];
//float angle = atan(k) * 180 / (float)CV_PI;
//std::cout << "k:" << k << std::endl;
//std::cout << "angle:" << angle << std::endl;
//std::cout << "cos:" << 0.3 * cos(k) << std::endl;
//std::cout << "ratio_maxrect:" << ratio_maxrect << std::endl;

std::sort(mserCharVec.begin(), mserCharVec.end(),
[](const CCharacter& r1, const CCharacter& r2) {
return r1.getCharacterPos().tl().x < r2.getCharacterPos().tl().x;
});

CCharacter midChar = mserCharVec.at(int(mserCharVec.size() / 2.f));
Rect midRect = midChar.getCharacterPos();
Point midCenter(midRect.tl().x + midRect.width / 2, midRect.tl().y + midRect.height / 2);

int mindist = 7 * maxrect.width;
std::vector<Vec2i> distVecVec;
distVecVec.reserve(32);

Vec2i mindistVec;
Vec2i avgdistVec;

// computer the dist which is the distacne between
// two near characters in the plate, use dist we can
// judege how to computer the max search range, and choose the
// best location of the sliding window in the next steps.
for (size_t mser_i = 0; mser_i + 1 < mserCharVec.size(); mser_i++) {
Rect charRect = mserCharVec.at(mser_i).getCharacterPos();
Point center(charRect.tl().x + charRect.width / 2, charRect.tl().y + charRect.height / 2);

Rect charRectCompare = mserCharVec.at(mser_i + 1).getCharacterPos();
Point centerCompare(charRectCompare.tl().x + charRectCompare.width / 2,
charRectCompare.tl().y + charRectCompare.height / 2);

int dist = charRectCompare.x - charRect.x;
Vec2i distVec(charRectCompare.x - charRect.x, charRectCompare.y - charRect.y);
distVecVec.push_back(distVec);

//if (dist < mindist) {
// mindist = dist;
// mindistVec = distVec;
//}
}

std::sort(distVecVec.begin(), distVecVec.end(),
[](const Vec2i& r1, const Vec2i& r2) {
return r1[0] < r2[0];
});

avgdistVec = distVecVec.at(int((distVecVec.size() - 1) / 2.f));

//float step = 10.f * (float)maxrect.width;
//float step = (float)mindistVec[0];
float step = (float)avgdistVec[0];

//cv::line(result, Point2f(line[2] - step, line[3] - k*step), Point2f(line[2] + step, k*step + line[3]), Scalar(255, 255, 255));
cv::line(result, Point2f(midCenter.x - step, midCenter.y - k*step), Point2f(midCenter.x + step, k*step + midCenter.y), Scalar(255, 255, 255));
//cv::circle(result, leftPoint, 3, Scalar(0, 0, 255), 2);

CPlate plate;
plate.setPlateLeftPoint(leftPoint);
plate.setPlateRightPoint(rightPoint);

plate.setPlateLine(line);
plate.setPlatDistVec(avgdistVec);
plate.setOstuLevel(ostu_level_avg);

plate.setPlateMergeCharRect(plateResult);
plate.setPlateMaxCharRect(maxrect);
plate.setMserCharacter(mserCharVec);
plateVec.push_back(plate);
}
}

// use strong seed to construct the first shape of the plate,
// then we need to find characters which are the weak seed.
// because we use strong seed to build the middle lines of the plate,
// we can simply use this to consider weak seeds only lie in the
// near place of the middle line
for (auto plate : plateVec) {
Vec4f line = plate.getPlateLine();
Point leftPoint = plate.getPlateLeftPoint();
Point rightPoint = plate.getPlateRightPoint();

Rect plateResult = plate.getPlateMergeCharRect();
Rect maxrect = plate.getPlateMaxCharRect();
Vec2i dist = plate.getPlateDistVec();
double ostu_level = plate.getOstuLevel();

std::vector<CCharacter> mserCharacter = plate.getCopyOfMserCharacters();
mserCharacter.reserve(16);

float k = line[1] / line[0];
float x_1 = line[2];
float y_1 = line[3];

std::vector<CCharacter> searchWeakSeedVec;
searchWeakSeedVec.reserve(16);

std::vector<CCharacter> searchRightWeakSeed;
searchRightWeakSeed.reserve(8);
std::vector<CCharacter> searchLeftWeakSeed;
searchLeftWeakSeed.reserve(8);

std::vector<CCharacter> slideRightWindow;
slideRightWindow.reserve(8);
std::vector<CCharacter> slideLeftWindow;
slideLeftWindow.reserve(8);

// draw weak seed and little seed from line;
// search for mser rect
if (1 && showDebug) {
std::cout << "search for mser rect:" << std::endl;
}

if (0 && showDebug) {
std::stringstream ss(std::stringstream::in | std::stringstream::out);
ss << "resources/image/tmp/" << img_index << "_1_" << "searcgMserRect.jpg";
imwrite(ss.str(), result);
}
if (1 && showDebug) {
std::cout << "mserCharacter:" << mserCharacter.size() << std::endl;
}

// if the count of strong seed is larger than max count, we dont need
// all the next steps, if not, we first need to search the weak seed in
// the same line as the strong seed. The judge condition contains the distance
// between strong seed and weak seed , and the rect simily of each other to improve
// the roubustnedd of the seed growing algorithm.
if (mserCharacter.size() < char_max_count) {
double thresh1 = 0.15;
double thresh2 = 2.0;
searchWeakSeed(searchCandidate, searchRightWeakSeed, thresh1, thresh2, line, rightPoint,
maxrect, plateResult, result, CharSearchDirection::RIGHT);
if (1 && showDebug) {
std::cout << "searchRightWeakSeed:" << searchRightWeakSeed.size() << std::endl;
}
for (auto seed : searchRightWeakSeed) {
cv::rectangle(result, seed.getCharacterPos(), Scalar(255, 0, 0), 1);
mserCharacter.push_back(seed);
}

searchWeakSeed(searchCandidate, searchLeftWeakSeed, thresh1, thresh2, line, leftPoint,
maxrect, plateResult, result, CharSearchDirection::LEFT);
if (1 && showDebug) {
std::cout << "searchLeftWeakSeed:" << searchLeftWeakSeed.size() << std::endl;
}
for (auto seed : searchLeftWeakSeed) {
cv::rectangle(result, seed.getCharacterPos(), Scalar(255, 0, 0), 1);
mserCharacter.push_back(seed);
}
}

// sometimes the weak seed is in the middle of the strong seed.
// and sometimes two strong seed are actually the two parts of one character.
// because we only consider the weak seed in the left and right direction of strong seed.
// now we examine all the strong seed and weak seed. not only to find the seed in the middle,
// but also to combine two seed which are parts of one character to one seed.
// only by this process, we could use the seed count as the condition to judge if or not to use slide window.
float min_thresh = 0.3f;
float max_thresh = 2.5f;
reFoundAndCombineRect(mserCharacter, min_thresh, max_thresh, dist, maxrect, result);

// if the characters count is less than max count
// this means the mser rect in the lines are not enough.
// sometimes there are still some characters could not be captured by mser algorithm,
// such as blur, low light ,and some chinese characters like zh-cuan.
// to handle this ,we use a simple slide window method to find them.
if (mserCharacter.size() < char_max_count) {
if (1 && showDebug) {
std::cout << "search chinese:" << std::endl;
std::cout << "judege the left is chinese:" << std::endl;
}

// if the left most character is chinese, this means
// that must be the first character in chinese plate,
// and we need not to do a slide window to left. So,
// the first thing is to judge the left charcater is
// or not the chinese.
bool leftIsChinese = false;
if (1) {
std::sort(mserCharacter.begin(), mserCharacter.end(),
[](const CCharacter& r1, const CCharacter& r2) {
return r1.getCharacterPos().tl().x < r2.getCharacterPos().tl().x;
});

CCharacter leftChar = mserCharacter[0];

//Rect theRect = adaptive_charrect_from_rect(leftChar.getCharacterPos(), image.cols, image.rows);
Rect theRect = leftChar.getCharacterPos();
//cv::rectangle(result, theRect, Scalar(255, 0, 0), 1);

Mat region = image(theRect);
Mat binary_region;

ostu_level = cv::threshold(region, binary_region, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);
if (1 && showDebug) {
std::cout << "left : ostu_level:" << ostu_level << std::endl;
}
//plate.setOstuLevel(ostu_level);

Mat charInput = preprocessChar(binary_region, char_size);
if (0 /*&& showDebug*/) {
imshow("charInput", charInput);
waitKey(0);
destroyWindow("charInput");
}

std::string label = "";
float maxVal = -2.f;
leftIsChinese = CharsIdentify::instance()->isCharacter(charInput, label, maxVal, true);
//auto character = CharsIdentify::instance()->identifyChinese(charInput, maxVal, leftIsChinese);
//label = character.second;
if (0 /* && showDebug*/) {
std::cout << "isChinese:" << leftIsChinese << std::endl;
std::cout << "chinese:" << label;
std::cout << "__score:" << maxVal << std::endl;
}
}

// if the left most character is not a chinese,
// this means we meed to slide a window to find the missed mser rect.
// search for sliding window
float ratioWindow = 0.4f;
//float ratioWindow = CParams::instance()->getParam3f();
float threshIsCharacter = 0.8f;
//float threshIsCharacter = CParams::instance()->getParam3f();
if (!leftIsChinese) {
slideWindowSearch(image, slideLeftWindow, line, leftPoint, dist, ostu_level, ratioWindow, threshIsCharacter,
maxrect, plateResult, CharSearchDirection::LEFT, true, result);
if (1 && showDebug) {
std::cout << "slideLeftWindow:" << slideLeftWindow.size() << std::endl;
}
for (auto window : slideLeftWindow) {
cv::rectangle(result, window.getCharacterPos(), Scalar(0, 0, 255), 1);
mserCharacter.push_back(window);
}
}
}

// if we still have less than max count characters,
// we need to slide a window to right to search for the missed mser rect.
if (mserCharacter.size() < char_max_count) {
// change ostu_level
float ratioWindow = 0.4f;
//float ratioWindow = CParams::instance()->getParam3f();
float threshIsCharacter = 0.8f;
//float threshIsCharacter = CParams::instance()->getParam3f();
slideWindowSearch(image, slideRightWindow, line, rightPoint, dist, plate.getOstuLevel(), ratioWindow, threshIsCharacter,
maxrect, plateResult, CharSearchDirection::RIGHT, false, result);
if (1 && showDebug) {
std::cout << "slideRightWindow:" << slideRightWindow.size() << std::endl;
}
for (auto window : slideRightWindow) {
cv::rectangle(result, window.getCharacterPos(), Scalar(0, 0, 255), 1);
mserCharacter.push_back(window);
}
}

// computer the plate angle
float angle = atan(k) * 180 / (float)CV_PI;
if (1 && showDebug) {
std::cout << "k:" << k << std::endl;
std::cout << "angle:" << angle << std::endl;
}

// the plateResult rect need to be enlarge to contains all the plate,
// not only the character area.
float widthEnlargeRatio = 1.15f;
float heightEnlargeRatio = 1.25f;
RotatedRect platePos(Point2f((float)plateResult.x + plateResult.width / 2.f, (float)plateResult.y + plateResult.height / 2.f),
Size2f(plateResult.width * widthEnlargeRatio, maxrect.height * heightEnlargeRatio), angle);

// justify the size is likely to be a plate size.
if (verifyRotatedPlateSizes(platePos)) {
rotatedRectangle(result, platePos, Scalar(0, 0, 255), 1);

plate.setPlatePos(platePos);
plate.setPlateColor(the_color);
plate.setPlateLocateType(CMSER);

if (the_color == BLUE) out_plateVec_blue.push_back(plate);
if (the_color == YELLOW) out_plateVec_yellow.push_back(plate);
}

// use deskew to rotate the image, so we need the binary image.
if (1) {
for (auto mserChar : mserCharacter) {
Rect rect = mserChar.getCharacterPos();
match.at(color_index)(rect) = 255;
}
cv::line(match.at(color_index), rightPoint, leftPoint, Scalar(255));
}
}

if (0 /*&& showDebug*/) {
imshow("result", result);
waitKey(0);
destroyWindow("result");
}

if (0) {
imshow("match", match.at(color_index));
waitKey(0);
destroyWindow("match");
}

if (0) {
std::stringstream ss(std::stringstream::in | std::stringstream::out);
ss << "resources/image/tmp/plateDetect/plate_" << img_index << "_" << the_color << ".jpg";
imwrite(ss.str(), result);
}
}


}

  

  首先通過MSER提取區域,提取出的區域進行一個尺寸判斷,濾除明顯不符合車牌文字尺寸的。接下來使用一個文字分類器,將分類結果概率大於0.9的設為強種子(下圖的綠色方框)。靠近的強種子進行聚合,劃出一條線穿過它們的中心(圖中白色的線)。一般來說,這條線就是車牌的中間軸線,斜率什麽都相同。之後,就在這條線的附近尋找那些概率低於0.9的弱種子(藍色方框)。由於車牌的特征,這些藍色方框應該跟綠色方框距離不太遠,同時尺寸也不會相差太大。藍色方框實在綠色方框的左右查找的,有時候,幾個綠色方框中間可能存在著一個方庫,這可以通過每個方框之間的距離差推出來,這就是橙色的方框。全部找完以後。綠色方框加上藍色與橙色方框的總數代表著目前在車牌區域中發現的文字數。有時這個數會低於7(中文車牌的文字數),這是因為有些區域即便通過MSER也提取不到(例如非常不穩定或光照變化大的),另外很多中文也無法通過MSER提取到(中文大多是不連通的,MSER提取的區域基本都是連通的)。所以下面需要再增加一個滑動窗口(紅色方框)來尋找這些缺失的文字或者中文,如果分類器概率大於某個閾值,就可以將其加入到最終的結果中。最後,把所有文字的位置用一個方框框起來,就是車牌的區域。

  想要通過中間圖片進行調試程序的話,首先依次根據函數調用關系plateMserLocate->mserSearch->mserCharMatch在core_func.cpp找到位置。在函數的最後,把圖片輸出的判斷符改為1。然後在resources/image下面依次新建tmp與plateDetect目錄(跟代碼中的一致),接下來再運行時在新目錄裏就可以看到這些調試圖片。(EasyPR裏還有很多其他類似的輸出代碼,只要按照代碼的寫法創建文件夾就可以看到輸出結果了)。

技術分享圖片

圖5 文字定位的中間結果(調試圖像)

二. 更加合理準確的評價指標

  原先的EasyPR的評價標準中有很多不合理的地方。例如一張圖片中找到了一個疑似的區域,就認為是定位成功了。或者如果一張圖片中定位到了幾個車牌,就用差距率最小的那個作為定位結果。這些地方不合理的地方在於,有可能找到的疑似區域根本不是車牌區域。另外一個包含幾個車牌的圖片僅僅用最大的一個作為結果,明顯不合理。

  因此新評價指標需要考慮定位區域和車牌區域的位置差異,只有當兩者接近時才能認為是定位成功。另外,一張圖片如果有幾個車牌,對應的就有幾個定位區域,每個區域與車牌做比對,綜合起來才能作為定位效果。因此需要加入一個GroundTruth,標記各個車牌的位置信息。新版本中,我們標記了251張圖片,其中共250個車牌的位置信息。為了衡量定位區域與車牌區域的位置差的比例,又引入了ICDAR2003的評價協議,來最終計算出定位的recall,precise與fscore值。

  車牌定位評價中做了大改動。字符識別模塊則做了小改動。首先是去除了“平均字符差距”這個意義較小的指標。轉而用零字符差距,一字符差距,中文字符正確替代,這三者都是比率。零字符差距(0-error)指的是識別結果與車牌沒有任何差異,跟原先的評價協議中的“完全正確率”指代一樣。一字符差距(1-error)指的是錯別僅僅只有1個字符或以下的,包括零字符差距。註意,中文一般是兩個字符。中文字符正確(Chinese-precise)指代中文字符識別正確的比率。這三個指標,都是越大越好,100%最高。

  為了實際看出這些指標的效果,拿通用測試集裏增加的50張復雜圖片做對此測試,文字定位方法在這些數據上的表現的差異與原先的SOBEL,COLOR定位方法的區別可以看下面的結果。

  SOBEL+COLOR:
  總圖片數:50, Plates count:52, 定位率:51.9231%
  Recall:46.1696%, Precise:26.3273%, Fscore:33.533%.
  0-error:12.5%, 1-error:12.5%, Chinese-precise:37.5%

  CMSER:
  總圖片數:50, Plates count:52, 定位率:78.8462%
  Recall:70.6192%, Precise:70.1825%, Fscore:70.4002%.
  0-error:59.4595%, 1-error:70.2703%, Chinese-precise:70.2703%

  可以看出定位率提升了接近27個百分點,定位Fscore與中文識別正確率則提升了接近1倍。

三. 非極大值抑制

  新版本中另一個較大的改動就是大量的使用了非極大值抑制(Non-maximum suppression)。使用非極大值抑制有幾個好處:1.當有幾個定位區域重疊時,可以根據它們的置信度(也是SVM車牌判斷模型得出的值)來取出其中最大概率準確的一個,移除其他幾個。這樣,不同定位方法,例如Sobel與Color定位的同一個區域,只有一個可以保留。因此,EasyPR新版本中,最終定位出的一個車牌區域,不再會有幾個框了。2.結合滑動窗口,可以用其來準確定位文字的位置,例如在車牌定位模塊中找到概率最大的文字位置,或者在文字識別模塊中,更準確的找到中文文字的位置。

  非極大值抑制的使用使得EasyPR的定位方法與後面的識別模塊解耦了。以前,每增加定位方法,可能會對最終輸出產生影響。現在,無論多少定位方法定位出的車牌都會通過非極大值抑制取出最大概率的一個,對後面的方法沒有一點影響。

  另外,如今setMaxPlates()這個函數可以確實的作用了。以前可以設置,但沒效果。現在,設置這個值為n以後,當在一副圖像中檢測到大於n個車牌區域(註意,這個是經過非極大值抑制後的)時,EasyPR只會輸出n個可能性最高的車牌區域。


四. 字符分割與識別部分的強化

  新版本中字符分割與識別部分都添加了新算法。例如使用了spatial-ostu替代普通的ostu算法,增加了圖像分割在面對光照不均勻的圖像上的二值化效果。

技術分享圖片 技術分享圖片 技術分享圖片

圖6 車牌圖像(左),普通大津閾值結果(中),空間大津閾值結果(右)

  同時,識別部分針對中文增加了一種adaptive threshold方法。這種方法在二值化“川”字時有比ostu更好的效果。通過將兩者一並使用,並選擇其中字符識別概率最大的一個,顯著提升了中文字符的識別準確率。在識別中文時,增加了一個小型的滑動窗口,以此來彌補通過省份字符直接查找中文字符時的定位不精等現象。



五. 新的特征與SVM模型,新的中文識別ANN模型

  為了強化車牌判斷的魯棒性,新版本中更改了SVM模型的特征,使用LBP特征的模型在面對低對比度與光照的車牌圖像中也有很好的判斷效果。為了強化中文識別的準確率,現在單獨為31類中文文字訓練了一個ANN模型ann_chinese,使用這個模型在分類中文是的效果,相對原先的通用模型可以提升近10個百分點。


六. 其他

  幾天前EasyPR發布了1.5-alpha版本。今天發布的beta版本相對於alpha版本,增加了Grid Search功能, 對文字定位方法的參數又進行了部分調優,同時去除了一些中文註釋以提高window下的兼容性,除此之外,在速度方面,此版本首次使用了多線程編程技術(OpenMP)來提高算法整體的效率等,使得最終的速度有了2倍左右的提升。

  下面說一點新版本的不足:目前來看,文字定位方法的魯棒性確實很高,不過遺憾的速度跟顏色定位方法相比,還是慢了接近一倍(與Sobel定位效率相當)。後面的改善中,考慮對其進行優化。另外,字符分割的效果實際上還是可以有更多的優化算法選擇的,未來的版本可以考慮對其做一個較大的嘗試與改進。

  對EasyPR做下說明:EasyPR,一個開源的中文車牌識別系統,代碼托管在github和gitosc。其次,在前面的博客文章中,包含EasyPR至今的開發文檔與介紹。

版權說明:

  本文中的所有文字,圖片,代碼的版權都是屬於作者和博客園共同所有。歡迎轉載,但是務必註明作者與出處。任何未經允許的剽竊以及爬蟲抓取都屬於侵權,作者和博客園保留所有權利。

參考文獻:

  1.Character-MSER : Scene Text Detection with Robust Character Candidate Extraction Method, ICDAR2015

  2.Seed-growing : A robust hierarchical detection method for scene text based on convolutional neural networks, ICME2015

EasyPR--開發詳解(8)文字定位