終極指南：構建用於檢測汽車損壞的Mask R-CNN模型（附Python演練）

阿新 • • 發佈：2019-01-08

介紹

計算機視覺領域的應用繼續令人驚歎著。從檢測視訊中的目標到計算人群中的人數，計算機視覺似乎沒有無法克服的挑戰。

這篇文章的目的是建立一個自定義Mask R-CNN模型，可以檢測汽車上的損壞區域（參見上面的影象示例）。這種模型的基本應用場景為，如果使用者可以上傳照片並且可以評估來自他們的損害，保險公司可以使用它來更快地處理索賠。如果貸方承銷汽車貸款，特別是二手車，也可以使用這種模式。

什麼是MaskR-CNN？

Mask R-CNN是一個例項分割模型，它允許我們識別目標類別的畫素位置。“例項分割”意味著對場景內的各個目標進行分段，無論它們是否屬於同一型別- 即識別單個車輛，人員等。檢視以下在COCO 資料集上訓練的Mask-RCNN模型的GIF 。如你所見，它可以識別汽車，人員，水果等的畫素位置。

Mask R-CNN不同於經典目標檢測模型–Faster R-CNN等，除了識別類別及其邊界框位置之外，還可以對邊界框中與該類別對應的畫素區域進行著色。那麼哪些任務需要這些額外的細節呢？我能想到的一些例子是：

自動駕駛汽車需要知道道路的確切畫素位置; 其他汽車也可以據此避免碰撞
機器人可能需要他們想要拾取的物體的畫素位置（這裡可以聯想到亞馬遜的無人機）

Mask R-CNN的工作原理

在我們構建Mask R-CNN模型之前，讓我們首先了解它是如何工作的。理解Mask R-CNN的一個好的方式是把它看作一個混合的Faster R-CNN，一個可以進行目標檢測（類別+邊界框）和可以實現畫素級別標註的FCN（完全卷積網路）的組合。見下圖：

Mask RCNN是Faster RCNN和FCN的組合

Mask R-CNN在概念上很簡單：首先使用Faster R-CNN為每個候選目標提供兩個輸出，一個類別標籤和一個邊界框偏移; 同時，添加了第三個輸出目標Mask的分支- 一個二進位制Mask，用於表明目標在邊界框中的畫素位置；另外，額外的Mask輸出與類別和邊界框輸出不同，需要提取目標更精細的空間佈局。為此，Mask R-CNN使用下面描述的 Fully Convolution Network（FCN）。

FCN是一種用於進行語義分割的流行演算法。該模型使用多種卷積和最大池化層來首先將影象解壓縮到其原始大小的1/32。然後，它在此粒度級別進行類別預測。最後，它使用了上取樣和反捲積層來將影象大小調整為原始尺寸。

因此，簡而言之，我們可以說Mask R-CNN網路架構結合了兩個網路- Faster R-CNN和FCN。模型的損失函式是進行分類，生成邊界框和生成mask的總損失。

Mask RCNN還有一些額外的改進，使其比FCN更精確。可以在論文中

如何構建用於汽車損傷檢測的Mask R-CNN模型

為了構建自定義Mask R-CNN，我們將參考 Matterport Github儲存庫（）。雖然在最新TensorFlow目標檢測庫也提供了構建Mask R-CNN的選項，但是在使用的過程很容易遇到報錯：TensorFlow版本，object detection版本，Mask格式等都是報錯的可能原因。在這裡推薦使用 Matterport Github儲存庫。

收集資料

在本次練習中，我從Google收集了66張受損車輛的影象（50張訓練集和16張驗證集）。看看下面的一些例子。

註釋資料

Mask R-CNN模型要求使用者註釋影象並識別損壞區域。我使用的註釋工具是VGG Image Annotator - v 1.0.6。可以使用此連結

建立完所有註釋後，可以下載註釋並以json格式儲存。您可以在此我儲存庫下customImages資料夾裡檢視我的儲存庫中的影象和註釋。

訓練模型

現在我們開始訓練模型的。首先克隆’Matterport Mask R-CNN’儲存庫

接下來我們將載入我們的影象和註釋。

class CustomDataset(utils.Dataset): def load_custom(self, dataset_dir, subset): “”“Load a subset of the Balloon dataset. dataset_dir: Root directory of the dataset. subset: Subset to load: train or val “”“

#Add classes. We have only one class to add.

self.add_class(“damage”, 1, “damage”)

#Train or validation dataset?

assert subset in [“train”, “val”]
dataset_dir = os.path.join(dataset_dir, subset)

# We mostly care about the x and y coordinates of each region

annotations1 = json.load(open(os.path.join(dataset_dir, “via_region_data.json”)))
annotations = list(annotations1.values()) # don’t need the dict keys

#The VIA tool saves images in the JSON even if they don’t have any

#annotations. Skip unannotated images.

annotations = [a for a in annotations if a[‘regions’]]

#Add images

for a in annotations:

#Get the x, y coordinaets of points of the polygons that make up

#the outline of each object instance. There are stores in the

#shape_attributes (see json format above)

polygons = [r[‘shape_attributes’] for r in a[‘regions’].values()]

#load_mask() needs the image size to convert polygons to masks.

image_path = os.path.join(dataset_dir, a[‘filename’])
image = skimage.io.imread(image_path)
height, width = image.shape[:2]
self.add_image(
“damage”, ## for a single class just add the name here
image_id=a[‘filename’], # use file name as a unique image id
path=image_path,width=width, height=height,polygons=polygons)

我使用了Matterport共享的balloon.py檔案並對其進行了修改，以建立一個載入影象和註釋的自定義程式碼，並將它們新增到CustomDataset類中。在這我的儲存庫內custom.py上檢視整個程式碼。本程式碼可適用其他檢測任務情形（請注意：此程式碼僅適用於一個類別）。

為了訓練模型，我們使用COCO訓練的模型作為檢查點來執行遷移學習。可以在Matterport儲存庫下載此模型。

執行以下程式碼塊訓練模型：

#Train a new model starting from pre-trained COCO weights

python3 custom.py train –dataset=/path/to/datasetfolder –weights=coco

#Resume training a model that you had trained earlier

python3 custom.py train –dataset=/path/to/datasetfolder –weights=last

我使用GPU並在20-30分鐘內訓練模型10個epochs。

驗證您的模型
您可以使用此notebook中（inspect_custom_weights.ipynb）的程式碼檢查模型權重- 檢查自定義權重。請在此筆記本中連結你的最後一個檢查點。此notebook可以幫助進行健全性檢查–權重和偏差是否分佈正常。請參閱下面的示例輸出：

在影象上執行模型並進行預測

使用筆記本 inspect_custom_model 對來自val set的影象執行模型，並檢視模型預測。請參閱以下示例結果：

至此，已經完成建立了一個Mask R-CNN模型來檢測汽車上的損壞。

結束筆記

Mask-RCNN是目標檢測模型的下一個發展方向，它面向更精確的檢測。Matterport公開了它的儲存庫並允許我們利用它來構建自定義模型去實現更多有意義的任務。本文只是Mask R-CNN模型可以完成的一個小例子。

參考：

終極指南：構建用於檢測汽車損壞的Mask R-CNN模型（附Python演練）

介紹

目錄

什麼是Mask R-CNN？

Mask R-CNN的工作原理

如何構建用於汽車損壞檢測的Mask R-CNN

收集資料

註釋資料

訓練模型

驗證模型

執行影象模型並進行預測

什麼是MaskR-CNN？

Mask R-CNN的工作原理

如何構建用於汽車損傷檢測的Mask R-CNN模型

收集資料

註釋資料

訓練模型

#Add classes. We have only one class to add.

#Train or validation dataset?

# We mostly care about the x and y coordinates of each region

#The VIA tool saves images in the JSON even if they don’t have any

#annotations. Skip unannotated images.

#Add images

#Get the x, y coordinaets of points of the polygons that make up

#the outline of each object instance. There are stores in the

#shape_attributes (see json format above)

#load_mask() needs the image size to convert polygons to masks.

#Train a new model starting from pre-trained COCO weights

#Resume training a model that you had trained earlier

在影象上執行模型並進行預測

相關推薦