深度有趣 | 12 一起來動動手

Python TensorFlow · 發表 2018-09-20 09:44:04

摘要：用TensorFlow實現一個手部實時檢測器和Inception-v3通過遷移學習實現定製的圖片分類任務類似在上節課內容的基礎上，新增手部標註資料，並使用預訓練好的模型完成遷移學習資料手部檢測資料來自於 vision.soic.indiana.edu/projec...

用TensorFlow實現一個手部實時檢測器

和Inception-v3通過遷移學習實現定製的圖片分類任務類似

在上節課內容的基礎上，新增手部標註資料，並使用預訓練好的模型完成遷移學習

資料

手部檢測資料來自於

ofollow,noindex">vision.soic.indiana.edu/projects/eg…

圖片使用Google Class拍攝， egohands_data.zip 是一個壓縮包，裡面共有48個資料夾，分別對應48個不同場景（室內、室外、下棋等）中共計4800張標註圖片，標註即全部的手部輪廓點

不過我們不需要手動解壓這個壓縮包，而是使用程式碼去完成資料的解壓和整理工作

egohands_dataset_clean.py 依次完成以下幾項工作

如果當前目錄下沒有 egohands_data.zip 則下載，即呼叫 download_egohands_dataset()
否則解壓 egohands_data.zip 並得到 egohands 資料夾，並對其中的圖片資料執行 rename_files()
rename_files() 會將所有的圖片重新命名，加上其父資料夾的名稱，避免圖片名重複，並呼叫 generate_csv_files()
generate_csv_files() 讀取每個場景下的圖片，呼叫 get_bbox_visualize() ，根據標註檔案 polygons.mat 繪製手部輪廓和Anchor Box並顯示，同時將圖片標註轉換並存儲為csv檔案，全部處理完後，再呼叫 split_data_test_eval_train()
split_data_test_eval_train() 完成訓練集和測試集的分割，在 images 資料夾中新建 train 和 test 兩個資料夾，分別存放對應的圖片和csv標註
完成以上工作後，便可以手動刪除一開始解壓得到的 egohands 資料夾

也就是從 egohands_data.zip 得到 images 資料夾，在我的筆記本上共花費6分鐘左右

接下來呼叫 generate_tfrecord.py ，將訓練集和測試集整理成TFRecord檔案

由於這裡只需要檢測手部，因此物體類別只有一種即 hand ，如果需要定製其他物體檢測任務，修改以下程式碼即可

def class_text_to_int(row_label):
if row_label == 'hand':
return 1
else:
None
複製程式碼

執行以下兩條命令，生成訓練集和測試集對應的TFRecord檔案

python generate_tfrecord.py --csv_input=images/train/train_labels.csv--output_path=retrain/train.record
複製程式碼

python generate_tfrecord.py --csv_input=images/test/test_labels.csv--output_path=retrain/test.record
複製程式碼

模型

依舊是上節課使用的 ssd_mobilenet_v1_coco ，但這裡只需要檢測手部，所以需要根據定製的標註資料進行遷移學習

retrain 資料夾中內容如下

train.record 和 test.record ：定製物體檢測任務的標註資料
ssd_mobilenet_v1_coco_11_06_2017 ：預訓練好的 ssd_mobilenet_v1_coco 模型
ssd_mobilenet_v1_coco.config ：使用遷移學習訓練模型的配置檔案
hand_label_map.pbtxt ：指定檢測類別的名稱和編號對映
retrain.py ：遷移學習的訓練程式碼
object_detection ：一些輔助檔案

配置檔案 ssd_mobilenet_v1_coco.config 的模版在這裡

github.com/tensorflow/…

按需修改配置檔案，主要是包括 PATH_TO_BE_CONFIGURED 的配置項

num_classes ：物體類別的數量，這裡為1
fine_tune_checkpoint ：預訓練好的模型checkpoint檔案
train_input_reader ：指定訓練資料 input_path 和對映檔案路徑 label_map_path
eval_input_reader ：指定測試資料 input_path 和對映檔案路徑 label_map_path

對映檔案 hand_label_map.pbtxt 內容如下，只有一個類別

item {
id: 1
name: 'hand'
}
複製程式碼

使用以下命令開始模型的遷移訓練， train_dir 為模型輸出路徑， pipeline_config_path 為配置項路徑

python retrain.py --logtostderr --train_dir=output/ --pipeline_config_path=ssd_mobilenet_v1_coco.config
複製程式碼

模型遷移訓練完畢後，在 output 資料夾中即可看到生成的 .data 、 .index 、 .meta 等模型檔案

使用TensorBoard檢視模型訓練過程，模型總損失如下

tensorboard --logdir='output'
複製程式碼

最後，再使用 export_inference_graph.py 將模型打包成 .pb 檔案

--pipeline_config_path ：配置檔案路徑
--trained_checkpoint_prefix ：模型checkpoint路徑
--output_directory ： .pb 檔案輸出路徑

python export_inference_graph.py --input_type image_tensor --pipeline_config_path retrain/ssd_mobilenet_v1_coco.config--trained_checkpoint_prefix retrain/output/model.ckpt-153192 --output_directory hand_detection_inference_graph
複製程式碼

執行後會生成資料夾 hand_detection_inference_graph ，裡面可以找到一個 frozen_inference_graph.pb 檔案

應用

現在便可以使用訓練好的手部檢測模型，實現一個手部實時檢測器

主要改動以下三行程式碼即可

PATH_TO_CKPT = 'hand_detection_inference_graph/frozen_inference_graph.pb'
PATH_TO_LABELS = 'retrain/hand_label_map.pbtxt'
NUM_CLASSES = 1
複製程式碼

完整程式碼如下

# -*- coding: utf-8 -*-

import numpy as np
import tensorflow as tf

from utils import label_map_util
from utils import visualization_utils as vis_util

import cv2
cap = cv2.VideoCapture(0)

PATH_TO_CKPT = 'hand_detection_inference_graph/frozen_inference_graph.pb'
PATH_TO_LABELS = 'retrain/hand_label_map.pbtxt'
NUM_CLASSES = 1

detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
od_graph_def.ParseFromString(fid.read())
tf.import_graph_def(od_graph_def, name='')

label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

with detection_graph.as_default():
with tf.Session(graph=detection_graph) as sess:
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
num_detections = detection_graph.get_tensor_by_name('num_detections:0')
while True:
ret, image_np = cap.read()
image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB)
image_np_expanded = np.expand_dims(image_np, axis=0)
(boxes, scores, classes, num) = sess.run(
[detection_boxes, detection_scores, detection_classes, num_detections], 
feed_dict={image_tensor: image_np_expanded})

vis_util.visualize_boxes_and_labels_on_image_array(image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index, use_normalized_coordinates=True, line_thickness=8)

cv2.imshow('hand detection', cv2.cvtColor(image_np, cv2.COLOR_RGB2BGR))
if cv2.waitKey(25) & 0xFF == ord('q'):
cap.release()
cv2.destroyAllWindows()
break
複製程式碼

執行程式碼後，即可看到攝像頭中手部檢測的結果

定製檢測任務

如果希望定製自己的檢測任務，準備一些圖片，然後手動標註，有個幾百條就差不多了

使用 labelImg 進行圖片標註，安裝方法請參考以下連結

github.com/tzutalin/la…

進入 labelImg 資料夾，使用以下命令，兩個引數分別表示圖片目錄和分類檔案路徑

python labelImg.py ../imgs/ ../predefined_classes.txt
複製程式碼

標註介面如下圖所示，按 w 開始矩形的繪製，按 Ctrl+S 儲存標註至 xml 資料夾

之後執行 xml_to_csv.py 即可將 .xml 檔案轉為 .csv 檔案

總之，為了準備TFRecord資料，按照以下步驟操作

新建 train 和 test 資料夾並分配圖片
分別對訓練集和測試集圖片手工標註
將訓練集和測試集對應的多個 .xml 轉為一個 .csv
根據原始圖片和 .csv 生成對應的TFRecord

深度有趣 | 12 一起來動動手

資料

模型

應用

定製檢測任務

您可能也會喜歡…