tensorflow入門教程(二十五)Object Detection API目標檢測(下)

阿新 • • 發佈：2019-01-29

1、概述

上一講，我們使用了別人根據COCO資料集訓練好的模型來做目標檢測，這一講，我們就來訓練自己的模型。

2、下載資料集

為了方便學習，我們先使用別人整理好的資料集來訓練---VOC 2012資料集。VOC 2012一共有17125張圖片，每張圖片都有標註，標註的內容包括人、動物、交通工具、傢俱等20個類別。首先下載資料集，資料集很大，有1.9G，慢慢下吧～連結如下，

3、修改環境變數

為了不用每次都將檔案拷貝到my_object_detection資料夾下，我們可以將my_object_detection目錄設定進Python的環境變數PYTHONPATH中，執行以下命令

export PYTHONPATH=$PYTHONPATH:/home/wilf/tensorflow-master/demo/my_object_detection:/home/wilf/tensorflow-master/demo/my_object_detection/slim

為了不用每次開機都執行這個命令，可以將其寫入到~/.bashrc檔案中。

4、VOC2012資料集結構簡介

轉換之前，先來看一下VOC2012資料集的結構，先將我們下載的檔案VOCtrainval_11-May-2012.tar解壓到my_images資料夾下，得到的目錄結構為

my_images/VOCdevkit/VOC2012/

VOC2012資料夾下包含5個子資料夾，如下圖所示，

JPEGImages資料夾中儲存了所有的圖片，每一張圖片對應的物體框的標註存在Annotations資料夾中，如下圖所示，

看看它是怎麼標註的（註釋是我加上去的），

對應的圖片如下，

分割圖片如下（<segmented>1</segmented>），

5、將VOC2012資料集轉成tfrecord格式

接下來將VOC2012資料集轉為tfrecord格式，在object_detection資料夾下執行以下命令，

訓練資料：

python dataset_tools/create_pascal_tf_record.py --data_dir=my_images/VOCdevkit/ --year=VOC2012 --output_path=my_images/VOCdevkit/pascal_train.record --set=train

測試資料：

python dataset_tools/create_pascal_tf_record.py --data_dir=my_images/VOCdevkit/ --year=VOC2012 --output_path=my_images/VOCdevkit/pascal_val.record --set=val

執行完以後，在my_images/VOCdevkit/資料夾下生成兩個檔案，pascal_train.record 和pascal_val.record。

6、下載模型

接著，下載模型，還是跟上一講一樣的連結，

下載faster_rcnn_inception_resnet_v2_atrous_coco模型，

data/pascal_label_map.pbtxt檔案則對於與VOC2012的label，總共有20個分類。

下載完後，將其解壓到my_images資料夾下，得到資料夾如下，

7、配置檔案

接下來呢，新建配置檔案，samples/configs/資料夾下有一些示例檔案，我們就模仿它們配置，參考faster_rcnn_inception_resnet_v2_atrous_coco.config檔案，執行命令，

cp samples/configs/faster_rcnn_inception_resnet_v2_atrous_coco.config samples/configs/faster_rcnn_inception_resnet_v2_atrous_voc2012.config

將num_classes: 90改為num_classes: 20
將num_examples: 8000改為num_examples: 5823，這個5823怎麼來？上面執行的將VOC2012資料集轉為tfrecord格式中，將create_pascal_tf_record.py中的examples_list的長度打印出來就得到這個5823，這個examples_list就是在驗證階段需要執行的圖片數量，命令為

python dataset_tools/create_pascal_tf_record.py --data_dir=my_images/VOCdevkit/ --year=VOC2012 --output_path=my_images/VOCdevkit/pascal_val.record --set=val

5處PATH_TO_BE_CONFIGURED的地方修改成對應的我們新建的目錄

然後，在my_images資料夾下新建一個資料夾train_dir，用來儲存訓練模型。

上面配置檔案完整內容如下，

# Faster R-CNN with Inception Resnet v2, Atrous version;
# Configured for VOC2012 Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  faster_rcnn {
    num_classes: 20
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 600
        max_dimension: 1024
      }
    }
    feature_extractor {
      type: 'faster_rcnn_inception_resnet_v2'
      first_stage_features_stride: 8
    }
    first_stage_anchor_generator {
      grid_anchor_generator {
        scales: [0.25, 0.5, 1.0, 2.0]
        aspect_ratios: [0.5, 1.0, 2.0]
        height_stride: 8
        width_stride: 8
      }
    }
    first_stage_atrous_rate: 2
    first_stage_box_predictor_conv_hyperparams {
      op: CONV
      regularizer {
        l2_regularizer {
          weight: 0.0
        }
      }
      initializer {
        truncated_normal_initializer {
          stddev: 0.01
        }
      }
    }
    first_stage_nms_score_threshold: 0.0
    first_stage_nms_iou_threshold: 0.7
    first_stage_max_proposals: 300
    first_stage_localization_loss_weight: 2.0
    first_stage_objectness_loss_weight: 1.0
    initial_crop_size: 17
    maxpool_kernel_size: 1
    maxpool_stride: 1
    second_stage_box_predictor {
      mask_rcnn_box_predictor {
        use_dropout: false
        dropout_keep_probability: 1.0
        fc_hyperparams {
          op: FC
          regularizer {
            l2_regularizer {
              weight: 0.0
            }
          }
          initializer {
            variance_scaling_initializer {
              factor: 1.0
              uniform: true
              mode: FAN_AVG
            }
          }
        }
      }
    }
    second_stage_post_processing {
      batch_non_max_suppression {
        score_threshold: 0.0
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SOFTMAX
    }
    second_stage_localization_loss_weight: 2.0
    second_stage_classification_loss_weight: 1.0
  }
}

train_config: {
  batch_size: 1
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        manual_step_learning_rate {
          initial_learning_rate: 0.0003
          schedule {
            step: 900000
            learning_rate: .00003
          }
          schedule {
            step: 1200000
            learning_rate: .000003
          }
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  gradient_clipping_by_norm: 10.0
  fine_tune_checkpoint: "my_images/faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28/model.ckpt"
  from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 200000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "my_images/VOCdevkit/pascal_train.record"
  }
  label_map_path: "data/pascal_label_map.pbtxt"
}

eval_config: {
  num_examples: 5823
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "my_images/VOCdevkit/pascal_val.record"
  }
  label_map_path: "data/pascal_label_map.pbtxt"
  shuffle: false
  num_readers: 1
}

8、開始訓練

執行如下命令，

python train.py --train_dir=my_images/train_dir/ --pipeline_config_path=samples/configs/faster_rcnn_inception_resnet_v2_atrous_voc2012.config

報錯了，

TypeError: __new__() got an unexpected keyword argument 'serialized_options'

似曾相識啊，在《tensorflow入門筆記(二十三)Object Detection API目標檢測(上)》那講也遇到了這個錯誤，將這個引數去掉試試。

唉，又報錯了，又是OOM記憶體溢位～屌絲的春天什麼時候才到呢？？

那就用CPU咯，在train.py中加入以下程式碼，

#原諒我窮屌絲，電腦顯示卡配置太低導致記憶體溢位，只能用cpu計算了
os.environ["CUDA_VISIBLE_DEVICES"]="-1"

再執行看看，

天吶～又出錯了，而且沒有什麼提示，這不是在為難我嗎？？！！

我猜可能是記憶體溢位，我們在程式執行的時候不定時的看看記憶體的佔用情況，

一開始，可用記憶體有7.5G這樣，

崩潰前，大概就剩下一百多M了～～!!看來沒法玩了～這兩天去淘寶塊記憶體條先了。

兩天過去，買了個16G的記憶體條，加上原來的8G，這下應該夠用了吧？還買了個460G固態硬碟，還在路上。安裝好記憶體條以後，走起！

哎喲我去，腿腳麻利了，一口氣能上五樓！先出去逛個街，回來再看看效果～

我勒個去，三個多小時過去，才476步！CPU這效率，看來的上的大點記憶體的顯示卡了。

9、匯出模型

訓練完以後，如何對單張圖片進行目標檢測呢？

Object Detection API提供了一個export_inference_graph.py指令碼用於匯出訓練好的模型，我們先將訓練好的checkpoint匯出成“,pb”檔案，再用上一講的程式碼，對圖片進行目標檢測。匯出模型命令如下，

python export_inference_graph.py --input_type image_tensor --pipeline_config_path samples/configs/faster_rcnn_inception_resnet_v2_atrous_voc2012.config --trained_checkpoint_prefix my_images/train_dir/model.ckpt-494 --output_directory my_images/export_dir/

執行成功後，export_dir資料夾下生成以下檔案，

10、使用自己訓練的模型對圖片進行目標檢測

這一步，只要修改上一講的程式碼就可以了。比較簡單，直接給程式碼好了。在object_detection目錄下新建檔案demo2.py，執行python demo2.py，程式碼如下，

#encoding:utf-8
import tensorflow as tf
import numpy as np

import os
from matplotlib import pyplot as plt
from PIL import Image
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_utils

#下載下來的模型的目錄
MODEL_DIR = 'my_images/export_dir/'
#下載下來的模型的檔案
MODEL_CHECK_FILE = os.path.join(MODEL_DIR, 'frozen_inference_graph.pb')
#資料集對於的label
MODEL_LABEL_MAP = os.path.join('data', 'pascal_label_map.pbtxt')
#資料集分類數量，可以開啟pascal_label_map.pbtxt檔案看看
MODEL_NUM_CLASSES = 20

#這裡是獲取例項圖片檔名，將其放到陣列中
PATH_TO_TEST_IMAGES_DIR = 'test_images'
TEST_IMAGES_PATHS = [os.path.join(PATH_TO_TEST_IMAGES_DIR, '06.jpg')]

#輸出影象大小，單位是in
IMAGE_SIZE = (12, 8)

tf.reset_default_graph()

#將模型讀取到預設的圖中
with tf.gfile.GFile(MODEL_CHECK_FILE, 'rb') as fd:
    _graph = tf.GraphDef()
    _graph.ParseFromString(fd.read())
    tf.import_graph_def(_graph, name='')

#載入pascal資料標籤
label_map = label_map_util.load_labelmap(MODEL_LABEL_MAP)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=MODEL_NUM_CLASSES)
category_index = label_map_util.create_category_index(categories)

#將圖片轉化成numpy陣列形式
def load_image_into_numpy_array(image):
    (im_width, im_height) = image.size
    return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)

#在圖中開始計算
detection_graph = tf.get_default_graph()
with tf.Session(graph=detection_graph) as sess:
    for image_path in TEST_IMAGES_PATHS:
        print(image_path)
        #讀取圖片
        image = Image.open(image_path)
        #將圖片資料轉成陣列
        image_np = load_image_into_numpy_array(image)
        #增加一個維度
        image_np_expanded = np.expand_dims(image_np, axis=0)
        #下面都是獲取模型中的變數，直接使用就好了
        image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
        #存放所有檢測框
        boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
        #每個檢測結果的可信度
        scores = detection_graph.get_tensor_by_name('detection_scores:0')
        #每個框對應的類別
        classes = detection_graph.get_tensor_by_name('detection_classes:0')
        #檢測框的個數
        num_detections = detection_graph.get_tensor_by_name('num_detections:0')
        #開始計算
        (boxes, scores, classes, num_detections) = sess.run([boxes, scores, classes, num_detections],
                                                            feed_dict={image_tensor : image_np_expanded})
        #列印識別結果
        print(num_detections)
        print(boxes)
        print(classes)
        print(scores)

        #得到視覺化結果
        vis_utils.visualize_boxes_and_labels_on_image_array(
            image_np,
            np.squeeze(boxes),
            np.squeeze(classes).astype(np.int32),
            np.squeeze(scores),
            category_index,
            use_normalized_coordinates=True,
            line_thickness=8
        )
        #顯示
        plt.figure(figsize=IMAGE_SIZE)
        plt.imshow(image_np)
        plt.show()

11、執行結果

就這麼簡單。

等下個月底新一代的顯示卡出來了，再看看能不能淘個便宜點的顯示卡～～!

-------韋訪 180725

tensorflow入門教程(二十五)Object Detection API目標檢測(下)

1、概述

2、下載資料集

3、修改環境變數

4、VOC2012資料集結構簡介

5、將VOC2012資料集轉成tfrecord格式

6、下載模型

7、配置檔案

8、開始訓練

9、匯出模型

10、使用自己訓練的模型對圖片進行目標檢測

11、執行結果

tensorflow入門教程(二十五)Object Detection API目標檢測(下)

Tensorflow入門教程(二十九)人臉表情識別(上)人臉表情資料集-fer2013

tensorflow入門教程(二十八)人臉識別(下)

WPF入門教程系列十五——WPF中的數據綁定(一)

Spring Boot入門教程(四十五): 事務@Transactional

Spring Boot 初級入門教程（十五） —— 整合 MyBatis

Linux小小白入門教程（十五）：使用者和使用者組

Android入門教程二十九之ProgressBar(進度條)

ionic入門教程第十五課-ionic效能優化之圖片延時載入

C語言入門（二十五）檔案操作

安卓入門教程（十五）- Fragment，Service，WAMP下載

【tensorflow入門教程二】資料集製作：使用TFRecords製作資料集並使用inceptionv3進行訓練

Spring Boot2 系列教程(二十五)Spring Boot 整合 Jpa 多資料來源

脫離Tensoeflow Object Detection API使用檢測程式

WPF入門教程系列十——布局之Border與ViewBox（五）

谷歌開源的TensorFlow Object Detection API視頻物體識別系統實現教程

ElasticSearch最佳入門實踐（二十五）mget批量查詢api

JavaFX UI控制元件教程（二十五）之Color Picker

Linux真小白入門教程第十二集——使用者檔案及使用者組

手把手教你ExtJS從入門到放棄——篇二十五(示例22:Ext.dom.Element方法addKeyMap,addKeyListener,on,focus,oad,serializeForm）

tensorflow入門教程(二十五)Object Detection API目標檢測(下)

1、概述

2、下載資料集

3、修改環境變數

4、VOC2012資料集結構簡介

5、將VOC2012資料集轉成tfrecord格式

6、下載模型

7、配置檔案

8、開始訓練

9、匯出模型

10、使用自己訓練的模型對圖片進行目標檢測

11、執行結果

相關推薦