1. 程式人生 > >利用谷歌object_detection API實現物體識別(知識總結)

利用谷歌object_detection API實現物體識別(知識總結)

這兩天想著實現一個實時物體識別的程式,正好了解到谷歌的object_detection API可以實時呼叫攝像頭進行識別畫面內的物體,所以就收集了相關資料學習了一下。

要準備的東西:

  • 安裝谷歌object_detection API
  • 安裝python3.5(本人的MacBook安裝的3.6)
  • 安裝tensorflow
  • 安裝opencv包

1、安裝Python 、TensorFlow和其他依賴項

pip install tensorflow
pip install pillow
pip install lxml
pip install jupyter
pip install matplotlib

2、安裝 Protoc, 進入Protoc下載頁,下載對應的編譯好的zip包。

下載後bin目錄下會有一個protoc二進位制檔案,覆蓋到對應目錄:

cp bin/protoc /usr/local/bin/protoc    

注意:應該拷貝到/usr/local/bin(可以讀寫)目錄下不是/usr/bin(只讀),否則會提示Operation not permitted, 這一步踩了好多坑。 3、從github上下載目標檢測API的原始碼

git clone https://github.com/tensorflow/models.git

4、編譯Protobuf,進入tensorflow/models 目錄,執行下面命令進行編譯:

protoc object_detection/protos/*.proto --python_out=.

注意:一定要進入models目錄下執行該命令。

5、在當前目錄下,新增slim環境變數

export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

6、測試目標檢測API是否安裝成功,下圖表示安裝成功

python object_detection/builders/model_builder_test.py

接著,主體程式碼如下:

# coding: utf-8   
  
import numpy as np  
import os  
import six.moves.urllib as urllib  
import sys  
import tarfile  
import tensorflow as tf  
import zipfile  
  
from collections import defaultdict  
from io import StringIO  
from matplotlib import pyplot as plt  
from PIL import Image  
  
import cv2                  #add 20170825  
cap = cv2.VideoCapture(0)   #add 20170825  
  
# This is needed since the notebook is stored in the object_detection folder.    
sys.path.append("..")  

# ## Object detection imports  
# Here are the imports from the object detection module.   
  
from object_detection.utils import label_map_util
  
from object_detection.utils import visualization_utils as vis_util
  

# # Model preparation   
  
# What model to download.  
MODEL_NAME = 'ssd_mobilenet_v1_coco_11_06_2017'  
#MODEL_NAME = 'faster_rcnn_resnet101_coco_11_06_2017'
#MODEL_NAME = 'ssd_inception_v2_coco_11_06_2017'
MODEL_FILE = MODEL_NAME + '.tar.gz'  
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'  
  
# Path to frozen detection graph. This is the actual model that is used for the object detection.  
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'  
  
# List of the strings that is used to add correct label for each box.  
PATH_TO_LABELS = os.path.join('models-master/research/object_detection/data', 'mscoco_label_map.pbtxt')
  
NUM_CLASSES = 90  
  
  
# ## Download Model  
  
# In[5]:  
  
opener = urllib.request.URLopener()  
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)  

tar_file = tarfile.open(MODEL_FILE)  
for file in tar_file.getmembers():  
  file_name = os.path.basename(file.name)  
  if 'frozen_inference_graph.pb' in file_name:  
    tar_file.extract(file, os.getcwd())  
  
  
# ## Load a (frozen) Tensorflow model into memory.  
  
# In[6]:  
  
detection_graph = tf.Graph()  
with detection_graph.as_default():  
  od_graph_def = tf.GraphDef()  
  with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:  
    serialized_graph = fid.read()  
    od_graph_def.ParseFromString(serialized_graph)  
    tf.import_graph_def(od_graph_def, name='')  
  
  
# ## Loading label map  
  
# In[7]:  
  
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)  
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)  
category_index = label_map_util.create_category_index(categories)  
  
  
# ## Helper code  
  
# In[8]:  
  
def load_image_into_numpy_array(image):  
  (im_width, im_height) = image.size  
  return np.array(image.getdata()).reshape(  
      (im_height, im_width, 3)).astype(np.uint8)  
  
  
# # Detection  
  
# In[9]:  
  
# For the sake of simplicity we will use only 2 images:  
# image1.jpg  
# image2.jpg  
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.  
PATH_TO_TEST_IMAGES_DIR = 'test_images'  
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3) ]  
  
# Size, in inches, of the output images.  
IMAGE_SIZE = (12, 8)  
  
  
# In[10]:  
  
with detection_graph.as_default():  
  with tf.Session(graph=detection_graph) as sess:  
    while True:    #for image_path in TEST_IMAGE_PATHS:    #changed 20170825  
      ret, image_np = cap.read()  
        
      # Expand dimensions since the model expects images to have shape: [1, None, None, 3]  
      image_np_expanded = np.expand_dims(image_np, axis=0)  
      image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')  
      # Each box represents a part of the image where a particular object was detected.  
      boxes = detection_graph.get_tensor_by_name('detection_boxes:0')  
      # Each score represent how level of confidence for each of the objects.  
      # Score is shown on the result image, together with the class label.  
      scores = detection_graph.get_tensor_by_name('detection_scores:0')  
      classes = detection_graph.get_tensor_by_name('detection_classes:0')  
      num_detections = detection_graph.get_tensor_by_name('num_detections:0')  
      # Actual detection.  
      (boxes, scores, classes, num_detections) = sess.run(  
          [boxes, scores, classes, num_detections],  
          feed_dict={image_tensor: image_np_expanded})  
      # Visualization of the results of a detection.  
      vis_util.visualize_boxes_and_labels_on_image_array(  
          image_np,  
          np.squeeze(boxes),  
          np.squeeze(classes).astype(np.int32),  
          np.squeeze(scores),  
          category_index,  
          use_normalized_coordinates=True,  
          line_thickness=8)  
      cv2.imshow('object detection', cv2.resize(image_np,(800,600)))  
      if cv2.waitKey(25) & 0xFF ==ord('q'):  
        cv2.destroyAllWindows()  
        break  
    
# In[ ]:  

這其中,我遇到了兩個重要的問題:

1. PATH_TO_LABELS = os.path.join('models-master/research/object_detection/data', 'mscoco_label_map.pbtxt')

這個路徑為下載的models包裡面的資料(90個分類標籤)。

2. TypeError: __new__() got an unexpected keyword argument 'serialized_options'

原因為終端的protobuf與pycharm中的protobuf版本不一致。調整為一致即可。

查詢版本的語句為:protoc --version

3. “from utils import label_map_util”ImportError:無法匯入名稱'label_map_util'

這個問題為object_detection沒有載入到系統變數中,導致無法呼叫。

解決方法:執行程式之前進入object_detection目錄下將路徑加入環境變數中

export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

然後將源程式的

from utils import label_map_util

from utils import visualization_utils as vis_util

改成:

from object_detection.utils import label_map_util

from object_detection.utils import visualization_utils as vis_util

即可。