1. 程式人生 > >TensorFlow Object Detection API 實踐

TensorFlow Object Detection API 實踐

1. Install object_detection API

```  python object_detection/builders/model_builder_test.py ```

2. Download and convent data

``` # From tensorflow/models/research/ wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz tar -xvf annotations.tar.gz tar -xvf images.tar.gz python object_detection/dataset_tools/create_pet_tf_record.py \     --label_map_path=object_detection/data/pet_label_map.pbtxt \     --data_dir=`pwd` \     --output_dir=`pwd` ```

4. config pipline file

``` # Faster R-CNN with Resnet-101 (v1) configured for the Oxford-IIIT Pet Dataset. # Users should configure the fine_tune_checkpoint field in the train config as # well as the label_map_path and input_path fields in the train_input_reader and # eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that # should be configured.

model {   faster_rcnn {     num_classes: 37     image_resizer {       keep_aspect_ratio_resizer {         min_dimension: 600         max_dimension: 1024       }     }     feature_extractor {       type: 'faster_rcnn_resnet101'       first_stage_features_stride: 16     }     first_stage_anchor_generator {       grid_anchor_generator {         scales: [0.25, 0.5, 1.0, 2.0]         aspect_ratios: [0.5, 1.0, 2.0]         height_stride: 16         width_stride: 16       }     }     first_stage_box_predictor_conv_hyperparams {       op: CONV       regularizer {         l2_regularizer {           weight: 0.0         }       }       initializer {         truncated_normal_initializer {           stddev: 0.01         }       }     }     first_stage_nms_score_threshold: 0.0     first_stage_nms_iou_threshold: 0.7     first_stage_max_proposals: 300     first_stage_localization_loss_weight: 2.0     first_stage_objectness_loss_weight: 1.0     initial_crop_size: 14     maxpool_kernel_size: 2     maxpool_stride: 2     second_stage_box_predictor {       mask_rcnn_box_predictor {         use_dropout: false         dropout_keep_probability: 1.0         fc_hyperparams {           op: FC           regularizer {             l2_regularizer {               weight: 0.0             }           }           initializer {             variance_scaling_initializer {               factor: 1.0               uniform: true               mode: FAN_AVG             }           }         }       }     }     second_stage_post_processing {       batch_non_max_suppression {         score_threshold: 0.0         iou_threshold: 0.6         max_detections_per_class: 100         max_total_detections: 300       }       score_converter: SOFTMAX     }     second_stage_localization_loss_weight: 2.0     second_stage_classification_loss_weight: 1.0   } }

train_config: {   batch_size: 1   optimizer {     momentum_optimizer: {       learning_rate: {         manual_step_learning_rate {           initial_learning_rate: 0.0003           schedule {             step: 900000             learning_rate: .00003           }           schedule {             step: 1200000             learning_rate: .000003           }         }       }       momentum_optimizer_value: 0.9     }     use_moving_average: false   }   gradient_clipping_by_norm: 10.0   fine_tune_checkpoint: "/Users/terry/models/research/0terry_pet/model/faster_rcnn_resnet101_coco_11_06_2017/model.ckpt"   from_detection_checkpoint: true   load_all_detection_checkpoint_vars: true   # Note: The below line limits the training process to 200K steps, which we   # empirically found to be sufficient enough to train the pets dataset. This   # effectively bypasses the learning rate schedule (the learning rate will   # never decay). Remove the below line to train indefinitely.   num_steps: 200000   data_augmentation_options {     random_horizontal_flip {     }   } }

train_input_reader: {   tf_record_input_reader {     input_path: "/Users/terry/models/research/0terry_pet/data/train/pet_faces_train.record-?????-of-00010"   }   label_map_path: "/Users/terry/models/research/0terry_pet/data/pet_label_map.pbtxt" }

eval_config: {   metrics_set: "coco_detection_metrics"   num_examples: 1101 }

eval_input_reader: {   tf_record_input_reader {     input_path: "/Users/terry/models/research/0terry_pet/data/eval/pet_faces_val.record-?????-of-00010"   }   label_map_path: "/Users/terry/models/research/0terry_pet/data/pet_label_map.pbtxt"   shuffle: false   num_readers: 1 } ```

3. Move the files to a new folder 0terry_pet ``` +data   -pet_label_map.pbtxt   -train     pet_faces_train.record-00009-of-00010 [10 files]   -eval TFRecord file     pet_faces_val.record-00009-of-00010 [10 files]   -images   -annotations +models   + model     _faster_rcnn_resnet101_pets.config     -faster_rcnn_resnet101_coco_11_06_2017     +train     +eval ```

3. Train data

 ```  # From the /Users/terry/models/research directory PIPELINE_CONFIG_PATH=0terry_pet/model/faster_rcnn_resnet101_pets.config MODEL_DIR=0terry_pet/model/faster_rcnn_resnet101_coco_11_06_2017 NUM_TRAIN_STEPS=500 SAMPLE_1_OF_N_EVAL_EXAMPLES=1 python object_detection/model_main.py \     --pipeline_config_path=${PIPELINE_CONFIG_PATH} \     --model_dir=${MODEL_DIR} \     --num_train_steps=${NUM_TRAIN_STEPS} \     --sample_1_of_n_eval_examples=$SAMPLE_1_OF_N_EVAL_EXAMPLES \     --alsologtostderr  ``` 4. Startup Tensorbroad

``` # From the /Users/terry/models/research directory MODEL_DIR=0terry_pet/model/faster_rcnn_resnet101_coco_11_06_2017 tensorboard --logdir=${MODEL_DIR} ```

觀察loss不怎麼變小時,可停止訓練

5. create pd file

將train資料夾下的最大的數字,如下檔案複製到checkpoint_best資料夾下,並去除ckpt後面的“-數字”,checkpoint檔案內相應也要改 ``` checkpoint model.ckpt.data-00000-of-00001 model.ckpt.index model.ckpt.meta ``` 在專案根目錄下執行: ```  # From the /Users/terry/models/research directory python object_detection/export_inference_graph.py \ --pipeline_config_path 0terry_pet/model/faster_rcnn_resnet101_pets.config \ --trained_checkpoint_prefix 0terry_pet/model/checkpoint_best/model.ckpt \ --output_directory 0terry_pet/model/pb ```

6. 預測新的圖片 ``` import numpy as np import os import six.moves.urllib as urllib import sys import tarfile import tensorflow as tf import zipfile

from distutils.version import StrictVersion from collections import defaultdict from io import StringIO from matplotlib import pyplot as plt from PIL import Image

# This is needed since the notebook is stored in the object_detection folder. sys.path.append("..") from object_detection.utils import ops as utils_ops

if StrictVersion(tf.__version__) < StrictVersion('1.9.0'):   raise ImportError('Please upgrade your TensorFlow installation to v1.9.* or later!')

# This is needed to display the images. %matplotlib inline

from object_detection.utils import label_map_util from object_detection.utils import visualization_utils as vis_util

# Path to frozen detection graph. This is the actual model that is used for the object detection. PATH_TO_FROZEN_GRAPH = '/Users/terry/models/research/0terry_pet/model/pb/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box. PATH_TO_LABELS = os.path.join('/Users/terry/models/research/0terry_pet/data', 'pet_label_map.pbtxt')

detection_graph = tf.Graph() with detection_graph.as_default():   od_graph_def = tf.GraphDef()   with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:     serialized_graph = fid.read()     od_graph_def.ParseFromString(serialized_graph)     tf.import_graph_def(od_graph_def, name='')      #Loading label map category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)   

def load_image_into_numpy_array(image):   (im_width, im_height) = image.size   return np.array(image.getdata()).reshape(       (im_height, im_width, 3)).astype(np.uint8) # If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS. PATH_TO_TEST_IMAGES_DIR = '/Users/terry/models/research/0terry_pet/data/test_images' TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3) ]

# Size, in inches, of the output images. IMAGE_SIZE = (12, 8)

def run_inference_for_single_image(image, graph):   with graph.as_default():     with tf.Session() as sess:       # Get handles to input and output tensors       ops = tf.get_default_graph().get_operations()       all_tensor_names = {output.name for op in ops for output in op.outputs}       tensor_dict = {}       for key in [           'num_detections', 'detection_boxes', 'detection_scores',           'detection_classes', 'detection_masks'       ]:         tensor_name = key + ':0'         if tensor_name in all_tensor_names:           tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(tensor_name)       if 'detection_masks' in tensor_dict:         # The following processing is only for single image         detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])         detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])         # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.         real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)         detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])         detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])         detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(             detection_masks, detection_boxes, image.shape[0], image.shape[1])         detection_masks_reframed = tf.cast(             tf.greater(detection_masks_reframed, 0.5), tf.uint8)         # Follow the convention by adding back the batch dimension         tensor_dict['detection_masks'] = tf.expand_dims(             detection_masks_reframed, 0)       image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

      # Run inference       output_dict = sess.run(tensor_dict,                              feed_dict={image_tensor: np.expand_dims(image, 0)})

      # all outputs are float32 numpy arrays, so convert types as appropriate       output_dict['num_detections'] = int(output_dict['num_detections'][0])       output_dict['detection_classes'] = output_dict[           'detection_classes'][0].astype(np.uint8)       output_dict['detection_boxes'] = output_dict['detection_boxes'][0]       output_dict['detection_scores'] = output_dict['detection_scores'][0]       if 'detection_masks' in output_dict:         output_dict['detection_masks'] = output_dict['detection_masks'][0]   return output_dict

for image_path in TEST_IMAGE_PATHS:   image = Image.open(image_path)   # the array based representation of the image will be used later in order to prepare the   # result image with boxes and labels on it.   image_np = load_image_into_numpy_array(image)   # Expand dimensions since the model expects images to have shape: [1, None, None, 3]   image_np_expanded = np.expand_dims(image_np, axis=0)   # Actual detection.   output_dict = run_inference_for_single_image(image_np, detection_graph)   # Visualization of the results of a detection.   vis_util.visualize_boxes_and_labels_on_image_array(       image_np,       output_dict['detection_boxes'],       output_dict['detection_classes'],       output_dict['detection_scores'],       category_index,       instance_masks=output_dict.get('detection_masks'),       use_normalized_coordinates=True,       line_thickness=8)   plt.figure(figsize=IMAGE_SIZE)   plt.imshow(image_np)      ```