1. 程式人生 > >Tensorflow object_detection API筆記

Tensorflow object_detection API筆記

TF object_detection API

這個API是tensorflow官方提供的工程模板,之前曾經嘗試過但沒有跑通,這次看的比較深入,基本上熟悉了訓練、測試、評估的操作流程。實驗了VOC2007訓練、Pet資料集訓練等。下面記錄的是研究過程中的一些總結。

使用API訓練資料集的一般流程

適當修改下面的對應路徑和配置檔案

1. 建立tfrecord

python dataset_tools/create_pet_tf_record.py \
    --data_dir=/media/han/E/mWork/datasets/Oxford-IIIT_Pet_Dataset \
    --output_dir=
trainLogs_pets/tfrecord

2. 訓練

如果不能執行,那麼執行:

# From tensorflow/models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
python train.py \
        --logtostderr \
        --train_dir=trainLogs_pets/output \
        --pipeline_config_path=trainLogs_pets/ssd_mobilenet_v1_pets.config

3. 將訓練得到的權重檔案合併為*.pb檔案

python export_inference_graph.py --input_type image_tensor \
    --pipeline_config_path trainLogs_pets/ssd_mobilenet_v1_pets.config \
    --trained_checkpoint_prefix trainLogs_pets/output/model.ckpt-100000 \
    --output_directory trainLogs_pets/output

4. 評估

python eval.py \
        --logtostderr \
        --checkpoint_dir=
trainLogs_pets/output \ --eval_dir=trainLogs_pets/eval \ --pipeline_config_path=trainLogs_pets/ssd_mobilenet_v1_pets.config

create_pascal_tf_record.py

ignore_difficult_instances #忽視難例就是不訓練難例
  • 該檔案會將影象檔案也編碼進.record檔案中,所以生成檔案比較大
  with tf.gfile.GFile(full_path, 'rb') as fid:
    encoded_jpg = fid.read()   
  example = tf.train.Example(features=tf.train.Features(feature={
    'image/height': dataset_util.int64_feature(height),
    'image/width': dataset_util.int64_feature(width),
    'image/filename': dataset_util.bytes_feature(
    data['filename'].encode('utf8')),
    'image/source_id': dataset_util.bytes_feature(
    data['filename'].encode('utf8')),
    'image/key/sha256': dataset_util.bytes_feature(key.encode('utf8')),
    'image/encoded': dataset_util.bytes_feature(encoded_jpg), #影象raw資料
    'image/format': dataset_util.bytes_feature('jpeg'.encode('utf8')), #jpeg格式,也就是說儲存的影象是壓縮後的大小
    'image/object/bbox/xmin': dataset_util.float_list_feature(xmin),
    'image/object/bbox/xmax': dataset_util.float_list_feature(xmax),
    'image/object/bbox/ymin': dataset_util.float_list_feature(ymin),
    'image/object/bbox/ymax': dataset_util.float_list_feature(ymax),
    'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
    'image/object/class/label': dataset_util.int64_list_feature(classes),
    'image/object/difficult': dataset_util.int64_list_feature(difficult_obj),
    'image/object/truncated': dataset_util.int64_list_feature(truncated),
    'image/object/view': dataset_util.bytes_list_feature(poses),
    }
  • 因為每一類的訓練txt中都包含所有訓練樣本的檔名,所以這裡雖然是aeroplane,但仍然將所有訓練樣本轉換成.record
    VOC train中只有2501個;所以這裡也可以修改為用train_val進行訓練
    examples_path = os.path.join(data_dir, year, 'ImageSets', 'Main',
                                 'aeroplane_' + FLAGS.set + '.txt')
from functools import partial #functools.partial用於建立一個偏函式,用一些預設引數包裝一個可呼叫物件。

def add(x, y):
    return x + y

add_y = partial(add, 3)  # add_y 是一個新的函式,只需要一個引數y,隱含x=3
ret=add_y(4) #ret=7

train.py中的例子

#使用Python的functools模組中partial偏函式;將model_builder.build()函式中的引數固定,並生成新的函式model_fn
  model_fn = functools.partial(
      model_builder.build,
      model_config=model_config,
      is_training=True)
  • tf.logging.set_verbosity(tf.logging.INFO) 設定logging冗餘
    如果不設定這行程式碼,則控制檯不顯示訊息。
    TensorFlow使用五個不同級別的日誌訊息。 按照上升的順序,它們是DEBUG,INFO,WARN,ERROR和FATAL。 當您在任何這些級別配置日誌記錄時,TensorFlow將輸出與該級別相對應的所有日誌訊息以及所有級別的嚴重級別。 例如,如果設定了ERROR的日誌記錄級別,則會收到包含ERROR和FATAL訊息的日誌輸出,如果設定了一個DEBUG級別,則會從所有五個級別獲取日誌訊息。 # 預設情況下,TENSFlow在WARN的日誌記錄級別進行配置,但是在跟蹤模型訓練時,您需要將級別調整為INFO,這將提供適合操作正在進行的其他反饋。參考:https://blog.csdn.net/caokaifa/article/details/80385501?utm_source=copy
    在寫入訊息時,呼叫tf.logging.info('Image size: %dx%d' % (width, height))

  • 分散式設定

  #下面5行程式碼是分散式計算時候用到的,如果是單PC,不影響計算
  env = json.loads(os.environ.get('TF_CONFIG', '{}'))
  cluster_data = env.get('cluster', None) #從Python環境配置中搜索是否存在叢集cluster
  cluster = tf.train.ClusterSpec(cluster_data) if cluster_data else None
  task_data = env.get('task', None) or {'type': 'master', 'index': 0} #task_data是master表示任務由主機完成
  task_info = type('TaskSpec', (object,), task_data)
  • trainer.train()
  trainer.train(
      create_input_dict_fn,#建立tensor輸入字典的函式,利用functools.partial對get_next()函式修飾得到
      model_fn,#建立檢測模型和計算Loss的函式,利用functools.partial對model_builder.build()函式進行得到
      train_config,#訓練引數配置,protobuf;字典型別
      master,
      task,
      FLAGS.num_clones,
      worker_replicas,
      FLAGS.clone_on_cpu,
      ps_tasks,
      worker_job_name,
      is_chief,
      FLAGS.train_dir,
      graph_hook_fn=graph_rewriter_fn)
   
    '''
      Args:
    create_tensor_dict_fn: a function to create a tensor input dictionary.
    create_model_fn: a function that creates a DetectionModel and generates losses.
    train_config: a train_pb2.TrainConfig protobuf.
    master: BNS name of the TensorFlow master to use.
    task: The task id of this training instance.
    num_clones: The number of clones to run per machine.
    worker_replicas: The number of work replicas to train with.
    clone_on_cpu: True if clones should be forced to run on CPU.
    ps_tasks: Number of parameter server tasks.
    worker_job_name: Name of the worker job.
    is_chief: Whether this replica is the chief replica.
    train_dir: Directory to write checkpoints and training summaries to.
    graph_hook_fn: Optional function that is called after the inference graph is
      built (before optimization). This is helpful to perform additional changes
      to the training graph such as adding FakeQuant ops. The function should
      modify the default graph.
    '''
  • create_model_fn
#create_model_fn是一個function.partial裝飾過的函式,如果後面加上'()',則表明呼叫該函式,因為不需要引數,所以括號裡面為空
detection_model = create_model_fn()

pipline config

# SSD with Mobilenet v1, configured for Oxford-IIIT Pets Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model { #定義模型配置
  ssd { #ssd-begin
    num_classes: 37 #類別數量,如果是自己的資料集,千萬要注意修改
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher { #匹配引數,哪種情況下算匹配正確;預設閾值是0.5
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true #低於unmatched_threshold的為負樣本
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity { #使用iou相似度進行度量
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6 #??
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer { #權重初始化設定
            truncated_normal_initializer { #使用截斷正態分佈初始化方法
              stddev: 0.03
              mean: 0.0
            }
          }
          batch_norm {
            train: true,
            scale: true,
            center: true,
            decay: 0.9997,
            epsilon: 0.001,
          }
        }
      }
    } # ssd-end
    feature_extractor {
      type: 'ssd_mobilenet_v1'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid {
        }
      }
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      hard_example_miner { #難例挖掘
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  batch_size: 24
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate { #指數衰減-學習率
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  #是否載入之前的訓練模型,如果為空,則從零開始訓練; /media/han/E/mWork/mCode/models/research/object_detection/ssd_mobilenet_v1_coco_2017_11_17/model.ckpt
  fine_tune_checkpoint: ""
  from_detection_checkpoint: true #載入預訓練模型的分類權重
  load_all_detection_checkpoint_vars: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 100000  #最大迭代次數
  data_augmentation_options { #資料增強方式
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    # 如果record檔案是分塊的,那麼使用?萬用字元匹配
    input_path: "trainLogs_pets/tfrecord/pet_faces_train.record-?????-of-?????"
  }
  label_map_path: "/media/han/E/mWork/mCode/models/research/object_detection/data/pet_label_map.pbtxt"
}

eval_config: {
  metrics_set: "coco_detection_metrics"
  num_examples: 1101
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "trainLogs_pets/tfrecord/pet_faces_val.record-?????-of-?????"
  }
  label_map_path: "/media/han/E/mWork/mCode/models/research/object_detection/data/pet_label_map.pbtxt"
  shuffle: false
  num_readers: 1
}

example-pets-eval

訓練10000 steps

(base) [email protected]:/media/han/E/mWork/mCode/models/research/object_detection$ python eval.py --logtostderr --checkpoint_dir=trainLogs_pets/output         --eval_dir=trainLogs_pets/eval     --pipeline_config_path=trainLogs_pets/ssd_mobilenet_v1_pets.config

eval結果:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.499
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.729
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.590
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.270
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.541
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.645
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.669
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.669
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.478
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.707

模型評估有多種標準,VOC、COC、OID_challenge等,該引數需要在pipline.config中的eval_config中設定,預設是pascal_voc_detection_metrics

EVAL_METRICS_CLASS_DICT = {
    'pascal_voc_detection_metrics':
        object_detection_evaluation.PascalDetectionEvaluator,
    'weighted_pascal_voc_detection_metrics':
        object_detection_evaluation.WeightedPascalDetectionEvaluator,
    'pascal_voc_instance_segmentation_metrics':
        object_detection_evaluation.PascalInstanceSegmentationEvaluator,
    'weighted_pascal_voc_instance_segmentation_metrics':
        object_detection_evaluation.WeightedPascalInstanceSegmentationEvaluator,
    'open_images_V2_detection_metrics':
        object_detection_evaluation.OpenImagesDetectionEvaluator,
    'coco_detection_metrics':
        coco_evaluation.CocoDetectionEvaluator,
    'coco_mask_metrics':
        coco_evaluation.CocoMaskEvaluator,
    'oid_challenge_object_detection_metrics':
        object_detection_evaluation.OpenImagesDetectionChallengeEvaluator,
}

EVAL_DEFAULT_METRIC = 'pascal_voc_detection_metrics'

錯誤記錄

ValueError: Tried to convert ‘t’ to a tensor and failed. Error: Argument must be a dense tensor: range(0, 3) - got shape [3], but wanted []
解決方法:
https://github.com/tensorflow/models/issues/3705#issuecomment-375563179