TensorFlow Object Detection API中的Faster R-CNN /SSD模型引數調整

關於TensorFlow Object Detection API配置，可以參考之前的文章https://becominghuman.ai/tensorflow-object-detection-api-tutorial-training-and-evaluating-custom-object-detector-ed2594afcf73

在本文中，我將討論如何更改預訓練模型的配置。本文的目的是您可以根據您的應用程式配置TensorFlow/models，而API將不再是一個黑盒！

本文的概述：

瞭解協議緩衝區和proto檔案。
利用proto檔案知識，我們如何瞭解模型的配置檔案
遵循3個步驟來更新模型的引數

其他示例：

更改重量初始值設定項
更改體重優化器
評估預訓練模型

協議緩衝區

要修改模型，我們需要了解它的內部機制。TensorFlow物件檢測API使用協議緩衝區（Protocol Buffers），這是與語言無關，與平臺無關且可擴充套件的機制，用於序列化結構化資料。就像XML規模較小，但更快，更簡單。API使用協議緩衝區語言的proto2版本。我將嘗試解釋更新預配置模型所需的語言。有關協議緩衝區語言的更多詳細資訊，請參閱此文件和Python教程。

協議緩衝區的工作可分為以下三個步驟：

在.proto檔案中定義訊息格式。該檔案的行為就像所有訊息的藍圖一樣，它顯示訊息所接受的所有引數是什麼，引數的資料型別應該是什麼，引數是必需的還是可選的，引數的標記號是什麼，什麼是引數的預設值等。API的protos檔案可在此處找到。為了理解，我使用grid_anchor_generator.proto檔案。

```
syntax = "proto2";

package object_detection.protos;

// Configuration proto for GridAnchorGenerator. See
// anchor_generators/grid_anchor_generator.py for details.
message GridAnchorGenerator {
   // Anchor height in pixels.
  optional int32 height = 1 [default = 256];

  // Anchor width in pixels.
  optional int32 width = 2 [default = 256];

  // Anchor stride in height dimension in pixels.
  optional int32 height_stride = 3 [default = 16];

  // Anchor stride in width dimension in pixels.
  optional int32 width_stride = 4 [default = 16];

  // Anchor height offset in pixels.
  optional int32 height_offset = 5 [default = 0];

  // Anchor width offset in pixels.
  optional int32 width_offset = 6 [default = 0];

  // At any given location, len(scales) * len(aspect_ratios) anchors are
  // generated with all possible combinations of scales and aspect ratios.

  // List of scales for the anchors.
  repeated float scales = 7;

  // List of aspect ratios for the anchors.
  repeated float aspect_ratios = 8;
}
```
它是從線30-33的引數明確scales，並aspect_ratios是強制性的訊息GridAnchorGenerator，而引數的其餘部分都是可選的，如果不通過，將採取預設值。
- 定義訊息格式後，我們需要編譯協議緩衝區。該編譯器將從檔案生成類.proto檔案。在安裝API的過程中，我們運行了以下命令，該命令將編譯協議緩衝區：
- ```
# From tensorflow/models/research/
protoc object_detection/protos/*.proto --python_out=.
```
  - 在定義和編譯協議緩衝區之後，我們需要使用Python協議緩衝區API來寫入和讀取訊息。在我們的例子中，我們可以將配置檔案視為協議緩衝區API，它可以在不考慮TensorFlow API的內部機制的情況下寫入和讀取訊息。換句話說，我們可以通過適當地更改配置檔案來更新預訓練模型的引數。
  - 瞭解配置檔案
    
    顯然，配置檔案可以幫助我們根據需要更改模型的引數。彈出的下一個問題是如何更改模型的引數？本節和下一部分將回答這個問題，在這裡proto檔案的知識將很方便。出於演示目的，我正在使用faster_rcnn_resnet50_pets.config檔案。
  - ```
  # Faster R-CNN with Resnet-50 (v1), configured for Oxford-IIIT Pets Dataset.
  # Users should configure the fine_tune_checkpoint field in the train config as
  # well as the label_map_path and input_path fields in the train_input_reader and
  # eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
  # should be configured.
  
  model {
    faster_rcnn {
      num_classes: 37
      image_resizer {
        keep_aspect_ratio_resizer {
          min_dimension: 600
          max_dimension: 1024
        }
      }
      feature_extractor {
        type: 'faster_rcnn_resnet50'
        first_stage_features_stride: 16
      }
      first_stage_anchor_generator {
        grid_anchor_generator {
          scales: [0.25, 0.5, 1.0, 2.0]
          aspect_ratios: [0.5, 1.0, 2.0]
          height_stride: 16
          width_stride: 16
        }
      }
      first_stage_box_predictor_conv_hyperparams {
        op: CONV
        regularizer {
          l2_regularizer {
            weight: 0.0
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.01
          }
        }
      }
      first_stage_nms_score_threshold: 0.0
      first_stage_nms_iou_threshold: 0.7
      first_stage_max_proposals: 300
      first_stage_localization_loss_weight: 2.0
      first_stage_objectness_loss_weight: 1.0
      initial_crop_size: 14
      maxpool_kernel_size: 2
      maxpool_stride: 2
      second_stage_box_predictor {
        mask_rcnn_box_predictor {
          use_dropout: false
          dropout_keep_probability: 1.0
          fc_hyperparams {
            op: FC
            regularizer {
              l2_regularizer {
                weight: 0.0
              }
            }
            initializer {
              variance_scaling_initializer {
                factor: 1.0
                uniform: true
                mode: FAN_AVG
              }
            }
          }
        }
      }
      second_stage_post_processing {
        batch_non_max_suppression {
          score_threshold: 0.0
          iou_threshold: 0.6
          max_detections_per_class: 100
          max_total_detections: 300
        }
        score_converter: SOFTMAX
      }
      second_stage_localization_loss_weight: 2.0
      second_stage_classification_loss_weight: 1.0
    }
  }
  
  train_config: {
    batch_size: 1
    optimizer {
      momentum_optimizer: {
        learning_rate: {
          manual_step_learning_rate {
            initial_learning_rate: 0.0003
            schedule {
              step: 900000
              learning_rate: .00003
            }
            schedule {
              step: 1200000
              learning_rate: .000003
            }
          }
        }
        momentum_optimizer_value: 0.9
      }
      use_moving_average: false
    }
    gradient_clipping_by_norm: 10.0
    fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
    from_detection_checkpoint: true
    # Note: The below line limits the training process to 200K steps, which we
    # empirically found to be sufficient enough to train the pets dataset. This
    # effectively bypasses the learning rate schedule (the learning rate will
    # never decay). Remove the below line to train indefinitely.
    num_steps: 200000
    data_augmentation_options {
      random_horizontal_flip {
      }
    }
    max_number_of_boxes: 50
  }
  
  train_input_reader: {
    tf_record_input_reader {
      input_path: "PATH_TO_BE_CONFIGURED/pet_train.record"
    }
    label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt"
  }
  
  eval_config: {
    num_examples: 2000
    # Note: The below line limits the evaluation process to 10 evaluations.
    # Remove the below line to evaluate indefinitely.
    max_evals: 10
  }
  
  eval_input_reader: {
    tf_record_input_reader {
      input_path: "PATH_TO_BE_CONFIGURED/pet_val.record"
    }
    label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt"
    shuffle: false
    num_readers: 1
  }
```
  第7至10行表示這num_classes是faster_rcnnmessage 的引數之一，而後者又是message的引數model。同樣，optimizer是父train_config訊息的子訊息，而message的batch_size另一個引數train_config。我們可以通過簽出相應的protos檔案來驗證這一點。
- ```
syntax = "proto2";

package object_detection.protos;

import "object_detection/protos/anchor_generator.proto";
import "object_detection/protos/box_predictor.proto";
import "object_detection/protos/hyperparams.proto";
import "object_detection/protos/image_resizer.proto";
import "object_detection/protos/losses.proto";
import "object_detection/protos/post_processing.proto";

// Configuration for Faster R-CNN models.
// See meta_architectures/faster_rcnn_meta_arch.py and models/model_builder.py
//
// Naming conventions:
// Faster R-CNN models have two stages: a first stage region proposal network
// (or RPN) and a second stage box classifier.  We thus use the prefixes
// `first_stage_` and `second_stage_` to indicate the stage to which each
// parameter pertains when relevant.
message FasterRcnn {

  // Whether to construct only the Region Proposal Network (RPN).
  optional int32 number_of_stages = 1 [default=2];

  // Number of classes to predict.
  optional int32 num_classes = 3;
  
  // Image resizer for preprocessing the input image.
  optional ImageResizer image_resizer = 4;
```
    從第20行和第26行可以明顯看出，這num_classes是optional訊息的引數之一faster_rcnn。我希望到目前為止的討論有助於理解配置檔案的組織。現在，是時候正確更新模型的引數之一了。
  - 步驟1：確定要更新的引數
    
    假設我們需要更新fast_rcnn_resnet50_pets.config檔案的image_resizer第10行中提到的引數。
    
    步驟2：在儲存庫中搜索給定引數
    
    目標是找到proto引數檔案。為此，我們需要在儲存庫中搜索。
  - 我們需要搜尋以下程式碼：
  - ```
  parameter_name path:research/object_detection/protos
  #in our case parameter_name="image_resizer" thus,
  image_resizer path:research/object_detection/protos
```
  在此path:research/object_detection/protos限制搜尋域。在此處可以找到有關如何在GitHub上搜索的更多資訊。搜尋的輸出image_resizer path:research/object_detection/protos如下所示：
- 從輸出中很明顯，要更新image_resizer引數，我們需要分析image_resizer.proto檔案。
  
  步驟3：分析proto檔案
```
  syntax = "proto2";
  
  package object_detection.protos;
  
  // Configuration proto for image resizing operations.
  // See builders/image_resizer_builder.py for details.
  message ImageResizer {
    oneof image_resizer_oneof {
      KeepAspectRatioResizer keep_aspect_ratio_resizer = 1;
      FixedShapeResizer fixed_shape_resizer = 2;
    }
  }
  
  // Enumeration type for image resizing methods provided in TensorFlow.
  enum ResizeType {
    BILINEAR = 0; // Corresponds to tf.image.ResizeMethod.BILINEAR
    NEAREST_NEIGHBOR = 1; // Corresponds to tf.image.ResizeMethod.NEAREST_NEIGHBOR
    BICUBIC = 2; // Corresponds to tf.image.ResizeMethod.BICUBIC
    AREA = 3; // Corresponds to tf.image.ResizeMethod.AREA
  }
  
  // Configuration proto for image resizer that keeps aspect ratio.
  message KeepAspectRatioResizer {
    // Desired size of the smaller image dimension in pixels.
    optional int32 min_dimension = 1 [default = 600];
  
    // Desired size of the larger image dimension in pixels.
    optional int32 max_dimension = 2 [default = 1024];
  
    // Desired method when resizing image.
    optional ResizeType resize_method = 3 [default = BILINEAR];
  
    // Whether to pad the image with zeros so the output spatial size is
    // [max_dimension, max_dimension]. Note that the zeros are padded to the
    // bottom and the right of the resized image.
    optional bool pad_to_max_dimension = 4 [default = false];
  
    // Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
    optional bool convert_to_grayscale = 5 [default = false];
  
    // Per-channel pad value. This is only used when pad_to_max_dimension is True.
    // If unspecified, a default pad value of 0 is applied to all channels.
    repeated float per_channel_pad_value = 6;
  }
  
  // Configuration proto for image resizer that resizes to a fixed shape.
  message FixedShapeResizer {
    // Desired height of image in pixels.
    optional int32 height = 1 [default = 300];
  
    // Desired width of image in pixels.
    optional int32 width = 2 [default = 300];
  
    // Desired method when resizing image.
    optional ResizeType resize_method = 3 [default = BILINEAR];
  
    // Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
    optional bool convert_to_grayscale = 4 [default = false];
  }
```
  從第8-10行可以看出，我們可以使用keep_aspect_ratio_resizer或調整影象的大小fixed_shape_resizer。在分析行23-44，我們可以觀察到的訊息keep_aspect_ratio_resizer有引數：min_dimension，max_dimension，resize_method，pad_to_max_dimension，convert_to_grayscale，和per_channel_pad_value。此外，fixed_shape_resizer有引數：height，width，resize_method，和convert_to_grayscale。proto檔案中提到了所有引數的資料型別。因此，要更改image_resizer型別，我們可以在配置檔案中更改以下幾行。
- ```
#before
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600 
max_dimension: 1024
    }
}
#after
image_resizer {
fixed_shape_resizer {
height: 600
width: 500
resize_method: AREA
  }
}
```
    上面的程式碼將使用AREA調整大小方法將影象調整為500 * 600。TensorFlow中可用的各種調整大小的方法可以在這裡找到。
  - 其他例子
    
    我們可以使用上一節中討論的步驟更新/新增任何引數。我將在此處演示一些經常使用的示例，但是上面討論的步驟可能有助於更新/新增模型的任何引數。
    
    更改重量初始化器
    - 決定更改fast_rcnn_resnet50_pets.config檔案的initializer第35行的引數。
    - initializer path:research/object_detection/protos在儲存庫中搜索。根據搜尋結果，很明顯我們需要分析hyperparams.proto檔案。
    - - hyperparams.proto檔案中的第68–74行說明了initializer配置。
      - message Initializer { oneof initializer_oneof { TruncatedNormalInitializer truncated_normal_initializer = 1; VarianceScalingInitializer variance_scaling_initializer = 2; RandomNormalInitializer random_normal_initializer = 3; } }
        
        我們可以使用random_normal_intializer代替truncated_normal_initializer，因為我們需要分析hyperparams.proto檔案中的第99–102行。
      - message RandomNormalInitializer {
        optional float mean = 1 [default = 0.0];
        optional float stddev = 2 [default = 1.0];
        }
      - 顯然random_normal_intializer有兩個引數mean和stddev。我們可以將配置檔案中的以下幾行更改為use random_normal_intializer。
      - #before initializer { truncated_normal_initializer { stddev: 0.01 } } #after initializer { random_normal_intializer{ mean: 1 stddev: 0.5 } }
        
        更改體重優化器
        
        決定更改faster_rcnn_resnet50_pets.config檔案的第87行momentum_optimizer的父訊息的引數。optimizer
        
        optimizer path:research/object_detection/protos在儲存庫中搜索。根據搜尋結果，很明顯我們需要分析optimizer.proto檔案。
        
        optimizer.proto檔案中的9-14行，解釋optimizer配置。
        
        message Optimizer { oneof optimizer { RMSPropOptimizer rms_prop_optimizer = 1; MomentumOptimizer momentum_optimizer = 2; AdamOptimizer adam_optimizer = 3; }
        
        顯然，代替momentum_optimizer我們可以使用adam_optimizer已被證明是良好的優化程式。為此，我們需要在f aster_rcnn_resnet50_pets.config檔案中進行以下更改。
```
#before
optimizer {  
  momentum_optimizer: {
      learning_rate: {
           manual_step_learning_rate {
          initial_learning_rate: 0.0003
          schedule {
            step: 900000
            learning_rate: .00003
          }
          schedule {
            step: 1200000
            learning_rate: .000003
          }
        }
      }
      momentum_optimizer_value: 0.9
    }
#after
optimizer {
  adam_optimizer: {
      learning_rate: {
       manual_step_learning_rate {
          initial_learning_rate: 0.0003
          schedule {
            step: 900000
            learning_rate: .00003
          }
          schedule {
            step: 1200000
            learning_rate: .000003
          }
        }
      }
    }
```
      評估預訓練模型
      
      Eval等待300秒，以檢查訓練模型是否已更新！如果您的GPU不錯，那麼您可以同時進行訓練和評估！通常，資源將被耗盡。為了克服這個問題，我們可以先訓練模型，將其儲存在目錄中，然後再評估模型。為了稍後進行評估，我們需要在配置檔案中進行以下更改：
    - ```
    #Before
    eval_config: {
      num_examples: 2000
      # Note: The below line limits the evaluation process to 10 evaluations.
      # Remove the below line to evaluate indefinitely.
      max_evals: 10
    }
    #after
    eval_config: {
    num_examples: 10
    num_visualizations: 10
    eval_interval_secs: 0
    }
```
  num_visualizations應該等於要評估的數量！視覺化的數量越多，評估所需的時間就越多。如果您的GPU具有足夠的能力同時進行訓練和評估，則可以保留eval_interval_secs: 300。此引數決定執行評估的頻率。我按照上面討論的3個步驟得出了這個結論。
  
  簡而言之，協議緩衝區的知識幫助我們理解了模型引數是以訊息形式傳遞的，並且可以更新我們可以引用的.proto檔案的引數。討論了3個簡單的步驟來找到.proto用於更新引數的正確檔案。
  
  請在註釋的配置檔案中提及您要更新/新增的任何引數。
- 關注【OpenCV與AI深度學習】獲得更多資訊
  
  掃描下面二維碼即可關注
相關推薦

TensorFlow Object Detection API中的Faster R-CNN /SSD模型引數調整

關於TensorFlow Object Detection API配置，可以參考之前的文章https://becominghuman.ai/tensorflow-object-detection-api-tutorial-training-and-evaluating-custom-object-detec

使用tensorflow object detection API 訓練自己的目標檢測模型（一）labelImg的安裝配置過程

第一步：準備自己的資料集。比如我要檢測車牌。首先用到的是labelImg軟體：先簡要介紹一下labelimg安裝的步驟。接下來需要安裝一些python的包：我的環境是win10anacondapythonn36需要安裝的庫有：lxml, pyqt5,一般anaconda會有l

使用tensorflow object detection API 訓練自己的目標檢測模型（二）

        在上一篇部落格"使用tensorflow object detection API 訓練自己的目標檢測模型（一）"中介紹瞭如何使用LabelImg標記資料集，生成.xml檔案，經過個人的手工標註，形成了一個大概有兩千張圖片的資料集。但是這仍然不滿足t

配置tensorflow object detection api

could ror blog test creat not pre setup.py python 3：安裝tensorflow model 以及slim 版本號為1.4以上的，model和slim均在research 文件夾下打開research文件目錄 python

谷歌開源的TensorFlow Object Detection API視頻物體識別系統實現教程

cti blog tail xiaoxiao pan clas post ont 谷歌教程：http://blog.csdn.net/xiaoxiao123jun/article/details/76605928 全部代碼：https://github.com/lyj83

#tensorflow object detection api 源碼分析

clas fas mask api 錯誤眼界沒有 lan 入門深度學習前言 Tensorflow 推出的 Object Detection API是一套抽象程度極高的目標檢測框架，可以快速用於生產部署。但網絡上大多數相關的中英文文章均只局限於應用層面的分析，對於該套

TensorFlow object detection API

storage 系統 pipeline -s doc 直接下載 and 獲取數據 ons https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_pet

TensorFlow object detection API應用一

ofo ash png figure lin 調用安裝包 pat eight 目標檢測在圖形識別的基礎上有了更進一步的應用，但是代碼也更加繁瑣，TensorFlow專門為此開設了一個object detection API，接下來看看怎麽使用它。一、object det

Ubuntu 16.04 安裝Tensorflow Object Detection API遇到的問題解決

** Ubuntu 16.04 安裝Tensorflow Object Detection API ** 本篇的內容主要參考以下連結：https://blog.csdn.net/pkokocl/article/details/82596089，該博主描述的比較清楚，對於解決實際

Tensorflow Object Detection API之MaskRCNN-資料處理篇

TensorFlow官網介紹：Run an Instance Segmentation Model 要求將資料處理為PNG Instance Segmentation Masks格式以下部分為處理單張Mask圖片的方式： from PIL import Image, ImageDr

Tensorflow object detection API--修改visualization_utils檔案,裁剪並儲存bounding box部分

任務描述：用Tensorflow object detection API檢測出來的結果是一整張圖片，想要把檢測出的bounding box部分單獨截取出來並儲存執行環境：spyder 效果展示：測試圖片：test_images --> 檢測圖片：testsave_images -

基於TensorFlow Object Detection API進行相關開發的步驟

1/安裝或升級protoc 2/編譯proto檔案 protoc object_detection/protos/*.proto --python_out=. 3將slim加入PYTHONPATH export PYTHONPATH="$PYTHONPATH:/home/user/DL

Tensorflow Object Detection API安裝與使用

一、簡介《21個專案玩轉深度學習：基於Tensorflow的實踐詳解》第五章實踐 win10、jupyter notebook、python3.6， Tensorflow Object Detection API專案地址：https://github.com/tensorflow/mo

Tensorflow object detection API(1)---環境搭建與測試

參考： https://blog.csdn.net/dy_guox/article/details/79081499 https://blog.csdn.net/u010103202/article/details/79899293 https://blog.csdn.n

windows+tensorflow object detection api 深度學習目標檢測實踐

1、在github上下載tensorflow/model專案 1. 首先把protoc-win32資料夾下面的protoc.exe移至protobuf-python/src目錄下。 2. 在cmd中進入protobuf-python/python目錄，先執行a

TensorFlow Object Detection API 超詳細教程和踩坑過程（安裝）

目錄     cuda安裝     cudnn安裝     anaconda安裝並建立環境     tensorflow環境     Tensorflow.models下載     Protobuf配置與測試 1.配置環境       首先說一下我

TensorFlow Object Detection API 超詳細教程和踩坑過程（資料準備和訓練）

1.準備資料 object detection的資料是需要tfrecord格式的，但是一般我們還是先製作voc格式的資料更加方便。 1.voc格式資料的準備：github上下載一個label-img：    然後選擇VOC格式，開始漫長的資料

基於谷歌開源的TensorFlow Object Detection API視訊物體識別系統實現教程

安裝Python 進入Python3.6.2下載頁，選擇 Files 中Windows平臺的Python安裝包，下載並安裝（本人安裝的是3.6.2版本的python，可根據實際情況下載不同版本的python）安裝TensorFlow 進入TensorFlow

谷歌開源Tensorflow Object Detection API學習筆記

谷歌宣佈開源其內部使用的 TensorFlow Object Detection API 物體識別系統。本教程針對ubuntu16.04系統，快速搭建環境以及實現視訊物體識別系統功能。 https://yq.aliyun.com/ziliao/405237 https://www.cnblo

Tensorflow Object Detection API 詳細實踐指南

最近由於研究方向的更換，接觸到了目標檢測（Object Detection）領域，覺得很有意思，並且閱讀了該方向的相關經典文獻，包括Fast-RCNN、Faster-RCNN、SSD、YOLO以及RetinaNet等。但是對於讀研狗或者讀博狗的我們來說，復現別人

TensorFlow Object Detection API中的Faster R-CNN /SSD模型引數調整

協議緩衝區

瞭解配置檔案

更改重量初始化器

更改體重優化器

評估預訓練模型

相關推薦