1. 程式人生 > >淺談caffe中train_val.prototxt和deploy.prototxt檔案的區別

淺談caffe中train_val.prototxt和deploy.prototxt檔案的區別

在剛開始學習的時候,覺得train_val.prototxt檔案和deploy.prototxt檔案很相似,然後當時想嘗試利用deploy.prototxt還原出train_val.prototxt檔案,所以就進行了一下對比,水平有限,可能很多地方說的不到位,希望大神們指點批評~~

本文以CaffeNet為例:

1. train_val.prototxt 

首先,train_val.prototxt檔案是網路配置檔案。該檔案是在訓練的時候用的。

2.deploy.prototxt

該檔案是在測試時使用的檔案。

區別:

首先deploy.prototxt檔案都是在train_val.prototxt檔案的基礎上刪除了一些東西,所形成的。

由於兩個檔案的性質,train_val.prototxt檔案裡面訓練的部分都會在deploy.prototxt檔案中刪除。

在train_val.prototxt檔案中,開頭要加入一下訓練設定檔案和準備檔案。例如,transform_param中的mirror: true(開啟映象);crop_size: ***(影象尺寸);mean_file: ""(求解均值的檔案),還有data_param中的source:""(處理過得資料訓練集檔案);batch_size: ***(訓練圖片每批次輸入圖片的數量);backend: LMDB(資料格式設定)。

然後接下來,訓練的時候還有一個測試的設定,測試和訓練模式的設定通過一個include{phase: TEST/TRAIN}來設定。接下來就是要設定TEST模組內容。然後其他設定跟上面一樣,裡面有個batch_size可以調小一點,因為測試的話不需要特別多的圖片數量。

而以上這一塊的內容在deploy裡表現出來的只有一個數據層的設定。只需設定name,type,top,input_param這些即可。

接下來,第一個卷積層的設定,train_val.prototxt檔案中多了param(反向傳播學習率的設定),這裡需要設定兩個param一個時weight的學習率,一個時bias的學習率,其中一般bias的學習率是weight學習率的兩倍。然後就是設定convolution_param,但是在train_val裡面需要有對weight_filler的初始化和對bias_filler的初始化。

然後就是設定啟用啟用函式。這一塊由於沒有初始化,所以兩個檔案都是一樣的。

再接下來就是池化層,由於池化就是降低解析度,所以這兩邊是一樣的,只需要設定kernel_size,stride,pool即可。無需引數的初始化。

再下來時LRN層,該層的全稱是Local Response Normalization(區域性響應值歸一化),該層的作用就是對區域性輸入進行一個歸一化操作,不過現在有論文表明,這一層加不加對結果影響不是很大。但這一層的定義都是相同的。

再接下來就是"conv2"、"relu2"、"pool2"、"LRN2"這樣的迴圈,具體跟之前說的一樣,train_val主要多的就是引數的初始化和學習率的設定。

在第五個卷積層之後,進入了"fc6"層,該層是全連線層,這裡train_val裡面還是多兩個param學習率的設定,和weight_filler、bias_filler的初始化設定,而兩者共同的是有一個輸出向量元素個數的設定:inner_product_param。

再接下來就是啟用函式RELU。

再接下來就是Dropout層,該層的目的就是為了防止模型過擬合。這其中有一個dropout_ration的設定一般為0.5即可。

再接下來就是"fc7",這一層跟"fc6"相同。然後就是"relu7"、"drop7"都是相同的。然後就是"fc8"也與之前相同。

再接下來就是Accuracy,這個層是用來計算網路輸出相對目標值的準確率,它實際上並不是一個損失層,所以沒有反傳操作。但是在caffe官網中,它在損失層這一部分。所以在deploy.prototxt檔案中,這一層的定義是沒有的。

再接下來train_val的最後一個層是"SoftmaxWithLoss"層,也是簡單的定義了name,type,bottom,top就完了。而這一塊的內容也不在deploy.prototxt檔案中。

而在deploy.prototxt檔案中直接定義了一個type:"Softmax"。

通過對CaffeNet這兩個檔案的檢視發現deploy.prototxt檔案和train_val.prototxt檔案之間的差異在很多層裡面牽扯到訓練部分的都會被刪除,然後就是反向傳播訓練部分會被刪除。

其中,這裡面有一個區別在裡頭,就是為什麼train_val裡面的是SoftmaxWithLoss而deploy裡面的是Softmax層(兩個都是損失層,都沒有任何引數):

這裡面其實都是softmax迴歸的應用,只是在定義成Softmax時直接計算了概率室友forward部分,而在SoftmaxWithLoss部分時是還有backward的部分。所以這裡就出現了區別,具體的區別可以看這兩個檔案的C++定義。

下表左邊的是train_val.prototxt檔案,右邊是deploy.prototxt檔案。

name: "CaffeNet" layer { name: "data" type: "Data" top: "data" top: "label" include { phase: TRAIN } transform_param { mirror: true crop_size: 227 mean_file: "data/ilsvrc12/imagenet_mean.binaryproto" } # mean pixel / channel-wise mean instead of mean image # transform_param { # crop_size: 227 # mean_value: 104 # mean_value: 117 # mean_value: 123 # mirror: true # } data_param { source: "examples/imagenet/ilsvrc12_train_lmdb" batch_size: 256 backend: LMDB } } layer { name: "data" type: "Data" top: "data" top: "label" include { phase: TEST } transform_param { mirror: false crop_size: 227 mean_file: "data/ilsvrc12/imagenet_mean.binaryproto" } # mean pixel / channel-wise mean instead of mean image # transform_param { # crop_size: 227 # mean_value: 104 # mean_value: 117 # mean_value: 123 # mirror: false # } data_param { source: "examples/imagenet/ilsvrc12_val_lmdb" batch_size: 50 backend: LMDB } } layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "pool1" type: "Pooling" bottom: "conv1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "norm1" type: "LRN" bottom: "pool1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv2" type: "Convolution" bottom: "norm1" top: "conv2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "pool2" type: "Pooling" bottom: "conv2" top: "pool2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "norm2" type: "LRN" bottom: "pool2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "fc6" type: "InnerProduct" bottom: "pool5" top: "fc6" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 1 } } } layer { name: "relu6" type: "ReLU" bottom: "fc6" top: "fc6" } layer { name: "drop6" type: "Dropout" bottom: "fc6" top: "fc6" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc7" type: "InnerProduct" bottom: "fc6" top: "fc7" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 1 } } } layer { name: "relu7" type: "ReLU" bottom: "fc7" top: "fc7" } layer { name: "drop7" type: "Dropout" bottom: "fc7" top: "fc7" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc8" type: "InnerProduct" bottom: "fc7" top: "fc8" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 1000 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "accuracy" type: "Accuracy" bottom: "fc8" bottom: "label" top: "accuracy" include { phase: TEST } } layer { name: "loss" type: "SoftmaxWithLoss" bottom: "fc8" bottom: "label" top: "loss" } --------------------- 作者:不破樓蘭終不還 來源:CSDN 原文:https://blog.csdn.net/fx409494616/article/details/53008971?utm_source=copy 版權宣告:本文為博主原創文章,轉載請附上博文連結! name: "CaffeNet" layer {   name: "data"   type: "Input"   top: "data"   input_param { shape: { dim: 10 dim: 3 dim: 227 dim: 227 } } } layer {   name: "conv1"   type: "Convolution"   bottom: "data"   top: "conv1"   convolution_param {     num_output: 96     kernel_size: 11     stride: 4   } } layer {   name: "relu1"   type: "ReLU"   bottom: "conv1"   top: "conv1" } layer {   name: "pool1"   type: "Pooling"   bottom: "conv1"   top: "pool1"   pooling_param {     pool: MAX     kernel_size: 3     stride: 2   } } layer {   name: "norm1"   type: "LRN"   bottom: "pool1"   top: "norm1"   lrn_param {     local_size: 5     alpha: 0.0001     beta: 0.75   } } layer {   name: "conv2"   type: "Convolution"   bottom: "norm1"   top: "conv2"   convolution_param {     num_output: 256     pad: 2     kernel_size: 5     group: 2   } } layer {   name: "relu2"   type: "ReLU"   bottom: "conv2"   top: "conv2" } layer {   name: "pool2"   type: "Pooling"   bottom: "conv2"   top: "pool2"   pooling_param {     pool: MAX     kernel_size: 3     stride: 2   } } layer {   name: "norm2"   type: "LRN"   bottom: "pool2"   top: "norm2"   lrn_param {     local_size: 5     alpha: 0.0001     beta: 0.75   } } layer {   name: "conv3"   type: "Convolution"   bottom: "norm2"   top: "conv3"   convolution_param {     num_output: 384     pad: 1     kernel_size: 3   } } layer {   name: "relu3"   type: "ReLU"   bottom: "conv3"   top: "conv3" } layer {   name: "conv4"   type: "Convolution"   bottom: "conv3"   top: "conv4"   convolution_param {     num_output: 384     pad: 1     kernel_size: 3     group: 2   } } layer {   name: "relu4"   type: "ReLU"   bottom: "conv4"   top: "conv4" } layer {   name: "conv5"   type: "Convolution"   bottom: "conv4"   top: "conv5"   convolution_param {     num_output: 256     pad: 1     kernel_size: 3     group: 2   } } layer {   name: "relu5"   type: "ReLU"   bottom: "conv5"   top: "conv5" } layer {   name: "pool5"   type: "Pooling"   bottom: "conv5"   top: "pool5"   pooling_param {     pool: MAX     kernel_size: 3     stride: 2   } } layer {   name: "fc6"   type: "InnerProduct"   bottom: "pool5"   top: "fc6"   inner_product_param {     num_output: 4096   } } layer {   name: "relu6"   type: "ReLU"   bottom: "fc6"   top: "fc6" } layer {   name: "drop6"   type: "Dropout"   bottom: "fc6"   top: "fc6"   dropout_param {     dropout_ratio: 0.5   } } layer {   name: "fc7"   type: "InnerProduct"   bottom: "fc6"   top: "fc7"   inner_product_param {     num_output: 4096   } } layer {   name: "relu7"   type: "ReLU"   bottom: "fc7"   top: "fc7" } layer {   name: "drop7"   type: "Dropout"   bottom: "fc7"   top: "fc7"   dropout_param {     dropout_ratio: 0.5   } } layer {   name: "fc8"   type: "InnerProduct"   bottom: "fc7"   top: "fc8"   inner_product_param {     num_output: 1000   } } layer {   name: "prob"   type: "Softmax"   bottom: "fc8"   top: "prob" }