caffe 實戰系列：proto檔案格式以及含義解析：如何定義網路，如何設定網路引數(以AlexNet為例) 2016.3.30

阿新 • • 發佈：2019-01-21

（0）前言：

初學者往往不知道如何配置網路，或者面對這些引數卻無從下手不知道是什麼含義，下面我根據分析原始碼的經驗給出AlexNet的具體解釋，希望能夠給初學者一些定義網路上面的幫助此外還能夠知道如何找網路的引數，這些引數是如何設定的。以AlexNet為例：首先給出配置例項： name: "AlexNet" layer { # 資料層 name: "data" type: "Data" top: "data" top: "label" include { phase: TRAIN #include 表明這是在訓練階段才包括進去 } transform_param { # 對資料進行預處理，分別為做映象，設定crop的大小為227，以及減去均值檔案 mirror: true crop_size: 227 mean_file: "data/ilsvrc12/imagenet_mean.binaryproto" } data_param { # 設定資料的來源 source: "examples/imagenet/ilsvrc12_train_lmdb" batch_size: 256 backend: LMDB } } layer { name: "data" type: "Data" top: "data" top: "label" include { # 規定只在測試的時候使用該層 phase: TEST } transform_param { # 測試的時候就不做映象了 mirror: false crop_size: 227 mean_file: "data/ilsvrc12/imagenet_mean.binaryproto" } data_param { source: "examples/imagenet/ilsvrc12_val_lmdb" batch_size: 50 backend: LMDB } } layer { # 卷積層 name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { # 通用的有關於學習的引數，學習率和權重衰減率，這裡是兩個學習率是因為定義了卷積組，且大小為2，所以是兩個引數 lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { # 卷積層的引數，卷積核以及偏置 num_output: 96 kernel_size: 11 stride: 4 # 但是conv1卻又沒有定義group:2，下面的卷積層倒是都定義了，所以這有點奇怪。 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { # relu層 name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { # norm層 name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { # 池化層 name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2# 卷積組的大小為2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool2" type: "Pooling" bottom: "norm2" top: "pool2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv3" type: "Convolution" bottom: "pool2" top: "conv3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2# 卷積組的大小為2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2# 卷積組的大小為2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "fc6" type: "InnerProduct" bottom: "pool5" top: "fc6" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu6" type: "ReLU" bottom: "fc6" top: "fc6" } layer { name: "drop6" type: "Dropout" bottom: "fc6" top: "fc6" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc7" type: "InnerProduct" bottom: "fc6" top: "fc7" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu7" type: "ReLU" bottom: "fc7" top: "fc7" } layer { name: "drop7" type: "Dropout" bottom: "fc7" top: "fc7" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc8" type: "InnerProduct" bottom: "fc7" top: "fc8" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 1000 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "accuracy" type: "Accuracy" bottom: "fc8" bottom: "label" top: "accuracy" include {# 測試階段才包括該層 phase: TEST } } layer { name: "loss" type: "SoftmaxWithLoss" bottom: "fc8" bottom: "label" top: "loss" }

（1）資料輸入層預處理的引數transform_param的定義：

// Message that stores parameters used to apply transformation // to the data layer's data message TransformationParameter { // For data pre-processing, we can do simple scaling and subtracting the // data mean, if provided. Note that the mean subtraction is always carried // out before scaling. // 對畫素值進行縮放pixelvalue = scale*pixelvalue optional float scale = 1 [default = 1]; // Specify if we want to randomly mirror data. // 是否對影象進行映象 optional bool mirror = 2 [default = false]; // Specify if we would like to randomly crop an image. // 隨機切割影象的大小 optional uint32 crop_size = 3 [default = 0]; // mean_file and mean_value cannot be specified at the same time // 均值檔案的路徑 optional string mean_file = 4; // if specified can be repeated once (would substract it from all the channels) // or can be repeated the same number of times as channels // (would subtract them from the corresponding channel) // 如果不使用均值檔案，用均值也可以的 repeated float mean_value = 5; // Force the decoded image to have 3 color channels. // 強制認為資料是三通道的（彩色的） optional bool force_color = 6 [default = false]; // Force the decoded image to have 1 color channels. // 強制認為資料是單通道的（灰度的） optional bool force_gray = 7 [default = false]; }

（2）資料輸入層中資料來源的引數data_param 定義

message DataParameter { enum DB { // 資料庫的型別LEVELDB還是LMDB型別 LEVELDB = 0; LMDB = 1; } // Specify the data source. // 資料庫檔案的路徑 optional string source = 1; // Specify the batch size. // 批大小 optional uint32 batch_size = 4; // The rand_skip variable is for the data layer to skip a few data points // to avoid all asynchronous sgd clients to start at the same point. The skip // point would be set as rand_skip * rand(0,1). Note that rand_skip should not // be larger than the number of keys in the database. // DEPRECATED. Each solver accesses a different subset of the database. // 隨機跳過前rand_skip個，這裡程式中會生成[0,rand_skip-1]之間的一個隨機數然後跳過這個數值個的資料 optional uint32 rand_skip = 7 [default = 0]; // 資料庫的後端是使用的什麼型別的資料庫 optional DB backend = 8 [default = LEVELDB]; // DEPRECATED. See TransformationParameter. For data pre-processing, we can do // simple scaling and subtracting the data mean, if provided. Note that the // mean subtraction is always carried out before scaling. // 該引數已經過時，應該在TransformationParameter進行定義，上面我已經給出了這部分引數的定義 optional float scale = 2 [default = 1]; optional string mean_file = 3; // DEPRECATED. See TransformationParameter. Specify if we would like to randomly // crop an image. 該引數已經過時 optional uint32 crop_size = 5 [default = 0]; // DEPRECATED. See TransformationParameter. Specify if we want to randomly mirror // data. 該引數已經過時 optional bool mirror = 6 [default = false]; // Force the encoded image to have 3 color channels // 強制認為儲存的影象是彩色的 optional bool force_encoded_color = 9 [default = false]; // Prefetch queue (Number of batches to prefetch to host memory, increase if // data access bandwidth varies). // 預取佇列的個數 optional uint32 prefetch = 10 [default = 4]; }

（3）卷積層中有關於學習的引數

首先就是那個卷積層的param，實際上就是在LayerParameter中進行定義的，也就是說每個層都有這個引數這是一個通用的引數。定義了學習率啥的還有損失權重 // LayerParameter next available layer-specific ID: 139 (last added: tile_param) message LayerParameter { optional string name = 1; // the layer name optional string type = 2; // the layer type repeated string bottom = 3; // the name of each bottom blob repeated string top = 4; // the name of each top blob // The train / test phase for computation. optional Phase phase = 10; // The amount of weight to assign each top blob in the objective. // Each layer assigns a default value, usually of either 0 or 1, // to each top blob. repeated float loss_weight = 5; // Specifies training parameters (multipliers on global learning constants, // and the name and other settings used for weight sharing). repeated ParamSpec param = 6;// 就是這貨這貨的詳細定義如下：主要包括名字、維度檢查的模式、學習率（預設是1），權重衰減率（等於1就是不衰減啦） message ParamSpec { // The names of the parameter blobs -- useful for sharing parameters among // layers, but never required otherwise. To share a parameter between two // layers, give it a (non-empty) name. optional string name = 1; // Whether to require shared weights to have the same shape, or just the same // count -- defaults to STRICT if unspecified. optional DimCheckMode share_mode = 2; enum DimCheckMode { // STRICT (default) requires that num, channels, height, width each match. STRICT = 0; // PERMISSIVE requires only the count (num*channels*height*width) to match. PERMISSIVE = 1; } // The multiplier on the global learning rate for this parameter. optional float lr_mult = 3 [default = 1.0]; // The multiplier on the global weight decay for this parameter. optional float decay_mult = 4 [default = 1.0]; }

（4）卷積層中有關於卷積的引數

接下來介紹與卷積相關的引數，即在卷積層定義的convolution_param。這貨的定義是這樣的： message ConvolutionParameter { optional uint32 num_output = 1; // The number of outputs for the layer optional bool bias_term = 2 [default = true]; // whether to have bias terms // Pad, kernel size, and stride are all given as a single value for equal // dimensions in all spatial dimensions, or once per spatial dimension. // 是否padding repeated uint32 pad = 3; // The padding size; defaults to 0 // 核大小 repeated uint32 kernel_size = 4; // The kernel size // 步長 repeated uint32 stride = 6; // The stride; defaults to 1 // For 2D convolution only, the *_h and *_w versions may also be used to // specify both spatial dimensions. // 對於二維卷積來說是可以設定pad、kernel以及步長的寬度和高度不一樣的 optional uint32 pad_h = 9 [default = 0]; // The padding height (2D only) optional uint32 pad_w = 10 [default = 0]; // The padding width (2D only) optional uint32 kernel_h = 11; // The kernel height (2D only) optional uint32 kernel_w = 12; // The kernel width (2D only) optional uint32 stride_h = 13; // The stride height (2D only) optional uint32 stride_w = 14; // The stride width (2D only) // 每一個卷積組的大小 optional uint32 group = 5 [default = 1]; // The group size for group conv // 這就是初始化權重和偏置的引數啦 optional FillerParameter weight_filler = 7; // The filler for the weight optional FillerParameter bias_filler = 8; // The filler for the bias enum Engine { DEFAULT = 0; CAFFE = 1; CUDNN = 2; } // 使用CPU還是GPU計算 optional Engine engine = 15 [default = DEFAULT]; // The axis to interpret as "channels" when performing convolution. // Preceding dimensions are treated as independent inputs; // succeeding dimensions are treated as "spatial". // With (N, C, H, W) inputs, and axis == 1 (the default), we perform // N independent 2D convolutions, sliding C-channel (or (C/g)-channels, for // groups g>1) filters across the spatial axes (H, W) of the input. // With (N, C, D, H, W) inputs, and axis == 1, we perform // N independent 3D convolutions, sliding (C/g)-channels // filters across the spatial axes (D, H, W) of the input. // 通道數，如果該值是1，那麼如果資料是（N，C，H，W） // 那麼就進行N個獨立的二維卷積 // 如果資料是（N，C，D，H，W），那麼就進行三維卷積 optional int32 axis = 16 [default = 1]; // Whether to force use of the general ND convolution, even if a specific // implementation for blobs of the appropriate number of spatial dimensions // is available. (Currently, there is only a 2D-specific convolution // implementation; for input blobs with num_axes != 2, this option is // ignored and the ND implementation will be used.) // 強制使用通用的N維卷積方法 // 如果num_axes!=2就會使用N維卷積 optional bool force_nd_im2col = 17 [default = false]; }

（5）卷積層中有關於初始化的引數

下面給出卷積層初始化的引數： message FillerParameter { // The filler type. // 初始化型別 optional string type = 1 [default = 'constant']; // 如果是常數初始化的話需要該值 optional float value = 2 [default = 0]; // the value in constant filler // 如果是均勻分佈初始化則需要min和max optional float min = 3 [default = 0]; // the min value in uniform filler optional float max = 4 [default = 1]; // the max value in uniform filler // 如果是高斯分佈初始化則需要mean和std optional float mean = 5 [default = 0]; // the mean value in Gaussian filler optional float std = 6 [default = 1]; // the std value in Gaussian filler // The expected number of non-zero output weights for a given input in // Gaussian filler -- the default -1 means don't perform sparsification. // 是否需要稀疏特性 optional int32 sparse = 7 [default = -1]; // Normalize the filler variance by fan_in, fan_out, or their average. // Applies to 'xavier' and 'msra' fillers. // 對於xavier和msra兩種權重初始化需要設定歸一化的型別是 // 使用扇入還是扇出還是扇入+扇出進行歸一化 enum VarianceNorm { FAN_IN = 0; FAN_OUT = 1; AVERAGE = 2; } optional VarianceNorm variance_norm = 8 [default = FAN_IN]; }

（6）區域性歸一化層引數lrn_param的定義

（該層實際上證明已經沒啥用了，所以就不解釋了，一般也不用） // Message that stores parameters used by LRNLayer message LRNParameter { optional uint32 local_size = 1 [default = 5]; optional float alpha = 2 [default = 1.]; optional float beta = 3 [default = 0.75]; enum NormRegion { ACROSS_CHANNELS = 0; WITHIN_CHANNEL = 1; } optional NormRegion norm_region = 4 [default = ACROSS_CHANNELS]; optional float k = 5 [default = 1.]; }

（7）全連線層

Caffe中也稱之為內積層，也有學習相關的引數以及初始化的引數：分別為param和inner_product_param 下面給出inner_product_param的定義，我們看到裡面定義了FillerParameter型別的weight_filler和bias_filler 另外還定義了axis,預設為1. message InnerProductParameter { optional uint32 num_output = 1; // The number of outputs for the layer optional bool bias_term = 2 [default = true]; // whether to have bias terms optional FillerParameter weight_filler = 3; // The filler for the weight optional FillerParameter bias_filler = 4; // The filler for the bias // The first axis to be lumped into a single inner product computation; // all preceding axes are retained in the output. // May be negative to index from the end (e.g., -1 for the last axis). optional int32 axis = 5 [default = 1]; }

（8）池化層的引數

pooling_param的定義如下： message PoolingParameter { enum PoolMethod { // 幾種池化方法 MAX = 0; AVE = 1; STOCHASTIC = 2; } optional PoolMethod pool = 1 [default = MAX]; // The pooling method // Pad, kernel size, and stride are all given as a single value for equal // dimensions in height and width or as Y, X pairs. // 如果使用pad引數則認為是正方形的，如果使用pad_h和pad_w則認為是矩形的 // 同理kernel_size也是、stride也是 optional uint32 pad = 4 [default = 0]; // The padding size (equal in Y, X) optional uint32 pad_h = 9 [default = 0]; // The padding height optional uint32 pad_w = 10 [default = 0]; // The padding width optional uint32 kernel_size = 2; // The kernel size (square) optional uint32 kernel_h = 5; // The kernel height optional uint32 kernel_w = 6; // The kernel width optional uint32 stride = 3 [default = 1]; // The stride (equal in Y, X) optional uint32 stride_h = 7; // The stride height optional uint32 stride_w = 8; // The stride width enum Engine { DEFAULT = 0; CAFFE = 1; CUDNN = 2; } optional Engine engine = 11 [default = DEFAULT]; // If global_pooling then it will pool over the size of the bottom by doing // kernel_h = bottom->height and kernel_w = bottom->width optional bool global_pooling = 12 [default = false]; }

（9）dropout層的引數

dropout_param的定義如下： message DropoutParameter { optional float dropout_ratio = 1 [default = 0.5]; // dropout ratio } 就一個引數，就是丟棄的資料的概率

（10）總結

各個引數的含義可以參考caffe.proto，碰到不懂的引數，或者考慮使用一些引數的時候也可以去找找你所用的層的引數是不是有。此外：RELU沒有引數

caffe 實戰系列：proto檔案格式以及含義解析：如何定義網路，如何設定網路引數(以AlexNet為例) 2016.3.30

（0）前言：

（1）資料輸入層預處理的引數transform_param的定義：

（2）資料輸入層中資料來源的引數data_param 定義

（3）卷積層中有關於學習的引數

（4）卷積層中有關於卷積的引數

（5）卷積層中有關於初始化的引數

（6）區域性歸一化層引數lrn_param的定義

（7）全連線層

（8）池化層的引數

（9）dropout層的引數

（10）總結

caffe 實戰系列：proto檔案格式以及含義解析：如何定義網路，如何設定網路引數(以AlexNet為例) 2016.3.30

天地圖專題七：行政區域標記，熱力圖（以廣西為例）

Caffe實戰系列：最簡潔的Caffe安裝教程(以ubuntu14.04為例)

Caffe實戰系列：實現自己Caffe網路層

caffe 實戰系列：如何寫自己的資料層（以Deep Spatial Net為例）

PDF檔案格式轉換攻略：PDF格式轉換圖片格式

WAV格式音訊檔案標頭檔案格式以及C++讀取

C++語言基礎例程案例：bmp檔案格式剖析

INI檔案格式以及Java編碼實現讀取

iOS 檢視ipa包中的檔案格式以及圖片

pytorch系列 ---5以 linear_regression為例講解神經網路實現基本步驟以及解讀nn.Linear函式

TSM檔案格式及例項解析（四）——string的排列

Docker系列之五：Volume 卷的使用——以Redis為例

linux驅動由淺入深系列：塊裝置驅動之三（塊裝置驅動結構分析，以mmc為例）

idea軟體編碼已經設定好了為utf-8，但是svn中down下來的檔案格式本身不是utf-8的，此時開啟後會出現中文亂碼解決方法

CNCF CNI系列之一：淺談kubernetes的網路與CNI(以flannel為例)

Caffe：如何執行一個pre-train過的神經網路——以VGG16為例

SUMO文件：軌跡檔案生成（以ns2為例）

博科SAN交換機學習筆記之二：配置檔案備份與韌體升級作者 LiaoJL | 轉載時請務必以超連結形式標明文章原文連結和作者資訊及本版權宣告。原文連結：http://www.liaojl.co

現代作業系統應用開發：UWP——檔案管理（二）：FileManagement

caffe 實戰系列：proto檔案格式以及含義解析：如何定義網路，如何設定網路引數(以AlexNet為例) 2016.3.30

（0）前言：

（1）資料輸入層預處理的引數transform_param的定義：

（2）資料輸入層中資料來源的引數data_param 定義

（3）卷積層中有關於學習的引數

（4）卷積層中有關於卷積的引數

（5）卷積層中有關於初始化的引數

（6）區域性歸一化層引數lrn_param的定義

（7）全連線層

（8）池化層的引數

（9）dropout層的引數

（10）總結

相關推薦