【10】Caffe學習系列：命令列解析

阿新 • • 發佈：2018-11-09

caffe的執行提供三種介面：c++介面（命令列）、python介面和matlab介面。本文先對命令列進行解析，後續會依次介紹其它兩個介面。其實大部分情況下我們會使用python介面進行呼叫，當然caffe提供了C++命令列介面，還是有必要了解一下。命令列引數有個優點是支援多GPU執行。

caffe的c++主程式（caffe.cpp)放在根目錄下的tools資料夾內, 當然還有一些其它的功能檔案，如：convert_imageset.cpp, compute_image_mean.cpp等也放在這個資料夾內。經過編譯後，這些檔案都被編譯成了可執行檔案，放在了 ./build/tools/ 資料夾內。因此我們要執行caffe程式，都需要加 ./build/tools/ 字首。

如：

# sh ./build/tools/caffe train --solver=examples/mnist/train_lenet.sh

caffe程式的命令列執行格式如下：

caffe <command> <args>

其中的<command>有這樣四種：

train
test
device_query
time

對應的功能為：

train----訓練或finetune模型（model),

test-----測試模型

device_query---顯示gpu資訊

time-----顯示程式執行時間

其中的<args>引數有：

-solver
-gpu
-snapshot
-weights
-iteration
-model
-sighup_effect
-sigint_effect

注意前面有個-符號。對應的功能為：

-solver：必選引數。一個protocol buffer型別的檔案，即模型的配置檔案。如：

# ./build/tools/caffe train -solver examples/mnist/lenet_solver.prototxt

-gpu: 可選引數。該引數用來指定用哪一塊gpu執行，根據gpu的id進行選擇，如果設定為'-gpu all'則使用所有的gpu執行。如使用第二塊gpu執行,預設從0開始：

# ./build/tools/caffe train -solver examples/mnist/lenet_solver.prototxt -gpu 1

-snapshot:可選引數。該引數用來從快照（snapshot)中恢復訓練。可以在solver配置檔案設定快照，儲存solverstate。如：

# ./build/tools/caffe train -solver examples/mnist/lenet_solver.prototxt -snapshot examples/mnist/lenet_iter_5000.solverstate

-weights:可選引數。用預先訓練好的權重來fine-tuning模型，需要一個caffemodel，不能和-snapshot同時使用。如：

# ./build/tools/caffe train -solver examples/finetuning_on_flickr_style/solver.prototxt -weights models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel

-iterations: 可選引數，迭代次數，預設為50。如果在配置檔案檔案中沒有設定迭代次數，則預設迭代50次。

-model:可選引數，定義在protocol buffer檔案中的模型。也可以在solver配置檔案中指定。

-sighup_effect：可選引數。用來設定當程式發生掛起事件時，執行的操作，可以設定為snapshot, stop或none, 預設為snapshot

-sigint_effect: 可選引數。用來設定當程式發生鍵盤中止事件時（ctrl+c), 執行的操作，可以設定為snapshot, stop或none, 預設為stop

剛才舉例了一些train引數的例子，現在我們來看看其它三個<command>：

test引數用在測試階段，用於最終結果的輸出，要模型配置檔案中我們可以設定需要輸入accuracy還是loss. 假設我們要在驗證集中驗證已經訓練好的模型，就可以這樣寫

# ./build/tools/caffe test -model examples/mnist/lenet_train_test.prototxt -weights examples/mnist/lenet_iter_10000.caffemodel -gpu 0 -iterations 100

這個例子比較長，不僅用到了test引數，還用到了-model, -weights, -gpu和-iteration四個引數。意思是利用訓練好了的權重（-weight)，輸入到測試模型中(-model)，用編號為0的gpu(-gpu)測試100次(-iteration)。

time引數用來在螢幕上顯示程式執行時間。如：

# ./build/tools/caffe time -model examples/mnist/lenet_train_test.prototxt -iterations 10

這個例子用來在螢幕上顯示lenet模型迭代10次所使用的時間。包括每次迭代的forward和backward所用的時間，也包括每層forward和backward所用的平均時間。

# ./build/tools/caffe time -model examples/mnist/lenet_train_test.prototxt -gpu 0

這個例子用來在螢幕上顯示lenet模型用gpu迭代50次所使用的時間。

# ./build/tools/caffe time -model examples/mnist/lenet_train_test.prototxt -weights examples/mnist/lenet_iter_10000.caffemodel -gpu 0 -iterations 10

利用給定的權重，利用第一塊gpu，迭代10次lenet模型所用的時間。

device_query引數用來診斷gpu資訊。

# ./build/tools/caffe device_query -gpu 0

最後，我們來看兩個關於gpu的例子

# ./build/tools/caffe train -solver examples/mnist/lenet_solver.prototxt -gpu 0,1

# ./build/tools/caffe train -solver examples/mnist/lenet_solver.prototxt -gpu all

如果遇到上述問題可參考，caffe缺少NCCL庫導致不能多GPU訓練問題(改makefile版)

這兩個例子表示：用兩塊或多塊GPU來平行運算，這樣速度會快很多。但是如果你只有一塊或沒有gpu, 就不要加-gpu引數了，加了反而慢。

最後，在linux下，本身就有一個time命令，因此可以結合進來使用，因此我們執行mnist例子的最終命令是(一塊gpu)：

$ time ./build/toos/caffe train -solver examples/mnist/lenet_solver.prototxt

【10】Caffe學習系列：命令列解析

【10】Caffe學習系列：命令列解析

【14】Caffe學習系列：計算圖片資料的均值

【13】Caffe學習系列：資料視覺化環境（python介面)配置

【12】Caffe學習系列：訓練和測試自己的圖片

【11】Caffe學習系列：影象資料轉換成db（leveldb/lmdb)檔案

【9】Caffe學習系列：執行caffe自帶的兩個簡單例子

【8】Caffe學習系列：solver優化方法

【7】Caffe學習系列：solver及其配置

【6】Caffe學習系列：Blob,Layer and Net以及對應配置檔案的編寫

【5】Caffe學習系列：其它常用層及引數

【4】Caffe學習系列：啟用層（Activiation Layers)及引數

【3】Caffe學習系列：視覺層（Vision Layers)及引數

【2】Caffe學習系列：資料層及引數

【16】Caffe學習系列：caffemodel視覺化

【2】Caffe學習系列(11)：影象資料轉換成db（leveldb/lmdb)檔案

【10】Python學習筆記：簡單的多級目錄(字典巢狀)

【轉載】Caffe學習：運行caffe自帶的兩個簡單例子

Caffe學習系列：啟用層（Activiation Layers)及引數

【JQuery】JQuery學習筆記：JQuery操作HTML，即JQuery DOM操作

caffe學習系列：訓練自己的圖片集（超詳細教程）

【10】Caffe學習系列：命令列解析

相關推薦