1. 程式人生 > >Caffe訓練個人資料並呼叫模型進行分類

Caffe訓練個人資料並呼叫模型進行分類

最近有份作業,需要用到cafee做一些圖片分類方面的,用慣Tensorflow了就gg,圖片集用了華南理工大學的圖片集

一開始的安裝由於我懶,所以讓個有經驗的同學幫我裝了下,本來想親力親為的我,真香

由於我之前裝了tensorflow-gpu,CUDA版本9.0,caffe現在好像支援最高8.0,用9.0是會build不出來的,嫌麻煩我直接裝cpu版了。

然後想先做個簡單的分類練一下手,第一眼看到的部落格地址,發現跟其他部落格寫的也差不多,順序也差不多,但是我自己會遇到一些問題,主要就是路徑的問題。

所以,流程中的指令碼檔案的路徑什麼的,要好好注意用在哪,以及會和其他路徑怎麼連線。

首先,可能由於Caffe版本不同,我看到很多網上的教程,可執行exe檔案都是在“/build/tools/”下,而我的是在“caffe\scripts\build\tools\Release”下,接下來跟著流程走。

我的整個訓練產生的檔案:

1.train.txt檔案和val.txt檔案以及label.txt,我的圖片都一起放data裡面了,一開始搞txt文字,還是用python處理的。。分出train和val資料夾,需要在之後的一些檔案中加上資料夾的路徑,後面會說原因。

ftw93.jpg 0
ftw94.jpg 0
ftw95.jpg 0
ftw96.jpg 0
ftw97.jpg 0
ftw98.jpg 0
ftw99.jpg 0
...
mtw1.jpg 1
mtw10.jpg 1
mtw100.jpg 1
mtw101.jpg 1
mtw102.jpg 1
mtw103.jpg 1
mtw104.jpg 1
mtw105.jpg 1
mtw106.jpg 1
mtw107.jpg 1

label.txt即所有分類

0 歐美女
1 亞洲女
2 歐美男
3 亞洲男

我的檔案都是這樣配置,不用絕對地址就是因為路徑相關,等一下說。

(2018-12-06更新)自動生成caffe訓練的訓練和測試集txt指令碼如下(訓練和測試圖片放兩個資料夾):

# /usr/bin/env sh
DATA=D:/caffe/examples/my_image
FILETYPE=jpg   #需要處理樣本的圖片格式
echo "Create train.txt..."
rm -rf $DATA/train.txt
array=("ftw" "fty" "mtw" "mty")    # 迴圈幾種類別
for i in 0 1 2 3  #
do
echo ${array[i]}
find $DATA/data/train -name ${array[i]}*.$FILETYPE | cut -d '/' -f7 | sed "s/$/ $i/">>train.txt   # 寫入檔案
done
echo "Create test.txt..."
rm -rf $DATA/test.txt
for i in 0 1 2 3   # -f6-7 指目錄第6-7層,根據上面的目錄來指定,若是不加train資料夾,只需7即可
do
find $DATA/data/test -name ${array[i]}*.$FILETYPE | cut -d '/' -f7 | sed "s/$/ $i/">>val.txt  
done
echo "All done"
pause

2. 生成lmdb檔案,我用的也是create_imagenet.sh檔案,路徑為caffe\examples\imagenet,檔案裡面我一開始也用相對路徑,一直有出錯的部分,所以我直接全用絕對路徑了。在這裡面就有TRAIN_DATA_ROOT和VAL_DATA_ROOT的問題了,這兩個指訓練和測試資料集的地址,與之前train.txt和val.txt裡的路徑會組合成完整的路徑,我不在train.txt中使用完整路徑的原因,是因為我用git bash啟動sh檔案,如果TRAIN_DATA_ROOT設定為 / ,則預設為git的exe檔案所以路徑作為TRAIN_DATA_ROOT,故而老是訓練失敗。一些註釋我就不打了,一般都能看得懂,之前貼的那份連結也有。

#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
set -e

EXAMPLE=D:/caffe/examples/my_image
DATA=D:/caffe/examples/my_image/data/
TOOLS=D:/caffe/scripts/build/tools/Release

TRAIN_DATA_ROOT=D:/caffe/examples/my_image/data/train/
VAL_DATA_ROOT=D:/caffe/examples/my_image/data/test/

# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=true
if $RESIZE; then
  RESIZE_HEIGHT=32   # 檔案中本來自動設定成256,但我為了快點出結果,先設了32
  RESIZE_WIDTH=32
else
  RESIZE_HEIGHT=0
  RESIZE_WIDTH=0
fi

if [ ! -d "$TRAIN_DATA_ROOT" ]; then
  echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
  echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \
       "where the ImageNet training data is stored."
  exit 1
fi

if [ ! -d "$VAL_DATA_ROOT" ]; then
  echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
  echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \
       "where the ImageNet validation data is stored."
  exit 1
fi

echo "Creating train lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $TRAIN_DATA_ROOT \
    $DATA/train.txt \
    $EXAMPLE/ilsvrc12_train_lmdb   #生成的lmdb路徑


echo "Creating val lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $VAL_DATA_ROOT \
    $DATA/val.txt \
    $EXAMPLE/ilsvrc12_val_lmdb    #生成的lmdb路徑

echo "Done."

3.生成mean_file,依然全部用絕對路徑,關於這個的用處,這個部落格有所描述。不過上面的連結中只計算了train的均值檔案,以及我看到的教程都是隻搞出了train的mean_file,然後後面在網路中一起用,這讓我覺得有點奇怪,於是我多加了個test的,結果計算出來的檔案大小與train的mean_file大小一樣,進入網路後跟之前也沒什麼變化,很迷

(注:一天後,我突然意識到了,現實中只有訓練資料和待預測資料,故而當然只有train的均值檔案,至於為什麼大小一樣。。自然因為它是均值檔案,不隨數量變化而變化)

EXAMPLE=D:/caffe/examples/my_image
DATA=D:/caffe/examples/my_image/data/
TOOLS=D:/caffe/scripts/build/tools/Release

$TOOLS/compute_image_mean $EXAMPLE/ilsvrc12_train_lmdb $EXAMPLE/imagenet_train_mean.binaryproto

$TOOLS/compute_image_mean $EXAMPLE/ilsvrc12_val_lmdb $EXAMPLE/imagenet_val_mean.binaryproto

echo "Done."

4.cifar10_quick_solver.prototxt 和 cifar10_quick_train_test.prototxt,這兩個都是在caffe\examples\cifar10中拷貝過來的

cifar10_quick_train_test.prototxt,可以對照一下哪些地方不同(我備註的就是不同的)

name: "CIFAR10_quick"
layer {
  name: "cifar"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mean_file: "D:/caffe/examples/my_image/imagenet_train_mean.binaryproto"  #均值檔案路徑
  }
  data_param {
    source: "D:/caffe/examples/my_image/ilsvrc12_train_lmdb"   # lmdb檔案路徑
    batch_size: 20   # 圖片數量比較少的話就不要設定太大了
    backend: LMDB   # 有兩種,生成的是lmdb,就選LMDB
  }
}
layer {
  name: "cifar"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mean_file: "D:/caffe/examples/my_image/imagenet_train_mean.binaryproto"  # 注意也是train的均值檔案
  }
  data_param {
    source: "D:/caffe/examples/my_image/ilsvrc12_val_lmdb"  # lmdb資料夾
    batch_size: 20  # 測試時候batch_size
    backend: LMDB  # 同上
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.0001
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "pool1"
  top: "pool1"
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 64
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "pool3"
  type: "Pooling"
  bottom: "conv3"
  top: "pool3"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool3"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 64
    weight_filler {
      type: "gaussian"
      std: 0.1
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 4   # 輸出變數
    weight_filler {
      type: "gaussian"
      std: 0.1
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

5.cifar10_quick_solver.prototxt檔案

# reduce the learning rate after 8 epochs (4000 iters) by a factor of 10

# The train/test net protocol buffer definition
net: "D:/caffe/examples/my_image/cifar10_quick_train_test.prototxt"   # 網路路徑
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 20  # 訓練幾次,這裡的數字應該和batch_size相乘後等於訓練集總數
# Carry out testing every 500 training iterations.
test_interval: 10  # 每隔幾次測試一次
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.001
momentum: 0.9
weight_decay: 0.004
# The learning rate policy
lr_policy: "fixed"
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 4000
# snapshot intermediate results
snapshot: 1000
snapshot_prefix: "D:/caffe/examples/my_image/cifar10_quick"
# solver mode: CPU or GPU
solver_mode: CPU

6.開始訓練,train.sh檔案

D:/caffe/scripts/build/tools/Release/caffe train --solver=D:/caffe/examples/my_image/cifar10_quick_solver.prototxt

然後執行train.sh檔案即可

7.(2018-12-06更新)對自己的圖片進行分類,注意其中有個deploy.protxt,需要跟之前的網路是適配的

我建立的deploy.protxt(根據前面的網路進行設定的)和test.sh如下:

name: "CIFAR10_quick"
layer {
  name: "cifar"
  type: "Input"
  top: "data"
  input_param { shape: { dim: 10 dim: 3 dim: 32 dim: 32 } }  #這裡dim 227 兩個地方需要對應上你自己訓練時候的尺寸,否則會出現以下描述的異常
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 32
	pad: 2
    kernel_size: 5
    stride: 1
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "pool1"
  top: "pool1"
}

layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  convolution_param {
    num_output: 64
    pad: 2
    kernel_size: 5
    stride: 1
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "pool3"
  type: "Pooling"
  bottom: "conv3"
  top: "pool3"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool3"
  top: "ip1"
  inner_product_param {
    num_output: 64
  }
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  inner_product_param {
    num_output: 4  #配置的標籤個數
  }
}

layer {
  name: "prob"
  type: "Softmax"
  bottom: "ip2"
  top: "prob"
}
for i in 1 2 3 4
do
D:/caffe/scripts/build/examples/cpp_classification/Release/classification.exe D:/caffe/examples/my_image/deploy.prototxt D:/caffe/examples/my_image/cifar10_quick_iter_4000.caffemodel.h5 D:/caffe/examples/my_image/imagenet_train_mean.binaryproto D:/caffe/examples/my_image/label.txt  C:/Users/xxx/Desktop/$i.jpg
# 一共五個路徑,應該都能看出指什麼
done

8.由於這篇一開始我只用了兩個分類,一開始的步驟寫的也都很簡單,後來才逐步修改上去的,所以可能有些地方承接的不好,之後我重新訓練了一份,保留了整個流程的指令碼和網路結構以及訓練出來的模型,可以直接下載

9.在vs2015中建立自己的工程呼叫模型進行分類,這步我還在探索中