1. 程式人生 > >caffe模型訓練全過程(一)指令碼、資料準備與製作

caffe模型訓練全過程(一)指令碼、資料準備與製作

1.首先建立工程資料夾

資料夾結構如下

|——project
    ├── create_imagenet.sh  #生成lmdb檔案的指令碼
    |——train_lmdb
        ├── data.mdb
        └── lock.mdb            #存放輸出的訓練集lmdb檔案
    |——val_lmdb
        ├── data.mdb\
        └── lock.mdb            #存放輸出的測試集lmdb檔案
    ├── models              #存放輸出的模型
        ├── solver_iter_2576.caffemodel
└── solver_iter_2576.solverstate ├── other #其他備份檔案 ├── solver.prototxt #solver配置檔案 ├── train #測試資料集 ├── positivite l#存放類別1的圖片 └── negative_eg #存放類別2的圖片 ├── train_caffenet.sh #執行此指令碼開始訓練 ├── train.txt
#存放訓練集路徑集合 ├── train_val.prototxt #caffe模型結構配置檔案 ├── val #測試集資料 └── val.txt #測試訓練圖片

2.製作LMDB資料來源

首先生成train.txt and val.txt兩個包含路徑的文字檔案

其如下:

train.txt

positivite/IMG_000001.jpg 1
positivite/IMG_000002.jpg 1
positivite/IMG_000003.jpg 1
positivite/IMG_000008.jpg
1 positivite/IMG_000010.jpg 1 positivite/IMG_000014.jpg 1 positivite/IMG_000016.jpg 1 positivite/IMG_000017.jpg 1 positivite/IMG_000018.jpg 1 positivite/IMG_000020.jpg 1 positivite/IMG_000022.jpg 1 positivite/IMG_000023.jpg 1 positivite/IMG_000026.jpg 1 positivite/IMG_000028.jpg 1 positivite/IMG_000029.jpg 1 positivite/IMG_000031.jpg 1 positivite/IMG_000032.jpg 1 positivite/IMG_000037.jpg 1 positivite/IMG_000039.jpg 1 positivite/IMG_000040.jpg 1 positivite/IMG_000042.jpg 1 positivite/IMG_000044.jpg 1 .....................
val.txt
positivite/IMG_000162.jpg 1
positivite/IMG_000164.jpg 1
positivite/IMG_000165.jpg 1
positivite/IMG_000167.jpg 1
positivite/IMG_000168.jpg 1
positivite/IMG_000170.jpg 1
positivite/IMG_000171.jpg 1
positivite/IMG_000174.jpg 1
positivite/IMG_000177.jpg 1
positivite/IMG_000179.jpg 1
positivite/IMG_000180.jpg 1
positivite/IMG_000184.jpg 1
positivite/IMG_000186.jpg 1
positivite/IMG_000188.jpg 1
positivite/IMG_000189.jpg 1
positivite/IMG_000194.jpg 1
positivite/IMG_000196.jpg 1
positivite/IMG_000199.jpg 1
positivite/IMG_000201.jpg 1
positivite/IMG_000202.jpg 1
positivite/IMG_000203.jpg 1
negative_eg/IMG_000180_3.jpg 0
negative_eg/IMG_000184_0.jpg 0
negative_eg/IMG_000184_1.jpg 0
negative_eg/IMG_000184_2.jpg 0
negative_eg/IMG_000184_3.jpg 0
negative_eg/IMG_000186_0.jpg 0
> negative_eg/IMG_000186_1.jpg 0
........................

3.修改一下create_imagenet.sh

主要就是改寫漢語註釋部分

#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
set -e
#案例路徑
EXAMPLE=/home/ubuntu/hudie_detection_case 
#資料根目錄
DATA=/home/ubuntu/hudie_detection_case
#caffebuild/tools的絕對路徑
TOOLS=/home/ubuntu/caffe/caffe/build/tools
#測試資料和訓練資料根目錄
TRAIN_DATA_ROOT=/home/ubuntu/hudie_detection_case/train/
VAL_DATA_ROOT=/home/ubuntu/hudie_detection_case/train/

# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
#根據需求是否需要把圖片縮放成統一大小
RESIZE=true
if $RESIZE; then
  RESIZE_HEIGHT=227
  RESIZE_WIDTH=227
else
  RESIZE_HEIGHT=0
  RESIZE_WIDTH=0
fi

if [ ! -d "$TRAIN_DATA_ROOT" ]; then
  echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
  echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \
       "where the ImageNet training data is stored."
  exit 1
fi

if [ ! -d "$VAL_DATA_ROOT" ]; then
  echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
  echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \
       "where the ImageNet validation data is stored."
  exit 1
fi

echo "Creating train lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $TRAIN_DATA_ROOT \
    $DATA/train.txt \
    $EXAMPLE/ilsvrc12_train_lmdb

echo "Creating val lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $VAL_DATA_ROOT \
    $DATA/val.txt \
    $EXAMPLE/ilsvrc12_val_lmdb

echo "Done."

這些準備完畢之後,執行sudo sh ./create_imagenet.sh,如果沒有報錯,恭喜!報錯了,可能是caffe依賴包沒有安裝好或者重新執行上述步驟。

4.製作神經網路模型train_val.prototxt

這裡使用的是AlexNet模型,此處主要修改輸入檔案路徑,輸出路徑,以及softmax層的輸出類數,已用黑體標出(分類個數)

name: "AlexNet"
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mirror: true
    crop_size: 227
    #mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
  }
  data_param {
    **source: "/home/ubuntu/hudie_detection_case/ilsvrc12_train_lmdb"**
    batch_size: 2
    backend: LMDB
  }
}
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mirror: false
    crop_size: 227
    #mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
  }
  data_param {
    **source: "/home/ubuntu/hudie_detection_case/ilsvrc12_val_lmdb"**
    batch_size: 2
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "conv1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "norm1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 2
    kernel_size: 5
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "norm2"
  type: "LRN"
  bottom: "conv2"
  top: "norm2"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "norm2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "conv4"
  type: "Convolution"
  bottom: "conv3"
  top: "conv4"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "conv4"
  top: "conv4"
}
layer {
  name: "conv5"
  type: "Convolution"
  bottom: "conv4"
  top: "conv5"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu5"
  type: "ReLU"
  bottom: "conv5"
  top: "conv5"
}
layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc7"
  type: "InnerProduct"
  bottom: "fc6"
  top: "fc7"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu7"
  type: "ReLU"
  bottom: "fc7"
  top: "fc7"
}
layer {
  name: "drop7"
  type: "Dropout"
  bottom: "fc7"
  top: "fc7"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc8"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    **num_output: 2**
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc8"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc8"
  bottom: "label"
  top: "loss"
}

5.之後修改slover.prototxt

主要引數已黑體

**net: "/home/ubuntu/hudie_detection_case/train_val.prototxt"**
test_iter: 1000
test_interval: 1000
#基礎學習率
**base_lr: 0.001**
lr_policy: "step"
gamma: 0.1
stepsize: 100000
#每訓練20次顯示資訊
**display: 20**
max_iter: 450000
momentum: 0.9
weight_decay: 0.0005
#每訓練10000次儲存模型,路徑為models
**snapshot: 10000**
snapshot_prefix: "models"
solver_mode: CPU

完成這些,後緊接著就是緊張而又緩慢的訓練工作了,可能十幾分鍾,可能十幾天。看你的資料量大小和模型法咋都了

6. 執行train_caffenet.sh

其內容如下

#!/usr/bin/env sh
set -e
#caffe 路徑
/home/ubuntu/caffe/caffe/build/tools/caffe train \
    --solver=/home/ubuntu/hudie_detection_case/solver.prototxt [email protected]

訓練介面(部分截圖)如下,接下來就等吧:
這裡寫圖片描述
這裡寫圖片描述