1. 程式人生 > >windows10 conda2 使用caffe訓練訓練自己的數據

windows10 conda2 使用caffe訓練訓練自己的數據

caffe lex cond www mom nal shuff sna nor

首先得到了https://blog.csdn.net/gybheroin/article/details/72581318系列博客的幫助。表示感激。

關於安裝caffe已在之前的博客介紹,自用可行,https://www.cnblogs.com/MY0213/p/9225310.html

1.數據源

首先使用的數據集為人臉數據集,可在百度雲自行下載:

鏈接:https://pan.baidu.com/s/156DiOuB46wKrM0cEaAgfMw 密碼:1ap0

將train.zip解壓可得數據源,label文件是val.txt和train.txt。

2.將圖片數據做成lmdb數據源

SET GLOG_logtostderr=1
 


SET RESIZE_HEIGHT=227 
SET RESIZE_WIDTH=227


"convert_imageset" --resize_height=227 --resize_width=227 --shuffle "train/" "train.txt" "mtraindb"
"convert_imageset" --resize_height=227 --resize_width=227 --shuffle "val/" "val.txt" "mvaldb"

pause

 詳見face_lmdb.bat,將數據做成同等大小的數據。

3. 得到圖像均值

SET GLOG_logtostderr=1

"compute_image_mean" "mtraindb" "train_mean.binaryproto"

pause

詳見mean_face.bat

訓練時先做減均值的操作,可能對訓練效果有好處

這裏可以用固定的圖片均值,是多少可以直接百度谷歌,這一步也可以不做,唐宇迪大神說影響不大。

4. 圖像訓練

SET GLOG_logostderr=1
caffe train --solver=solver.prototxt 
pause

 詳見train.bat

net: "train.prototxt"
test_iter: 100
test_interval: 1000
# lr for fine-tuning should be lower than when starting from scratch
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
# stepsize should also be lower, as we‘re closer to being done
stepsize: 1000
display: 50
max_iter: 10000
momentum: 0.9
weight_decay: 0.0005
snapshot: 1000
snapshot_prefix: "model"
# uncomment the following to default to CPU mode solving
# solver_mode: CPU

詳見solver.prototxt

關於solver.prototxt的內涵可查看

https://blog.csdn.net/qq_27923041/article/details/55211808

#############################  DATA Layer  #############################
name: "face_train_val"
layer {
  top: "data"
  top: "label"
  name: "data"
  type: "Data"
  data_param {
    source: "mtraindb"
    backend:LMDB
    batch_size: 64
  }
  transform_param {
     mean_file: "train_mean.binaryproto"
     mirror: true
  }
  include: { phase: TRAIN }
}

layer {
  top: "data"
  top: "label"
  name: "data"
  type: "Data"
  data_param {
    source: "mvaldb"
    backend:LMDB
    batch_size: 64
  }
  transform_param {
    mean_file: "train_mean.binaryproto"
    mirror: true
  }
  include: { 
    phase: TEST 
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "conv1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "norm1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 2
    kernel_size: 5
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "norm2"
  type: "LRN"
  bottom: "conv2"
  top: "norm2"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "norm2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "conv4"
  type: "Convolution"
  bottom: "conv3"
  top: "conv4"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "conv4"
  top: "conv4"
}
layer {
  name: "conv5"
  type: "Convolution"
  bottom: "conv4"
  top: "conv5"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    group: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu5"
  type: "ReLU"
  bottom: "conv5"
  top: "conv5"
}
layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc7"
  type: "InnerProduct"
  bottom: "fc6"
  top: "fc7"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 0.1
    }
  }
}
layer {
  name: "relu7"
  type: "ReLU"
  bottom: "fc7"
  top: "fc7"
}
layer {
  name: "drop7"
  type: "Dropout"
  bottom: "fc7"
  top: "fc7"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc8-expr"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8-expr"
  param {
    lr_mult: 10
    decay_mult: 1
  }
  param {
    lr_mult: 20
    decay_mult: 0
  }
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc8-expr"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc8-expr"
  bottom: "label"
  top: "loss"
}

詳見train.prototxt,也就是將alexnet中最後的1000變為2就可以了。

這個過程需要5天左右(我用的cpu),可以直接用已有模型alexnet_iter_50000_full_conv.caffemodel

5. 測試

可用run_face_detect_batch.py測試人臉檢測效果。

6. 總結

這個網絡測試時特別慢,用的是slipping window的方法。下面的文章再介紹快速一點的faster rcnn 及FPN。

slipping window中用了Casting a Classifier into a Fully Convolutional Network 的方法。這一方法在其他網絡中也可用。

關於rcnn的演進,可見https://www.cnblogs.com/MY0213/p/9460562.html

歡迎批評指正。

windows10 conda2 使用caffe訓練訓練自己的數據