基於Openpose框架的人臉關鍵點heatmaps預測模型設計、訓練
阿新 • • 發佈:2018-12-21
眾所周知,人臉識別中有一項重要的任務-人臉關鍵點預測,通過這個環節對齊,才能進行人臉識別,提高人臉識別的準確率。另外,一些活體檢測/人臉狀態分析也需要利用該方案進行實現。
經典的人臉檢測模型MTCNN中具有人臉關鍵點的預測功能,但其關鍵預測精度比較差,對於大角度、模糊、遮擋、小尺度等情況的人臉效果下降更加嚴重。因此我結合openpose的關鍵點預測模型,自行設計瞭如下的人臉關鍵點熱圖預測模型,經過驗證可以很好的實現人臉關鍵點預測的效果。
訓練網路設計
由於經典的openpose框架中heatmaps的熱圖有6個stage進行預測和中間監督,但我按照原始的框架發現到達後面幾個stage後,loss基本和前面的stage一致,但多個stage對模型的速度會有影響,因此設計了3個stage和4個stage的版本,這裡主要介紹4個stage的版本。
利用netscope可以將設計的網路視覺化如下(如果看不清,可以利用我下面提供的proto內容自行驗證):
所設計的網路proto檔案內容如下:
name: "landmarks-net" input: "data" input_shape { dim: 1 dim: 3 dim: 112 dim: 112 } layer { name: "conv1_1" type: "Convolution" bottom: "data" top: "conv1_1" param { lr_mult: 1.0 decay_mult: 1 } param { lr_mult: 2.0 decay_mult: 0 } convolution_param { num_output: 64 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "relu1_1" type: "ReLU" bottom: "conv1_1" top: "conv1_1" } layer { name: "conv1_2" type: "Convolution" bottom: "conv1_1" top: "conv1_2" param { lr_mult: 1.0 decay_mult: 1 } param { lr_mult: 2.0 decay_mult: 0 } convolution_param { num_output: 64 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "relu1_2" type: "ReLU" bottom: "conv1_2" top: "conv1_2" } layer { name: "pool1_stage1" type: "Pooling" bottom: "conv1_2" top: "pool1_stage1" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { name: "conv2_1" type: "Convolution" bottom: "pool1_stage1" top: "conv2_1" param { lr_mult: 1.0 decay_mult: 1 } param { lr_mult: 2.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "relu2_1" type: "ReLU" bottom: "conv2_1" top: "conv2_1" } layer { name: "conv2_2" type: "Convolution" bottom: "conv2_1" top: "conv2_2" param { lr_mult: 1.0 decay_mult: 1 } param { lr_mult: 2.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "relu2_2" type: "ReLU" bottom: "conv2_2" top: "conv2_2" } layer { name: "pool2_stage1" type: "Pooling" bottom: "conv2_2" top: "pool2_stage1" pooling_param { pool: MAX kernel_size: 2 stride: 2 } } layer { name: "conv3_1" type: "Convolution" bottom: "pool2_stage1" top: "conv3_1" param { lr_mult: 1.0 decay_mult: 1 } param { lr_mult: 2.0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "relu3_1" type: "ReLU" bottom: "conv3_1" top: "conv3_1" } layer { name: "conv3_2" type: "Convolution" bottom: "conv3_1" top: "conv3_2" param { lr_mult: 1.0 decay_mult: 1 } param { lr_mult: 2.0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "relu3_2" type: "ReLU" bottom: "conv3_2" top: "conv3_2" } layer { name: "conv3_3" type: "Convolution" bottom: "conv3_2" top: "conv3_3" param { lr_mult: 1.0 decay_mult: 1 } param { lr_mult: 2.0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "relu3_3" type: "ReLU" bottom: "conv3_3" top: "conv3_3" } layer { name: "conv3_4" type: "Convolution" bottom: "conv3_3" top: "conv3_4" param { lr_mult: 1.0 decay_mult: 1 } param { lr_mult: 2.0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "relu3_4" type: "ReLU" bottom: "conv3_4" top: "conv3_4" } layer { name: "conv4_4_CPM" type: "Convolution" bottom: "conv3_4" top: "conv4_4_CPM" param { lr_mult: 1.0 decay_mult: 1 } param { lr_mult: 2.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "relu4_4_CPM" type: "ReLU" bottom: "conv4_4_CPM" top: "conv4_4_CPM" } layer { name: "conv5_1_CPM_new" type: "Convolution" bottom: "conv4_4_CPM" top: "conv5_1_CPM_new" param { lr_mult: 1.0 decay_mult: 1 } param { lr_mult: 2.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "relu5_1_CPM_new" type: "ReLU" bottom: "conv5_1_CPM_new" top: "conv5_1_CPM_new" } layer { name: "conv5_2_CPM_new" type: "Convolution" bottom: "conv5_1_CPM_new" top: "conv5_2_CPM_new" param { lr_mult: 1.0 decay_mult: 1 } param { lr_mult: 2.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "relu5_2_CPM_new" type: "ReLU" bottom: "conv5_2_CPM_new" top: "conv5_2_CPM_new" } layer { name: "conv5_3_CPM_new" type: "Convolution" bottom: "conv5_2_CPM_new" top: "conv5_3_CPM_new" param { lr_mult: 1.0 decay_mult: 1 } param { lr_mult: 2.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "relu5_3_CPM_new" type: "ReLU" bottom: "conv5_3_CPM_new" top: "conv5_3_CPM_new" } layer { name: "conv5_4_CPM_new" type: "Convolution" bottom: "conv5_3_CPM_new" top: "conv5_4_CPM_new" param { lr_mult: 1.0 decay_mult: 1 } param { lr_mult: 2.0 decay_mult: 0 } convolution_param { num_output: 512 pad: 0 kernel_size: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "relu5_4_CPM_new" type: "ReLU" bottom: "conv5_4_CPM_new" top: "conv5_4_CPM_new" } layer { name: "conv5_5_CPM_new" type: "Convolution" bottom: "conv5_4_CPM_new" top: "conv5_5_CPM_new" param { lr_mult: 1.0 decay_mult: 1 } param { lr_mult: 2.0 decay_mult: 0 } convolution_param { num_output: 5 pad: 0 kernel_size: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "concat_stage2" type: "Concat" bottom: "conv5_5_CPM_new" bottom: "conv4_4_CPM" top: "concat_stage2" concat_param { axis: 1 } } layer { name: "Mconv1_stage2_new" type: "Convolution" bottom: "concat_stage2" top: "Mconv1_stage2_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 3 kernel_size: 7 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu1_stage2_new" type: "ReLU" bottom: "Mconv1_stage2_new" top: "Mconv1_stage2_new" } layer { name: "Mconv2_stage2_new" type: "Convolution" bottom: "Mconv1_stage2_new" top: "Mconv2_stage2_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 3 kernel_size: 7 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu2_stage2_new" type: "ReLU" bottom: "Mconv2_stage2_new" top: "Mconv2_stage2_new" } layer { name: "Mconv3_stage2_new" type: "Convolution" bottom: "Mconv2_stage2_new" top: "Mconv3_stage2_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 3 kernel_size: 7 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu3_stage2_new" type: "ReLU" bottom: "Mconv3_stage2_new" top: "Mconv3_stage2_new" } layer { name: "Mconv4_stage2_new" type: "Convolution" bottom: "Mconv3_stage2_new" top: "Mconv4_stage2_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 3 kernel_size: 7 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu4_stage2_new" type: "ReLU" bottom: "Mconv4_stage2_new" top: "Mconv4_stage2_new" } layer { name: "Mconv5_stage2_new" type: "Convolution" bottom: "Mconv4_stage2_new" top: "Mconv5_stage2_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 3 kernel_size: 7 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu5_stage2_new" type: "ReLU" bottom: "Mconv5_stage2_new" top: "Mconv5_stage2_new" } layer { name: "Mconv6_stage2_new" type: "Convolution" bottom: "Mconv5_stage2_new" top: "Mconv6_stage2_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 0 kernel_size: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu6_stage2_new" type: "ReLU" bottom: "Mconv6_stage2_new" top: "Mconv6_stage2_new" } layer { name: "Mconv7_stage2_new" type: "Convolution" bottom: "Mconv6_stage2_new" top: "Mconv7_stage2_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 5 pad: 0 kernel_size: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "concat_stage3" type: "Concat" bottom: "Mconv7_stage2_new" bottom: "conv4_4_CPM" top: "concat_stage3" concat_param { axis: 1 } } layer { name: "Mconv1_stage3_new" type: "Convolution" bottom: "concat_stage3" top: "Mconv1_stage3_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 3 kernel_size: 7 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu1_stage3_new" type: "ReLU" bottom: "Mconv1_stage3_new" top: "Mconv1_stage3_new" } layer { name: "Mconv2_stage3_new" type: "Convolution" bottom: "Mconv1_stage3_new" top: "Mconv2_stage3_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 3 kernel_size: 7 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu2_stage3_new" type: "ReLU" bottom: "Mconv2_stage3_new" top: "Mconv2_stage3_new" } layer { name: "Mconv3_stage3_new" type: "Convolution" bottom: "Mconv2_stage3_new" top: "Mconv3_stage3_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 3 kernel_size: 7 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu3_stage3_new" type: "ReLU" bottom: "Mconv3_stage3_new" top: "Mconv3_stage3_new" } layer { name: "Mconv4_stage3_new" type: "Convolution" bottom: "Mconv3_stage3_new" top: "Mconv4_stage3_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 3 kernel_size: 7 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu4_stage3_new" type: "ReLU" bottom: "Mconv4_stage3_new" top: "Mconv4_stage3_new" } layer { name: "Mconv5_stage3_new" type: "Convolution" bottom: "Mconv4_stage3_new" top: "Mconv5_stage3_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 3 kernel_size: 7 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu5_stage3_new" type: "ReLU" bottom: "Mconv5_stage3_new" top: "Mconv5_stage3_new" } layer { name: "Mconv6_stage3_new" type: "Convolution" bottom: "Mconv5_stage3_new" top: "Mconv6_stage3_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 0 kernel_size: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu6_stage3_new" type: "ReLU" bottom: "Mconv6_stage3_new" top: "Mconv6_stage3_new" } layer { name: "Mconv7_stage3_new" type: "Convolution" bottom: "Mconv6_stage3_new" top: "Mconv7_stage3_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 5 pad: 0 kernel_size: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "concat_stage4" type: "Concat" bottom: "Mconv7_stage3_new" bottom: "conv4_4_CPM" top: "concat_stage4" concat_param { axis: 1 } } layer { name: "Mconv1_stage4_new" type: "Convolution" bottom: "concat_stage4" top: "Mconv1_stage4_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 3 kernel_size: 7 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu1_stage4_new" type: "ReLU" bottom: "Mconv1_stage4_new" top: "Mconv1_stage4_new" } layer { name: "Mconv2_stage4_new" type: "Convolution" bottom: "Mconv1_stage4_new" top: "Mconv2_stage4_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 3 kernel_size: 7 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu2_stage4_new" type: "ReLU" bottom: "Mconv2_stage4_new" top: "Mconv2_stage4_new" } layer { name: "Mconv3_stage4_new" type: "Convolution" bottom: "Mconv2_stage4_new" top: "Mconv3_stage4_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 3 kernel_size: 7 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu3_stage4_new" type: "ReLU" bottom: "Mconv3_stage4_new" top: "Mconv3_stage4_new" } layer { name: "Mconv4_stage4_new" type: "Convolution" bottom: "Mconv3_stage4_new" top: "Mconv4_stage4_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 3 kernel_size: 7 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu4_stage4_new" type: "ReLU" bottom: "Mconv4_stage4_new" top: "Mconv4_stage4_new" } layer { name: "Mconv5_stage4_new" type: "Convolution" bottom: "Mconv4_stage4_new" top: "Mconv5_stage4_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 3 kernel_size: 7 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu5_stage4_new" type: "ReLU" bottom: "Mconv5_stage4_new" top: "Mconv5_stage4_new" } layer { name: "Mconv6_stage4_new" type: "Convolution" bottom: "Mconv5_stage4_new" top: "Mconv6_stage4_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 128 pad: 0 kernel_size: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } } layer { name: "Mrelu6_stage4_new" type: "ReLU" bottom: "Mconv6_stage4_new" top: "Mconv6_stage4_new" } layer { name: "Mconv7_stage4_new" type: "Convolution" bottom: "Mconv6_stage4_new" top: "Mconv7_stage4_new" param { lr_mult: 4.0 decay_mult: 1 } param { lr_mult: 8.0 decay_mult: 0 } convolution_param { num_output: 5 pad: 0 kernel_size: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" } } }
訓練資料
我主要使用了資料集CelebA,並且按照1:1的比例,從中抽取了正臉和大角度的人臉作為訓練資料。
標籤檔案:我根據CelebA中人臉的關鍵點標籤,為每個關鍵點生成了單峰的heatmaps標籤檔案。標籤檔案視覺化後如下圖樣子:
需要可以私信我交流。
最終訓練結果如下:
I0906 05:13:55.329042 5065 solver.cpp:243] Iteration 41960, loss = 0.5419 I0906 05:13:55.329210 5065 solver.cpp:259] Train net output #0: land_loss_stage = 0.281634 (* 1 = 0.281634 loss) I0906 05:13:55.329221 5065 solver.cpp:259] Train net output #1: land_loss_stage2 = 0.0835331 (* 1 = 0.0835331 loss) I0906 05:13:55.329238 5065 solver.cpp:259] Train net output #2: land_loss_stage3 = 0.0827685 (* 1 = 0.0827685 loss) I0906 05:13:55.329244 5065 solver.cpp:259] Train net output #3: land_loss_stage4 = 0.0939632 (* 1 = 0.0939632 loss) I0906 05:13:55.329252 5065 sgd_solver.cpp:138] Iteration 41960, lr = 8.1e-07 I0906 05:14:18.058833 5065 solver.cpp:243] Iteration 41980, loss = 0.487881 I0906 05:14:18.058876 5065 solver.cpp:259] Train net output #0: land_loss_stage = 0.261268 (* 1 = 0.261268 loss) I0906 05:14:18.058885 5065 solver.cpp:259] Train net output #1: land_loss_stage2 = 0.0769336 (* 1 = 0.0769336 loss) I0906 05:14:18.058892 5065 solver.cpp:259] Train net output #2: land_loss_stage3 = 0.0663182 (* 1 = 0.0663182 loss) I0906 05:14:18.058897 5065 solver.cpp:259] Train net output #3: land_loss_stage4 = 0.0833601 (* 1 = 0.0833601 loss) I0906 05:14:18.058903 5065 sgd_solver.cpp:138] Iteration 41980, lr = 8.1e-07 I0906 05:14:39.636696 5065 solver.cpp:596] Snapshotting to binary proto file /home/work/glenn/gitmodel/caffe-face/face_example/landmarks_data/train_map/snapshot_iter_42000.caffemodel I0906 05:14:40.695569 5065 sgd_solver.cpp:307] Snapshotting solver state to binary proto file /home/work/glenn/gitmodel/caffe-face/face_example/landmarks_data/train_map/snapshot_iter_42000.solverstate I0906 05:14:41.087797 5065 solver.cpp:332] Iteration 42000, loss = 0.826902 I0906 05:14:41.087839 5065 solver.cpp:337] Optimization Done. I0906 05:14:41.087843 5065 caffe.cpp:254] Optimization Done.
模型測試
經過約4w步迭代,模型的預測結果如下: