1. 程式人生 > >YOLO配置檔案理解

YOLO配置檔案理解

http://www.infocool.net/kb/WWW/201703/317548.html


  
  1. [ net]
  2. batch= 64 每batch個樣本更新一次引數。
  3. subdivisions= 8 如果記憶體不夠大,將batch分割為subdivisions個子batch,每個子batch的大小為batch/subdivisions。
  4. 在darknet程式碼中,會將batch/subdivisions命名為batch。
  5. height= 416 input影象的高
  6. width= 416 Input影象的寬
  7. channels= 3 Input影象的通道數
  8. momentum= 0.9 動量
  9. decay= 0.0005 權重衰減正則項,防止過擬合
  10. angle= 0
    通過旋轉角度來生成更多訓練樣本
  11. saturation = 1.5 通過調整飽和度來生成更多訓練樣本
  12. exposure = 1.5 通過調整曝光量來生成更多訓練樣本
  13. hue= .1 通過調整色調來生成更多訓練樣本
  14. learning_rate= 0.0001 初始學習率
  15. max_batches = 45000 訓練達到max_batches後停止學習
  16. policy=steps 調整學習率的policy,有如下policy:CONSTANT, STEP, EXP, POLY, STEPS, SIG, RANDOM
  17. steps= 100, 25000, 35000 根據batch_num調整學習率
  18. scales= 10, .1, .1 學習率變化的比例,累計相乘
  19. [ convolutional]
  20. batch_normalize= 1 是否做BN
  21. filters= 32 輸出多少個特徵圖
  22. size= 3 卷積核的尺寸
  23. stride= 1 做卷積運算的步長
  24. pad= 1 如果pad為 0,padding由 padding引數指定。如果pad為 1,padding大小為size/ 2
  25. activation=leaky 啟用函式:
  26. logistic,loggy,relu,elu,relie,plse,hardtan,lhtan,linear,ramp,leaky,tanh,stair
  27. [ maxpool]
  28. size= 2 池化層尺寸
  29. stride= 2 池化步進
  30. [ convolutional]
  31. batch_normalize= 1
  32. filters= 64
  33. size= 3
  34. stride= 1
  35. pad= 1
  36. activation=leaky
  37. [ maxpool]
  38. size= 2
  39. stride= 2
  40. ......
  41. ......
  42. #######
  43. [ convolutional]
  44. batch_normalize= 1
  45. size= 3
  46. stride= 1
  47. pad= 1
  48. filters= 1024
  49. activation=leaky
  50. [ convolutional]
  51. batch_normalize= 1
  52. size= 3
  53. stride= 1
  54. pad= 1
  55. filters= 1024
  56. activation=leaky
  57. [ route] the route layer is to bring finer grained features in from earlier in the network
  58. layers= -9
  59. [ reorg] the reorg layer is to make these features match the feature map size at the later layer.
  60. The end feature map is 13x13, the feature map from earlier is 26x26x512.
  61. The reorg layer maps the 26x26x512 feature map onto a 13x13x2048 feature map
  62. so that it can be concatenated with the feature maps at 13x13 resolution.
  63. stride= 2
  64. [ route]
  65. layers= -1, -3
  66. [ convolutional]
  67. batch_normalize= 1
  68. size= 3
  69. stride= 1
  70. pad= 1
  71. filters= 1024
  72. activation=leaky
  73. [ convolutional]
  74. size= 1
  75. stride= 1
  76. pad= 1
  77. filters= 125 region前最後一個卷積層的filters數是特定的,計算公式為filter=num*(classes+ 5)
  78. 5的意義是 5個座標,論文中的tx,ty,tw,th,to
  79. activation=linear
  80. [ region]
  81. anchors = 1.08, 1.19, 3.42, 4.41, 6.63, 11.38, 9.42, 5.11, 16.62, 10.52 預選框,可以手工挑選,
  82. 也可以通過k means 從訓練樣本中學出
  83. bias_match= 1
  84. classes= 20 網路需要識別的物體種類數
  85. coords= 4 每個box的 4個座標tx,ty,tw,th
  86. num= 5 每個grid cell預測幾個box
  87. softmax= 1 使用softmax做啟用函式
  88. jitter= .2 通過抖動增加噪聲來抑制過擬合
  89. rescore= 1 暫理解為一個開關,非 0時通過重打分來調整l.delta(預測值與真實值的差)
  90. object_scale= 5 暫理解為計算損失時預測框中有物體時的權重
  91. noobject_scale= 1 暫理解為計算損失時預測框中無物體時的權重
  92. class_scale= 1 暫理解為計算類別損失時的權重
  93. coord_scale= 1 暫理解為計算損失時座標偏差的權重
  94. absolute= 1
  95. thresh = .6
  96. random= 0 是否隨機確定最後一個預測框

darknet對應程式碼

找到cfg檔案解析的程式碼,選擇detector demo 作為入口

darknet.c檔案 main 函式開始


  
  1. } else if ( 0 == strcmp(argv[ 1], "detector")){
  2. run_detector(argc, argv);

Detector.c檔案 run_detector函式


  
  1. char *prefix = find_char_arg(argc, argv, "-prefix", 0);
  2. float thresh = find_float_arg(argc, argv, "-thresh", .24);
  3. float hier_thresh = find_float_arg(argc, argv, "-hier", .5);
  4. int cam_index = find_int_arg(argc, argv, "-c", 0);
  5. int frame_skip = find_int_arg(argc, argv, "-s", 0);
  6. if(argc < 4){
  7. fprintf( stderr, "usage: %s %s [train/test/valid] [cfg] [weights (optional)]\n", argv[ 0], argv[ 1]);
  8. return;
  9. }
  10. char *gpu_list = find_char_arg(argc, argv, "-gpus", 0);
  11. char *outfile = find_char_arg(argc, argv, "-out", 0);
  12. ......
  13. ......
  14. else if( 0== strcmp(argv[ 2], "demo")) {
  15. list *options = read_data_cfg(datacfg);
  16. int classes = option_find_int(options, "classes", 20);
  17. char *name_list = option_find_str(options, "names", "data/names.list");
  18. char **names = get_labels(name_list);
  19. demo(cfg, weights, thresh, cam_index, filename, names, classes, frame_skip, prefix, hier_thresh);
  20. }

read_data_cfg函式解析配置檔案,儲存到options指標。

class

int classes = option_find_int(options, "classes", 20);

  

classes為YOLO可識別的種類數

batch、learning_rate、momentum、decay和 subdivisions

demo.c檔案demo函式

net = parse_network_cfg(cfgfile);

  

Parser.c檔案 parse_network_cfg函式


  
  1. list *sections = read_cfg(filename);
  2. node *n = sections->front;
  3. if(!n) error( "Config file has no sections");
  4. network net = make_network(sections->size - 1);
  5. net.gpu_index = gpu_index;
  6. size_params params;
  7. section *s = (section *)n->val;
  8. list *options = s->options;
  9. if(!is_network(s)) error( "First section must be [net] or [network]");
  10. parse_net_options(options, &net);

parse_net_options函式


  
  1. net->batch = option_find_int(options, "batch", 1);
  2. net->learning_rate = option_find_float(options, "learning_rate", .001);
  3. net->momentum = option_find_float(options, "momentum", .9);
  4. net->decay = option_find_float(options, "decay", .0001);
  5. int subdivs = option_find_int(options, "subdivisions", 1);
  6. net->time_steps = option_find_int_quiet(options, "time_steps", 1);
  7. net->batch /= subdivs;
  8. net->batch *= net->time_steps;
  9. net->subdivisions = subdivs;

learning_rate為初始學習率,訓練時的真正學習率和學習率的策略及初始學習率有關。

momentum為動量,在訓練時加入動量可以幫助走出local minima 以及saddle point。

decay是權重衰減正則項,用來防止過擬合。

batch的值等於cfg檔案中的batch/subdivisions 再乘以time_steps。
time_steps在yolo預設的cfg中是沒有配置的,所以是預設值1。
因此batch可以認為就是cfg檔案中的batch/subdivisions。

前面有提到batch的意義是每batch個樣本更新一次引數。

而subdivisions的意義在於降低對GPU memory的要求。
darknet將batch分割為subdivisions個子batch,每個子batch的大小為batch/subdivisions,並將子batch命名為batch。

我們看下訓練時和batch有關的程式碼

Detector.c檔案的train_detector函式


  
  1. #ifdef GPU
  2. if(ngpus == 1){
  3. loss = train_network(net, train);
  4. } else {
  5. loss = train_networks(nets, ngpus, train, 4);
  6. }
  7. #else
  8. loss = train_network(net, train);
  9. #endif

Network.c檔案的train_network函式


  
  1. int batch = net.batch;
  2. int n = d.X.rows / batch;
  3. float *X = calloc(batch*d.X.cols, sizeof( float));
  4. float *y = calloc(batch*d.y.cols, sizeof( float));
  5. int i;
  6. float sum = 0;
  7. for(i = 0; i < n; ++i){
  8. get_next_batch(d, batch, i*batch, X, y);
  9. float err = train_network_datum(net, X, y);
  10. sum += err;
  11. }

train_network_datum函式


  
  1. *net.seen += net.batch;
  2. ......
  3. ......
  4. forward_network(net, state);
  5. backward_network(net, state);
  6. float error = get_network_cost(net);
  7. if(((*net.seen)/net.batch)%net.subdivisions == 0) update_network(net);

我們看到,只有((*net.seen)/net.batch)%net.subdivisions == 0時才會更新網路引數。
*net.seen是已經訓練過的子batch數,((*net.seen)/net.batch)%net.subdivisions的意義正是已經訓練過了多少個真正的batch。

policy、steps和scales

Parser.c檔案 parse_network_cfg函式


  
  1. char *policy_s = option_find_str(options, "policy", "constant");
  2. net->policy = get_policy(policy_s);
  3. net->burn_in = option_find_int_quiet(options, "burn_in", 0);
  4. if(net->policy == STEP){
  5. net->step = option_find_int(options, "step", 1);
  6. net->scale = option_find_float(options, "scale", 1);
  7. } else if (net->policy == STEPS){
  8. char *l = option_find(options, "steps");
  9. char *p = option_find(options, "scales");
  10. if(!l || !p) error( "STEPS policy must have steps and scales in cfg file");
  11. int len = strlen(l);
  12. int n = 1;
  13. int i;
  14. for(i = 0; i < len; ++i){
  15. if (l[i] == ',') ++n;
  16. }
  17. int *steps = calloc(n, sizeof( int));
  18. float *scales = calloc(n, sizeof( float));
  19. for(i = 0; i < n; ++i){
  20. int step = atoi(l);
  21. float scale = atof(p);
  22. l = strchr(l, ',')+ 1;
  23. p = strchr(p, ',')+ 1;
  24. steps[i] = step;
  25. scales[i] = scale;
  26. }
  27. net->scales = scales;
  28. net->steps = steps;
  29. net->num_steps = n;
  30. } else if (net->policy == EXP){
  31. net->gamma = option_find_float(options, "gamma", 1);
  32. } else if (net->policy == SIG){
  33. net->gamma = option_find_floa