1. 程式人生 > >ICnet基於VOC資料集的訓練

ICnet基於VOC資料集的訓練

Voc4ICnet

資料集準備的目的:ICNET基於VOc資料集的訓練,和同時做分割和檢測的Blitznet做對比.

一、資料集標籤製作與準備:

Pascal VOC資料集可用於目標檢測和分割,提供了語義分割標籤和例項分割標籤. 本文中使用的資料集為原始的pascal-voc2012,和B. Hariharan et a提供的額外的帶有分割label的voc_aug資料集合並而成.

由於ICNET原始碼中使用的標籤資料為灰度圖,需要將上述資料集中的標籤資料都轉換為灰度圖作為傳入的標籤.

而voc_aug下載並解壓,得到資料夾benchmark_RELEASE,講其重新命名為VOC_aug.該資料集中的label是.mat格式的檔案,這裡需要將.mat轉化為.png的灰度圖.

在網上看到deeplab_v2中作者有給出轉化程式碼,下載deeplab_v2專案,轉化程式碼為deeplab_v2/voc2012/mat2png.py,傳入引數為.mat檔案的存放路徑和轉化都的png檔案存放路徑

本文資料集中用到的檔案路徑如下:

vocdataset
    VOC2012_org
        ImageSets
            Segmentation
                trainval.txt
        JPEGImages
        SegmentationClass
        SegmentationClass_1D
    VOC_aug
        dataset
            cls
            cls_png

本文的執行命令如下:

cd ~/TF-Project/deeplab_v2/voc2012
./mat2png.py /media/yue/DATA/vocdataset/VOC_aug/dataset/cls /media/yue/DATA/vocdataset/VOC_aug/dataset/cls_png

由於原始的資料集VOC2012中語義標籤是三通道的彩色圖片,需要將其降為單通道的圖片

 ./convert_labels.py /media/yue/DATA/vocdataset/VOC2012_orig/SegmentationClass/   /media/yue/DATA/vocdataset/VOC2012_orig/ImageSets/Segmentation/trainval.txt /media/yue/DATA/vocdataset/VOC2012_orig/SegmentationClass_1D/

即可得到兩組資料的原圖和標籤圖,具體為

vocdataset
    VOC2012_org
        JPEGImages #原圖
        SegmentationClass_1D #單通道標籤圖
    VOC_aug
        dataset
            img #原圖
            cls_png #單通道標籤圖

再將兩資料集的原圖和單通道標籤圖分別合併,將原始pascal voc2012資料集中的圖片inages和標籤labels複製到增強pascal voc2012資料集中,如果重複則覆蓋

cd /media/yue/DATA/vocdataset
cp VOC2012_orig/SegmentationClass_1D/* VOC_aug/dataset/cls_png/
cp VOC2012_orig/JPEGImages/* VOC_aug/dataset/img/

  1. images:img,jpg圖片的個數變成17125張;
  2. labels:檔名是cls_png, png圖片的個數是11355張。

再製作訓練集和驗證集和測試集,生成.txt檔案.由於Icnet對輸入檔案大小有限定,這裡還需要將所有圖片都resize到2的大小.再根據訓練步驟訓練即可.

1、需要修改以下路徑和引數,本文如下

先複製train.py為train_voc.py

# If you want to apply to other datasets, change following four lines
DATA_DIR ='/media/yue/DATA/vocdataset'
DATA_LIST_PATH = '/media/yue/DATA/vocdataset/icnettrain.txt'
IGNORE_LABEL = 0 # The class number of background
INPUT_SIZE = '256, 256'

2、修改傳入的預訓練模型,這裡不需要傳入預訓練模型,直接將讀入預訓練模型語句註釋掉. 本文修改程式碼的190和191行如下:

else:
    print('traing without pre-trained model...')
    #net.load(args.restore_from, sess)

訓練命名如下:

python train_voc.py --update-mean-var --train-beta-gamma

如過需要傳入預處理模型,修改如下

1修改icnet_cityscapes_bnnomerge.prototxt中的conv6_cls num_output 為自己需要輸出類別數,這裡為21(不修改此處也可進行訓練,是否有錯誤還需驗證)

2修改network.py 修改load(63行)函式如下,主要修改兩處.1session.run(var.assign(data))前一行加入判斷 if ‘conv6_cls’ not in var.name:2修改ignore_missing為True

def load(self, data_path, session, ignore_missing=True):
        data_dict = np.load(data_path, encoding='latin1').item()
        for op_name in data_dict:
            with tf.variable_scope(op_name, reuse=True):
                for param_name, data in data_dict[op_name].items():
                    try:
                        if 'bn' in op_name:
                            param_name = BN_param_map[param_name]

                        var = tf.get_variable(param_name)
                        if 'conv6_cls' not in var.name:
                            session.run(var.assign(data))
                    except ValueError:
                        if not ignore_missing:
                            raise

3取消net.load(args.restore_from, sess)的註釋. 訓練命名如上

1參照evaluate.py寫的cityscapes_param和ADE20k_param寫voc_param,並新增相關欄位.本文如下:

voc_param = {'name': 'voc',
                    'input_size': [256, 256],
                    'num_classes': 21,
                    'ignore_label': 0,
                    'num_steps': 2000,
                    'data_dir': '/media/yue/DATA/vocdataset',
                    'data_list': '/media/yue/DATA/vocdataset/icnetval.txt'

2修改model_paths中的others欄位如下:

'others': './snapshots/'

3在傳入引數裡新增"voc"欄位:

parser.add_argument("--dataset", type=str, default='',
                        choices=['ade20k', 'cityscapes','voc'],
                        required=True)

4在定義的preprocess函式裡裡的新增以下內容:

    elif param['name'] == 'voc':
        img = tf.expand_dims(img, axis=0)
        img = tf.image.resize_bilinear(img, shape, align_corners=True)

5在main函式裡新增傳入引數的判斷

    elif args.dataset == 'voc':
        param = voc_param

6在main函式裡計算miou處新增voc欄位

    elif args.dataset == 'voc':
        mIoU, update_op = tf.contrib.metrics.streaming_mean_iou(pred, gt, num_classes=param['num_classes'])

執行命令

 python evaluate.py --dataset=voc --filter-scale=1 --model=others 

報錯:

InvalidArgumentError (see above for traceback): assertion failed: [`labels` out of bound] [Condition x < y did not hold element-wise:x (mean_iou/confusion_matrix/control_dependency:0) = ] [1 1 1...] [y (mean_iou/ToInt64_2:0) = ] [8]
	 [[Node: mean_iou/confusion_matrix/assert_less/Assert/AssertGuard/Assert = Assert[T=[DT_STRING, DT_STRING, DT_INT64, DT_STRING, DT_INT64], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mean_iou/confusion_matrix/assert_less/Assert/AssertGuard/Assert/Switch/_781, mean_iou/confusion_matrix/assert_less/Assert/AssertGuard/Assert/data_0, mean_iou/confusion_matrix/assert_less/Assert/AssertGuard/Assert/data_1, mean_iou/confusion_matrix/assert_less/Assert/AssertGuard/Assert/Switch_1/_783, mean_iou/confusion_matrix/assert_less/Assert/AssertGuard/Assert/data_3, mean_iou/confusion_matrix/assert_less/Assert/AssertGuard/Assert/Switch_2/_785)]]
	 [[Node: Gather_1/_817 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_1308_Gather_1", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

檢查語義標籤圖片,發現有個別灰度影象素點值大於類別數,說明在轉為灰度圖時出現了錯誤.重新制作灰度圖,發現灰度圖無誤,錯誤出現自resize的過程中,修正後即可.

訓練的60000次的模型,miou為0.00797,loss已降至0.5,目前未找到原因.

1、修改model_paths中’others’欄位,為’others’:’./snapshots/’

2、修改

def main():
    args = get_arguments()

    if args.dataset == 'cityscapes':
        num_classes = cityscapes_class
    else:
        num_classes = ADE20k_class

新增voc的判斷,修改後為

    if args.dataset == 'cityscapes':
        num_classes = cityscapes_class
    elif args.dataset == 'ADE20k':
        num_classes = ADE20k_class
    elif args.dataset == 'voc':
        num_classes = voc_class

3、修改

parser.add_argument("--dataset", type=str, default='',
                        choices=['ade20k', 'cityscapes'],
                        required=True)

新增voc欄位

parser.add_argument("--dataset", type=str, default='',
                        choices=['ade20k', 'cityscapes','voc'],
                        required=True)

執行命令

python inference.py --img-path=./input/2007_000027.jpg --model=others --dataset=voc --filter-scale=1

報錯

Traceback (most recent call last):
  File "inference.py", line 198, in <module>
    main()
  File "inference.py", line 160, in main
    pred = decode_labels(raw_output_up, shape, num_classes)
  File "/home/yue/TF-Project/ICNet-tensorflow-master/tools.py", line 39, in decode_labels
    pred = tf.matmul(onehot_output, color_mat)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/math_ops.py", line 1891, in matmul
    a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 2437, in _mat_mul
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2958, in create_op
    set_shapes_for_outputs(ret)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2209, in set_shapes_for_outputs
    shapes = shape_func(op)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2159, in call_with_requiring
    return call_cpp_shape_fn(op, require_shape_fn=True)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/common_shapes.py", line 627, in call_cpp_shape_fn
    require_shape_fn)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/common_shapes.py", line 691, in _call_cpp_shape_fn_impl
    raise ValueError(err.message)
ValueError: Dimensions must be equal, but are 21 and 19 for 'MatMul' (op: 'MatMul') with input shapes: [65536,21], [19,3].


這裡表示是decode_labels函式在呼叫的時候出錯,進入這個函式的檔案發現,修改判斷num_classes的引數.首先新建voc的label_colours,它代表要用什麼顏色標出相應的在deeplab_v2/voc2012/utils.py檔案裡找到為(需要修改格式,參考cityscapess的label_colours格式):

label_colours_voc = [[0,   0,   0],
             [128,   0,   0] ,
             [  0, 128,   0],
             [128, 128,   0],
             [  0,   0, 128]  ,
             [128,   0, 128] ,
             [  0, 128, 128] ,
             [128, 128, 128],
             [ 64,   0,   0]  ,
             [192,   0,   0] ,
             [ 64, 128,   0],
             [192, 128,   0],
             [ 64,   0, 128],
             [192,   0, 128],
             [ 64, 128, 128],
             [192, 128, 128],
             [  0,  64,   0],
             [128,  64,   0],
             [  0, 192,   0] ,
             [128, 192,   0],
             [  0,  64, 128]]

修改decode_labels

def decode_labels(mask, img_shape, num_classes):
    if num_classes == 150:
        color_table = read_labelcolours(matfn)
    elif num_classes==21:
        color_table = label_colours_voc
    else:
        color_table = label_colours