tensorflow實現FCN完成訓練自己標註的資料

阿新 • • 發佈：2019-01-01

一、先復現FCN

環境：Ubuntu18.04+tensorflow（我的）

1.下載程式碼：

論文地址：https://arxiv.org/pdf/1605.06211v1.pdf

　　論文視訊地址：http://techtalks.tv/talks/fully-convolutional-networks-for-semantic-segmentation/61606/

　　GitHub資源：https://github.com/shekkizh/FCN.tensorflow

程式碼的實現有四個python檔案，分別是FCN.py、BatchDatasetReader.py、TensorFlowUtils.py、read_MITSceneParsingData.py，將這四個檔案放在一個當前目錄下。

2.然後下載VGG網路的權重引數，下載好後的檔案路徑為./Model_zoo/imagenet-vgg-verydeep-19.mat.

3.然後下載訓練會用到的資料集，並解壓到路徑: ./Data_zoo/MIT_SceneParsing/ADEChallengeData2016。

4.訓練時把FCN.py中的全域性變數mode該為“train”，執行該檔案。測試時改為“visualize”執行即可。

這個很簡單的。

下面解析一下部分FCN.py為主檔案程式碼，

載入vgg引數：

def vgg_net(weights, image):
    layers = (
        'conv1_1', 'relu1_1', 'conv1_2', 'relu1_2', 'pool1',

        'conv2_1', 'relu2_1', 'conv2_2', 'relu2_2', 'pool2',

        'conv3_1', 'relu3_1', 'conv3_2', 'relu3_2', 'conv3_3',
        'relu3_3', 'conv3_4', 'relu3_4', 'pool3',

        'conv4_1', 'relu4_1', 'conv4_2', 'relu4_2', 'conv4_3',
        'relu4_3', 'conv4_4', 'relu4_4', 'pool4',

        'conv5_1', 'relu5_1', 'conv5_2', 'relu5_2', 'conv5_3',
        'relu5_3', 'conv5_4', 'relu5_4'
    )

    net = {}
    current = image
    for i, name in enumerate(layers):
        kind = name[:4]
        if kind == 'conv':
            kernels, bias = weights[i][0][0][0][0]
            # matconvnet: weights are [width, height, in_channels, out_channels]
            # tensorflow: weights are [height, width, in_channels, out_channels]
            kernels = utils.get_variable(np.transpose(kernels, (1, 0, 2, 3)), name=name + "_w")
            bias = utils.get_variable(bias.reshape(-1), name=name + "_b")
            current = utils.conv2d_basic(current, kernels, bias)
        elif kind == 'relu':
            current = tf.nn.relu(current, name=name)
            if FLAGS.debug:
                utils.add_activation_summary(current)
        elif kind == 'pool':
            current = utils.avg_pool_2x2(current)
        net[name] = current

    return net

主要是獲取VGG模型預先訓練好的模型係數檔案。該檔案為Mat格式，我們可以使用Python的scipy.io進行資料讀取。

讀取這塊在TensorFlowUtils.py中有。該資料包含很多資訊，而我們需要的資訊是每層神經網路的kernels和bias。

kernels的獲取方式是data['layers'][0][第i層][0][0][0][0][0]，形狀為[width, height, in_channels, out_channels]，bias的獲取方式是data['layers'][0][第i層][0][0][0][0][0]，形狀為[1,out_channels]。

對於VGG-19的卷積，全部採用了3X3的filters，所以width為3，height為3。

注意，這裡面的層數i，指的是包括conv、relu、pool、fc各種操作。因此，i=0為卷積核，i=1為relu，i=2為卷積核，i=3為relu，i=4為pool，i=5為卷積核，……，i=37為全連線層，以此類推。VGG-19的pooling採用了長寬為2X2的max-pooling。

生成FCN網路：

def inference(image, keep_prob):
    """
    Semantic segmentation network definition
    :param image: input image. Should have values in range 0-255
    :param keep_prob:
    :return:
    """

    # 載入模型資料
    print("setting up vgg initialized conv layers ...")
    model_data = utils.get_model_data(FLAGS.model_dir, MODEL_URL)

    mean = model_data['normalization'][0][0][0]
    mean_pixel = np.mean(mean, axis=(0, 1))

    weights = np.squeeze(model_data['layers']) #把形狀中為1的維度去掉

    # 影象預處理
    processed_image = utils.process_image(image, mean_pixel)

    with tf.variable_scope("inference"):
        image_net = vgg_net(weights, processed_image)
        conv_final_layer = image_net["conv5_3"]

        pool5 = utils.max_pool_2x2(conv_final_layer)

        W6 = utils.weight_variable([7, 7, 512, 4096], name="W6")
        b6 = utils.bias_variable([4096], name="b6")
        conv6 = utils.conv2d_basic(pool5, W6, b6)
        relu6 = tf.nn.relu(conv6, name="relu6")
        if FLAGS.debug:
            utils.add_activation_summary(relu6)
        relu_dropout6 = tf.nn.dropout(relu6, keep_prob=keep_prob)

        W7 = utils.weight_variable([1, 1, 4096, 4096], name="W7")
        b7 = utils.bias_variable([4096], name="b7")
        conv7 = utils.conv2d_basic(relu_dropout6, W7, b7)
        relu7 = tf.nn.relu(conv7, name="relu7")
        if FLAGS.debug:
            utils.add_activation_summary(relu7)
        relu_dropout7 = tf.nn.dropout(relu7, keep_prob=keep_prob)

        W8 = utils.weight_variable([1, 1, 4096, NUM_OF_CLASSESS], name="W8")
        b8 = utils.bias_variable([NUM_OF_CLASSESS], name="b8")
        conv8 = utils.conv2d_basic(relu_dropout7, W8, b8)
        # annotation_pred1 = tf.argmax(conv8, dimension=3, name="prediction1")

        # now to upscale to actual image size
        deconv_shape1 = image_net["pool4"].get_shape()
        W_t1 = utils.weight_variable([4, 4, deconv_shape1[3].value, NUM_OF_CLASSESS], name="W_t1")
        b_t1 = utils.bias_variable([deconv_shape1[3].value], name="b_t1")
        conv_t1 = utils.conv2d_transpose_strided(conv8, W_t1, b_t1, output_shape=tf.shape(image_net["pool4"]))
        fuse_1 = tf.add(conv_t1, image_net["pool4"], name="fuse_1")

        deconv_shape2 = image_net["pool3"].get_shape()
        W_t2 = utils.weight_variable([4, 4, deconv_shape2[3].value, deconv_shape1[3].value], name="W_t2")
        b_t2 = utils.bias_variable([deconv_shape2[3].value], name="b_t2")
        conv_t2 = utils.conv2d_transpose_strided(fuse_1, W_t2, b_t2, output_shape=tf.shape(image_net["pool3"]))
        fuse_2 = tf.add(conv_t2, image_net["pool3"], name="fuse_2")

        shape = tf.shape(image)
        deconv_shape3 = tf.stack([shape[0], shape[1], shape[2], NUM_OF_CLASSESS])
        W_t3 = utils.weight_variable([16, 16, NUM_OF_CLASSESS, deconv_shape2[3].value], name="W_t3")
        b_t3 = utils.bias_variable([NUM_OF_CLASSESS], name="b_t3")
        conv_t3 = utils.conv2d_transpose_strided(fuse_2, W_t3, b_t3, output_shape=deconv_shape3, stride=8)

        annotation_pred = tf.argmax(conv_t3, dimension=3, name="prediction")

    return tf.expand_dims(annotation_pred, dim=3), conv_t3

VGG-19需要對輸入圖片進行一步預處理，把每個畫素點的取值減去訓練集算出來的RGB均值。

VGG-19的RGB均值可以通過np.mean(data['normalization'][0][0][0], axis=(0, 1)獲得，其取值為[ 123.68 116.779 103.939]，不細解釋了。

TensorFlowUtils.py主要定義了一些工具函式，如變數初始化、卷積反捲積操作、池化操作、批量歸一化、影象預處理等，read_MITSceneParsingData.py主要是用於讀取資料集的資料，BatchDatasetReader.py主要用於製作資料集batch塊。

執行FCN.py這樣就開始訓練了。

二、製作自己的訓練資料

1. 做標籤安裝

我的環境ubuntu18+py36.

# Python3
sudo apt-get install python3-pyqt5  # PyQt5
sudo pip3 install labelme

2.標註：

執行labelme開啟取名標註即可，點選Save後會生成改圖片對應的json檔案。

單張pnglabel圖片生成：labelme_json_to_dataset 《檔名》.json

如：

有的看起來是全黑的，然而讀到畫素中，是可以看到對相同類別的檔案進行標註了。

而實際中我們希望能對資料夾下多個json檔案進行批量處理。

import argparse
import json
import os
import os.path as osp
import warnings

import numpy as np
import PIL.Image
import yaml

from labelme import utils


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('json_file')
    parser.add_argument('-o', '--out', default=None)
    args = parser.parse_args()

    json_file = args.json_file

    list = os.listdir(json_file)
    for i in range(0, len(list)):
        path = os.path.join(json_file, list[i])
        if os.path.isfile(path):
            data = json.load(open(path))
            img = utils.img_b64_to_arr(data['imageData'])
            lbl, lbl_names = utils.labelme_shapes_to_label(img.shape, data['shapes'])

            captions = ['%d: %s' % (l, name) for l, name in enumerate(lbl_names)]
            lbl_viz = utils.draw_label(lbl, img, captions)
            out_dir = osp.basename(list[i]).replace('.', '_')
            out_dir = osp.join(osp.dirname(list[i]), out_dir)
            if not osp.exists(out_dir):
                os.mkdir(out_dir)

            PIL.Image.fromarray(img).save(osp.join(out_dir, 'img.png'))
            PIL.Image.fromarray(lbl).save(osp.join(out_dir, 'label.png'))
            PIL.Image.fromarray(lbl_viz).save(osp.join(out_dir, 'label_viz.png'))

            with open(osp.join(out_dir, 'label_names.txt'), 'w') as f:
                for lbl_name in lbl_names:
                    f.write(lbl_name + '\n')

            warnings.warn('info.yaml is being replaced by label_names.txt')
            info = dict(label_names=lbl_names)
            with open(osp.join(out_dir, 'info.yaml'), 'w') as f:
                yaml.safe_dump(info, f, default_flow_style=False)

            print('Saved to: %s' % out_dir)


if __name__ == '__main__':
    main()

使用時：python3 此檔名資料夾（你的json）

程式碼可以將json檔案中的label儲存為png影象檔案。但是存在一個問題：對於多類分割任務，任意一張圖可能不包含所有分類。因此整個資料夾下生成的所有label影象中，不同影象中的相同類別的目標在label.png中可能對應不同的灰度值，使標註的label不具備統一性，因而出錯。即假如我總共有10個類別，但分割一張圖片時，只有其中的3個類別，這樣的話，我生成標籤就是[0,1,2]，當我分割另外一張影象時，有其中的2個類別，那麼對應的標籤是[0,1]，而且這兩張圖中的1，並不是相同的類別。

例如：

針對一此情況，對程式碼進行了改進，使用時：

python3 batch_json_to_dataset.py json《json資料夾》 labels《轉換的png圖》

此時會在label檔案下就會生成label圖片，都是黑乎乎的，你如果要視覺化的話，可以python3 batch_color_map.py labels out 5，最後的5表示類別，我這個地方是四種類別（如下圖的藍色：curtain，紅色：pool，綠色：closestool，黃色：window），加背景也作為一種畫素，所以為5，生成的圖象如下圖所示。

上圖最後一張圖的原圖：標註的太粗糙了~~

程式碼依賴包好幾個，我壓縮打包下載網址：

好了資料集準備好了，就可以修改程式碼開始訓練了。

我的類別是4類，加上背景則就是五類，所以NUM_OF_CLASSESS = 5 # 類的個數

我自己做的資料比較小，不到100張，迭代1000次，此部落格僅入手篇。

當測試時，按照作者的程式碼，將train改為visualize即可。

後期有需要再補充，如有錯誤歡迎指出。

tensorflow實現FCN完成訓練自己標註的資料

一、先復現FCN

二、製作自己的訓練資料

tensorflow實現FCN完成訓練自己標註的資料

深度學習tensorflow實戰筆記（1）全連線神經網路（FCN）訓練自己的資料（從txt檔案中讀取）

實現yolo3模型訓練自己的資料集總結

tensorflow-Inception-v3模型訓練自己的資料程式碼示例

Tensorflow object detection API 訓練自己的資料集

用Tensorflow Object Detection API 訓練自己的資料集

tensorflow專案學習(1)——訓練自己的資料集並進行物體檢測(object detection)

R-FCN修改訓練自己的資料

tensorflow object detection api訓練自己的資料集

Tensorflow製作並用CNN訓練自己的資料集

關於使用tensorflow object detection API訓練自己的模型-補充部分（程式碼，資料標註工具，訓練資料，測試資料）

詳解tensorflow訓練自己的資料集實現CNN影象分類

使用Tensorflow來讀取訓練自己的資料（三）

使用Tensorflow來讀取訓練自己的資料（二）

使用Tensorflow來讀取訓練自己的資料（一）

SSD Tensorflow訓練自己的資料集，遇到報錯absl.flags._exceptions.IllegalFlagValueError: flag --num_classes==: 求助

使用tensorflow訓練自己的資料集（一）——製作資料集

使用tensorflow訓練自己的資料集（四）——計算模型準確率

利用FCN-32s,FCN-16s和FCN-8s訓練自己製作的資料集

使用tensorflow訓練自己的資料集（一）

tensorflow實現FCN完成訓練自己標註的資料

一、先復現FCN

二、製作自己的訓練資料

相關推薦