【深度學習】ResNet解讀及程式碼實現

阿新 • • 發佈：2018-11-16

簡介

ResNet是何凱明大神在2015年提出的一種網路結構，獲得了ILSVRC-2015分類任務的第一名，同時在ImageNet detection，ImageNet localization，COCO detection和COCO segmentation等任務中均獲得了第一名，在當時可謂是轟動一時。

ResNet又名殘差神經網路，指的是在傳統卷積神經網路中加入殘差學習（residual learning）的思想，解決了深層網路中梯度彌散和精度下降（訓練集）的問題，使網路能夠越來越深，既保證了精度，又控制了速度。

出發點

隨著網路的加深，梯度彌散問題會越來越嚴重，導致網路很難收斂甚至無法收斂。梯度彌散問題目前有很多的解決辦法，包括網路初始標準化，資料標準化以及中間層的標準化（Batch Normalization）等。但是網路加深還會帶來另外一個問題：隨著網路加深，出現訓練集準確率下降的現象，如下圖，

很多同學第一反應肯定是“這不是過擬合了嗎”。其實，這不是由於過擬合引起的。過擬合通常指模型在訓練集表現很好，在測試集很差。凱明大神針對這個問題提出了殘差學習的思想。

殘差學習指的是什麼？

殘差學習的思想就是上面這張圖，可以把它理解為一個block，定義如下：

$y=F(x,\{W_i\})+x$

殘差學習的block一共包含兩個分支或者兩種對映（mapping）：

1. identity mapping，指的是上圖右邊那條彎的曲線。顧名思義，identity mapping指的就是本身的對映，也就是 $x$ 自身；

2. residual mapping，指的是另一條分支，也就是 $F(x)$

部分，這部分稱為殘差對映，也就是 $y-x$ 。

為什麼殘差學習可以解決“網路加深準確率下降”的問題？

對於一個神經網路模型，如果該模型是最優的，那麼訓練就很容易將residual mapping優化到0，此時只剩下identity mapping，那麼無論怎麼增加深度，理論上網路會一直處於最優狀態。因為相當於後面所有增加的網路都會沿著identity mapping（自身）進行資訊傳輸，可以理解為最優網路後面的層數都是廢掉的（不具備特徵提取的能力），實際上沒起什麼作用。這樣，網路的效能也就不會隨著深度的增加而降低了。

網路結構

文中提到了一個名詞叫“Shortcut Connection”，實際上它指的就是identity mapping，這裡先解釋一下，免的大家後面會confuse。針對不同深度的ResNet，作者提出了兩種Residual Block：

對上圖做如下說明：

1. 左圖為基本的residual block，residual mapping為兩個64通道的3x3卷積，輸入輸出均為64通道，可直接相加。該block主要使用在相對淺層網路，比如ResNet-34；

2. 右圖為針對深層網路提出的block，稱為“bottleneck” block，主要目的就是為了降維。首先通過一個1x1卷積將256維通道（channel）降到64通道，最後通過一個256通道的1x1卷積恢復。

通過上面的介紹我們知道，residual mapping和identity mapping是沿通道維度相加的，那麼如果通道維度不相同怎麼辦？

作者提出在identity mapping部分使用1x1卷積進行處理，表示如下：

$y=F(x,\{W_i\})+W_sx$

其中， $W_s$ 指的是1x1卷積操作。

下圖為VGG-19，Plain-34(沒有使用residual結構)和ResNet-34網路結構對比：

對上圖進行如下說明：

1. 相比於VGG-19，ResNet沒有使用全連線層，而使用了全域性平均池化層，可以減少大量引數。VGG-19大量引數集中在全連線層；

2. ResNet-34中跳躍連線“實線”為identity mapping和residual mapping通道數相同，“虛線”部分指的是兩者通道數不同，需要使用1x1卷積調整通道維度，使其可以相加。

論文一共提出5種ResNet網路，網路引數統計表如下：

程式碼實現

本節使用keras實現ResNet-18。

from keras.layers import Input
from keras.layers import Conv2D, MaxPool2D, Dense, BatchNormalization, Activation, add, GlobalAvgPool2D
from keras.models import Model
from keras import regularizers
from keras.utils import plot_model
from keras import backend as K


def conv2d_bn(x, nb_filter, kernel_size, strides=(1, 1), padding='same'):
    """
    conv2d -> batch normalization -> relu activation
    """
    x = Conv2D(nb_filter, kernel_size=kernel_size,
                          strides=strides,
                          padding=padding,
                          kernel_regularizer=regularizers.l2(0.0001))(x)
    x = BatchNormalization()(x)
    x = Activation('relu')(x)
    return x


def shortcut(input, residual):
    """
    shortcut連線，也就是identity mapping部分。
    """

    input_shape = K.int_shape(input)
    residual_shape = K.int_shape(residual)
    stride_height = int(round(input_shape[1] / residual_shape[1]))
    stride_width = int(round(input_shape[2] / residual_shape[2]))
    equal_channels = input_shape[3] == residual_shape[3]

    identity = input
    # 如果維度不同，則使用1x1卷積進行調整
    if stride_width > 1 or stride_height > 1 or not equal_channels:
        identity = Conv2D(filters=residual_shape[3],
                           kernel_size=(1, 1),
                           strides=(stride_width, stride_height),
                           padding="valid",
                           kernel_regularizer=regularizers.l2(0.0001))(input)

    return add([identity, residual])


def basic_block(nb_filter, strides=(1, 1)):
    """
    基本的ResNet building block，適用於ResNet-18和ResNet-34.
    """
    def f(input):

        conv1 = conv2d_bn(input, nb_filter, kernel_size=(3, 3), strides=strides)
        residual = conv2d_bn(conv1, nb_filter, kernel_size=(3, 3))

        return shortcut(input, residual)

    return f


def residual_block(nb_filter, repetitions, is_first_layer=False):
    """
    構建每層的residual模組，對應論文引數統計表中的conv2_x -> conv5_x
    """
    def f(input):
        for i in range(repetitions):
            strides = (1, 1)
            if i == 0 and not is_first_layer:
                strides = (2, 2)
            input = basic_block(nb_filter, strides)(input)
        return input

    return f


def resnet_18(input_shape=(224,224,3), nclass=1000):
    """
    build resnet-18 model using keras with TensorFlow backend.
    :param input_shape: input shape of network, default as (224,224,3)
    :param nclass: numbers of class(output shape of network), default as 1000
    :return: resnet-18 model
    """
    input_ = Input(shape=input_shape)

    conv1 = conv2d_bn(input_, 64, kernel_size=(7, 7), strides=(2, 2))
    pool1 = MaxPool2D(pool_size=(3, 3), strides=(2, 2), padding='same')(conv1)

    conv2 = residual_block(64, 2, is_first_layer=True)(pool1)
    conv3 = residual_block(128, 2, is_first_layer=True)(conv2)
    conv4 = residual_block(256, 2, is_first_layer=True)(conv3)
    conv5 = residual_block(512, 2, is_first_layer=True)(conv4)

    pool2 = GlobalAvgPool2D()(conv5)
    output_ = Dense(nclass, activation='softmax')(pool2)

    model = Model(inputs=input_, outputs=output_)
    model.summary()

    return model

if __name__ == '__main__':
    model = resnet_18()
    plot_model(model, 'ResNet-18.png')  # 儲存模型圖

【深度學習】ResNet解讀及程式碼實現

簡介

出發點

網路結構

程式碼實現

【深度學習】ResNet解讀及程式碼實現

【機器學習】SVM基礎知識+程式碼實現

【機器學習】KNN基本介紹+程式碼實現

【深度學習】寫詩機器人tensorflow實現

【機器學習】感知機Python程式碼實現

【深度學習】Alexnet網路分析及程式碼實現

【深度學習】GoogLeNet系列解讀 —— Inception v4

【深度學習】GoogLeNet系列解讀 —— Inception v3

【深度學習】GoogLeNet系列解讀 —— Inception v2

【深度學習】線性迴歸（一）原理及python從0開始實現

【深度學習】alexnet、vgg19_bn、ResNet-110、PreResNet-110、ResNeXt-29, 8x64等模型效能對比

【深度學習】ResNeXt網路解讀

【深度學習】經典神經網路 VGG 論文解讀

【深度學習】Ubuntu16.04+tensorflow+opencv+pygame 執行FlappyBird（畫素小鳥）程式碼（4）

【深度學習】谷歌deepdream原理及tensorflow實現

【深度學習】【物聯網】深度解讀：深度學習在IoT大資料和流分析中的應用

【深度學習】Batch Normalizaton 的作用及理論基礎詳解

【深度學習】詞的向量化表示

【深度學習】批歸一化（Batch Normalization）

【深度學習】常用的模型評估指標

【深度學習】ResNet解讀及程式碼實現

簡介

出發點

網路結構

程式碼實現

相關推薦