1. 程式人生 > >TensorFlow實戰:Chapter-6(CNN-4-經典卷積神經網路(ResNet))

TensorFlow實戰:Chapter-6(CNN-4-經典卷積神經網路(ResNet))

ResNet

ResNet簡介

ResNet(Residual Neural Network)由微軟研究院的何凱明大神等4人提出,ResNet通過使用Residual Unit成功訓練152層神經網路,在ILSCRC2015年比賽中獲得3.75%的top-5錯誤率,獲得冠軍。ResNet的引數量少,且新增的Residual Unit單元可以極快地加速神經網路的訓練,同時模型的準備率也有非常大的提升。本節重點分析KaiMing He大神的《Deep Residual Learning for Image Recognition》論文,以及如何用TensorFlow實現ResNet.

相關內容

在ResNet之前,瑞士教授Schmidhuber提出了Highway Network,原理和ResNet很像,Schmidhuber教授有著一個更出名的發明–LSTM網路。

  • Highway Network解決的問題?
    通常認為神經網路的深度對其效能非常重要,而在增加網路深度的同時隨之而來的是網路訓練難度增大,Highway Network的目標就是解決極深的神經網路難以訓練的問題。

  • Highway Network的原理?
    Highway Network相當於修改了每一層啟用函式,此前的啟用函式是對輸入訊號做了非線性變換y=H(x,W),Highway Network則允許保留一定比例的原始輸入x,即y=H(x,W1)T(x,W2)+xC(x,W3),這裡T為變換系數,C為保留係數。在這樣的設定下,前面一層的資訊,有一定比例的可以直接傳輸到下一層(不經過矩陣乘法和啟用函式變換),如同網路傳輸中的一條高速公路,因此得名Highway Network。Highway Network主要通過gating units學習如何控制網路中的資訊流,即學習原始資訊應保留的比例。這個可學習的gatting機制,正是借鑑Schmidhuber教授早年的的LSTM網路中的gatting。這正是Highway Network的引入,使得幾百層乃至上千次的網路可以訓練了。

  • ResNet網路和Highway Network有啥關係?
    ResNet和Highway Network非常相似,都是針對網路隨著深度的變化而引發的問題,而Highway Network的解決思路給ResNet提供瞭解決辦法.(下面詳解ResNet的問題引出)

論文分析

問題引出

這裡寫圖片描述

首先丟擲一個問題:一個網路效能的提升是否能夠可以通過簡單的堆疊網路的層數?

這個問題很難回答,因為在網路訓練的過程中可能存在梯度消失/爆炸,現在通過規範初始化和中間層歸一化技術,在配合以BP為基礎的SGD基本上可以訓練淺層網路。

但是當網路深度增加時,就會暴露出另一個問題:
隨著網路的深度增加,網路的錯誤率卻在上升。

參考下圖

這裡寫圖片描述

我們把這個問題用一個名詞degradation表示,首先我們可以判斷的是,引起degradation不是過擬合。因為從圖上來看,隨著網路層數的增加,網路在測試集上的的錯誤率在上升,但同時在訓練集上的錯誤率也在上升,如果是過擬合,網路在訓練集上的錯誤率不該有是上升的。所有引起degradation的不是因為網路的過擬合

那麼問題出在哪?

這裡寫圖片描述

回答上面的問題,我們要分析degradation出現的原因:
degradation問題說明了不是所有的系統都是容易優化的,我們可以這麼想,針對一個淺層的網路,如果這個網路達到了一定得效能後,我們在該網路的基礎上疊加新的網路層(簡稱深層網路),那麼深層網路應該比淺層網路的效能不差,因為如果淺層網路已經是效能最好的話,那麼多疊加的網路層學習到後面都為恆等對映(identity mapping)即可,但是實際情況卻是訓練的誤差也在上升,那麼就是網路本身有問題了

解決辦法

針對網路本身的問題,文章提出了一個新的網路結構–deep residual learning framework(深度殘差學習框架),下面就來詳細講解一下這個新的結構相比以前的網路有啥提升的.

這裡寫圖片描述

我們將原先堆疊的網路層從一個直接的對映(desired underlying mapping)用一個新的對映代替了,這個新的對映 們稱之為residual mapping,即如果我們假設我們期望的對映為H(x),設原本網路的非線性對映為F(x)=H(x)-x, 那麼期望的對映就可以寫成H(x) = F(x) + x,現在F(x) + x的結構就是本文的resudial mapping,resudial mapping與原本的F(x)的區別很明顯,就在於多了一個x
我們可以考慮極端情況,如果我們需要學習一個恆等對映(即H(x)= x),我們認為通過堆疊網路使得原本的非線性對映F(x)優化為0的難度要比F(x)優化到1要簡單.

為了證明這個新的對映好使,文章剩下的部分就是做相關證明工作了。

要證明新的對映好,那麼先要設計並實現新的網路架構。

實現residual mapping

將residual mapping的每層用下圖表示:

這裡寫圖片描述

我們現在要優化的是一個殘差結構,設定網路塊為:

     y = F(x,W) + x   
    # y and y are the input and output vectors

這種結構的實現是在原來網路的基礎上新增一個通道(控制裡面的前饋),這裡新新增的通道需要跳過一個或者多個網路層,可以看到新的網路塊不需要新增新的引數,而且新的網路結構依據可以通過SGD來訓練,且這種新的網路實現起來也很方便,我們可以等同的比較原本的網路和新的網路結合的效能。

實驗

下面我們測試了三個網路:

  • 左邊:VGG-19 Model(19.6billion FLOPs);

  • 中間:普通的網路:使用多個3*3的小卷積核(以VGG網路的思想設計),遵循著兩個設計原則:

    • 對於相同的輸出特徵圖尺寸,層與濾波器的個數是相同的
    • 如果輸出特徵圖的尺寸減半,那麼濾波器的個數加倍,保持時間複雜度

    網路共34層,以全域性平均池化層和1000個分類的softmax層結束。
    普通的網路需要(3.6billion FLOPs)

  • 右邊:residual model:這是建立在普通網路的基礎上的,可以看到網路的旁邊多了很多前饋線,我們把這些前饋線也稱之為shortcut或skip connections. 這些前饋線代表的是恆等對映。
    前饋線應用會遇到兩種情況:

    • 輸入和輸出的維度一致,那可以直接連線
    • 如果輸出的維度增加了,有兩種辦法
      1. 恆等前饋線不足的維度新增padding,padding的值為0
      2. 借鑑Inception Net的思想,經過1*1卷積變換維度

這裡寫圖片描述

同時配置了不同層數的ResNet,如圖:

這裡寫圖片描述

對於兩層到三層的ResNet殘差學習模組,設計如下:
這裡寫圖片描述

設計好網路,下面該做實驗,分析結果了

實驗結果

下圖是ResNets與普通網路的對比,注意ResNets相對於普通網路是沒有額外的引數的:

這裡寫圖片描述

左邊是普通網路隨著迭代次數的增加,plain-18和plain-34隨著迭代次數的增加,訓練誤差和驗證誤差的變化,加粗的是驗證誤差,細線為訓練誤差,可以看到無論是訓練誤差還是驗證誤差,隨著迭代次數的增加,plain-34都比plain-18要大,這就是我們一開始說的degradation問題。

右邊是residual net(使用的是零填充),可以看到隨著迭代次數的增加 訓練集和驗證集的誤差都繼續下降了,說明residual net結構產生的效果確實比較好。

評估residual net在ImageNet資料集上的表現:

這裡寫圖片描述

resNet-A是使用zero-padding,resNet-B是在等維使用恆等對映,否則使用shortcuts projection,resNet-C的所有shortcuts都是projection

可以看到residual net隨著層數的增加,效能繼續提升。

到這裡論文分析就算結束了,下面分析ResNet在TensorFlow上的開源實現。

ResNet在TensorFlow上的實現

程式碼如下:

    # coding:utf8
    # %%
    # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
    #
    # Licensed under the Apache License, Version 2.0 (the "License");
    # you may not use this file except in compliance with the License.
    # You may obtain a copy of the License at
    #
    #     http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    # ==============================================================================
    """

    Typical use:

       from tensorflow.contrib.slim.nets import resnet_v2

    ResNet-101 for image classification into 1000 classes:

       # inputs has shape [batch, 224, 224, 3]
       with slim.arg_scope(resnet_v2.resnet_arg_scope(is_training)):
          net, end_points = resnet_v2.resnet_v2_101(inputs, 1000)

    ResNet-101 for semantic segmentation into 21 classes:

       # inputs has shape [batch, 513, 513, 3]
       with slim.arg_scope(resnet_v2.resnet_arg_scope(is_training)):
          net, end_points = resnet_v2.resnet_v2_101(inputs,
                                                    21,
                                                    global_pool=False,
                                                    output_stride=16)
    """
    import collections
    import tensorflow as tf

    slim = tf.contrib.slim


    class Block(collections.namedtuple('Block', ['scope', 'unit_fn', 'args'])):
        '''
        使用collections.namedtuple設計ResNet基本的Block模組組的named tuple
            只包含資料結構,包含具體方法
        需要傳入三個引數[scope,unit_fn,args]

         以Block('block1',bottleneck,[(256,64,1)]x2 + [(256,64,2)])為例
         scope = 'block1' 這個Block的名稱就是block1

         unit_fn = bottleneck,  就是ResNet的殘差學習單元

         args = [(256,64,1)]x2 + [(256,64,2)]
         args是一個列表,每個元素都對應一個bottleneck殘差學習單元
         前面兩個元素都是(256,64,1),後一個元素是(256,64,2)
         每個元素都是一個三元的tuple,代表(depth,depth_bottleneck,stride)
         例如(256,64,2)代表構建的bottleneck殘差學習單元(每個殘差學習單元裡面有三個卷積層)中,
         第三層輸出通道數depth為256,前兩層輸出通道數depth_bottleneck為64,且中間層的步長stride為2.

         這個殘差學習單元的結構為[(1x1/s1,64),(3x3/s2,64),(1x1/s1,256)]

        整個block1中有三個bottleneck殘差學習單元,結構為
        [(1x1/s1,64),(3x3/s2,64),(1x1/s1,256)]
        [(1x1/s1,64),(3x3/s2,64),(1x1/s1,256)]
        [(1x1/s1,64),(3x3/s2,64),(1x1/s1,256)]
        '''


        """
        A named tuple describing a ResNet block.
        Its parts are:
          scope: The scope of the `Block`.
          unit_fn: The ResNet unit function which takes as input a `Tensor` and
            returns another `Tensor` with the output of the ResNet unit.
          args: A list of length equal to the number of units in the `Block`. The list
            contains one (depth, depth_bottleneck, stride) tuple for each unit in the
            block to serve as argument to unit_fn.
        """


    def subsample(inputs, factor, scope=None):
        '''
            降取樣方法,如果factor=1,則不做修改返回inputs,不為1,則使用slim.max_pool2d最大池化實現,
        :param inputs:
        :param factor: 取樣因子
        :param scope:
        :return:
        '''
        """Subsamples the input along the spatial dimensions.

        Args:
          inputs: A `Tensor` of size [batch, height_in, width_in, channels].
          factor: The subsampling factor.
          scope: Optional variable_scope.

        Returns:
          output: A `Tensor` of size [batch, height_out, width_out, channels] with the
            input, either intact (if factor == 1) or subsampled (if factor > 1).
        """
        if factor == 1:
            return inputs
        else:
            return slim.max_pool2d(inputs, [1, 1], stride=factor, scope=scope)




    def conv2d_same(inputs, num_outputs, kernel_size, stride, scope=None):
        '''
            如果步長為1,直接使用slim.conv2d,使用conv2d的padding='SAME'
            如果步長大於1,需要顯式的填充0(size已經擴大了),在使用conv2d取padding='VALID'
             (或者先直接SAME,再呼叫上面的subsample下采樣)
        :param inputs:  [batch, height_in, width_in, channels].
        :param num_outputs:  An integer, the number of output filters.
        :param kernel_size: An int with the kernel_size of the filters.
        :param stride: An integer, the output stride.
        :param scope:
        :return:
        '''

        """Strided 2-D convolution with 'SAME' padding.

        When stride > 1, then we do explicit zero-padding, followed by conv2d with
        'VALID' padding.

        Note that

           net = conv2d_same(inputs, num_outputs, 3, stride=stride)

        is equivalent to

           net = slim.conv2d(inputs, num_outputs, 3, stride=1, padding='SAME')
           net = subsample(net, factor=stride)

        whereas

           net = slim.conv2d(inputs, num_outputs, 3, stride=stride, padding='SAME')

        is different when the input's height or width is even, which is why we add the
        current function. For more details, see ResnetUtilsTest.testConv2DSameEven().

        Args:
          inputs: A 4-D tensor of size [batch, height_in, width_in, channels].
          num_outputs: An integer, the number of output filters.
          kernel_size: An int with the kernel_size of the filters.
          stride: An integer, the output stride.
          rate: An integer, rate for atrous convolution.
          scope: Scope.

        Returns:
          output: A 4-D tensor of size [batch, height_out, width_out, channels] with
            the convolution output.
        """
        if stride == 1:
            return slim.conv2d(inputs, num_outputs, kernel_size, stride=1,
                               padding='SAME', scope=scope)
        else:
            # kernel_size_effective = kernel_size + (kernel_size - 1) * (rate - 1)
            pad_total = kernel_size - 1
            pad_beg = pad_total // 2
            pad_end = pad_total - pad_beg
            inputs = tf.pad(inputs,
                            [[0, 0], [pad_beg, pad_end], [pad_beg, pad_end], [0, 0]])
            return slim.conv2d(inputs, num_outputs, kernel_size, stride=stride,
                               padding='VALID', scope=scope)




    @slim.add_arg_scope
    def stack_blocks_dense(net, blocks,
                           outputs_collections=None):
        '''
        定義堆疊Blocks函式,
        :param net:  為輸入  [batch, height, width, channels]
        :param blocks:  blocks為之前定義好的Blocks的class的列表,
        :param outputs_collections: 用來收集各個end_points和collections
        :return:
            使用兩層迴圈,逐個Block,逐個Residual unit堆疊
            先使用variable_scope將殘差單元命名改為block/unit_%d的形式
            在第二層,我們拿到每個Blocks中的Residual Unit的args,並展開
            再使用unit_fn殘差學習單元生成函式順序地建立並連線所有的殘差學習單元
            最後,我們使用slim.utils.collect_named_outputs函式將輸出net新增到collection

        '''
        """Stacks ResNet `Blocks` and controls output feature density.

        First, this function creates scopes for the ResNet in the form of
        'block_name/unit_1', 'block_name/unit_2', etc.


        Args:
          net: A `Tensor` of size [batch, height, width, channels].
          blocks: A list of length equal to the number of ResNet `Blocks`. Each
            element is a ResNet `Block` object describing the units in the `Block`.
          outputs_collections: Collection to add the ResNet block outputs.

        Returns:
          net: Output tensor

        """
        for block in blocks:
            with tf.variable_scope(block.scope, 'block', [net]) as sc:
                for i, unit in enumerate(block.args):
                    with tf.variable_scope('unit_%d' % (i + 1), values=[net]):
                        unit_depth, unit_depth_bottleneck, unit_stride = unit
                        net = block.unit_fn(net,
                                            depth=unit_depth,
                                            depth_bottleneck=unit_depth_bottleneck,
                                            stride=unit_stride)
                net = slim.utils.collect_named_outputs(outputs_collections, sc.name, net)

        return net


    def resnet_arg_scope(is_training=True,
                         weight_decay=0.0001,
                         batch_norm_decay=0.997,
                         batch_norm_epsilon=1e-5,
                         batch_norm_scale=True):
        '''
            這裡建立ResNet通過的arg_scope,用來定義某些函式的引數預設值
            先設定好BN的各項引數,然後通過slim.arg_scope將slim.conv2d的幾個預設引數設定好:
        :param is_training:
        :param weight_decay:  權重衰減率
        :param batch_norm_decay: BN衰減率預設為0.997
        :param batch_norm_epsilon:
        :param batch_norm_scale:
        :return:
        '''
        """Defines the default ResNet arg scope.

        TODO(gpapan): The batch-normalization related default values above are
          appropriate for use in conjunction with the reference ResNet models
          released at https://github.com/KaimingHe/deep-residual-networks. When
          training ResNets from scratch, they might need to be tuned.

        Args:
          is_training: Whether or not we are training the parameters in the batch
            normalization layers of the model.
          weight_decay: The weight decay to use for regularizing the model.
          batch_norm_decay: The moving average decay when estimating layer activation
            statistics in batch normalization.
          batch_norm_epsilon: Small constant to prevent division by zero when
            normalizing activations by their variance in batch normalization.
          batch_norm_scale: If True, uses an explicit `gamma` multiplier to scale the
            activations in the batch normalization layer.

        Returns:
          An `arg_scope` to use for the resnet models.
        """
        batch_norm_params = {
            'is_training': is_training,
            'decay': batch_norm_decay,
            'epsilon': batch_norm_epsilon,
            'scale': batch_norm_scale,
            'updates_collections': tf.GraphKeys.UPDATE_OPS,
        }

        '''
            通過slim.arg_scope將slim.conv2d預設引數
            權重設定為L2正則
            權重初始化/啟用函式設定/BN設定
        '''
        with slim.arg_scope(
                [slim.conv2d],
                weights_regularizer=slim.l2_regularizer(weight_decay),
                weights_initializer=slim.variance_scaling_initializer(),
                activation_fn=tf.nn.relu,
                normalizer_fn=slim.batch_norm,
                normalizer_params=batch_norm_params):
            with slim.arg_scope([slim.batch_norm], **batch_norm_params):
                # The following implies padding='SAME' for pool1, which makes feature
                # alignment easier for dense prediction tasks. This is also used in
                # https://github.com/facebook/fb.resnet.torch. However the accompanying
                # code of 'Deep Residual Learning for Image Recognition' uses
                # padding='VALID' for pool1. You can switch to that choice by setting
                # slim.arg_scope([slim.max_pool2d], padding='VALID').
                with slim.arg_scope([slim.max_pool2d], padding='SAME') as arg_sc:
                    return arg_sc


    @slim.add_arg_scope
    def bottleneck(inputs, depth, depth_bottleneck, stride,
                   outputs_collections=None, scope=None):
        '''
            bottleneck殘差學習單元,這是ResNet V2論文中提到的Full Preactivation Residual Unit的
        一個變種, 它和V1中的殘差學習單元的主要區別有兩點:
            1. 在每一層前都用了Batch Normalization
            2. 對輸入進行preactivation,而不是在卷積進行啟用函式處理
        :param inputs:
        :param depth:
        :param depth_bottleneck:
        :param stride:
        :param outputs_collections:
        :param scope:
        :return:
        '''
        """Bottleneck residual unit variant with BN before convolutions.

        This is the full preactivation residual unit variant proposed in [2]. See
        Fig. 1(b) of [2] for its definition. Note that we use here the bottleneck
        variant which has an extra bottleneck layer.

        When putting together two consecutive ResNet blocks that use this unit, one
        should use stride = 2 in the last unit of the first block.

        Args:
          inputs: A tensor of size [batch, height, width, channels].
          depth: The depth of the ResNet unit output.
          depth_bottleneck: The depth of the bottleneck layers.
          stride: The ResNet unit's stride. Determines the amount of downsampling of
            the units output compared to its input.
          rate: An integer, rate for atrous convolution.
          outputs_collections: Collection to add the ResNet unit output.
          scope: Optional variable_scope.

        Returns:
          The ResNet unit's output.
        """
        with tf.variable_scope(scope, 'bottleneck_v2', [inputs]) as sc:
            # 獲取輸入的最後一個維度,即輸出通道數
            depth_in = slim.utils.last_dimension(inputs.get_shape(), min_rank=4)
            #先做BN操作,在使用ReLU做preactivation
            preact = slim.batch_norm(inputs, activation_fn=tf.nn.relu, scope='preact')
            # 定義shortcut,如果殘差單元的輸入通道數depth_in和輸出通道數depth一致,那麼使用subsample
            #按步長為stride對inputs進行空間上的降取樣(確保空間尺寸和殘差一致,因為殘差中間那層的卷積步長為stride)
            # 如果輸入/輸出通道數不一樣,我們用步長stride的1*1卷積改變其通道數,使得與輸出通道數一致
            if depth == depth_in:
                shortcut = subsample(inputs, stride, 'shortcut')
            else:
                shortcut = slim.conv2d(preact, depth, [1, 1], stride=stride,
                                       normalizer_fn=None, activation_fn=None,
                                       scope='shortcut')
            # 然後定義residual,這裡residual有3層,先是一個1*1尺寸/步長為1/輸出通道數為depth_bottleneck的卷積
            # 然後是一個3*3尺寸 -->最後還是一個1*1
            # 最終得到的residual,注意最後一層沒有正則化也沒有啟用函式
            # 最後將residual和shortcut相加,得到最後的output,再新增到collection

            residual = slim.conv2d(preact, depth_bottleneck, [1, 1], stride=1,
                                   scope='conv1')
            residual = conv2d_same(residual, depth_bottleneck, 3, stride,
                                   scope='conv2')
            residual = slim.conv2d(residual, depth, [1, 1], stride=1,
                                   normalizer_fn=None, activation_fn=None,
                                   scope='conv3')

            output = shortcut + residual

            return slim.utils.collect_named_outputs(outputs_collections,
                                                    sc.name,
                                                    output)


    def resnet_v2(inputs,
                  blocks,
                  num_classes=None,
                  global_pool=True,
                  include_root_block=True,
                  reuse=None,
                  scope=None):
        '''

        :param inputs:
        :param blocks:
        :param num_classes:
        :param global_pool:
        :param include_root_block:
        :param reuse:
        :param scope:
        :return:
        '''
        """Generator for v2 (preactivation) ResNet models.

        This function generates a family of ResNet v2 models. See the resnet_v2_*()
        methods for specific model instantiations, obtained by selecting different
        block instantiations that produce ResNets of various depths.


        Args:
          inputs: A tensor of size [batch, height_in, width_in, channels].
          blocks: A list of length equal to the number of ResNet blocks. Each element
            is a resnet_utils.Block object describing the units in the block.
          num_classes: Number of predicted classes for classification tasks. If None
            we return the features before the logit layer.
          include_root_block: If True, include the initial convolution followed by
            max-pooling, if False excludes it. If excluded, `inputs` should be the
            results of an activation-less convolution.
          reuse: whether or not the network and its variables should be reused. To be
            able to reuse 'scope' must be given.
          scope: Optional variable_scope.


        Returns:
          net: A rank-4 tensor of size [batch, height_out, width_out, channels_out].
            If global_pool is False, then height_out and width_out are reduced by a
            factor of output_stride compared to the respective height_in and width_in,
            else both height_out and width_out equal one. If num_classes is None, then
            net is the output of the last ResNet block, potentially after global
            average pooling. If num_classes is not None, net contains the pre-softmax
            activations.
          end_points: A dictionary from components of the network to the corresponding
            activation.

        Raises:
          ValueError: If the target output_stride is not valid.
        """
        with tf.variable_scope(scope, 'resnet_v2', [inputs], reuse=reuse) as sc:
            end_points_collection = sc.original_name_scope + '_end_points'
            with slim.arg_scope([slim.conv2d, bottleneck,
                                 stack_blocks_dense],
                                outputs_collections=end_points_collection):
                net = inputs
                if include_root_block:
                    # We do not include batch normalization or activation functions in conv1
                    # because the first ResNet unit will perform these. Cf. Appendix of [2].
                    with slim.arg_scope([slim.conv2d],
                                        activation_fn=None, normalizer_fn=None):
                        net = conv2d_same(net, 64, 7, stride=2, scope='conv1')
                    net = slim.max_pool2d(net, [3, 3], stride=2, scope='pool1')
                net = stack_blocks_dense(net, blocks)
                # This is needed because the pre-activation variant does not have batch
                # normalization or activation functions in the residual unit output. See
                # Appendix of [2].
                net = slim.batch_norm(net, activation_fn=tf.nn.relu, scope='postnorm')
                if global_pool:
                    # Global average pooling.
                    net = tf.reduce_mean(net, [1, 2], name='pool5', keep_dims=True)
                if num_classes is not None:
                    net = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,
                                      normalizer_fn=None, scope='logits')
                # Convert end_points_collection into a dictionary of end_points.
                end_points = slim.utils.convert_collection_to_dict(end_points_collection)
                if num_classes is not None:
                    end_points['predictions'] = slim.softmax(net, scope='predictions')
                return net, end_points


    def resnet_v2_50(inputs,
                     num_classes=None,
                     global_pool=True,
                     reuse=None,
                     scope='resnet_v2_50'):
        """ResNet-50 model of [1]. See resnet_v2() for arg and return description."""
        blocks = [
            Block('block1', bottleneck, [(256, 64, 1)] * 2 + [(256, 64, 2)]),
            Block('block2', bottleneck, [(512, 128, 1)] * 3 + [(512, 128, 2)]),
            Block('block3', bottleneck, [(1024, 256, 1)] * 5 + [(1024, 256, 2)]),
            Block('block4', bottleneck, [(2048, 512, 1)] * 3)]
        return resnet_v2(inputs, blocks, num_classes, global_pool,
                         include_root_block=True, reuse=reuse, scope=scope)


    def resnet_v2_101(inputs,
                      num_classes=None,
                      global_pool=True,
                      reuse=None,
                      scope='resnet_v2_101'):
        """ResNet-101 model of [1]. See resnet_v2() for arg and return description."""
        blocks = [
            Block('block1', bottleneck, [(256, 64, 1)] * 2 + [(256, 64, 2)]),
            Block('block2', bottleneck, [(512, 128, 1)] * 3 + [(512, 128, 2)]),
            Block('block3', bottleneck, [(1024, 256, 1)] * 22 + [(1024, 256, 2)]),
            Block('block4', bottleneck, [(2048, 512, 1)] * 3)]
        return resnet_v2(inputs, blocks, num_classes, global_pool,
                         include_root_block=True, reuse=reuse, scope=scope)


    def resnet_v2_152(inputs,
                      num_classes=None,
                      global_pool=True,
                      reuse=None,
                      scope='resnet_v2_152'):
        """ResNet-152 model of [1]. See resnet_v2() for arg and return description."""
        blocks = [
            Block('block1', bottleneck, [(256, 64, 1)] * 2 + [(256, 64, 2)]),
            Block('block2', bottleneck, [(512, 128, 1)] * 7 + [(512, 128, 2)]),
            Block('block3', bottleneck, [(1024, 256, 1)] * 35 + [(1024, 256, 2)]),
            Block('block4', bottleneck, [(2048, 512, 1)] * 3)]
        return resnet_v2(inputs, blocks, num_classes, global_pool,
                         include_root_block=True, reuse=reuse, scope=scope)


    def resnet_v2_200(inputs,
                      num_classes=None,
                      global_pool=True,
                      reuse=None,
                      scope='resnet_v2_200'):
        """ResNet-200 model of [2]. See resnet_v2() for arg and return description."""
        blocks = [
            Block('block1', bottleneck, [(256, 64, 1)] * 2 + [(256, 64, 2)]),
            Block('block2', bottleneck, [(512, 128, 1)] * 23 + [(512, 128, 2)]),
            Block('block3', bottleneck, [(1024, 256, 1)] * 35 + [(1024, 256, 2)]),
            Block('block4', bottleneck, [(2048, 512, 1)] * 3)]
        return resnet_v2(inputs, blocks, num_classes, global_pool,
                         include_root_block=True, reuse=reuse, scope=scope)


    from datetime import datetime
    import math
    import time


    def time_tensorflow_run(session, target, info_string):
        num_steps_burn_in = 10
        total_duration = 0.0
        total_duration_squared = 0.0

        for i in range(num_batches + num_steps_burn_in):
            start_time = time.time()
            _ = session.run(target)
            duration = time.time() - start_time
            if i >= num_steps_burn_in:
                if not i % 10:
                    print ('%s: step %d, duration = %.3f' %
                           (datetime.now(), i - num_steps_burn_in, duration))
                total_duration += duration
                total_duration_squared += duration * duration
        mn = total_duration / num_batches
        vr = total_duration_squared / num_batches - mn * mn
        sd = math.sqrt(vr)
        print ('%s: %s across %d steps, %.3f +/- %.3f sec / batch' %
               (datetime.now(), info_string, num_batches, mn, sd))


    batch_size = 32
    height, width = 224, 224
    inputs = tf.random_uniform((batch_size, height, width, 3))
    with slim.arg_scope(resnet_arg_scope(is_training=False)):
        net, end_points = resnet_v2_152(inputs, 1000)

    init = tf.global_variables_initializer()
    sess = tf.Session()
    sess.run(init)
    num_batches = 100
    time_tensorflow_run(sess, net, "Forward")

輸出:

2017-08-05 21:21:23.997012: step 0, duration = 0.232 
2017-08-05 21:21:26.310152: step 10, duration = 0.230 
2017-08-05 21:21:28.625971: step 20, duration = 0.232 
2017-08-05 21:21:30.948839: step 30, duration = 0.231 
2017-08-05 21:21:33.273177: step 40, duration = 0.232 
2017-08-05 21:21:35.608182: step 50, duration = 0.233 
2017-08-05 21:21:37.941335: step 60, duration = 0.232 
2017-08-05 21:21:40.276842: step 70, duration = 0.231 
2017-08-05 21:21:42.609510: step 80, duration = 0.233 
2017-08-05 21:21:44.934983: step 90, duration = 0.231 
2017-08-05 21:21:47.031013: Forward across 100 steps, 0.233 +/- 0.002 sec / batch

層數極深,但是訓練的速度還是可以的,ResNet是一個實用的卷積神經網路機構~