Tensorflow例項分析Google Inception v3 網路

阿新 • • 發佈：2019-01-05

本文直接從Inception v3的程式碼實現入手，分析其中值得借鑑的思想

首先要知道一個slim的元件，可以給引數自動賦值，可以省去很多操作

def inception_arg_scope(weight_decay=0.00004,
                        use_batch_norm=True,
                        batch_norm_decay=0.9997,
                        batch_norm_epsilon=0.001):
  """Defines the default arg scope for inception models.
  Args:
    weight_decay: 設定L2正則
    use_batch_norm: 設定是否使用batch normalization
    batch_norm_decay: BN的衰減係數
    batch_norm_epsilon: 為了避免除數為0加的一個小型浮點數

  Returns:
    An `arg_scope` to use for the inception models.
  """
  batch_norm_params = {
      'decay': batch_norm_decay,
      'epsilon': batch_norm_epsilon,
      'updates_collections': tf.GraphKeys.UPDATE_OPS,
  }

  if use_batch_norm:
    normalizer_fn = slim.batch_norm
    normalizer_params = batch_norm_params
  else:
    normalizer_fn = None
    normalizer_params = {}
  # 下面這句會對conv2d和全連線層這兩個函式的weights_regularizer引數自動賦值為
  with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      weights_regularizer=slim.l2_regularizer(weight_decay)):

    #接下來再巢狀一個slim.arg_scope,對conv2d的權重、啟用函式、正則化函式及其引數自動賦值後將其返回 
    with slim.arg_scope(
        [slim.conv2d],
        weights_initializer=slim.variance_scaling_initializer(),
        activation_fn=tf.nn.relu,
        normalizer_fn=normalizer_fn,
        normalizer_params=normalizer_params) as sc:
      return sc

該函式的的作用就是事先定義好conv2d中的各種引數，之後再定義卷積層就會非常方便。

trunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)

再定義一個用於產生截斷分佈的函式

接下來定義inception v3 base,它可以生成inception v3的卷積部分

def inception_v3_base(inputs,
                      final_endpoint='Mixed_7c',
                      min_depth=16,
                      depth_multiplier=1.0,
                      scope=None):
  """Inception model from http://arxiv.org/abs/1512.00567.

  總體名稱對應結構如下
  Old name          | New name
  =======================================
  conv0             | Conv2d_1a_3x3
  conv1             | Conv2d_2a_3x3
  conv2             | Conv2d_2b_3x3
  pool1             | MaxPool_3a_3x3
  conv3             | Conv2d_3b_1x1
  conv4             | Conv2d_4a_3x3
  pool2             | MaxPool_5a_3x3
  mixed_35x35x256a  | Mixed_5b
  mixed_35x35x288a  | Mixed_5c
  mixed_35x35x288b  | Mixed_5d
  mixed_17x17x768a  | Mixed_6a
  mixed_17x17x768b  | Mixed_6b
  mixed_17x17x768c  | Mixed_6c
  mixed_17x17x768d  | Mixed_6d
  mixed_17x17x768e  | Mixed_6e
  mixed_8x8x1280a   | Mixed_7a
  mixed_8x8x2048a   | Mixed_7b
  mixed_8x8x2048b   | Mixed_7c

  Args:
    inputs: a tensor of size [batch_size, height, width, channels].
    final_endpoint: specifies the endpoint to construct the network up to. It
      can be one of ['Conv2d_1a_3x3', 'Conv2d_2a_3x3', 'Conv2d_2b_3x3',
      'MaxPool_3a_3x3', 'Conv2d_3b_1x1', 'Conv2d_4a_3x3', 'MaxPool_5a_3x3',
      'Mixed_5b', 'Mixed_5c', 'Mixed_5d', 'Mixed_6a', 'Mixed_6b', 'Mixed_6c',
      'Mixed_6d', 'Mixed_6e', 'Mixed_7a', 'Mixed_7b', 'Mixed_7c'].
    min_depth: Minimum depth value (number of channels) for all convolution ops.
      Enforced when depth_multiplier < 1, and not an active constraint when
      depth_multiplier >= 1.
    depth_multiplier: Float multiplier for the depth (number of channels)
      for all convolution ops. The value must be greater than zero. Typical
      usage will be to set this value in (0, 1) to reduce the number of
      parameters or computation cost of the model.
    scope: Optional variable_scope.

  Returns:
    tensor_out: output tensor corresponding to the final_endpoint.
    end_points: a set of activations for external use, for example summaries or
                losses.

  Raises:
    ValueError: if final_endpoint is not set to one of the predefined values,
                or depth_multiplier <= 0
  """
  # end_points will collect relevant activations for external use, for example
  # summaries or losses.
  end_points = {}

  if depth_multiplier <= 0:
    raise ValueError('depth_multiplier is not greater than zero.')

  #用來計算深度的函式，與乘子結合
  depth = lambda d: max(int(d * depth_multiplier), min_depth)
  
  with tf.variable_scope(scope, 'InceptionV3', [inputs]):
    with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
                        stride=1, padding='VALID'):
      # 299 x 299 x 3
      end_point = 'Conv2d_1a_3x3'
      net = slim.conv2d(inputs, depth(32), [3, 3], stride=2, scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 149 x 149 x 32
      end_point = 'Conv2d_2a_3x3'
      net = slim.conv2d(net, depth(32), [3, 3], scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 147 x 147 x 32
      end_point = 'Conv2d_2b_3x3'
      net = slim.conv2d(net, depth(64), [3, 3], padding='SAME', scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 147 x 147 x 64
      end_point = 'MaxPool_3a_3x3'
      net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 73 x 73 x 64
      end_point = 'Conv2d_3b_1x1'
      net = slim.conv2d(net, depth(80), [1, 1], scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 73 x 73 x 80.
      end_point = 'Conv2d_4a_3x3'
      net = slim.conv2d(net, depth(192), [3, 3], scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 71 x 71 x 192.
      end_point = 'MaxPool_5a_3x3'
      net = slim.max_pool2d(net, [3, 3], stride=2, scope=end_point)
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 35 x 35 x 192.
    # 接下來將是三個連續的Inception模組組，模組組中各自分別有多個incpetion模組,
    # 這部分結構是Inception v3的精華 
    # Inception blocks 第一個模組組的名稱為Mixed_5b
    # 先用slim設定所有inception Module的預設引數，所有卷積、最大池化、平均池化層的步長為1，padding為'SAME'

    with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
                        stride=1, padding='SAME'):
      # mixed: 35 x 35 x 256.
      end_point = 'Mixed_5b'
      with tf.variable_scope(end_point):

        #第一組模組有4個分支
         #第一個分支為64通道的1x1卷積
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
         #第二個分支為48通道的1x1卷積接一個64通道的5x5卷積
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(48), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(64), [5, 5],
                                 scope='Conv2d_0b_5x5')
         #第三個分支為64通道的1x1卷積接一個96通道的3x3的卷積
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0c_3x3')
        #第四個分支為3x3平均池化接一個32通道的1x1卷積
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(branch_3, depth(32), [1, 1],
                                 scope='Conv2d_0b_1x1')
        #將4個分支的輸出合併在一起，在輸出通道維度上進行合併 
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      #由於設定了步長為1，padding為'same',所以尺寸不會減少，只是通道數量增加了
      # 按照相同的方法設定第二個Inception Module，輸出通道數為288
      # mixed_1: 35 x 35 x 288.
      end_point = 'Mixed_5c'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(48), [1, 1], scope='Conv2d_0b_1x1')
          branch_1 = slim.conv2d(branch_1, depth(64), [5, 5],
                                 scope='Conv_1_0c_5x5')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(64), [1, 1],
                                 scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(branch_3, depth(64), [1, 1],
                                 scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # mixed_2: 35 x 35 x 288.
      end_point = 'Mixed_5d'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(48), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(64), [5, 5],
                                 scope='Conv2d_0b_5x5')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_2 = slim.conv2d(branch_2, depth(96), [3, 3],
                                 scope='Conv2d_0c_3x3')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(branch_3, depth(64), [1, 1],
                                 scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 第一個模組組的3個Module到這裡結束

      # 第二個模組組包含了5個Module,第2-5個結構相似
     # mixed_3: 17 x 17 x 768.
      end_point = 'Mixed_6a'
      with tf.variable_scope(end_point):
     # 包含3分支 
     # 第一分支stride=2,尺寸壓縮到17x17
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(384), [3, 3], stride=2,
                                 padding='VALID', scope='Conv2d_1a_1x1')
     #第二分支有三層，第一層是一個64通道的1x1卷積，第二層是96通道的3x3卷積，第三層是96通道3x3卷積，尺寸壓縮 
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(64), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(96), [3, 3],
                                 scope='Conv2d_0b_3x3')
          branch_1 = slim.conv2d(branch_1, depth(96), [3, 3], stride=2,
                                 padding='VALID', scope='Conv2d_1a_1x1')
     #第三個分支3x3最大池化層，尺寸同樣壓縮 
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID',
                                     scope='MaxPool_1a_3x3')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
       # 所以最後的尺寸是17x17x(384+96+256)=17x17x768
       # 隨後的Module尺寸全部固定到17x17x768      
       # mixed4: 17 x 17 x 768.      
       # 第二個模組組的第二個Module有4個分支      
      end_point = 'Mixed_6b'      
      with tf.variable_scope(end_point):        
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
       # 第二個分支有3層卷積
       # 第二層是一個128通道的1x7卷積，第三層是一個192通道的7x1卷積
       # 這裡用到了Factorization into small convolutions的思想，串聯一個1x7和7x1的卷積相當於7x卷積
       # 但是引數量大大減少了 ，只有2/7,同時多了一個啟用函式增強了非線性特徵變換，減輕了過擬合
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(128), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(128), [1, 7], 
                                scope='Conv2d_0b_1x7')
          branch_1 = slim.conv2d(branch_1, depth(192), [7, 1],
                                 scope='Conv2d_0c_7x1')
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(128), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(branch_2, depth(128), [7, 1],
                                 scope='Conv2d_0b_7x1')
          branch_2 = slim.conv2d(branch_2, depth(128), [1, 7],
                                 scope='Conv2d_0c_1x7')
          branch_2 = slim.conv2d(branch_2, depth(128), [7, 1], 
                                scope='Conv2d_0d_7x1')
          branch_2 = slim.conv2d(branch_2, depth(192), [1, 7],
                                 scope='Conv2d_0e_1x7')
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(branch_3, depth(192), [1, 1],
                                 scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
      # 最後該module的尺寸是17x17x(192+192+192+192)=17x17x768
      # mixed_5: 17 x 17 x 768.
      end_point = 'Mixed_6c'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
 #第二個模組組的第三個Module與第二個Module的不同就是第二個分支和第三個分支的前幾個卷積層
 # 的輸出通道數不同，從128變成了160，最終通道數不變
 # 這樣做特徵相當於被重新計算了一遍，對網路的豐富性提升很大

      # 第一個Module包含3個分支 
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
          branch_0 = slim.conv2d(branch_0, depth(320), [3, 3], stride=2,
                                 padding='VALID', scope='Conv2d_1a_3x3')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(192), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = slim.conv2d(branch_1, depth(192), [1, 7],
                                 scope='Conv2d_0b_1x7')
          branch_1 = slim.conv2d(branch_1, depth(192), [7, 1],
                                 scope='Conv2d_0c_7x1')
          branch_1 = slim.conv2d(branch_1, depth(192), [3, 3], stride=2,
                                 padding='VALID', scope='Conv2d_1a_3x3')

        #這裡的輸出是8x8x768 
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID',
                                     scope='MaxPool_1a_3x3')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # 輸出尺寸 8x8x(320+192+768)=8x8x1280

  
      # mixed_9: 8 x 8 x 2048.
      end_point = 'Mixed_7b'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(320), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(384), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = tf.concat(axis=3, values=[
              slim.conv2d(branch_1, depth(384), [1, 3], scope='Conv2d_0b_1x3'),
              slim.conv2d(branch_1, depth(384), [3, 1], scope='Conv2d_0b_3x1')])

        # 第三個分支比較複雜

        # 顯示448通道的1x1卷積接一個384通道的3x3卷積，然後在分支內拆成兩個分支，384通道1x3和3x1

        # 最後合併的到8x8x768  
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(448), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(
              branch_2, depth(384), [3, 3], scope='Conv2d_0b_3x3')
          branch_2 = tf.concat(axis=3, values=[
              slim.conv2d(branch_2, depth(384), [1, 3], scope='Conv2d_0c_1x3'),
              slim.conv2d(branch_2, depth(384), [3, 1], scope='Conv2d_0d_3x1')])
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(
              branch_3, depth(192), [1, 1], scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points

      # mixed_10: 8 x 8 x 2048.
      end_point = 'Mixed_7c'
      with tf.variable_scope(end_point):
        with tf.variable_scope('Branch_0'):
          branch_0 = slim.conv2d(net, depth(320), [1, 1], scope='Conv2d_0a_1x1')
        with tf.variable_scope('Branch_1'):
          branch_1 = slim.conv2d(net, depth(384), [1, 1], scope='Conv2d_0a_1x1')
          branch_1 = tf.concat(axis=3, values=[
              slim.conv2d(branch_1, depth(384), [1, 3], scope='Conv2d_0b_1x3'),
              slim.conv2d(branch_1, depth(384), [3, 1], scope='Conv2d_0c_3x1')])
        with tf.variable_scope('Branch_2'):
          branch_2 = slim.conv2d(net, depth(448), [1, 1], scope='Conv2d_0a_1x1')
          branch_2 = slim.conv2d(
              branch_2, depth(384), [3, 3], scope='Conv2d_0b_3x3')
          branch_2 = tf.concat(axis=3, values=[
              slim.conv2d(branch_2, depth(384), [1, 3], scope='Conv2d_0c_1x3'),
              slim.conv2d(branch_2, depth(384), [3, 1], scope='Conv2d_0d_3x1')])
        with tf.variable_scope('Branch_3'):
          branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
          branch_3 = slim.conv2d(
              branch_3, depth(192), [1, 1], scope='Conv2d_0b_1x1')
        net = tf.concat(axis=3, values=[branch_0, branch_1, branch_2, branch_3])
      end_points[end_point] = net
      if end_point == final_endpoint: return net, end_points
    raise ValueError('Unknown final endpoint %s' % final_endpoint)


	到這裡Inception的核心部分完成，包含了3個inception模組組，每個模組組內包含了多個結構類似的Inceptionmodule。

	圖片的尺寸從299x299經過5個步長為2的卷結或池化後縮減到8x8,通道數從3一直到2048。

	每個Inception模組組的目的都是獎空間結構簡化，同時把空間資訊轉化為高階抽象的特徵資訊。

	這一過程使得每層輸出tensor的size持續下降，降低了計算量。

	可以發現InceptionModule的規律，一般情況下有4個分支，第1個分支一般是1x1卷積，第2個分支一般是1x1卷積後再接分解的1xn和nx1卷積

	第3個分支和第2個分支類似,但是一般更深一些,第4個分支一般具有最大池化或者平均池化

	因此，Inception Module是通過組合比較簡單的特徵抽象(分支1)，比較複雜的特徵抽象(2,3),和一個簡化的池化層(分支4)，一共4種不同程度的

	特徵抽象和變化來有選擇的保留不同層次的高階特徵，最大程度豐富網路的表達能力

def inception_v3(inputs,
                 num_classes=1000,
                 is_training=True,
                 dropout_keep_prob=0.8,
                 min_depth=16,
                 depth_multiplier=1.0,
                 prediction_fn=slim.softmax,
                 spatial_squeeze=True,
                 reuse=None,
                 scope='InceptionV3'):
  """Inception model from http://arxiv.org/abs/1512.00567.

  "Rethinking the Inception Architecture for Computer Vision"

  Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens,
  Zbigniew Wojna.

  With the default arguments this method constructs the exact model defined in
  the paper. However, one can experiment with variations of the inception_v3
  network by changing arguments dropout_keep_prob, min_depth and
  depth_multiplier.

  The default image size used to train this network is 299x299.

  Args:
    inputs: a tensor of size [batch_size, height, width, channels].
    num_classes: number of predicted classes.
    is_training: whether is training or not.
    dropout_keep_prob: the percentage of activation values that are retained.
    min_depth: Minimum depth value (number of channels) for all convolution ops.
      Enforced when depth_multiplier < 1, and not an active constraint when
      depth_multiplier >= 1.
    depth_multiplier: Float multiplier for the depth (number of channels)
      for all convolution ops. The value must be greater than zero. Typical
      usage will be to set this value in (0, 1) to reduce the number of
      parameters or computation cost of the model.
    prediction_fn: a function to get predictions out of logits.
    spatial_squeeze: if True, logits is of shape [B, C], if false logits is
        of shape [B, 1, 1, C], where B is batch_size and C is number of classes.
    reuse: whether or not the network and its variables should be reused. To be
      able to reuse 'scope' must be given.
    scope: Optional variable_scope.

  Returns:
    logits: the pre-softmax activations, a tensor of size
      [batch_size, num_classes]
    end_points: a dictionary from components of the network to the corresponding
      activation.

  Raises:
    ValueError: if 'depth_multiplier' is less than or equal to zero.
  """
  if depth_multiplier <= 0:
    raise ValueError('depth_multiplier is not greater than zero.')
  depth = lambda d: max(int(d * depth_multiplier), min_depth)
  with tf.variable_scope(scope, 'InceptionV3', [inputs, num_classes],
                         reuse=reuse) as scope:
    with slim.arg_scope([slim.batch_norm, slim.dropout],
                        is_training=is_training):
      net, end_points = inception_v3_base(
          inputs, scope=scope, min_depth=min_depth,
          depth_multiplier=depth_multiplier)

      # Auxiliary Head logits
      # 先把卷積、最大池化、平均池化的預設步長設為1
      with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
                          stride=1, padding='SAME'):
        aux_logits = end_points['Mixed_6e']
      # 在6e之後接一個5x5的平均池化，步長為3，padding位VALID,這樣尺寸從17x17x768變成5x5x768 
        with tf.variable_scope('AuxLogits'):
          aux_logits = slim.avg_pool2d(
              aux_logits, [5, 5], stride=3, padding='VALID',
              scope='AvgPool_1a_5x5')
         #接著連一個128通道的1x1卷積和一個768通道的的5x5卷積
          aux_logits = slim.conv2d(aux_logits, depth(128), [1, 1],
                                   scope='Conv2d_1b_1x1')

          # Shape of feature map before the final layer.
          kernel_size = _reduced_kernel_size_for_small_input(
              aux_logits, [5, 5])
          aux_logits = slim.conv2d(
              aux_logits, depth(768), kernel_size,
              weights_initializer=trunc_normal(0.01),
              padding='VALID', scope='Conv2d_2a_{}x{}'.format(*kernel_size))
          # 最後接一個輸出通道為num_classes的1x1卷積，不設定啟用函式和規範化函式
          # 輸出變成了1x1x1000 
          aux_logits = slim.conv2d(
              aux_logits, num_classes, [1, 1], activation_fn=None,
              normalizer_fn=None, weights_initializer=trunc_normal(0.001),
              scope='Conv2d_2b_1x1')
         # 消除tensor中前兩個為1的維度
          if spatial_squeeze: 
            aux_logits = tf.squeeze(aux_logits, [1, 2], name='SpatialSqueeze')
          end_points['AuxLogits'] = aux_logits


      # 之後就是正常的分類預測
      # Final pooling and prediction
      with tf.variable_scope('Logits'):
        kernel_size = _reduced_kernel_size_for_small_input(net, [8, 8])
        net = slim.avg_pool2d(net, kernel_size, padding='VALID',
                              scope='AvgPool_1a_{}x{}'.format(*kernel_size))
        # 1 x 1 x 2048
        net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')
        end_points['PreLogits'] = net
        # 2048
        logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,
                             normalizer_fn=None, scope='Conv2d_1c_1x1')
        if spatial_squeeze:
          logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')
        # 1000
      end_points['Logits'] = logits
      end_points['Predictions'] = prediction_fn(logits, scope='Predictions')
  return logits, end_points

Tensorflow例項分析Google Inception v3 網路

本文直接從Inception v3的程式碼實現入手，分析其中值得借鑑的思想首先要知道一個slim的元件，可以給引數自動賦值，可以省去很多操作 def inception_arg_scope(weight_decay=0.00004,

TensorFlow遷移學習-使用谷歌訓練好的Inception-v3網路進行分類

遷移學習是將一個數據集上訓練好的網路模型快速轉移到另外一個數據集上，可以保留訓練好的模型中倒數第一層之前的所有引數，替換最後一層即可，在最後層之前的網路層稱之為瓶頸層。下面程式碼是使用TensorFlow將ImageNet上訓練好的Inception-v

原始碼分析——遷移學習Inception V3網路重訓練實現圖片分類

1. 前言近些年來，隨著以卷積神經網路（CNN）為代表的深度學習在影象識別領域的突破，越來越多的影象識別演算法不斷湧現。在去年，我們初步成功嘗試了影象識別在測試領域的應用：將網站樣式錯亂問題、無線領域機型適配問題轉換為“特定場景下的正常圖片和異常圖片的二分類問題”，並藉助Goolge開源的Inception

tensorRt加速tensorflow模型推理（inception V3為例）

摘要在一個人工智慧大爆發的時代，一個企業不來點人工智慧都不好意思說自己是科技企業。隨著各公司在各自領域資料量的積累，以及深度學習的強擬合特點，各個公司都會訓練出屬於自己的模型，那麼問題就來了，你有模型，我也有模型，那還比什麼？對，就是速度，誰的速度快，誰就厲害。引言 te

Tensorflow載入goodle的inception-v3模型

import tensorflow as tf import os import tarfile import requests # inception模型下載地址 inception_pretrai

深度學習卷積神經網路——經典網路GoogLeNet(Inception V3)網路的搭建與實現

一、Inception網路（google公司）——GoogLeNet網路的綜述獲得高質量模型最保險的做法就是增加模型的深度（層數）或者是其寬度（層核或者神經元數），但是這裡一般設計思路的情況下會出現如下的缺陷： 1.引數太多，若訓練資料集有限，容易過擬合； 2.網路越大

深度學習框架Tensorflow學習與應用(八儲存和載入模型，使用Google的影象識別網路inception-v3進行影象識別)

一模型的儲存 [email protected]:~/tensorflow$ cat 8-1saver_save.py # coding: utf-8 # In[1]: import tensorflow as tf from tensorflow.examples.tutorials

TensorFlow 深度學習框架（9）-- 經典卷積網路模型 : LeNet-5 模型 & Inception-v3 模型

LeNet -5 模型LeNet-5 模型總共有 7 層，以數字識別為例，圖展示了 LeNet-5 模型的架構第一層，卷積層這一層的輸入就是原始的影象畫素，LeNet-5 模型接受的輸入層大小為 32*32*1 。第一個卷積層過濾器的尺寸為 5 * 5，深度為 6，步長為 1

Tensorflow 卷積神經網路 Inception-v3模型遷移學習花朵識別

Inception-v3模型結構：Inception-v3簡介：1.基於大濾波器尺寸分解卷積在視覺網路中，預期相近啟用的輸出是高度相關的。因此，我們可以預期，它們的啟用可以在聚合之前被減少，並且這應該會導致類似的富有表現力的區域性表示。全卷積網路減少計算可以提高效率2.分

tensorflow 卷積神經網路 Inception-v3模型遷移學習

____tz_zs小練習案例來源於《TensorFlow實戰Google深度學習框架》資料集檔案解壓後，包含5個子資料夾，子資料夾的名稱為花的名稱，代表了不同的類別。平均每一種花有734張圖片，圖片是RGB色彩模式，大小也不相同。 # -*- coding

運用java 呼叫tensorflow中的inception v3模型

首先使用maven新增依賴項： <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3

《TensorFlow：實戰Google深度學習框架》——6.3 卷積神經網路常用結構

1、卷積層圖6-8顯示了卷積層神經網路結構中重要的部分：濾波器（filter）或者核心（kernel）。過濾器可以將當前層神經網路上的一個子節點矩陣轉化為下一層神經網路上的一個單位節點矩陣。單位節點矩陣指的是一個長和寬都為1，但深度不限的節點矩陣。在一個卷積層巾，過濾器

《TensorFlow：實戰Google深度學習框架》——6.2 卷積神經網路簡介（卷積神經網路的基本網路結構及其與全連線神經網路的差異）

下圖為全連線神經網路與卷積神經網路的結構對比圖：由上圖來分析兩者的差異：全連線神經網路與卷積網路相同點 &nb

《TensorFlow：實戰Google深度學習框架》——6.3 卷積神經網路常用結構（池化層）

池化層在兩個卷積層之間，可以有效的縮小矩陣的尺寸（也可以減小矩陣深度，但實踐中一般不會這樣使用），co。池從而減少最後全連線層中的引數。池化層既可以加快計算速度也可以防止過度擬合問題的作用。池化層也是通過一個類似過濾器結構完成的，計算方式有兩種：最大池化層：採用最

tensorflow-Inception-v3模型訓練自己的資料程式碼示例

一、宣告　　本程式碼非原創，源網址不詳，僅做學習參考。二、程式碼　　 1 # -*- coding: utf-8 -*- 2 3 import glob # 返回一個包含有匹配檔案/目錄的陣列 4 import os.path 5 import rand

tensorflow利用Inception-v3實現遷移學習

1、Tensorflow 實現遷移學習。 #photo地址： #http://download.tensorflow.org/example_images/flower_photos.tgz #Inception-v3模型 #https://storage.googleapi

TensorFlow遊樂場及神經網路簡介，我以《Tensorflow：實戰Google深度學習框架》為主，基礎最重要

轉載：https://blog.csdn.net/broadview2006/article/details/80128755 本文將通過TensorFlow遊樂場來快速介紹神經網路的主要功能。TensorFlow遊樂場（http://playground.tensorflow.org）是一個通

Tensorflow— 使用inception-v3做各種影象的識別

程式碼：import tensorflow as tf import os import numpy as np import re from PIL import Image import matplotlib.pyplot as plt程式碼：class NodeLook

利用Tensorflow構建CNN影象多分類模型及影象引數、資料維度變化情況例項分析

本文以CIFAR-10為資料集，基於Tensorflow介紹了CNN(卷積神經網路)影象分類模型的構建過程，著重分析了在建模過程中卷積層、池化層、扁平化層、全連線層、輸出層的運算機理，以及經過運算後圖像尺寸、資料維度等引數的變化情況。 CIFAR-10資料

TensorFlow實現Google InceptionNet V3（forward耗時檢測）

Google InceptionNet-V3網路結構圖： Inception V3網路結構圖：型別 kernel尺寸/步長（或註釋）輸入尺寸卷積 3*3 / 2 299 * 299 * 3 卷積 3*3

Tensorflow例項分析Google Inception v3 網路

相關推薦