1. 程式人生 > >cnn系列文章 --tf.nn.conv2d 和 tf.nn.max_pool詳解

cnn系列文章 --tf.nn.conv2d 和 tf.nn.max_pool詳解

你懂得,分享一句歌詞哦:

《故夢》 誰踩過枯枝輕響,螢火繪著畫屏香 為誰攏一袖芬芳,紅葉的信箋情意綿長

接下來主要講解tensorflow有關cnn構建的問題了。

  • tf.nn.conv2d(input, filter, padding, use_cundnn_on_gpu=True, data_format=’NHWC’, name=None)

    1. input [batch, in_height, in_width, in_channels]
    2. filter [height, width, in_channels, out_channels]
    3. padding:
      • ’VALID’ : p=0
      • ‘SAME’: 輸入輸出相等
    4. strides: 各個維度上([batch, height, width, channels])的跨度,一般設定為:[1, strides, strides, 1]

    output: [batch, out_height, out_width, out_channels]

    outheight=inheight+2×pfliterheightstridesheight+1 outwidth=inwidth+2×pfliterwidthstrideswidth+1
  • tf.nn.max_pool(input, ksize, strides, padding, data_format=”NHWC”,name=None)

    1. input: [batch, in_height, in_width, in_channels]
    2. ksize: 一般為[1, height, width, 1], 不在batch和channels上做池化,故設為1
    3. strides: 在每個維度上滑動的步長,一般[1, strides, strides, 1]
    4. padding: 一般為’VALID’,即p=0;也可以’SAME’

    output: [batch, out_height, out_width, in_channels]

    outheight=inheight+2×pksizeheightstridesheight+1 outwidth=inwidth+2×pksizerwidthstrideswidth+1
import tensorflow as tf


def conv_2d(input_data, filter_shape, bias_shape, strides=None, padding='VALID', activation_function=0):
    '''

    :param input_data: [batch, in_height, in_weight, in_channels]
    :param filter_shape: [height, width, in_channels, out_channels]
    :param bias_shape: bias的長度與filter_shape最後一個數相同,
    :param strides: a list of ints, 預設[1, 1,1,1]
    :param padding:
    :param activation_function: 0 ->relu 1->sigmoid 2->tanh 3->無激勵函式
    :return: [batch, out_height, out_width, out_channels]
    '''
    # 1. 執行濾波操作 tf.nn.conv2d
    # 2. 利用python的broadcast(廣播機制)加上bias
    # 3. 啟用函式(activation function) 預設不經過啟用函式,


    # 預設 strides = [1,1,1,1]
    if strides is None:
        strides = [1, 1, 1,1]

    # 根據論文,初始引數的方差與輸入到神經元個數關係: variance = 2 / num
    # 根據論文,初始引數的方差與輸入到神經元個數關係: variance = 2 / num
    in_data_num = filter_shape[0] * filter_shape[1] * filter_shape[1]
    fliter_init = tf.random_normal_initializer(stddev=(2.0 / in_data_num) ** 0.5)

    fliter = tf.get_variable('filter', filter_shape, initializer=fliter_init)
    bias_init = tf.constant_initializer(value=0)
    b = tf.get_variable('bias', bias_shape, initializer=bias_init)

    conv_out = tf.nn.conv2d(input_data,fliter,strides=strides, padding=padding)

    add_bias = tf.nn.bias_add(conv_out, b)

    if activation_function == 3:
        return tf.nn.tanh(add_bias)
    elif activation_function == 2:
        return tf.nn.sigmoid(add_bias)     
    elif activation_function == 1:
        return tf.nn.relu(add_bias)
    else:
        return add_bias


def max_pool(input_data, k_size=2, k_strides=2, padding='VALID'):
    '''

    :param input_data: [batch, in_height, in_weight, channels]
    :param k_size: 池化區域的大小
    :param k_strides: 在每個維度上的步長, 一般在batch和channels上的步長為 1
    :param padding: 'VALID' p=0 ; 'SAME'  輸入輸出維度相同
    :return: [batch, out_height, ot_weight, channels]
    '''
    return tf.nn.max_pool(input_data,ksize=[1, k_size, k_size, 1],
                   strides=[1, k_strides, k_strides, 1], padding=padding)

接下來我們將使用上面兩個函式構建課程中基於lenet-5的網路結構