cnn系列文章 --tf.nn.conv2d 和 tf.nn.max_pool詳解
阿新 • • 發佈:2018-12-09
你懂得,分享一句歌詞哦:
《故夢》 誰踩過枯枝輕響,螢火繪著畫屏香 為誰攏一袖芬芳,紅葉的信箋情意綿長
接下來主要講解tensorflow有關cnn構建的問題了。
tf.nn.conv2d(input, filter, padding, use_cundnn_on_gpu=True, data_format=’NHWC’, name=None)
- input [batch, in_height, in_width, in_channels]
- filter [height, width, in_channels, out_channels]
- padding:
- ’VALID’ : p=0
- ‘SAME’: 輸入輸出相等
- strides: 各個維度上([batch, height, width, channels])的跨度,一般設定為:[1, strides, strides, 1]
output: [batch, out_height, out_width, out_channels]
tf.nn.max_pool(input, ksize, strides, padding, data_format=”NHWC”,name=None)
- input: [batch, in_height, in_width, in_channels]
- ksize: 一般為[1, height, width, 1], 不在batch和channels上做池化,故設為1
- strides: 在每個維度上滑動的步長,一般[1, strides, strides, 1]
- padding: 一般為’VALID’,即p=0;也可以’SAME’
output: [batch, out_height, out_width, in_channels]
import tensorflow as tf
def conv_2d(input_data, filter_shape, bias_shape, strides=None, padding='VALID', activation_function=0):
'''
:param input_data: [batch, in_height, in_weight, in_channels]
:param filter_shape: [height, width, in_channels, out_channels]
:param bias_shape: bias的長度與filter_shape最後一個數相同,
:param strides: a list of ints, 預設[1, 1,1,1]
:param padding:
:param activation_function: 0 ->relu 1->sigmoid 2->tanh 3->無激勵函式
:return: [batch, out_height, out_width, out_channels]
'''
# 1. 執行濾波操作 tf.nn.conv2d
# 2. 利用python的broadcast(廣播機制)加上bias
# 3. 啟用函式(activation function) 預設不經過啟用函式,
# 預設 strides = [1,1,1,1]
if strides is None:
strides = [1, 1, 1,1]
# 根據論文,初始引數的方差與輸入到神經元個數關係: variance = 2 / num
# 根據論文,初始引數的方差與輸入到神經元個數關係: variance = 2 / num
in_data_num = filter_shape[0] * filter_shape[1] * filter_shape[1]
fliter_init = tf.random_normal_initializer(stddev=(2.0 / in_data_num) ** 0.5)
fliter = tf.get_variable('filter', filter_shape, initializer=fliter_init)
bias_init = tf.constant_initializer(value=0)
b = tf.get_variable('bias', bias_shape, initializer=bias_init)
conv_out = tf.nn.conv2d(input_data,fliter,strides=strides, padding=padding)
add_bias = tf.nn.bias_add(conv_out, b)
if activation_function == 3:
return tf.nn.tanh(add_bias)
elif activation_function == 2:
return tf.nn.sigmoid(add_bias)
elif activation_function == 1:
return tf.nn.relu(add_bias)
else:
return add_bias
def max_pool(input_data, k_size=2, k_strides=2, padding='VALID'):
'''
:param input_data: [batch, in_height, in_weight, channels]
:param k_size: 池化區域的大小
:param k_strides: 在每個維度上的步長, 一般在batch和channels上的步長為 1
:param padding: 'VALID' p=0 ; 'SAME' 輸入輸出維度相同
:return: [batch, out_height, ot_weight, channels]
'''
return tf.nn.max_pool(input_data,ksize=[1, k_size, k_size, 1],
strides=[1, k_strides, k_strides, 1], padding=padding)
接下來我們將使用上面兩個函式構建課程中基於lenet-5的網路結構