1. 程式人生 > >tf.nn.conv2d tf.nn.bias_add tf.nn.max_pool tf.nn.bias_add tf.nn.relu 實現一個CNN

tf.nn.conv2d tf.nn.bias_add tf.nn.max_pool tf.nn.bias_add tf.nn.relu 實現一個CNN

溫馨提示

我首先會對知識點進行講解,後面的程式碼會用到這裡所講的所有知識點,在看的時候,如果不懂也沒事,看了後面程式碼就會明白

tf.nn.conv2d

tf.nn.conv2d(
    input,
    filter,
    strides,
    padding,
    use_cudnn_on_gpu=True,
    data_format='NHWC',
    dilations=[1, 1, 1, 1],
    name=None
)
'''
Args:
	input: A Tensor. Must be one of the following types: half, bfloat16, float32, float64. A 4-D tensor. The dimension order is interpreted according to the value of data_format, see below for details.
	filter: A Tensor. Must have the same type as input. A 4-D tensor of shape [filter_height, filter_width, in_channels, out_channels]
	strides: A list of ints. 1-D tensor of length 4. The stride of the sliding window for each dimension of input. The dimension order is determined by the value of data_format, see below for details.
	padding: A string from: "SAME", "VALID". The type of padding algorithm to use.
	use_cudnn_on_gpu: An optional bool. Defaults to True.
	data_format: An optional string from: "NHWC", "NCHW". Defaults to "NHWC". Specify the data format of the input and output data. With the default format "NHWC", the data is stored in the order of: [batch, height, width, channels]. Alternatively, the format could be "NCHW", the data storage order of: [batch, channels, height, width].
	dilations: An optional list of ints. Defaults to [1, 1, 1, 1]. 1-D tensor of length 4. The dilation factor for each dimension of input. If set to k > 1, there will be k-1 skipped cells between each filter element on that dimension. The dimension order is determined by the value of data_format, see above for details. Dilations in the batch and depth dimensions must be 1.
	name: A name for the operation (optional).
'''

引數說明

input: 指需要做卷積的輸入,它要求是一個Tensor,具有[batch_size, in_height, in_width, in_channels]這樣的shape 或者[batch, in_channels,in_height, in_width],具體的shape是根據data_format的引數的值確定的。具體含義是[訓練時一個batch的數量, 高度, 寬度, 通道數],注意這是一個4維的Tensor,要求型別為float32和float64其中之一

filter: 相當於CNN中的多個卷積核,它要求是一個Tensor,具有**[filter_height, filter_width, in_channels, out_channels]**這樣的shape,必須是這個shape順序 其和input是不一樣的,具體含義是[卷積核的高度,卷積核的寬度,影象通道數,卷積核個數

] 要求型別與引數input相同,有一個地方需要注意,in_channels的值必須和input的in_channels大小一樣
strides:卷積時在影象每一維的步長,這是一個一維的向量,長度4

padding:string型別的量,只能是”SAME”,”VALID”其中之一,這個值決定了不同的卷積方式.具體的細節需要去了解CNN

data_format:只能是“NHWC”、“NCHW”的可選字串之一。預設為“NHWC”。指定輸入和輸出資料的資料格式。使用預設格式“NHWC”,資料按[batch、高度、寬度、通道]的順序儲存。或者,格式可以是“NCHW”,資料儲存順序為:[batch、通道、高度、寬度]

return :結果返回一個Tensor,這個輸出,就是我們常說的feature map

tf.nn.max_pool

定義卷積神經網路的池化(這裡是最大池化)操作,平均池化用的是這個介面tf.nn.avg_pool

tf.nn.max_pool(
    value,
    ksize,
    strides,
    padding,
    data_format='NHWC',
    name=None
)
'''
Args:
	value: A 4-D Tensor of the format specified by data_format.
	ksize: A list or tuple of 4 ints. The size of the window for each dimension of the input tensor.
	strides: A list or tuple of 4 ints. The stride of the sliding window for each dimension of the input tensor.
	padding: A string, either 'VALID' or 'SAME'. The padding algorithm. See the "returns" section of tf.nn.convolution for details.
	data_format: A string. 'NHWC', 'NCHW' and 'NCHW_VECT_C' are supported.
	name: Optional name for the operation.
	Returns:
	A Tensor of format specified by data_format. The max pooled output tensor.
'''

引數說明

**value:**需要池化的tensor ,其shape是 NHWC 或者 NCHW之一 ; A 4-D Tensor 一般池化層接在卷積層後面,所以輸入通常是feature map,
ksize: 池化視窗的大小,取一個四維向量,一般是[1, height, width, 1],一般不在batch和channels上做池化,所以這兩個維度設為了1
**strides:**池化視窗在每一個維度上滑動的步長,一般也是[1, stride,stride, 1]
padding: 可以取’VALID’ 或者’SAME’
data_format: A string. ‘NHWC’, ‘NCHW’

tf.nn.bias_add

tf.nn.bias_add(
    value,
    bias,
    data_format=None,
    name=None
)
'''
Args:
	value: A Tensor with type float, double, int64, int32, uint8, int16, int8, complex64, or complex128.
	bias: A 1-D Tensor with size matching the last dimension of value. Must be the same type as value unless value is a quantized type, in which case a different quantized type may be used.
	data_format: A string. 'NHWC' and 'NCHW' are supported.
	name: A name for the operation (optional).
Returns:
   A Tensor with the same type as value.
'''

說明

實現的功能就是神經網路中到“加偏置” 。tf.nn.biase_add(value=inputs,bias)
value:需要加偏置的輸入,是一個tensor
bias:偏置,是一個1-D的tensor,其大小必須和value的最後一維的大小一樣(後面看例子)
data_format: 資料的格式 A string. ‘NHWC’ and ‘NCHW’ are supported. (格式“NHWC”,資料按[batch、高度、寬度、通道]的順序儲存。或者,格式可以是“NCHW”,資料儲存順序為:[batch、通道、高度、寬度])

返回相加後的結果

tf.nn.relu

神經網路的啟用函式(後期我會專門寫一個部落格,梳理各個啟用函式的有缺點,有興趣的小夥伴可以留意一下)

tf.nn.relu(
    features,
    name=None
)

實現的功能是max(features, 0).

使用以上介面定一個卷積神經網路

程式碼來源

#!/usr/bin/env python
# coding: utf-8
from __future__ import division, print_function, absolute_import
import tensorflow as tf
# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
# Training Parameters
learning_rate = 0.001
num_steps = 500
batch_size = 128
display_step = 10

# Network Parameters
num_input = 784 # MNIST data input (img shape: 28*28)
num_classes = 10 # MNIST total classes (0-9 digits)
dropout = 0.75 # Dropout, probability to keep units

X=tf.placeholder(tf.float32,[None,num_input])
Y=tf.placeholder(tf.int32,[None,num_classes])
keep_prob=tf.placeholder(tf.float32)

def conv2d(x,W,b,strides=1):
    x=tf.nn.conv2d(x,W,strides=[1,strides,strides,1],padding="SAME")
    x=tf.nn.bias_add(x,b)
    return tf.nn.relu(x)

def maxpool2d(x,k=2):
    return tf.nn.max_pool(x,ksize=[1,k,k,1],strides=[1,k,k,1],padding="SAME")

def conv_net(x, weights, biases,dropout):
    x=tf.reshape(x,[-1,28,28,1])
    #Convolution Layer
    conv1=conv2d(x,weights["wc1"],biases['bc1'])
    conv1=maxpool2d(conv1,k=2)
    
    conv2=conv2d(conv1,weights["wc2"],biases["bc2"])
    conv2=maxpool2d(conv2,k=2)
    
    fc1=tf.reshape(conv2,[-1,weights['wd1'].get_shape().as_list()[0]])
    fc1=tf.add(tf.matmul(fc1,weights["wd1"]),biases["bd1"])
    fc1=tf.nn.relu(fc1)
    
    fc1=tf.nn.dropout(fc1,dropout)
    
    out=tf.add(tf.matmul(fc1,weights["out"]),biases["out"])
    return out

weights={
    # 5x5 conv, 1 input, 32 outputs 
    "wc1":tf.Variable(tf.random_normal([5,5,1,32])),
    "wc2":tf.Variable(tf.random_normal([5,5,32,64])),
    "wd1":tf.Variable(tf.random_normal([7*7*64,1024])),
    "out":tf.Variable(tf.random_normal([1024,num_classes]))    
}
biases={
    "bc1":tf.Variable(tf.random_normal([32])),
    "bc2":tf.Variable(tf.random_normal([64])),
    "bd1":tf.Variable(tf.random_normal([1024])),
    "out":tf.Variable(tf.random_normal([num_classes]))
}


logits=conv_net(X,weights,biases,keep_prob)

pred_classes=tf.argmax(tf.nn.softmax(logits),axis=-1)

loss_op=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits,labels=Y))

optimizer=tf.train.AdamOptimizer(learning_rate=learning_rate)
train_op=optimizer.minimize(loss_op)

accuracy_op=tf.reduce_mean(tf.cast(tf.equal(pred_classes,tf.argmax(Y,-1)),tf.float32))

init=tf.global_variables_initializer()


with tf.Session() as sess:
       sess.run(init)
       for step in range(1, num_steps+1):
            batch_x, batch_y = mnist.train.next_batch(batch_size)
            # Run optimization op (backprop)
            sess.run(train_op, feed_dict={X: batch_x, Y: batch_y, keep_prob: dropout})
            if step % display_step == 0 or step == 1:
                # Calculate batch loss and accuracy
                loss, acc = sess.run([loss_op, accuracy], feed_dict={X: batch_x,
                                                                     Y: batch_y,
                                                                     keep_prob: 1.0})
                print("Step " + str(step) + ", Minibatch Loss= " +                       "{:.4f}".format(loss) + ", Training Accuracy= " +                       "{:.3f}".format(acc))

       print("Optimization Finished!")

        # Calculate accuracy for 256 MNIST test images
       print("Testing Accuracy:",             sess.run(accuracy, feed_dict={X: mnist.test.images[:256],
                                          Y: mnist.test.labels[:256],
                                          keep_prob: 1.0}))