Tensorflow官方文件學習理解（五）-卷積MNIST

阿新 • • 發佈：2018-12-12

之前構建的模型在MNIST上只有91%的正確率，有點低，我們嘗試一下使用卷積神經網路來改善效果。如果您不是很清楚什麼是卷積神經網路的話，可以參考我的這篇文章：連結。

權重初始化

在建立模型之前，我們先來建立權重和偏置。一般來說，初始化時應加入輕微噪聲，來打破對稱性，防止零梯度的問題。因為我們用的是ReLu啟用函式，所以用稍微大於0的值來初始化偏置能夠避免節點輸出恆為0的問題（dead neurons）。為了不在建立模型的時候反覆做初始化操作，我們定義兩個函式用於初始化。

def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1) # 從截斷的正態分佈中輸出隨機值
    return tf.Variable(initial)
def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape )
    return tf.Variable(initial)

卷積和池化

Tensorflow在卷積和池化上有很強的靈活性。我們如何處理邊界，步長設定多大什麼的。在這裡我們的卷積使用步長(stride size)為1，邊距（padding size）為0的模板，保證輸出和輸入是同一個大小。我們的池化用簡單傳統的2*2大小的模板做max pooling。為了程式碼更簡潔，我們把這部分抽象成一個函式。

def conv2d(x, w):
    return tf.nn.conv2d(x, w, strides=[1, 1, 1, 1], padding="SAME")
def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2,1], padding="SAME")

strides裡面的每一量對應W在x上的移動步長，比如strides = [1, 2, 3, 4]批，每次移動batch的個數是1；每次移動in_height的數目是2；每次移動in_width的數目是3；每次移動in_channels的數目是4。當然，每次只應該移動一個量。注意，batch和in_channels一般每次只會移動1。所以一般形式是strides = [1, stride, stride, 1]。得到的結果，包括四個維度[batch, in_height, in_width, in_channels]，ksize指對x的四個維度做池化時的大小。如ksize=[1, 2, 2, 1]

，池化的模板的每次一個batch，一個channel，長為2，寬為2。

padding可以用SAME和VALID兩種方式:對於VALID，輸出的形狀計算如下：

對於SAME,輸出的形狀計算如下：

現在我們可以開始實現第一層了。它由一個卷積核接一個max pooling完成。卷積在每個5*5的patch中算出32個特徵。權重是一個[5, 5, 1, 32]的張量，前面兩個維度代表的是patch的大小，接著是輸入的通道數，最後輸出的是通道數目。輸出對應一個同樣大小的偏置向量。

w_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])

為了用這一層，我們把我們的輸入圖片x，變成一個4d向量，第2，3維對應圖片的寬高，最後一維代表顏色通道，-1表示，它的大小資訊由其它幾組值確定。

x_image = tf.reshape(x, [-1, 28, 28, 1])

我們把x_image和權值向量進行卷積相乘，加上偏置，使用ReLu啟用函式，最後max pooling。

h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1)
h_pool1 = max_pool_2x2(

為了構建一個更深的網路，我們會把幾個類似的層堆疊起來，第二層卷積中，我們採用5*5的卷積核，希望得到64個特徵。

w_conv2 = weight_variable([5, 5, 32, 64])
b_conv1 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv1)
h_pool2 = max_pool_2x2(h_conv2)

密集連線層

28/2/2=7，現在圖片降維到7x7，我們加入一個有1024個神經元的全連線層，用於處理整個圖片。我們把池化層輸出的張量reshape成一些向量，乘上權重矩陣，加上偏置，使用ReLu啟用。

w_fc1 = weight_variable([7*7*64, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1)

Dropout

為了減少過擬合，我們在輸出層之前加入dropout。我們用一個placeholder來代表一個神經元在dropout中被保留的概率。這樣我們可以在訓練過程中啟用dropout，在測試過程中關閉dropout。Tensorflow的tf.nn.dropout操作會自動處理神經元輸出值的scale。所以用dropout的時候不用考慮scale。

keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

最後我們在輸出層新增softmax函式。

w_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])

y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, w_fc2) + b_fc2)

接下來我們需要對其進行訓練和測試：

cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))
train_step = tf.train.AdamOptimizer(0.0001).minimize(cross_entropy)
correct_prediction = tf.equal(tf.arg_max(y_conv, 1), tf.arg_max(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
sess.run(tf.global_variables_initializer())
for i in range(20000):
    batch = mnist.train.next_batch(50)
    if i % 100 == 0:
        train_accuracy = accuracy.eval(feed_dict={x:batch[0], y_:batch[1], keep_prob:1.0})
        print("step %d, train_accuracy %g" % (i, train_accuracy))
        train_step.run(feed_dict={x:batch[0], y_:batch[1], keep_prob:0.5})
print("test_accuracy %g" % accuracy.eval(feed_dict={x:mnist.test.images, y_:mnist.test.labels, keep_prob:1.0}))

全部程式碼：

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
import tensorflow as tf
sess = tf.InteractiveSession()
x = tf.placeholder("float", shape=[None, 784])
y_ = tf.placeholder("float", shape=[None, 10])
def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1) # 從截斷的正態分佈中輸出隨機值
    return tf.Variable(initial)
def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape )
    return tf.Variable(initial)
def conv2d(x, w):
    return tf.nn.conv2d(x, w, strides=[1, 1, 1, 1], padding="SAME")
def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2,1], padding="SAME")
w_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
x_image = tf.reshape(x, [-1, 28, 28, 1])
h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

w_conv2 = weight_variable([5, 5, 32, 64])
b_conv1 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv1)
h_pool2 = max_pool_2x2(h_conv2)

w_fc1 = weight_variable([7*7*64, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1)

keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

w_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])

y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, w_fc2) + b_fc2)

cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))
train_step = tf.train.AdamOptimizer(0.0001).minimize(cross_entropy)
correct_prediction = tf.equal(tf.arg_max(y_conv, 1), tf.arg_max(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
sess.run(tf.global_variables_initializer())
for i in range(20000):
    batch = mnist.train.next_batch(50)
    if i % 100 == 0:
        train_accuracy = accuracy.eval(feed_dict={x:batch[0], y_:batch[1], keep_prob:1.0})
        print("step %d, train_accuracy %g" % (i, train_accuracy))
        train_step.run(feed_dict={x:batch[0], y_:batch[1], keep_prob:0.5})
print("test_accuracy %g" % accuracy.eval(feed_dict={x:mnist.test.images, y_:mnist.test.labels, keep_prob:1.0}))

輸出結果如下：

參考：

Tensorflow官方文件學習理解（五）-卷積MNIST

權重初始化

卷積和池化

Tensorflow官方文件學習理解（五）-卷積MNIST

Sail.js官方文件閱讀筆記（五）——api/policies/ 目錄

tensorflow學習筆記十六：tensorflow官方文件學習 Image Recognition（Inception v3模型）

xarray官方文件學習筆記（序章）

Android官方文件—User Interface（Layouts）（Grid View）

Android官方文件—User Interface（Layouts）（List View）

Android官方文件—User Interface（Layouts）（Relative Layout）

Android官方文件—User Interface（Layouts）（Linear Layout）

Android官方文件—User Interface（Layouts）（概述）

Android官方文件—User Interface（概述）

Spring Boot 2.0官方文件之 Actuator（轉）

cocos 文件學習筆記（一）

AKKA官方文件閱讀筆記（1）JAVA版2.5.16

Android官方文件—APP元件（Activities）（概述）

Android官方文件—APP元件（Activities）（Fragments）

Android官方文件—APP元件（Activities）（Tasks and Back Stack）

Android官方文件—APP元件（Activities）（Overview Screen）

Android官方文件—APP元件（Services）（Bound Services）

Android官方文件—APP元件（Services）（AIDL）

Android官方文件—APP清單（概述）

Tensorflow官方文件學習理解 （五）-卷積MNIST

權重初始化

卷積和池化

相關推薦

Tensorflow官方文件學習理解（五）-卷積MNIST