TensorFlow 訓練 MNIST （2）—— 多層神經網路

阿新 • • 發佈：2018-12-13

　　在我的上一篇隨筆中，採用了單層神經網路來對MNIST進行訓練，在測試集中只有約90%的正確率。這次換一種神經網路（多層神經網路）來進行訓練和測試。

1、獲取MNIST資料

　　MNIST資料集只要一行程式碼就可以獲取的到，非常方便。關於MNIST的基本資訊可以參考我的上一篇隨筆。

mnist = input_data.read_data_sets('./data/mnist', one_hot=True)

2、模型基本結構

　　本次採用的訓練模型為三層神經網路結構，輸入層節點數與MNIST一行資料的長度一致，為784；輸出層節點數與數字的類別數一致，為10；隱藏層節點數為50個；每次訓練的mini-batch數量為64,；最大訓練週期為50000。

1 inputSize  = 784
2 outputSize = 10
3 hiddenSize = 50
4 batchSize  = 64
5 trainCycle = 50000

3、輸入層

　　輸入層用於接收每次小批量樣本的輸入，先通過placeholder來進行佔位，在訓練時才傳入具體的資料。值得注意的是，在生成輸入層的tensor時，傳入的shape中有一個‘None’，表示每次輸入的樣本的數量，該‘None’表示先不作具體的指定，在真正輸入的時候再根據實際的資料來進行推斷。這個很方便，但也是有條件的，也就是通過該方法返回的tensor不能使用簡單的加（+）減（-）乘（*）除（/）符號來進行計算（否則將會報錯），需要用TensorFlow中的相關函式來進行代替。

inputLayer = tf.placeholder(tf.float32, shape=[None, inputSize])

4、隱藏層

　　在神經網路中，隱藏層的作用主要是提取資料的特徵（feature）。這裡的權重引數採用了 tensorflow.truncated_normal() 函式來進行生成，與上次採用的 tensorflow.

random_normal() 不一樣。這兩者的作用都是生成指定形狀、期望和標準差的符合正太分佈隨機變數。區別是 truncated_normal 函式對隨機變數的範圍有個限制（與期望的偏差在2個標準差之內，否則丟棄）。另外偏差項這裡也使用了變數的形式，也可以採用常量來進行替代。

　　啟用函式為sigmoid函式。

1 hiddenWeight = tf.Variable(tf.truncated_normal([inputSize, hiddenSize], mean=0, stddev=0.1))
2 hiddenBias   = tf.Variable(tf.truncated_normal([hiddenSize]))
3 hiddenLayer  = tf.add(tf.matmul(inputLayer, hiddenWeight), hiddenBias)
4 hiddenLayer  = tf.nn.sigmoid(hiddenLayer)

5、輸出層

　　輸出層與隱藏層類似，只是節點數不一樣。

1 outputWeight = tf.Variable(tf.truncated_normal([hiddenSize, outputSize], mean=0, stddev=0.1))
2 outputBias   = tf.Variable(tf.truncated_normal([outputSize], mean=0, stddev=0.1))
3 outputLayer  = tf.add(tf.matmul(hiddenLayer, outputWeight), outputBias)
4 outputLayer  = tf.nn.sigmoid(outputLayer)

6、輸出標籤

　　跟輸入層一樣，也是先佔位，在最後訓練的時候再傳入具體的資料。標籤，也就是每一個樣本的正確分類。

outputLabel = tf.placeholder(tf.float32, shape=[None, outputSize])

7、損失函式

　　這裡採用的是交叉熵損失函式。注意用的是v2版本，第一個版本已被TensorFlow宣告為deprecated，準備廢棄了。

loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=outputLabel, logits=outputLayer))

8、優化器與目標函式

　　優化器採用了Adam梯度下降法，我試過了普通的GradientDescentOptimizer，效果不如Adam；也用過Adadelta，結果幾乎收斂不了。

　　目標函式就是最小化損失函式。

optimizer = tf.train.AdamOptimizer()
target    = optimizer.minimize(loss)

9、訓練過程

　　先建立一個會話，然後初始化tensors，最後進行迭代訓練。模型的收斂速度很快，在1000次的時候就達到了大概90%的正確率。

 1 with tf.Session() as sess:
 2     sess.run(tf.global_variables_initializer())
 3 
 4     for i in range(trainCycle):
 5         batch = mnist.train.next_batch(batchSize)
 6         sess.run(target, feed_dict={inputLayer: batch[0], outputLabel: batch[1]})
 7 
 8         if i % 1000 == 0:
 9             corrected = tf.equal(tf.argmax(outputLabel, 1), tf.argmax(outputLayer, 1))
10             accuracy = tf.reduce_mean(tf.cast(corrected, tf.float32))
11             accuracyValue = sess.run(accuracy, feed_dict={inputLayer: batch[0], outputLabel: batch[1]})
12             print(i, 'train set accuracy:', accuracyValue)

模型訓練輸出：

10、測試訓練結果

　　在測資料集上測試。準確率達到96%，比單層的神經網路好很多。

1     corrected = tf.equal(tf.argmax(outputLabel, 1), tf.argmax(outputLayer, 1))
2     accuracy  = tf.reduce_mean(tf.cast(corrected, tf.float32))
3     accuracyValue = sess.run(accuracy, feed_dict={inputLayer: mnist.test.images, outputLabel: mnist.test.labels})
4     print("accuracy on test set:", accuracyValue)

測試集上的輸出：

附：

　　完整程式碼如下：

 1 import tensorflow as tf
 2 from tensorflow.examples.tutorials.mnist import input_data
 3 
 4 mnist = input_data.read_data_sets('./data/mnist', one_hot=True)
 5 
 6 inputSize  = 784
 7 outputSize = 10
 8 hiddenSize = 50
 9 batchSize  = 64
10 trainCycle = 50000
11 
12 # 輸入層
13 inputLayer = tf.placeholder(tf.float32, shape=[None, inputSize])
14 
15 # 隱藏層
16 hiddenWeight = tf.Variable(tf.truncated_normal([inputSize, hiddenSize], mean=0, stddev=0.1))
17 hiddenBias   = tf.Variable(tf.truncated_normal([hiddenSize]))
18 hiddenLayer  = tf.add(tf.matmul(inputLayer, hiddenWeight), hiddenBias)
19 hiddenLayer  = tf.nn.sigmoid(hiddenLayer)
20 
21 # 輸出層
22 outputWeight = tf.Variable(tf.truncated_normal([hiddenSize, outputSize], mean=0, stddev=0.1))
23 outputBias   = tf.Variable(tf.truncated_normal([outputSize], mean=0, stddev=0.1))
24 outputLayer  = tf.add(tf.matmul(hiddenLayer, outputWeight), outputBias)
25 outputLayer  = tf.nn.sigmoid(outputLayer)
26 
27 # 標籤
28 outputLabel = tf.placeholder(tf.float32, shape=[None, outputSize])
29 
30 # 損失函式
31 loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=outputLabel, logits=outputLayer))
32 
33 # 優化器
34 optimizer = tf.train.AdamOptimizer()
35 
36 # 訓練目標
37 target = optimizer.minimize(loss)
38 
39 # 訓練
40 with tf.Session() as sess:
41     sess.run(tf.global_variables_initializer())
42 
43     for i in range(trainCycle):
44         batch = mnist.train.next_batch(batchSize)
45         sess.run(target, feed_dict={inputLayer: batch[0], outputLabel: batch[1]})
46 
47         if i % 1000 == 0:
48             corrected = tf.equal(tf.argmax(outputLabel, 1), tf.argmax(outputLayer, 1))
49             accuracy = tf.reduce_mean(tf.cast(corrected, tf.float32))
50             accuracyValue = sess.run(accuracy, feed_dict={inputLayer: batch[0], outputLabel: batch[1]})
51             print(i, 'train set accuracy:', accuracyValue)
52 
53     # 測試
54     corrected = tf.equal(tf.argmax(outputLabel, 1), tf.argmax(outputLayer, 1))
55     accuracy  = tf.reduce_mean(tf.cast(corrected, tf.float32))
56     accuracyValue = sess.run(accuracy, feed_dict={inputLayer: mnist.test.images, outputLabel: mnist.test.labels})
57     print("accuracy on test set:", accuracyValue)
58 
59     sess.close()

View Code

TensorFlow 訓練 MNIST （2）—— 多層神經網路

TensorFlow 訓練 MNIST （2）—— 多層神經網路

TensorFlow 訓練 MNIST （1）—— softmax 單層神經網路

深度學習實踐（二）——多層神經網路

TensorFlow 訓練 MNIST （1）—— softmax 單層神經網絡

深度學習tensorflow實戰筆記（1）全連線神經網路（FCN）訓練自己的資料（從txt檔案中讀取）

吳恩達深度學習筆記（2）-什麼是神經網路（Neural Network）

人工智慧實踐：TensorFlow筆記學習（七）—— 卷積神經網路基礎

TensorFlow學習筆記（九）tf搭建神經網路基本流程

TensorFlow學習筆記（2）——CNN應用於MNIST

TensorFlow訓練MNIST資料集（3） —— 卷積神經網路

TensorFlow入門（五）多層 LSTM 通俗易懂版

pytorch實戰（二）-多層感知機識別MNIST數字

【tensorflow】TensorFlow入門（五）多層 LSTM 通俗易懂版

理解神經網路，從簡單的例子開始（2）使用python建立多層神經網路

TensorFlow筆記（二）---多層感知機識別手寫數字

php擴展開發筆記（2）多個源代碼文件的配置和編譯

TensorFlow學習筆記（2）----placeholder

tensorflow-非線性迴歸（2）

tensorflow-隊列（2）

人工智慧（4）- 實現多層神經網路

TensorFlow 訓練 MNIST （2）—— 多層神經網路

相關推薦