Mnist手寫數字識別 Tensorflow

阿新 • • 發佈：2020-07-20

# Ｍnist手寫數字識別 Tensorflow ## 任務目標 * 瞭解mnist資料集 * 搭建和測試模型 *** ## 編輯環境作業系統:Win10 python版本:3.6 整合開發環境:pycharm tensorflow版本:1.* *** ## 程式流程圖 ![程式流程圖](https://images.cnblogs.com/cnblogs_com/lyhLive/1808476/o_200720065600流程圖.png) *** ## 瞭解mnist資料集 mnist資料集:[mnist資料集下載地址](http://yann.lecun.com/exdb/mnist/) MNIST 資料集來自美國國家標準與技術研究所, National Institute of Standards and Technology (NIST). 訓練集 (training set) 由來自 250 個不同人手寫的數字構成, 其中 50% 是高中學生, 50% 來自人口普查局 (the Census Bureau) 的工作人員. 測試集(test set) 也是同樣比例的手寫數字資料. 圖片是以位元組的形式進行儲存, 我們需要把它們讀取到 NumPy array 中, 以便訓練和測試演算法。 **讀取mnist資料集** ``` mnist = input_data.read_data_sets("mnist_data", one_hot=True) ``` *** ## 模型結構 ### 輸入層 ``` with tf.variable_scope("data"): x = tf.placeholder(tf.float32,shape=[None,784],name='x_pred') # 784=28*28*1 寬長為28，單通道圖片 y_true = tf.placeholder(tf.int32,shape=[None,10]) # 10個類別 ``` ### 第一層卷積現在我們可以開始實現第一層了。它由一個卷積接一個max pooling完成。卷積在每個5x5的patch中算出32個特徵。卷積的權重張量形狀是[5, 5, 1, 32]，前兩個維度是patch的大小，接著是輸入的通道數目，最後是輸出的通道數目。而對於每一個輸出通道都有一個對應的偏置量。為了用這一層，我們把x變成一個4d向量，其第2、第3維對應圖片的寬、高，最後一維代表圖片的顏色通道數(因為是灰度圖所以這裡的通道數為1，如果是rgb彩色圖，則為3)。我們把x_image和權值向量進行卷積，加上偏置項，然後應用ReLU啟用函式，最後進行max pooling。 ``` with tf.variable_scope("conv1"): w_conv1 = tf.Variable(tf.random_normal([5,5,1,32])) # 5*5的卷積核 1個通道的輸入影象 32個不同的卷積核，得到32個特徵圖 b_conv1 = tf.Variable(tf.constant(0.0,shape=[32])) x_reshape = tf.reshape(x,[-1,28,28,1]) # n張 28*28 的單通道圖片 conv1 = tf.nn.relu(tf.nn.conv2d(x_reshape,w_conv1,strides=[1,1,1,1],padding="SAME")+b_conv1) #strides為過濾器步長 padding='SAME' 邊緣自動補充 pool1 = tf.nn.max_pool(conv1,ksize=[1,2,2,1],strides=[1,2,2,1],padding="SAME") # ksize為池化層過濾器的尺度，strides為過濾器步長 padding="SAME" 考慮邊界，如果不夠用用0填充 ``` ### 第二層卷積為了構建一個更深的網路，我們會把幾個類似的層堆疊起來。第二層中，每個5x5的patch會得到64個特徵 ``` with tf.variable_scope("conv2"): w_conv2 = tf.Variable(tf.random_normal([5,5,32,64])) b_conv2 = tf.Variable(tf.constant(0.0,shape=[64])) conv2 = tf.nn.relu(tf.nn.conv2d(pool1,w_conv2,strides=[1,1,1,1],padding="SAME")+b_conv2) pool2 = tf.nn.max_pool(conv2,ksize=[1,2,2,1],strides=[1,2,2,1],padding="SAME") ``` ### 密集連線層現在，圖片尺寸減小到7x7，我們加入一個有1024個神經元的全連線層，用於處理整個圖片。我們把池化層輸出的張量reshape成一些向量，乘上權重矩陣，加上偏置，然後對其使用ReLU。為了減少過擬合，我們在輸出層之前加入dropout。我們用一個placeholder來代表一個神經元的輸出在dropout中保持不變的概率。這樣我們可以在訓練過程中啟用dropout，在測試過程中關閉dropout。 TensorFlow的tf.nn.dropout操作除了可以遮蔽神經元的輸出外，還會自動處理神經元輸出值的scale。所以用dropout的時候可以不用考慮scale。 ``` with tf.variable_scope("fc1"): w_fc1 = tf.Variable(tf.random_normal([7*7*64,1024])) # 經過兩次卷積和池化 28 * 28/(2+2) = 7 * 7 b_fc1 = tf.Variable(tf.constant(0.0,shape=[1024])) h_pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1) # 在輸出層之前加入dropout以減少過擬合 keep_prob = tf.placeholder("float32",name="keep_prob") h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) ``` ### 輸出層 &emsp' 最後，我們新增一個softmax層，就像前面的單層softmax regression一樣。 ``` with tf.variable_scope("fc2"): w_fc2 = tf.Variable(tf.random_normal([1024,10])) # 經過兩次卷積和池化 28 * 28/(2+2) = 7 * 7 b_fc2 = tf.Variable(tf.constant(0.0,shape=[10])) y_predict = tf.matmul(h_fc1_drop,w_fc2)+b_fc2 tf.add_to_collection('pred_network', y_predict) # 用於載入模型獲取要預測的網路結構 ``` *** ## 訓練和評估模型為了進行訓練和評估，我們使用與之前簡單的單層SoftMax神經網路模型幾乎相同的一套程式碼，只是我們會用更加複雜的ADAM優化器來做梯度最速下降，在feed_dict中加入額外的引數keep_prob來控制dropout比例。然後每100次迭代輸出一次日誌。 ``` with tf.variable_scope("loss"): loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_true,logits=y_predict)) with tf.variable_scope("optimizer"): # 使用反向傳播，利用優化器使損失函式最小化 train_op = tf.train.AdamOptimizer(0.001).minimize(loss) with tf.variable_scope("acc"): # 檢測我們的預測是否真實標籤匹配(索引位置一樣表示匹配) # tf.argmax(y_conv,dimension), 返回最大數值的下標通常和tf.equal()一起使用，計算模型準確度 # dimension=0 按列找 dimension=1 按行找 equal_list = tf.equal(tf.arg_max(y_true,1),tf.arg_max(y_predict,1)) # 統計測試準確率，將correct_prediction的布林值轉換為浮點數來代表對、錯，並取平均值。 accuracy = tf.reduce_mean(tf.cast(equal_list,tf.float32)) # tensorboard # tf.summary.histogram用來顯示直方圖資訊 # tf.summary.scalar用來顯示標量資訊 # Summary：所有需要在TensorBoard上展示的統計結果 tf.summary.histogram("weight",w_fc2) tf.summary.histogram("bias",b_fc2) tf.summary.scalar("loss",loss) tf.summary.scalar("acc",accuracy) merged = tf.summary.merge_all() saver = tf.train.Saver() with tf.Session() as sess: sess.run(tf.global_variables_initializer()) filewriter = tf.summary.FileWriter("tfboard",graph=sess.graph) if is_train: # 訓練 for i in range(20001): x_train, y_train = mnist.train.next_batch(50) if i%100==0: # 評估模型準確度，此階段不使用Dropout print("第%d訓練，準確率為%f" % (i + 1, sess.run(accuracy, feed_dict={x: x_train, y_true: y_train, keep_prob: 1.0}))) # # 訓練模型，此階段使用50%的Dropout sess.run(train_op,feed_dict={x:x_train,y_true:y_train,keep_prob: 0.5}) summary = sess.run(merged,feed_dict={x:x_train,y_true:y_train, keep_prob: 1}) filewriter.add_summary(summary,i) saver.save(sess,savemodel) else: # 測試集預測 count = 0.0 epochs = 300 saver.restore(sess, savemodel) for i in range(epochs): x_test, y_test = mnist.train.next_batch(1) print("第%d張圖片，真實值為：%d預測值為：%d" % (i + 1, tf.argmax(sess.run(y_true, feed_dict={x: x_test, y_true: y_test,keep_prob: 1.0}), 1).eval(), tf.argmax( sess.run(y_predict, feed_dict={x: x_test, y_true: y_test,keep_prob: 1.0}), 1).eval() )) if (tf.argmax(sess.run(y_true, feed_dict={x: x_test, y_true: y_test,keep_prob: 1.0}), 1).eval() == tf.argmax( sess.run(y_predict, feed_dict={x: x_test, y_true: y_test,keep_prob: 1.0}), 1).eval()): count = count + 1 print("正確率為 %.2f " % float(count * 100 / epochs) + "%") ``` **評估結果** ![評估結果](https://images.cnblogs.com/cnblogs_com/lyhLive/1808476/o_200720065716評估結果.jpg) *** ## 傳入手寫圖片，利用模型預測首先利用opencv包將圖片轉為單通道（灰度圖），調整影象尺寸28*28,並且二值化影象，通過處理最後得到一個(0~1)扁平的圖片畫素值（一個二維陣列）。 **手寫數字圖片** ![8](https://images.cnblogs.com/cnblogs_com/lyhLive/1808476/o_2007200656538.jpg) **處理手寫數字圖片** ``` def dealFigureImg(imgPath): img = cv2.imread(imgPath) # 手寫數字影象所在位置 img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # 轉換影象為單通道(灰度圖) resize_img = cv2.resize(img, (28, 28)) # 調整影象尺寸為28*28 ret, thresh_img = cv2.threshold(resize_img, 127, 255, cv2.THRESH_BINARY) # 二值化 cv2.imwrite("image/temp.jpg",thresh_img) im = Image.open('image/temp.jpg') data = list(im.getdata()) # 得到一個扁平的圖片畫素 result = [(255 - x) * 1.0 / 255.0 for x in data] # 畫素值範圍(0-255)，轉換為(0-1) ->符合模型訓練時傳入資料的值 result = np.expand_dims(result, 0) # 擴充套件維度 ->符合模型訓練時傳入資料的維度 os.remove('image/temp.jpg') return result ``` *** **載入模型進行預測** ``` def predictFigureImg(imgPath): result = dealFigureImg(imgPath) with tf.Session() as sess: new_saver = tf.train.import_meta_graph("model/mnist_model.meta") new_saver.restore(sess, "model/mnist_model") graph = tf.get_default_graph() x = graph.get_operation_by_name('data/x_pred').outputs[0] keep_prob = graph.get_operation_by_name('fc1/keep_prob').outputs[0] y = tf.get_collection("pred_network")[0] predict = np.argmax(sess.run(y, feed_dict={x: result,keep_prob:1.0})) print("result:",predict) ``` **預測結果** ![預測結果](https://images.cnblogs.com/cnblogs_com/lyhLive/1808476/o_200720065633識別結果.jpg) *** ## 完整程式碼 ``` import tensorflow as tf import cv2 import os import numpy as np from PIL import Image from tensorflow.examples.tutorials.mnist import input_data # 構造模型 def getMnistModel(savemodel,is_train): """ :param savemodel: 模型儲存路徑 :param is_train: True為訓練，False為測試模型 :return:None """ mnist = input_data.read_data_sets("mnist_data", one_hot=True) with tf.variable_scope("data"): x = tf.placeholder(tf.float32,shape=[None,784],name='x_pred') # 784=28*28*1 寬長為28，單通道圖片 y_true = tf.placeholder(tf.int32,shape=[None,10]) # 10個類別 with tf.variable_scope("conv1"): w_conv1 = tf.Variable(tf.random_normal([5,5,1,32])) # 5*5的卷積核 1個通道的輸入影象 32個不同的卷積核，得到32個特徵圖 b_conv1 = tf.Variable(tf.constant(0.0,shape=[32])) x_reshape = tf.reshape(x,[-1,28,28,1]) # n張 28*28 的單通道圖片 conv1 = tf.nn.relu(tf.nn.conv2d(x_reshape,w_conv1,strides=[1,1,1,1],padding="SAME")+b_conv1) #strides為過濾器步長 padding='SAME' 邊緣自動補充 pool1 = tf.nn.max_pool(conv1,ksize=[1,2,2,1],strides=[1,2,2,1],padding="SAME") # ksize為池化層過濾器的尺度，strides為過濾器步長 padding="SAME" 考慮邊界，如果不夠用用0填充 with tf.variable_scope("conv2"): w_conv2 = tf.Variable(tf.random_normal([5,5,32,64])) b_conv2 = tf.Variable(tf.constant(0.0,shape=[64])) conv2 = tf.nn.relu(tf.nn.conv2d(pool1,w_conv2,strides=[1,1,1,1],padding="SAME")+b_conv2) pool2 = tf.nn.max_pool(conv2,ksize=[1,2,2,1],strides=[1,2,2,1],padding="SAME") with tf.variable_scope("fc1"): w_fc1 = tf.Variable(tf.random_normal([7*7*64,1024])) # 經過兩次卷積和池化 28 * 28/(2+2) = 7 * 7 b_fc1 = tf.Variable(tf.constant(0.0,shape=[1024])) h_pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1) # 在輸出層之前加入dropout以減少過擬合 keep_prob = tf.placeholder("float32",name="keep_prob") h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) with tf.variable_scope("fc2"): w_fc2 = tf.Variable(tf.random_normal([1024,10])) # 經過兩次卷積和池化 28 * 28/(2+2) = 7 * 7 b_fc2 = tf.Variable(tf.constant(0.0,shape=[10])) y_predict = tf.matmul(h_fc1_drop,w_fc2)+b_fc2 tf.add_to_collection('pred_network', y_predict) # 用於載入模型獲取要預測的網路結構 with tf.variable_scope("loss"): loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_true,logits=y_predict)) with tf.variable_scope("optimizer"): # 使用反向傳播，利用優化器使損失函式最小化 train_op = tf.train.AdamOptimizer(0.001).minimize(loss) with tf.variable_scope("acc"): # 檢測我們的預測是否真實標籤匹配(索引位置一樣表示匹配) # tf.argmax(y_conv,dimension), 返回最大數值的下標通常和tf.equal()一起使用，計算模型準確度 # dimension=0 按列找 dimension=1 按行找 equal_list = tf.equal(tf.arg_max(y_true,1),tf.arg_max(y_predict,1)) # 統計測試準確率，將correct_prediction的布林值轉換為浮點數來代表對、錯，並取平均值。 accuracy = tf.reduce_mean(tf.cast(equal_list,tf.float32)) # tensorboard # tf.summary.histogram用來顯示直方圖資訊 # tf.summary.scalar用來顯示標量資訊 # Summary：所有需要在TensorBoard上展示的統計結果 tf.summary.histogram("weight",w_fc2) tf.summary.histogram("bias",b_fc2) tf.summary.scalar("loss",loss) tf.summary.scalar("acc",accuracy) merged = tf.summary.merge_all() saver = tf.train.Saver() with tf.Session() as sess: sess.run(tf.global_variables_initializer()) filewriter = tf.summary.FileWriter("tfboard",graph=sess.graph) if is_train: # 訓練 for i in range(20001): x_train, y_train = mnist.train.next_batch(50) if i%100==0: # 評估模型準確度，此階段不使用Dropout print("第%d訓練，準確率為%f" % (i + 1, sess.run(accuracy, feed_dict={x: x_train, y_true: y_train, keep_prob: 1.0}))) # # 訓練模型，此階段使用50%的Dropout sess.run(train_op,feed_dict={x:x_train,y_true:y_train,keep_prob: 0.5}) summary = sess.run(merged,feed_dict={x:x_train,y_true:y_train, keep_prob: 1}) filewriter.add_summary(summary,i) saver.save(sess,savemodel) else: # 測試集預測 count = 0.0 epochs = 300 saver.restore(sess, savemodel) for i in range(epochs): x_test, y_test = mnist.train.next_batch(1) print("第%d張圖片，真實值為：%d預測值為：%d" % (i + 1, tf.argmax(sess.run(y_true, feed_dict={x: x_test, y_true: y_test,keep_prob: 1.0}), 1).eval(), tf.argmax( sess.run(y_predict, feed_dict={x: x_test, y_true: y_test,keep_prob: 1.0}), 1).eval() )) if (tf.argmax(sess.run(y_true, feed_dict={x: x_test, y_true: y_test,keep_prob: 1.0}), 1).eval() == tf.argmax( sess.run(y_predict, feed_dict={x: x_test, y_true: y_test,keep_prob: 1.0}), 1).eval()): count = count + 1 print("正確率為 %.2f " % float(count * 100 / epochs) + "%") # 手寫數字影象預測 def dealFigureImg(imgPath): img = cv2.imread(imgPath) # 手寫數字影象所在位置 img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # 轉換影象為單通道(灰度圖) resize_img = cv2.resize(img, (28, 28)) # 調整影象尺寸為28*28 ret, thresh_img = cv2.threshold(resize_img, 127, 255, cv2.THRESH_BINARY) # 二值化 cv2.imwrite("image/temp.jpg",thresh_img) im = Image.open('image/temp.jpg') data = list(im.getdata()) # 得到一個扁平的圖片畫素 result = [(255 - x) * 1.0 / 255.0 for x in data] # 畫素值範圍(0-255)，轉換為(0-1) ->符合模型訓練時傳入資料的值 result = np.expand_dims(result, 0) # 擴充套件維度 ->符合模型訓練時傳入資料的維度 os.remove('image/temp.jpg') return result def predictFigureImg(imgPath): result = dealFigureImg(imgPath) with tf.Session() as sess: new_saver = tf.train.import_meta_graph("model/mnist_model.meta") new_saver.restore(sess, "model/mnist_model") graph = tf.get_default_graph() x = graph.get_operation_by_name('data/x_pred').outputs[0] keep_prob = graph.get_operation_by_name('fc1/keep_prob').outputs[0] y = tf.get_collection("pred_network")[0] predict = np.argmax(sess.run(y, feed_dict={x: result,keep_prob:1.0})) print("result:",predict) if __name__ == '__main__': # 訓練和預測 modelPath = "model/mnist_model" getMnistModel(modelPath,True) # True 訓練 False 預測 # 圖片傳入模型進行預測 # imgPath = "image/8.jpg" # predictFigureImg(imgPath) ``` tensorflow官方文件 mnist進階：[https://www.freesion.com/article/8867776254/](https://www.freesion.com/article/88677

Mnist手寫數字識別 Tensorflow

Mnist手寫數字識別 Tensorflow

Tensorflow實踐 mnist手寫數字識別

tensorflow 基礎學習五：MNIST手寫數字識別

Tensorflow之MNIST手寫數字識別：分類問題（1）

Tensorflow之MNIST手寫數字識別：分類問題（2）

TensorFlow筆記（1）非線性迴歸、MNIST手寫數字識別

tensorflow實戰：MNIST手寫數字識別的優化2-代價函式優化，準確率98%

Tensorflow案例5：CNN演算法-Mnist手寫數字識別

Tensorflow案例4：Mnist手寫數字識別(線性神經網路)及其侷限性

TensorFlow——MNIST手寫數字識別

TensorFlow——Mnist手寫數字識別並可視化實戰教程（一）

基於tensorflow的MNIST手寫數字識別（二）--入門篇

TensorFlow程式碼實現（一）[MNIST手寫數字識別]

TensorFlow實現機器學習的“Hello World”--Mnist手寫數字識別

TensorFlow筆記之一：MNIST手寫數字識別

Tensorflow深度學習之七：再談mnist手寫數字識別程式

Android+TensorFlow+CNN+MNIST 手寫數字識別實現

基於tensorflow的MNIST手寫數字識別（三）--神經網路篇

mnist手寫數字識別——深度學習入門專案（tensorflow+keras+Sequential模型）

Caffe的運行mnist手寫數字識別

Mnist手寫數字識別 Tensorflow

相關推薦