新手上手Tensorflow之手寫數字識別應用（1）

阿新 • • 發佈：2018-11-17

學深度學習有一段時間了，各種演算法研究一通，什麼CNN啦，RNN啦，LSTM啦，RCNN啦，各種論文看了一堆。看沒看懂且不說（心虛。。），回來我想把訓練的模型看看實際效果的時候，才發現TensorFlow的好多基本功能還不會。好吧，還是拿著Mnist資料集搞一波手寫數字識別的全流程吧！涉及到通過滑鼠輸入數字並獲取、影象預處理、模型訓練和數字預測等。重點是這些步驟中的一些關鍵的技術的實現細節。新手實踐，不當之處多多指點。
本文按照程式的實現過程，主要分為如下幾部分：

通過滑鼠輸入數字並儲存
影象預處理
模型訓練
通過模型對輸入的圖片進行識別

整個程式碼已經傳到GitHub：

sunpro/HandWritingRecognition-Tensorflow

1. 通過滑鼠輸入數字並儲存

通過opencv的setMouseCalback()函式獲取滑鼠的行為來獲得輸入。關鍵是重寫MouseCallback函式。其函式的C的形式如下：

typedef void(* cv::MouseCallback) (int event, int x, int y, int flags, void *userdata)

其中，其引數的意義：

event ： one of the cv::MouseEventTypes constants.（滑鼠操作事件的整數代號）
x

： The x-coordinate of the mouse event.（當前滑鼠座標的x座標）
y：The y-coordinate of the mouse event.（當前滑鼠座標的y座標）
flags ： one of the cv::MouseEventFlags constants.(滑鼠事件標誌)
userdata： The optional parameter.

重點區分一下event和flags

cv::MouseEventTypes滑鼠操作事件的整數代號，在opencv中，event滑鼠事件總共有10中，從0-9部分代表如下:

event	indication	description
EVENT_MOUSEMOVE	indicates that the mouse pointer has moved over the window.	滑鼠滑動
EVENT_LBUTTONDOWN	indicates that the left mouse button is pressed.	滑鼠左鍵按下
EVENT_RBUTTONDOWN	indicates that the right mouse button is pressed.	滑鼠有鍵按下
EVENT_MBUTTONDOWN	indicates that the middle mouse button is pressed.	滑鼠中鍵按下
EVENT_LBUTTONUP	indicates that left mouse button is released.	滑鼠左鍵擡起
EVENT_RBUTTONUP	indicates that right mouse button is released.	滑鼠右鍵擡起
EVENT_MBUTTONUP	indicates that middle mouse button is released.	滑鼠中鍵擡起

cv::MouseEventFlags代表滑鼠的拖拽事件，以及鍵盤滑鼠聯合事件，總共有32種事件，依次如下：

flags	indication	description
EVENT_FLAG_LBUTTON	indicates that the left mouse button is down.	滑鼠左鍵拖拽
EVENT_FLAG_RBUTTON	indicates that the right mouse button is down.	滑鼠右鍵拖拽
EVENT_FLAG_MBUTTON	indicates that the middle mouse button is down.	滑鼠中鍵拖拽
EVENT_FLAG_CTRLKEY	indicates that CTRL Key is pressed.	按Ctrl不放事件
EVENT_FLAG_SHIFTKEY	indicates that SHIFT Key is pressed.	按Shift不放事件
EVENT_FLAG_ALTKEY	indicates that ALT Key is pressed.	按ALT不放事件

通過對比二者的指示內容，不難發現，event 是指瞬時的動作；flags是指長時間的狀態；例如，點選了左鍵，這時候，event就是指點選左鍵的這一瞬間，之後，就是左鍵被按住，除非你擡起來（這樣就又觸發了左鍵擡起這個動作），否則一直“左鍵被按下”這個狀態。
我們通過滑鼠獲取輸入，自然就是希望獲取滑鼠左鍵按下後滑鼠移動的軌跡，因此獲取軌跡的判斷語句如下：

if event == cv2.EVENT_MOUSEMOVE and flag == cv2.EVENT_FLAG_LBUTTON:
…

實現完整程式碼如下：

'''
./Input.py
處理滑鼠事件；
從而獲得手寫數字！
'''
import cv2;
import numpy as np

# 建立一個空幀，定義(700, 700, 3)畫圖區域，注意資料型別
frame = np.zeros((600, 600, 3), np.uint8) 

last_measurement = current_measurement = np.array((0, 0), np.float32)

def OnMouseMove(event, x, y, flag, userdata):
    global frame, current_measurement, last_measurement
    if event == cv2.EVENT_LBUTTONDOWN:
        #last_measurement = np.array([[np.float32(x)], [np.float32(y)]]) # 當前測量
        current_measurement = np.array([[np.float32(x)], [np.float32(y)]]) # 當前測量
        #print('滑鼠左鍵點選事件！')
        #print('x:%d,y:%d'%(x,y),mousedown)
        #cv2.line(frame, (0, 0), (100, 100), (255, 0, 0)) # 藍色線為測量值     

    if event == cv2.EVENT_MOUSEMOVE and flag == cv2.EVENT_FLAG_LBUTTON: 
        #print('滑鼠移動事件！')
        #print('x:%d,y:%d'%(x,y))
        last_measurement = current_measurement # 把當前測量儲存為上一次測量
        current_measurement = np.array([[np.float32(x)], [np.float32(y)]]) # 當前測量
        lmx, lmy = last_measurement[0], last_measurement[1] # 上一次測量座標
        cmx, cmy = current_measurement[0], current_measurement[1] # 當前測量座標
        #print('lmx:%f.1,lmy:%f.1,cmx:%f.1'%(lmx,lmy,cmx))
        cv2.line(frame, (lmx, lmy), (cmx, cmy), (255, 255, 255), thickness = 8) #輸入數字    
        #print(str(event))
#print('start!')
# 視窗初始化
cv2.namedWindow("Input Number:")
#opencv採用setMouseCallback函式處理滑鼠事件，具體事件必須由回撥（事件）函式的第一個引數來處理，該引數確定觸發事件的型別（點選、移動等）
cv2.setMouseCallback("Input Number:", OnMouseMove)
key = 0
while key != ord('q'):
    cv2.imshow("Input Number:", frame)
    key = cv2.waitKey(1) & 0xFF
cv2.imwrite('number.jpg',frame)
#cv2.destroyWindow('Input Number:')
print('number image has been stored and named "number.jpg"')
cv2.destroyAllWindows()

程式碼的邏輯沒有什麼難點。需要注意的地方有兩個，
1. opencv的影象儲存的資料型別是uint8。我經常忘記這個，導致程式出錯。
2. 程式碼中 “global”關鍵字的使用方法，牽扯到python變數的生存週期和函式變數傳遞問題。具體可以參考博文： python 區域性變數和全域性變數 global

執行很簡單，直接

python Input.py

執行結果：

reference：
1. 獲取影象感興趣地矩形區域實現：http://blog.csdn.net/songzitea/article/details/16954057
2. 【Python+OpenCV】目標跟蹤-卡爾曼濾波-滑鼠軌跡跟蹤：http://m.blog.csdn.net/lwplwf/article/details/74295801
3. Opencv函式setMouseCallback滑鼠事件響應：http://blog.csdn.net/dcrmg/article/details/52027847

新手上手Tensorflow之手寫數字識別應用（1）

1. 通過滑鼠輸入數字並儲存

新手上手Tensorflow之手寫數字識別應用（1）

新手上手Tensorflow之手寫數字識別應用（3）

新手上手Tensorflow之手寫數字識別應用（2）

卷積神經網路之手寫數字識別應用MNISTCNN

教你用TensorFlow實現手寫數字識別

TensorFlow——MNIST手寫數字識別

K-近鄰演算法之手寫數字識別系統

TensorFlow——Mnist手寫數字識別並可視化實戰教程（一）

機器學習實戰k近鄰演算法(kNN)應用之手寫數字識別程式碼解讀

機器學習實戰例項之手寫數字識別（KNN、python3）

通過攝像頭捕獲影象用tensorflow做手寫數字識別

kNN之手寫數字識別

機器學習框架ML.NET學習筆記【4】多元分類之手寫數字識別

100天搞定機器學習|day39 Tensorflow Keras手寫數字識別

python-積卷神經網路全面理解-tensorflow實現手寫數字識別

新人上手TensorFlow 之簡單瞭解一下Batch Normalization （BN）

pytorch + visdom AutoEncode 和 VAE(Variational Autoencoder) 處理手寫數字資料集（MNIST）

TensorFlow入門之二：tensorflow手寫數字識別

Tensorflow之MNIST手寫數字識別：分類問題（1）

Tensorflow之MNIST手寫數字識別：分類問題（2）

新手上手Tensorflow之手寫數字識別應用（1）

1. 通過滑鼠輸入數字並儲存

相關推薦