LeNet-5 神經網路模型分析及其 TensorFlow 實現

阿新 • • 發佈：2019-01-08

一、LeNet-5 簡介

LeNet-5 是 Yann Lecun 於1998提出的神經網路架構，更是卷積神經網路的開山鼻祖，雖然該網路模型僅有 7 層神經網路結構，但在 MNIST 資料集上的識別精度高達 99.2%，是卷積神經網路首次在數字影象識別領域的成功運用。

但是需要說明的有幾點：

（1）LeNet-5 主要採用 tanh 和 sigmoid 作為非線性啟用函式，但是目前 relu 對卷積神經網路更有效

（2）LeNet-5 採用平均池化作為下采樣操作，但是目前最大池化操作被更廣泛地採用

（3）LeNet-5 網路最後一層採用 Gaussian 連線層，用於輸出 0~9 這 10 個類別中的一類，但是目前分類器操作已經被 softmax 層取代

二、LeNet-5 網路結構

第一層：卷積層

卷積核尺寸：(長, 寬) = (5, 5)

卷積核深度：6（卷積核種類，可以簡單理解經過卷積操作變換後得到的特徵圖 feature maps 的通道數）

卷積操作的步長：(長, 寬) = (1, 1)

邊緣填充方式：padding = 'VALID'（不使用0填充邊緣，得到的特徵圖 feature maps 的尺寸小於輸入影象的尺寸）

輸入資料的尺寸：(長, 寬) = (32, 32)

輸出特徵圖 feature maps 的尺寸：(長, 寬) = (28, 28)，不使用 0 填充邊緣時輸出特徵圖的尺寸計算方式為：(輸入資料的長或寬 - 卷積核的長或寬 + 1) / 步長 = (32 - 5 + 1) / 1 = 28，取上整為 28

輸出節點的數量：28 * 28 * 6 = 4704（即下一層網路輸入層的節點個數，對應輸出特徵圖 feature maps 的尺度）

訓練引數的數量：(5 * 5 * 1 + 1) * 6 = 156（每個卷積核有 5 * 5 個權重和 1 個偏置，共 6 種卷積核）

卷積層連線的數量：(28 * 28 * 6) * (5 * 5 + 1) = 122304

第二層：平均池化層

池化操作的區域尺寸：(長, 寬) = (2, 2)

池化操作的深度：6（池化操作不改變深度 / 通道數）

池化操作的步長：(長, 寬) = (2, 2)

邊緣填充方式：padding = 'VALID'

輸入資料的尺寸：(長, 寬) = (28, 28)

輸出特徵圖 feature maps 的尺寸：(長, 寬) = (14, 14)，(輸入資料的長或寬 - 池化區域的長或寬 + 1) / 步長 = (28 - 2 + 1) / 2 = 13.5，取上整為 14

輸出節點的數量：14 * 14 * 6

第三層：卷積層

卷積核尺寸：(長, 寬) = (5, 5)

卷積核深度：16

卷積操作的步長：(長, 寬) = (1, 1)

邊緣填充方式：padding = 'VALID'

輸入資料的尺寸：(長, 寬) = (14, 14)

輸出特徵圖 feature maps 的尺寸：(長, 寬) = (28, 28)，不使用 0 填充邊緣時輸出特徵圖的尺寸計算方式為：(輸入資料的長或寬 - 卷積核的長或寬 + 1) / 步長 = (14 - 5 + 1) / 1 = 10，取上整為10

輸出節點的數量：10* 10 * 16 = 1600

訓練引數的數量：(5 * 5 * 6 + 1) * 16 = 2416（每個卷積核有 5 * 5 個權重和 1 個偏置，共 6 種卷積核）

卷積層連線的數量：(10 * 10 * 16) * (5 * 5 + 1) = 41600

第四層：平均池化層

池化操作的區域尺寸：(長, 寬) = (2, 2)

池化操作的深度：16

池化操作的步長：(長, 寬) = (2, 2)

邊緣填充方式：padding = 'VALID'

輸入資料的尺寸：(長, 寬) = (10, 10)

輸出特徵圖 feature maps 的尺寸：(長, 寬) = (5, 5)，(輸入資料的長或寬 - 池化區域的長或寬 + 1) / 步長 = (10 - 2 + 1) / 2 = 4.5，取上整為 5

輸出節點的數量：5 * 5 * 16

第五層：全連線層

輸入資料的尺度：(長, 寬, 通道數) = (5, 5, 16)

輸出節點的數量：120

訓練引數的數量：(5 * 5 * 16) * 120 + 120 = 48120 (連線權重個數 (5 * 5 * 16) * 120，偏置個數 120)

第六層：全連線層

輸入節點個數：120

輸出節點的數量：84

訓練引數的數量：120 * 84 + 84 = 10164 (連線權重個數 120 * 84，偏置個數 84)

第七層：高斯連線層（Gaussian Connections）（最後一層是分類器，目前主要通過 softmax 分類層代替）

輸入節點個數：84

輸出節點的數量：10

訓練引數的數量：84* 10 + 10 = 850 (連線權重個數 84 * 10，偏置個數 10)

三、LeNet-5 網路實現程式碼

1、環境

TensorFlow API r1.12

CUDA 9.2 V9.2.148

cudnn64_7.dll

Python 3.6.3

Windows 10、Mac

##################################################################################################
# 原始的 LeNet 網路復現 Tensorflow程式碼
# 儘量接近最原始的 LeNet 網路，但也有些許不同之處
# 1998 [LeNet-5] Gradient-Based Learning Applied to Document Recognition(Proceedings of the IEEE)
# LeNet-5 共 7 層
# 第一層：卷積層
# 第二層：平均池化層
# 第三層：卷積層
# 第四層：平均池化層
# 第五層：全連線層
# 第六層：全連線層
# 第七層：Gaussian連線層(以下程式碼通過全連線層 + softmax 代替實現)
# 卷積核在 TensorFlow 稱為過濾器（filter）
# 卷積核的維度：卷積核的長、卷積核的寬、輸入通道數、卷積後的輸出通道數
# 池化核的維度：批量、高度、寬度、深度 / 通道
# Tensorboard 儲存在當前路徑下的 summary資料夾，命令列中執行：tensorboard --logdir=="./summary/"
# 瀏覽器中檢視相關變數：http://localhost:6006
##################################################################################################

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

class LeNet5_Origin(object):
    def __init__(self):
        self.batch_size = 100   # 每個批次的資料量
        self.image_height = 32   # 輸入圖片的高度
        self.image_width = 32   # 輸入圖片的寬度
        self.image_channel = 1   # 輸入圖片的深度/通道數
        self.epoch_num = 1000   # 訓練的輪次
        self.lr = 0.001   # 優化器的學習率

    def build_model(self, images):
        # 第一層：卷積層
        with tf.variable_scope("conv1"):
            conv1_weights = tf.get_variable(name="conv1_w", shape=[5, 5, 1, 6], initializer=tf.truncated_normal_initializer(mean=0.0, stddev=1.0))
            conv1_bias = tf.get_variable(name="conv1_b", shape=[6], initializer=tf.constant_initializer(value=0.1))
            conv1 = tf.nn.conv2d(input=images, filter=conv1_weights, strides=[1, 1, 1, 1], padding="VALID")
            conv1 = tf.nn.bias_add(value=conv1,bias=conv1_bias)
            relu1 = tf.nn.relu(features=conv1)

        # 第二層：平均池化層
        with tf.variable_scope("pool2"):
            pool2 = tf.nn.avg_pool(value=relu1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="VALID")

        # 第三層：卷積層
        with tf.variable_scope("conv3"):
            conv3_weights = tf.get_variable(name="conv3_w", shape=[5, 5, 6, 16], initializer=tf.truncated_normal_initializer(mean=0.0, stddev=1.0))
            conv3_bias = tf.get_variable(name="conv3_b", shape=[16], initializer=tf.constant_initializer(value=0.1))
            conv3 = tf.nn.conv2d(input=pool2, filter=conv3_weights, strides=[1, 1, 1, 1], padding="VALID")
            conv3 = tf.nn.bias_add(value=conv3, bias=conv3_bias)
            relu3 = tf.nn.relu(features=conv3)

        # 第四層：平均池化層
        with tf.variable_scope("pool4"):
            pool4 = tf.nn.avg_pool(value=relu3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="VALID")

        pool4_shapes = pool4.get_shape().as_list()
        batch_size = pool4_shapes[0]
        fc5_nodes_num = pool4_shapes[1]*pool4_shapes[2]*pool4_shapes[3]
        reshape_pool4 = tf.reshape(tensor=pool4,shape=[batch_size, fc5_nodes_num])

        # 第五層：全連線層
        with tf.variable_scope("fc5"):
            fc5_weights = tf.get_variable(name="w5", shape=[fc5_nodes_num, 120], initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.1))
            fc5_bias = tf.get_variable(name="b5", shape=[120], initializer=tf.constant_initializer(value=0.1))
            fc5 = tf.nn.relu(features=(tf.matmul(a=reshape_pool4, b=fc5_weights) + fc5_bias))

        # 第六層：全連線層
        with tf.variable_scope("fc6"):
            fc6_weights = tf.get_variable(name="w6", shape=[120, 84], initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.1))
            fc6_bias = tf.get_variable(name="b6", shape=[84], initializer=tf.constant_initializer(value=0.1))
            fc6 = tf.nn.relu(features=(tf.matmul(a=fc5, b=fc6_weights) + fc6_bias))

        # 第七層：Gaussian Connection輸出層（輸出0~9這10個類別中的一個，目前已有 softmax 層實現）
        with tf.variable_scope("softmax"):
            fc7_weights = tf.get_variable(name="w7", shape=[84, 10], initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.1))
            fc7_bias = tf.get_variable(name="b7", shape=[10], initializer=tf.constant_initializer(value=0.1))
            fc7 = tf.matmul(a=fc6, b=fc7_weights) + fc7_bias
            fc7 = tf.nn.softmax(logits=fc7)
        return fc7

    def train(self):
        mnist = input_data.read_data_sets('./mnist_data/', one_hot=True)

        images_holder = tf.placeholder(dtype=tf.float32, shape=[self.batch_size, self.image_height, self.image_width, self.image_channel], name="x")
        labels_holder = tf.placeholder(dtype=tf.float32, shape=[self.batch_size, 10], name="y")
        label_predict = self.build_model(images_holder)

        loss = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=tf.argmax(labels_holder,1), logits=label_predict)
        # loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=tf.argmax(labels_holder,1), logits=label_predict)
        correct_predict = tf.equal(tf.argmax(label_predict, 1), tf.argmax(labels_holder, 1))
        accuracy = tf.reduce_mean(input_tensor=tf.cast(x=correct_predict, dtype=tf.float32))

        train_op = tf.train.GradientDescentOptimizer(learning_rate=self.lr).minimize(loss=loss)

        tf.summary.histogram(name='loss', values=loss)
        tf.summary.scalar(name="accuracy", tensor=accuracy)
        merged = tf.summary.merge_all()

        init_op = tf.global_variables_initializer()

        saver = tf.train.Saver(max_to_keep=1)

        with tf.Session() as sess:
            sess.run(init_op)

            writer = tf.summary.FileWriter(logdir="./summary/", graph=sess.graph)

            for i in range(self.epoch_num):
                batch_images, batch_labels = mnist.train.next_batch(self.batch_size)
                batch_images = tf.reshape(tensor=batch_images, shape=[self.batch_size, 28, 28, 1])
                batch_images = tf.image.resize_images(images=batch_images,size=(32,32))

                sess.run(train_op, feed_dict={images_holder:batch_images.eval(), labels_holder:batch_labels})
                accuracy_result = sess.run(accuracy, feed_dict={images_holder: batch_images.eval(), labels_holder: batch_labels})

                summary_result = sess.run(fetches=merged, feed_dict={images_holder: batch_images.eval(), labels_holder: batch_labels})
                writer.add_summary(summary=summary_result, global_step=i)

                saver.save(sess=sess, save_path="./models/lenet-5.ckpt", global_step=i+1)

                # print(accuracy_result)


if __name__ == "__main__":
    model = LeNet5_Origin()
    model.train()

Tensorboard 圖

Tensorboard Scalar 觀察 accuracy

Tensorboard Histgram 觀察 loss

參考文獻：

[1] LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.

[2] 《TensorFlow 實戰 Google 深度學習框架》第 6 章影象識別與卷積神經網路

LeNet-5 神經網路模型分析及其 TensorFlow 實現

一、LeNet-5 簡介

二、LeNet-5 網路結構

三、LeNet-5 網路實現程式碼

LeNet-5 神經網路模型分析及其 TensorFlow 實現

基於Tensorflow+MNIST的LeNet-5神經網路

CNN的LeNet-5模型及其TensorFlow實現

TensorFlow下構建高效能神經網路模型的最佳實踐

使用Tensorflow構造簡單的神經網路模型

TensorFlow實踐（10）——卷積神經網路模型LeNet5

記一次使用Tensorflow搭建神經網路模型經歷

基於Tensorflow, OpenCV. 使用MNIST資料集訓練卷積神經網路模型，用於手寫數字識別

TensorFlow神經網路模型不收斂的處理

利用TensorFlow訓練簡單的二分類神經網路模型

tensorflow(5)——神經網路

常用神經網路模型及其應用評述

神經網路案例分析4-5-神經網路遺傳演算法函式極值尋優-基於BP_Adaboost 的強分類器設計

TensorFlow 核心流程剖析 -- 2 神經網路模型的構建、分割和優化

基於 keras平臺CNN神經網路模型的服裝識別分析

Keras深度神經網路模型分層分析【輸入層、卷積層、池化層】

Keras結合Keras後端搭建個性化神經網路模型（不用原生Tensorflow）

【火爐煉AI】深度學習002-構建並訓練單層神經網路模型

【火爐煉AI】深度學習003-構建並訓練深度神經網路模型

5.神經網路演算法

LeNet-5 神經網路模型分析及其 TensorFlow 實現

一、LeNet-5 簡介

二、LeNet-5 網路結構

三、LeNet-5 網路實現程式碼

相關推薦