1. 程式人生 > >基於Tensorflow的機器學習(6) -- 卷積神經網路

基於Tensorflow的機器學習(6) -- 卷積神經網路

本篇部落格將基於tensorflow的estimator以及MNIST實現LeNet。具體實現步驟如下:

匯入必要內容

from __future__ import division, print_function, absolute_import

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('/tmp/data/', one_hot=False)
# Why we set the one-hot to be false, what if we do not do it?
import tensorflow as tf import matplotlib.pyplot as plt import numpy as np

上述程式碼值得注意的是使用到了python的內建函式__future__, 其是保證python2.7可以使用3.x相關功能的一種有效的實現手段.

變數配置

# Training parameters
learning_rate = 0.01
num_steps = 2000
batch_size = 128

# Network Parameters
num_input = 784
num_classes = 10
dropout = 0.75

其中引入到了dropout, 在訓練時使用dropout隨機去掉一定比例的connection, 而在測試時不使用. 因此dropout可以通過Mode來進行切換.

定義神經網路

# Create the neural network
def conv_net(x_dict, n_classes, dropout, reuse, is_training):

    # Define a scope for reusing the variables.
    with tf.variable_scope('ConvNet', reuse=reuse):
        # TF Estimator input is a dict, is case of multiple inputs
x = x_dict['images'] # MNIST data input is a 1-D vector of 784 features (28*28 pixels) # Reshape to match picture format [Height*Width*Channel] # Tensor input become 4-D: [Batch Size, Height, Width, Channel] x = tf.reshape(x, shape=[-1, 28, 28, 1]) # Convolution Layer with 32 filters and a kernel size of 5 conv1 = tf.layers.conv2d(x, 32, 5, activation=tf.nn.relu) # Max Pooling (down-sampling) with strides of 2 and kernel size of 2 conv1 = tf.layers.max_pooling2d(conv1, 2, 2) # Convolution Layer with 64 filters and a kernel size of 3 conv2 = tf.layers.conv2d(conv1, 64, 3, activation=tf.nn.relu) # Max Pooling (down-sampling) with strides of 2 and kernel size of 2 conv2 = tf.layers.max_pooling2d(conv2, 2, 2) # Flatten the data to a 1-D vector for the fully connected layer fc1 = tf.contrib.layers.flatten(conv2) # Fully connected layer (in tf contrib folder for now) fc1 = tf.layers.dense(fc1, 1024) # Apply Dropout (if is_training is False, dropout is not applied) fc1 = tf.layers.dropout(fc1, rate=dropout, training=is_training) # Output layer, class prediction out = tf.layers.dense(fc1, n_classes) return out

以下內容需要說明:

  1. tf.variable.scope變數的reuse是控制該網路的變數是否能夠被get_variable()函式呼叫的, 在訓練模式之下,reuse應該被設定為True, 允許變數修改; 在測試時,reuse被設定為False.
  2. x = x_dict[‘images’]. 由於Estimator接收的輸入為字典, 因此此處將x轉換為字典型別.
  3. x = tf.reshape([x, shape=[-1, 28, 28, 1]]). 通過tf的reshape將該1維向量轉變為4維向量, 其中每一維分別表示為 batch size, height, width, channel. 如果將其設定為-1, 則表示該位不指定, LeNet中不指定批大小,因此此處將其設定為-1.

定義模型函式

# Define the model function (following TF Estimator Template)
def model_fn(features, labels, mode):

    logits_train = conv_net(features, num_classes, dropout, reuse=False, is_training=True)
    logits_test = conv_net(features, num_classes, dropout, reuse=True, is_training=False)

    # Predictions
    pred_classes = tf.argmax(logits_test, axis=1)
    pred_probas = tf.nn.softmax(logits_test)

    # If prediction mode, early return
    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode, predictions=pred_classes)

    # Define loss and optimizer
    loss_op = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(
        logits=logits_train, labels=tf.cast(labels, dtype=tf.int32)))
    optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
    train_op = optimizer.minimize(loss_op, global_step=tf.train.get_global_step())

    # Evaluate the accuracy of the model
    acc_op = tf.metrics.accuracy(labels=labels, predictions=pred_classes)

    # TF Estimator requires to return a EstimatorSpec, that specify
    # the different ops for training, evaluating et al.
    estim_specs = tf.estimator.EstimatorSpec(
        mode=mode,
        predictions=pred_classes,
        loss=loss_op,
        train_op=train_op,
        eval_metric_ops={'accuracy' : acc_op})

    return estim_specs
    # Refer to the function estimator

以下內容值得注意與說明:

  1. logits_train 與 logits_test 分別表示的是訓練模式和測試模式, 其中可以通過is_training來指定模式.

建立Estimator

# Build the Estimator
model = tf.estimator.Estimator(model_fn)

定義輸入函式並訓練

# Define the input function for training
input_fn = tf.estimator.inputs.numpy_input_fn(
    x={'images' : mnist.train.images}, y=mnist.train.labels,
    batch_size=batch_size, num_epochs=None, shuffle=True)
# What is the shuffle and epochs really means?

# Train the Model
model.train(input_fn, steps=num_steps)

由於X是一個字典, 因此需要選出mnist中的訓練圖片進行訓練. 另外shuffle在訓練時將其置為True, 當期進行測試時設定為False. 接著使用models.train進行模型訓練.

模型評估

# Evaluate the model
# Define the input function for evaluating
input_fn = tf.estimator.inputs.numpy_input_fn(
    x={'images': mnist.test.images}, y=mnist.test.labels,
    batch_size=batch_size, shuffle=False)
model.evaluate(input_fn)

使用測試資料集來進行測試, 然後使用model.evaluate進行模型評估, 使用的測試批為batch_size.

單影象測試

# Predict single images
n_images = 4
test_images = mnist.test.images[:n_images]
input_fn = tf.estimator.inputs.numpy_input_fn(
    x={'images': test_images}, shuffle=False)
preds = list(model.predict(input_fn))
# print(tf.estimator.inputs.numpy_input_fn(
#     x={'images': test_images}, shuffle=False))
# print(preds)

以上即是使用Tensorflow實現的卷積神經網路的全流程. 相對而言, 使用tensorflow還是十分簡單清晰的. 後續將繼續進行學習.