人工智慧(4)- 實現多層神經網路
阿新 • • 發佈:2018-12-06
1.單層神經網路
2.多層神經網路
3.MLP的3個步驟
MLP learning procedure in three simple steps:
- Starting at the input layer, we forward propagate the patterns of the training data through the network to generate an output.
- Based on the network's output, we calculate the error that we want to minimize using a cost function that we will describe later.
- We backpropagate the error, find its derivative with respect to each weight inthe network, and update the model.
前向演算法
隱藏層中的每個單元連結所有輸入層,計算隱藏層的啟用單元
輸出也是同樣的方法
4.Obtaining the MNIST dataset
獲取60000個訓練集和10000個測試集,將原始的資料轉換成784(28*28)畫素的資料。
# -*- coding: utf-8 -*- """ Created on Sat Nov 10 14:30:38 2018 @author:YRP """ import os import struct import numpy as np #Load_mnist返回兩個值樣品和特徵 def load_mnist(path, kind='train'): """Load MNIST data from `path`""" labels_path = os.path.join(path, '%s-labels.idx1-ubyte' % kind) images_path = os.path.join(path, '%s-images.idx3-ubyte' % kind) with open(labels_path, 'rb') as lbpath: magic, n = struct.unpack('>II', lbpath.read(8)) labels = np.fromfile(lbpath, dtype=np.uint8) with open(images_path, 'rb') as imgpath: magic, num, rows, cols = struct.unpack(">IIII", imgpath.read(16)) images = np.fromfile(imgpath, dtype=np.uint8).reshape( len(labels), 784) images = ((images / 255.) - .5) * 2 return images, labels #讀取60000個訓練集和10000個測試集 X_train, y_train = load_mnist('', kind='train') print('Rows: %d, columns: %d' % (X_train.shape[0], X_train.shape[1])) X_test, y_test = load_mnist('', kind='t10k') print('Rows: %d, columns: %d' % (X_test.shape[0], X_test.shape[1])) #顯示影象中的1到9 import matplotlib.pyplot as plt fig, ax = plt.subplots(nrows=2, ncols=5, sharex=True, sharey=True) ax = ax.flatten() for i in range(10): img = X_train[y_train == i][0].reshape(28, 28) ax[i].imshow(img, cmap='Greys') ax[0].set_xticks([]) ax[0].set_yticks([]) plt.tight_layout() plt.show() #儲存訓練和測試集到檔案中 np.savez_compressed('mnist_scaled.npz', X_train=X_train, y_train=y_train, X_test=X_test, y_test=y_test) #將檔案讀取 mnist = np.load('mnist_scaled.npz')
影象顯示結果
5.區分手寫資料 Classifying handwritten digits
實現MLP包括一層輸入、一層隱藏、一層輸出,來對MNIST的資料集進行識別
對55000個數據進行訓練,留下5000個數據進行驗證
在NeuralNetMLP中設定引數
-
-
- l2: This is the l parameter for L2 regularization to decrease the degree of overfitting.
- epochs: This is the number of passes over the training set.
- eta: This is the learning rate h .
- shuffle: This is for shuffling the training set prior to every epoch to prevent that the algorithm gets stuck in circles.
- seed: This is a random seed for shuffling and weight initialization.
- minibatch_size: This is the number of training samples in each mini-batch when splitting of the training data in each epoch for stochastic gradient descent. The gradient is computed for each mini-batch separately instead of the entire training data for faster learning.
-
通過得到200個epochs的cost,繪製出如下圖表
得到200Epochs的驗證和訓練精度
最後通過分析驗證集和訓練集的精度評估模型的泛化能力
Test accuracy: 97.54%
觀察一個5*5的子圖矩陣,其中副標題中的第一個數字表示圖索引,第二個數字表示真正的類標籤(t),第三個數字表示預測的類標籤(p):
import os
import mlp
import numpy as np
import matplotlib.pyplot as plt
mnist = np.load('./mnist/mnist_scaled.npz')
X_train, y_train, X_test, y_test = [mnist[f] for f in mnist.files]
n_epochs = 200
if 'TRAVIS' in os.environ:
n_epochs = 20
nn = mlp.NeuralNetMLP(n_hidden=100,
l2=0.01,
epochs=n_epochs,
eta=0.0005,
minibatch_size=100,
shuffle=True,
seed=1)
nn.fit(X_train=X_train[:55000],
y_train=y_train[:55000],
X_valid=X_train[55000:],
y_valid=y_train[55000:])
plt.plot(range(nn.epochs), nn.eval_['cost'])
plt.ylabel('Cost')
plt.xlabel('Epochs')
plt.savefig('images/costEpochs.png', dpi=300)
plt.show()
plt.plot(range(nn.epochs), nn.eval_['train_acc'], label='training')
plt.plot(range(nn.epochs), nn.eval_['valid_acc'], label='validation', linestyle='--')
plt.ylabel('Accuracy')
plt.xlabel('Epochs')
plt.legend()
plt.savefig('images/accuracyEpochs.png', dpi=300)
plt.show()
y_test_pred = nn.predict(X_test)
acc = (np.sum(y_test == y_test_pred)
.astype(np.float) / X_test.shape[0])
print('Test accuracy: %.2f%%' % (acc * 100))
miscl_img = X_test[y_test != y_test_pred][:25]
correct_lab = y_test[y_test != y_test_pred][:25]
miscl_lab = y_test_pred[y_test != y_test_pred][:25]
fig, ax = plt.subplots(nrows=5, ncols=5, sharex=True, sharey=True,)
ax = ax.flatten()
for i in range(25):
img = miscl_img[i].reshape(28, 28)
ax[i].imshow(img, cmap='Greys', interpolation='nearest')
ax[i].set_title('%d) t: %d p: %d' % (i+1, correct_lab[i], miscl_lab[i]))
ax[0].set_xticks([])
ax[0].set_yticks([])
plt.tight_layout()
plt.savefig('images/misclassifying.png', dpi=300)
plt.show()
參考資料:《Python Machine Learning(2th)》