1. 程式人生 > >機器學習學習筆記:用MiniVGGNet處理Cifar-10資料集

機器學習學習筆記:用MiniVGGNet處理Cifar-10資料集

0. 引言

VGGNet,由Simonyan和Zisserman在2014年提出,論文名字是《Very Deep Learning Convolutional Neural Networks for Large-Scale Image Recognition》。他們做出的貢獻主要是提出了一個只用(3x3)的卷積filters,並且層數為16-19層的神經網路。使用這個網路,在ImageNet分類挑戰中,可以獲得較高的分類精度。(筆者到現在的學習過程中,發現最神經網路的結構越簡單,訓練速度較快,但是識別精度不高。因此是需要使用前人已經驗證過了有較高識別精度的網路結構。) 本文為了先讓大家一起體會一下VGGNet的深度,把VGGNet的複雜度適當的降低到只有兩層結構,並將其命名為MiniVGGNet。以尊重原創。

1. MiniVGGNet的網路結構

第一層:INPUT =>

第二層:CONV => ACT => BN => CONV => ACT =>BN => POOL => DROPOUT =>

第三層:CONV => ACT => BN => CONV => ACT =>BN => POOL => DROPOUT =>

第四層:FC => ACT => BN => DROPOUT =>

第五層:FC => SOFTMAX

MiniVGGNet網路結構
Layer Type Output Size Filter Size / Stride
INPUT IMAGE 32 x 32 x 3
CONV 32 x 32 x 32 3 x 3, K = 32
ACT 32 x 32 x 32
BN 32 x 32 x 32
CONV 32 x 32 x 32 3 x 3, K = 32
ACT 32 x 32 x 32
BN 32 x 32 x 32
POOL 16 x 16 x 32 2 x 2
DROPOUT 16  x 16 x 32
CONV 16  x 16 x 64 3 x 3, K = 64
ACT 16  x 16 x 64
BN 16  x 16 x 64
CONV 16  x 16 x 64 3 x 3, K = 64
ACT 16  x 16 x 64
BN 16  x 16 x 64
POOL 8 x 8 x 64 2 x 2
DROPOUT 8 x 8 x 64
FC 512
ACT 512
BN 5212
DROPOUT 512
FC 10
SOFTMAX 10

VGGNet會不斷重複上述的第2層和第3層,疊加知道總體的層數到達16層到19層之間。這樣大大的增加了網路的複雜性。也增加了訓練所需的消耗。

2. 程式碼

2.1 minivggnet.py

# import the necessary packages
from keras.models import Sequential
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Flatten
from keras.layers.core import Dropout
from keras.layers.core import Dense
from keras import backend as K

class MiniVGGNet:
	@staticmethod
	def build(width, height, depth, classes):
		# initialize the model along with the input shape to be
		# "channels last" and the channels dimension itself
		model = Sequential()
		inputShape = (height, width, depth)
		chanDim = -1
		
		# if we are using "channels first", update the input shape
		# and channels dimension
		if K.image_data_format() == "channels_first":
			inputShape = (depth, height, width)
			chanDim = 1

		# first CONV => RELU => CONV => RELU => POOL layer set
		model.add(Conv2D(32, (3, 3), padding="same",
			input_shape=inputShape))
		model.add(Activation("relu"))
		model.add(BatchNormalization(axis=chanDim))
		model.add(Conv2D(32, (3, 3), padding="same"))
		model.add(Activation("relu"))
		model.add(BatchNormalization(axis=chanDim))
		model.add(MaxPooling2D(pool_size=(2, 2)))
		model.add(Dropout(0.25))

		# second CONV => RELU => CONV => RELU => POOL layer set
		model.add(Conv2D(64, (3, 3), padding="same"))
		model.add(Activation("relu"))
		model.add(BatchNormalization(axis=chanDim))
		model.add(Conv2D(64, (3, 3), padding="same"))
		model.add(Activation("relu"))
		model.add(BatchNormalization(axis=chanDim))
		model.add(MaxPooling2D(pool_size=(2, 2)))
		model.add(Dropout(0.25))

		# first (and only) set of FC => RELU layers
		model.add(Flatten())
		model.add(Dense(512))
		model.add(Activation("relu"))
		model.add(BatchNormalization())
		model.add(Dropout(0.5))

		# softmax classifier
		model.add(Dense(classes))
		model.add(Activation("softmax"))
		
		# return the constructed network architecture
		return model

2.2 minivggnet_cifar10.py

本程式碼會呼叫keras.datasets的cifar10,當執行cifar10.load_data(),系統會自動下載cifar10資料集。Windows下面會下載到c:\Users\<使用者名稱>\.keras\datasets。

Ubuntu下會自動下載到~/.keras/datasets。我因為已經在Ubuntu18電腦內下載過一次,於是把該tar.gz壓縮檔案複製到datasets資料夾內即可。注意的是,伺服器上,壓縮包名字cifar-10-python.tar.gz。Keras會把檔案改名為cifar-10-batches-py.tar.gz。因此如果自行從伺服器下載的話,需要把名字改為keras所識別的名字。

# set the matplotlib backend so figures can be saved in the background
import matplotlib
matplotlib.use("Agg")

# import the necessary packages
from sklearn.preprocessing import LabelBinarizer
from sklearn.metrics import classification_report
from pyimagesearch.nn.conv import MiniVGGNet
from keras.optimizers import SGD
from keras.datasets import cifar10
import matplotlib.pyplot as plt
import numpy as np
import argparse

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-o", "--output", required=True,
	help="path to the output loss/accuracy plot")
args = vars(ap.parse_args())

# load the training and testing data, then scale it into the
# range [0, 1]
print("[INFO] loading CIFAR-10 data...")
((trainX, trainY), (testX, testY)) = cifar10.load_data()
trainX = trainX.astype("float") / 255.0
testX = testX.astype("float") / 255.0

# convert the labels from integers to vectors
lb = LabelBinarizer()
trainY = lb.fit_transform(trainY)
testY = lb.transform(testY)

# initialize the label names for the CIFAR-10 dataset
labelNames = ["airplane", "automobile", "bird", "cat", "deer",
	"dog", "frog", "horse", "ship", "truck"]

# initialize the optimizer and model
print("[INFO] compiling model...")
opt = SGD(lr=0.01, decay=0.01 / 40, momentum=0.9, nesterov=True)
model = MiniVGGNet.build(width=32, height=32, depth=3, classes=10)
model.compile(loss="categorical_crossentropy", optimizer=opt,
	metrics=["accuracy"])

# train the network
print("[INFO] training network...")
H = model.fit(trainX, trainY, validation_data=(testX, testY),
	batch_size=64, epochs=40, verbose=1)

# evaluate the network
print("[INFO] evaluating network...")
predictions = model.predict(testX, batch_size=64)
print(classification_report(testY.argmax(axis=1),
	predictions.argmax(axis=1), target_names=labelNames))

# plot the training loss and accuracy
plt.style.use("ggplot")
plt.figure()
plt.plot(np.arange(0, 40), H.history["loss"], label="train_loss")
plt.plot(np.arange(0, 40), H.history["val_loss"], label="val_loss")
plt.plot(np.arange(0, 40), H.history["acc"], label="train_acc")
plt.plot(np.arange(0, 40), H.history["val_acc"], label="val_acc")
plt.title("Training Loss and Accuracy on CIFAR-10")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend()
plt.savefig(args["output"])

3. 執行結果

執行指令

python minivggnet_cifar10.py --output output/cifar10_minivggnet_with_no_bn.png

經歷了漫長的等待。

我的電腦是i3-6100的桌上型電腦。記憶體為8G。CPU的Tensorflow。一次迭代需要356s。中午差幾分12:00開始跑的,要到15:35左右才完成了40次迭代運算。

Epoch 40/40
   64/50000 [..............................] - ETA: 5:39 - loss: 0.2286 - acc: 0  
   128/50000 [..............................] - ETA: 5:39 - loss: 0.2074 - acc: 0  
   192/50000 [..............................] - ETA: 5:38 - loss: 0.1949 - acc: 0  
   256/50000 [..............................] - ETA: 5:38 - loss: 0.2003 - acc: 0 
   15552/50000 [========>.....................] - ETA: 3:54 - loss: 0.2430 - acc: 0
   48256/50000 [===========================>..] - ETA: 11s - loss: 0.2494 - acc: 0
   49920/50000 [============================>.] - ETA: 0s - loss: 0.2487 - acc: 0.9
   49984/50000 [============================>.] - ETA: 0s - loss: 0.2488 - acc: 0.9
   50000/50000 [==============================] - 356s 7ms/step - loss: 0.2488 - acc: 0.9100 - val_loss: 0.5595 - val_acc: 0.8224
   
[INFO] evaluating network...
             precision    recall  f1-score   support

   airplane       0.83      0.82      0.83      1000
 automobile       0.89      0.93      0.91      1000
       bird       0.74      0.76      0.75      1000
        cat       0.70      0.64      0.67      1000
       deer       0.81      0.78      0.80      1000
        dog       0.75      0.75      0.75      1000
       frog       0.82      0.90      0.86      1000
      horse       0.91      0.84      0.87      1000
       ship       0.88      0.92      0.90      1000
      truck       0.88      0.88      0.88      1000

avg / total       0.82      0.82      0.82     10000

從結果來看,MiniVGGNet在Cifar-10資料集的成績是:平均識別精度達到了82%。跟我之前執行的ShallowNet比起來,高了20個百分點。精度有了很大改進。

程式碼全部來自:《Deep.Learning.for.Computer.Vision.with.Python.Starter.Bundle.2017.9.pdf》。推薦一看。