1. 程式人生 > >計算機視覺(五):使用SVM分類Cifar-10資料集

計算機視覺(五):使用SVM分類Cifar-10資料集

1 - 引言

之前我們使用了K-NN對Cifar-10資料集進行了圖片分類,正確率只有不到30%,但是還是比10%高的[手動滑稽],這次我們將學習使用SVM分類器來對Cafi-10資料集實現分類,但是正確率應該也不會很高

要想繼續提高正確率,就要對影象進行預處理和特徵的選取工作,而不是用整張圖片進行識別。從計算機視覺發展的角度來將,在深度學習出來之前,傳統的影象識別方法一直都是使用特徵選取+分類器的方法來識別影象,雖然說正確率沒有深度學習那麼高,但是這些方法還是有必要學習掌握的。

2 - 準備工作

  1. 建立專案
    因為資料集還是Cifar-10,專案的結構和K-NN的一樣
  2. 在classifiers檔案中建立linear_svm.py
  3. 建立SVM.py檔案進行實驗

3 - 具體步驟

線性分類器為 f ( x i ; W )

= W x i f(x_i;W)=Wx_i
在這裡插入圖片描述
hinge損失函式為:
L
i = j y i m a x ( 0 , S j S y i + Δ ) + λ 2 W 2 L_i = \sum_{j\neq y_i}max(0,S_j-S_{y_i}+\Delta)+\frac{\lambda}{2}||W||^2

使用梯度下降來最小化損失函式
j = y i : L i j w j = X i j=y_i: \frac{\partial L_{ij}}{\partial w_j}=-X_i
j y i : L i j w j = X i j\neq y_i: \frac{\partial L_{ij}}{\partial w_j}=X_i

根據這個思想,可以寫出函式svm_loss_naive

def svm_loss_naive(W, X, y, reg):
  """
  使用迴圈實現的SVM loss函式
  輸入維數為D,有C類,我們使用N個樣本作為一批輸入
  輸入:
  -W :一個numpy 陣列,維數為(D,C),儲存權重
  -X :一個numpy陣列,維數為(N,D),儲存一小批資料
  -y : 一個numpy陣列,維數為(N,),儲存訓練標籤
  -reg :float,正則化強度

  輸出:
  - loss : 損失函式的值
  - dW : 權重W的梯度,和W大小相同的array
  """
  dW = np.zeros(W.shape) # initialize the gradient as zero

  # compute the loss and the gradient
  num_classes = W.shape[1]
  num_train = X.shape[0]
  loss = 0.0
  for i in range(num_train):
    scores = X[i].dot(W)
    correct_class_score = scores[y[i]]

    for j in range(num_classes):
      if j == y[i]:
        continue
      margin = scores[j] - correct_class_score + 1 # note delta = 1
      if margin > 0:
        loss += margin
        #計算j不等於yi的行的梯度
        dW[:, j] += X[i]
        #j=yi時的梯度
        dW[:, y[i]]+=(-X[i])

  # Right now the loss is a sum over all training examples, but we want it
  # to be an average instead so we divide by num_train.
  loss /= num_train
  dW /= num_train
  # Add regularization to the loss.
  loss += 0.5*reg * np.sum(W * W)
  dW += reg * W

  return loss, dW

但是這個函式使用迴圈計算效率低,我們可以繼續使用向量化的思想寫出不需要迴圈的函式

def svm_loss_vectorized(W, X, y, reg):
  """
  結構化的SVM損失函式,使用向量來實現
  輸入和輸出的SVM_loss_naive一致
  """
  loss = 0.0
  dW = np.zeros(W.shape) #初始化梯度為0
  """
  實現結構化SVM損失函式,將損失儲存在loss變數中
  """
  scores = X.dot(W)

  num_classes = W.shape[1]
  num_train = X.shape[0]

  scores_correct = scores[np.arange(num_train), y]
  scores_correct = np.reshape(scores_correct, (num_train, -1))
  margins = scores - scores_correct + 1
  margins = np.maximum(0, margins)
  margins[np.arange(num_train), y] = 0
  loss += np.sum(margins) / num_train
  loss += 0.5 * reg * np.sum(W * W)

  """
  使用向量計算結構化SVM損失函式的梯度,把結果儲存在dW
  """
  margins[margins > 0] = 1
  row_sum = np.sum(margins, axis=1)  # 1 by N
  margins[np.arange(num_train), y] = -row_sum
  dW += np.dot(X.T, margins) / num_train + reg * W

  return loss, dW

我們可以驗證一下這兩個演算法的計算結果和計算時間

from cs231n.classifiers.linear_svm import svm_loss_naive
import time
#生成一個很小的SVM隨機權重矩陣
W = np.random.randn(3073, 10) * 0.0001
tic = time.time()
loss_naive, grad_naive = svm_loss_naive(W, X_dev, y_dev, 0.000005)
toc = time.time()
print('Naive loss: %e computed in %fs' % (loss_naive, toc - tic))

from cs231n.classifiers.linear_svm import svm_loss_vectorized
tic = time.time()
loss_vectorized, _ = svm_loss_vectorized(W, X_dev, y_dev, 0.000005)
toc = time.time()
print('Vectorized loss: %e computed in %fs' % (loss_vectorized, toc - tic))

# The losses should match but your vectorized implementation should be much faster.
print('difference: %f' % (loss_naive - loss_vectorized))

可以看到向量化的計算方法明顯快於普通迴圈方法,而且計算的結果是一樣的

Naive loss: 8.846791e+00 computed in 0.157097s
Vectorized loss: 8.846791e+00 computed in 0.006006s
difference: -0.000000

在我們得到損失、梯度之後,我們繼續使用SGD來最小化損失

    def train(self, X, y, learning_rate=1e-3, reg=1e-5, num_iters=100,
              batch_size=200, verbose=False):
        """
         使用隨機梯度下降來訓練這個分類器
         輸入:
         -X :一個numpy陣列,維數為(N,D)
         -Y : 一個numpy陣列,維數為(N,)
         -learning rate: float ,優化的學習率
         -reg : float,正則化強度
         -num_iters: integer, 優化時訓練的步數
         -batch_size:integer, 每一步使用的訓練樣本數
         -ver bose : boolean, 若為真,優化時列印過程

         輸出:
         一個儲存每次訓練的損失函式值的List
        """
        num_train, dim = X.shape
        num_classes = np.max(y) + 1 #假設y的值時0...K-1,其中K是類別數量

        if self.W is None:
            #簡易初始化W
            self.W = 0.001 * np.random.randn(dim,num_classes)

        #使用隨機梯度下降優化W
        loss_history = []
        for it in range(num_iters):
            X_batch = None
            y_batch = None
            """
            從訓練集中取樣batch_size個樣本和對應的標籤,在這一輪梯度下降中使用。
            把資料儲存在X_batch中,把對應的標籤儲存在y_batch中
            取樣後,X_batch的形狀為(dim,batch_size),y_batch的形狀為(batch_size,)
            """
            batch_inx = np.random.choice(num_train,batch_size)
            X_batch = X[batch_inx,:]
            y_batch = y[batch_inx]

            loss,grad = self.loss(X_batch, y_batch,reg)
            loss_history.append(loss)

            """
            使用梯度和學習率更新權重
            """
            self.W = self.W - learning_rate * grad
            if verbose and it % 100 == 0:
                print('iteration %d / %d: loss %f' % (it,num_iters,loss))

        return loss_history

現在我們可以訓練權重,最小化損失函式

from cs231n.classifiers.linear_classifier import LinearSVM
import time
svm = LinearSVM()
tic = time.time()
loss_hist = svm.train(X_train, y_train, learning_rate=1e-7, reg=2.5e4,
                      num_iters=1500, verbose=True)
toc = time.time()
print('That took %fs' % (toc - tic))

# A useful debugging strategy is to plot the loss as a function of
# iteration number:
plt.plot(loss_hist)
plt.xlabel('Iteration number')
plt.ylabel('Loss value')
plt.show()

# Write the LinearSVM.predict function and evaluate the performance on both the
# training and validation set
y_train_pred = svm.predict(X_train)
print('training accuracy: %f' % (np.mean(y_train == y_train_pred), ))
y_val_pred = svm.predict(X_val)
print('validation accuracy: %f' % (np.mean(y_val == y_val_pred), ))

在這裡插入圖片描述

然後我們可以在訓練集和預測及上預測我們的準確率

    def predict(self, X):
        """
        Use the trained weights of this linear classifier to predict labels for
        data points.
        Inputs:
        - X: A numpy array of shape (N, D) containing training data; there are N
          training samples each of dimension D.
        Returns:
        - y_pred: Predicted labels for the data in X. y_pred is a 1-dimensional
          array of length N, and each element is an integer giving the predicted
          class.
        """
        y_pred = np.zeros(X.shape[0])
        scores = X.dot(self.W)
        y_pred = np.argmax(scores, axis=1)
        return y_pred
y_train_pred = svm.predict(X_train)
print('training accuracy: %f' % (np.mean(y_train == y_train_pred), ))
y_val_pred = svm.predict(X_val)
print('validation accuracy: %f' % (np.mean(y_val == y_val_pred), ))
training accuracy: 0.430408
validation accuracy: 0.358000

可以看到在訓練集上有43%的正確率,驗證集上有35%的正確率,為了提高正確率,我們可以使用驗證集調整超引數(正則化強度和學習率)

from cs231n.classifiers.linear_classifier import LinearSVM
import time

"""
使用驗證集去調整超引數(正則化和學習率)
"""
learning_rates = [2e-7,0.75e-7,1.5e-7,1.25e-7,0.75e-7]
regularation_strengths = [3e4,3.25e4,3.5e4,3.75e4,4e4,4.25e4,4.75e4,5e4]

"""
結果是一個詞典,將形成(learning_rate,regularization_strength)的陣列
"""
results = {}
best_val = -1 #出現的正確率最大值
best_svm = None #達到正確率最大值的SVM物件

"""
通過驗證集選擇最佳超引數,對於每一個超引數的組合在訓練集訓練一個線性SVM,
在訓練集和測試集上計算它的準確度,然後在字典裡儲存這些值,另外,在best_val中
儲存最好的驗證集準確率,在best_svm中儲存達到這個最佳值的SVM物件
"""
for rate in learning_rates:
    for regular in regularation_strengths:
        svm = LinearSVM()
        svm.train(X_train,y_train,learning_rate=rate,reg=regular,num_iters=1000)
        y_train_pred = svm.predict(X_train)
        accuracy_train = np.mean(y_train==y_train_pred)
        y_val_pred = svm.predict(X_val)
        accuracy_val = np.mean(y_val==y_val_pred)
        results[(rate,regular)]= (accuracy_train,accuracy_val)
        if(best_val<accuracy_val):
            best_val = accuracy_val
            best_svm = svm
for lr, reg in sorted(results):
    train_accuracy, val_accuracy = results[(lr,reg)]
    print('lr %e reg %e train accuracy: %f val accuracy: %f' % (lr, reg, train_accuracy, val_accuracy))
print('best validation accuracy achieved during cross-validation: %f' % best_val)

輸出入下:

lr 7.500000e-08 reg 3.000000e+04 train accuracy: 0.418571 val accuracy: 0.351000
lr 7.500000e-08 reg 3.250000e+04 train accuracy: 0.415714 val accuracy: 0.354000
lr 7.500000e-08 reg 3.500000e+04 train accuracy: 0.415918 val accuracy: 0.371000
lr 7.500000e-08 reg 3.750000e+04 train accuracy: 0.418980 val accuracy: 0.364000
lr 7.500000e-08 reg 4.000000e+04 train accuracy: 0.415306 val accuracy: 0.368000
lr 7.500000e-08 reg 4.250000e+04 train accuracy: 0.414082 val accuracy: 0.360000
lr 7.500000e-08 reg 4.750000e+04 train accuracy: 0.417143 val accuracy: 0.369000
lr 7.500000e-08 reg 5.000000e+04 train accuracy: 0.411429 val accuracy: 0.366000
lr 1.250000e-07 reg 3.000000e+04 train accuracy: 0.419184 val accuracy: 0.363000
lr 1.250000e-07 reg 3.250000e+04 train accuracy: 0.419388 val accuracy: 0.367000
lr 1.250000e-07 reg 3.500000e+04 train accuracy: 0.421020 val accuracy: 0.357000
lr 1.250000e-07 reg 3.750000e+04 train accuracy: 0.415510 val accuracy: 0.355000
lr 1.250000e-07 reg 4.000000e+04 train accuracy: 0.417347 val accuracy: 0.364000
lr 1.250000e-07 reg 4.250000e+04 train accuracy: 0.417551 val accuracy: 0.360000
lr 1.250000e-07 reg 4.750000e+04 train accuracy: 0.419796 val accuracy: 0.360000
lr 1.250000e-07 reg 5.000000e+04 train accuracy: 0.406735 val accuracy: 0.368000
lr 1.500000e-07 reg 3.000000e+04 train accuracy: 0.431837 val accuracy: 0.361000
lr 1.500000e-07 reg 3.250000e+04 train accuracy: 0.422653 val accuracy: 0.361000
lr 1.500000e-07 reg 3.500000e+04 train accuracy: 0.410816 val accuracy: 0.356000
lr 1.500000e-07 reg 3.750000e+04 train accuracy: 0.411837 val accuracy: 0.353000
lr 1.500000e-07 reg 4.000000e+04 train accuracy: 0.412653 val accuracy: 0.344000
lr 1.500000e-07 reg 4.250000e+04 train accuracy: 0.406531 val accuracy: 0.362000
lr 1.500000e-07 reg 4.750000e+04 train accuracy: 0.410816 val accuracy: 0.355000
lr 1.500000e-07 reg 5.000000e+04 train accuracy: 0.397347 val accuracy: 0.354000
lr 2.000000e-07 reg 3.000000e+04 train accuracy: 0.422245 val accuracy: 0.363000
lr 2.000000e-07 reg 3.250000e+04 train accuracy: 0.411224 val accuracy: 0.353000
lr 2.000000e-07 reg 3.500000e+04 train accuracy: 0.410816 val accuracy: 0.351000
lr 2.000000e-07 reg 3.750000e+04 train accuracy: 0.409592 val accuracy: 0.356000
lr 2.000000e-07 reg 4.000000e+04 train accuracy: 0.409184 val accuracy: 0.347000
lr 2.000000e-07 reg 4.250000e+04 train accuracy: 0.397347 val accuracy: 0.342000
lr 2.000000e-07 reg 4.750000e+04 train accuracy: 0.408571 val accuracy: 0.361000
lr 2.000000e-07 reg 5.000000e+04 train accuracy: 0.404082 val accuracy: 0.358000
best validation accuracy achieved during cross-validation: 0.371000

結果在設定的超引數迴圈中,最好的是0.371,比我們之前的提高了2%。如果繼續嘗試更多的超引數,可以到達40%左右