5.4支援向量機（SVM）演算法（下）應用

阿新 • • 發佈：2019-01-15

利用SVM進行人臉識別例項：

# !/usr/bin/env python
# -*- coding: utf-8 -*-
# Author: YaHuYang

#在python2.x版本中要使用Python3.x的特性,可以使用__future__模組匯入相應的介面，減少對當前低版本影響
#from __future__ import print_function

#計時，程式執行時間
from time import time
#列印程式進展時的一些資訊
import logging
#最後識別出來的人臉通過繪圖打印出來
import matplotlib.pyplot as plt

#當import 一個模組比如如下模組cross_validation時，會有刪除橫線，表示該模組在當前版本可能已經被刪除,在新版本中改為model_selection模組
#DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20."This module will be removed in 0.20.", DeprecationWarning)
#from sklearn.cross_validation import train_test_split
from sklearn.model_selection import train_test_split
from sklearn.datasets import fetch_lfw_people
#grid_search已經被移除
#from sklearn.grid_search import GridSearchCV
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report
from sklearn.decomposition import RandomizedPCA
from sklearn.svm import SVC
#匯入混淆矩陣模組confusion_matrix()
from sklearn.metrics import confusion_matrix

print(__doc__)

# Display progress logs on stdout程式進展的資訊打印出來
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(message)s')

###############################################################################
# Download the data, if not already on disk and load it as numpy arrays
#下載人臉庫 http://vis-www.cs.umass.edu/lfw/

lfw_people = fetch_lfw_people(min_faces_per_person=70, resize=0.4)

# introspect the images arrays to find the shapes (for plotting)
n_samples, h, w = lfw_people.images.shape

# for machine learning we use the 2 data directly (as relative pixel
# positions info is ignored by this model)
#獲取特徵向量矩陣
X = lfw_people.data
#特徵向量的維度(列數)或者稱特徵點的個數
n_features = X.shape[1]

# the label to predict is the id of the person
#返回每一組的特徵標記
y = lfw_people.target
target_names = lfw_people.target_names
#返回多少類(多少行)，也就是多少個人進行人臉識別
n_classes = target_names.shape[0]

print("Total dataset size:")
print("n_samples: %d" % n_samples)
print("n_features: %d" % n_features)
print("n_classes: %d" % n_classes)


###############################################################################
# Split into a training set and a test set using a stratified k fold
# split into a training and testing set
#將資料集拆分成四個部分
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)


###############################################################################
#PCA降維方法，減少特徵值，降低複雜度。
# Compute a PCA (eigenfaces) on the face dataset (treated as unlabeled
# dataset): unsupervised feature extraction / dimensionality reduction
n_components = 150

print("Extracting the top %d eigenfaces from %d faces" % (n_components, X_train.shape[0]))
t0 = time()
pca = RandomizedPCA(n_components=n_components, whiten=True).fit(X_train)

print("done in %0.3fs" % (time() - t0))

#提取特徵值
eigenfaces = pca.components_.reshape((n_components, h, w))

print("Projecting the input data on the eigenfaces orthonormal basis")
t0 = time()
X_train_pca = pca.transform(X_train)
X_test_pca = pca.transform(X_test)
print("done in %0.3fs" % (time() - t0))


###############################################################################
# Train a SVM classification model

print("Fitting the classifier to the training set")
t0 = time()
param_grid = {'C': [1e3, 5e3, 1e4, 5e4, 1e5],'gamma': [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.1], }
#clf = GridSearchCV(SVC(kernel='rbf', class_weight='auto'), param_grid)
clf = GridSearchCV(SVC(kernel='rbf'), param_grid)
clf = clf.fit(X_train_pca, y_train)
print("done in %0.3fs" % (time() - t0))
print("Best estimator found by grid search:")
print(clf.best_estimator_)


###############################################################################
# Quantitative evaluation of the model quality on the test set

print("Predicting people's names on the test set")
t0 = time()
y_pred = clf.predict(X_test_pca)
print("done in %0.3fs" % (time() - t0))

print(classification_report(y_test, y_pred, target_names=target_names))
print(confusion_matrix(y_test, y_pred, labels=range(n_classes)))


###############################################################################
# Qualitative evaluation of the predictions using matplotlib

def plot_gallery(images, titles, h, w, n_row=3, n_col=4):
    """Helper function to plot a gallery of portraits"""
    plt.figure(figsize=(1.8 * n_col, 2.4 * n_row))
    plt.subplots_adjust(bottom=0, left=.01, right=.99, top=.90, hspace=.35)
    for i in range(n_row * n_col):
        plt.subplot(n_row, n_col, i + 1)
        plt.imshow(images[i].reshape((h, w)), cmap=plt.cm.gray)
        plt.title(titles[i], size=12)
        plt.xticks(())
        plt.yticks(())


# plot the result of the prediction on a portion of the test set

def title(y_pred, y_test, target_names, i):
    pred_name = target_names[y_pred[i]].rsplit(' ', 1)[-1]
    true_name = target_names[y_test[i]].rsplit(' ', 1)[-1]
    return 'predicted: %s\ntrue:      %s' % (pred_name, true_name)

prediction_titles = [title(y_pred, y_test, target_names, i)
                     for i in range(y_pred.shape[0])]

plot_gallery(X_test, prediction_titles, h, w)

# plot the gallery of the most significative eigenfaces

eigenface_titles = ["eigenface %d" % i for i in range(eigenfaces.shape[0])]
plot_gallery(eigenfaces, eigenface_titles, h, w)

plt.show()

結果：Total dataset size:
n_samples: 1288
n_features: 1850
n_classes: 7
Extracting the top 150 eigenfaces from 966 faces
D:\anaconda3\anaconda3install\lib\site-packages\sklearn\utils\deprecation.py:58: DeprecationWarning: Class RandomizedPCA is deprecated; RandomizedPCA was deprecated in 0.18 and will be removed in 0.20. Use PCA(svd_solver='randomized') instead. The new implementation DOES NOT store whiten ``components_``. Apply transform to get them.
warnings.warn(msg, category=DeprecationWarning)
done in 3.302s
Projecting the input data on the eigenfaces orthonormal basis
done in 0.020s
Fitting the classifier to the training set
done in 29.999s
Best estimator found by grid search:
SVC(C=1000.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape='ovr', degree=3, gamma=0.005, kernel='rbf',
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False)
Predicting people's names on the test set
done in 0.068s
precision recall f1-score support

Ariel Sharon 0.91 0.59 0.71 17
Colin Powell 0.82 0.91 0.86 65
Donald Rumsfeld 0.86 0.67 0.75 27
George W Bush 0.80 0.96 0.87 133
Gerhard Schroeder 0.86 0.58 0.69 31
Hugo Chavez 1.00 0.56 0.72 16
Tony Blair 0.81 0.67 0.73 33

avg / total 0.83 0.82 0.81 322

[[ 10 4 0 3 0 0 0]
[ 0 59 1 5 0 0 0]
[ 1 1 18 7 0 0 0]
[ 0 3 0 128 2 0 0]
[ 0 0 1 8 18 0 4]
[ 0 3 0 3 0 9 1]

[ 0 2 1 7 1 0 22]]

結果：

5.4支援向量機（SVM）演算法（下）應用

5.4支援向量機（SVM）演算法（下）應用

4.支援向量機（SVM）演算法(下）

支援向量機之SVM演算法庫(scikit-learn)（三）

支援向量機數學證明與推導（SVM）

SVM支援向量機-SKlearn實現與繪圖（8）

特徵選擇之支援向量機遞迴特徵消除（SVM-RFE）

支援向量機—SMO論文詳解（序列最小最優化演算法）

支援向量機：SVM

機器學習---演算法---支援向量機---線性SVM--第一部分

機器學習4-支援向量機

人工智障學習筆記——機器學習(4)支援向量機

Python 資料科學手冊 5.7 支援向量機

二分類支援向量機模型SVM知識點詳解

支援向量機從原理到演算法的實現

支援向量機與SMO優化演算法

決策樹、貝葉斯、人工神經網路、K-近鄰、支援向量機等常用分類演算法小結

機器學習實戰【5】（SVM-支援向量機）

SVM支援向量機-《機器學習實戰》SMO演算法Python實現（5）

機器學習實戰（五）支援向量機SVM（Support Vector Machine）

吳恩達機器學習（第十三章）---支援向量機SVM

5.4支援向量機（SVM）演算法（下）應用

相關推薦