支援向量機(SVM)演算法應用——人臉識別

阿新 • • 發佈：2019-01-01

環境簡述:python3.6.4

根據python2.7版本程式碼進行勘誤

Class RandomizedPCA is deprecated; RandomizedPCA was deprecated in
0.18 and will be removed in 0.20. Use PCA(svd_solver=’randomized’) instead. The new implementation DOES NOT store whiten Apply transform to get them.

ValueError: class_weight must be dict, ‘balanced’, or None, got:
‘auto’\

DeprecationWarning: imread is deprecated!
imread is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.

Code如下

# !/usr/bin/env python
# -*- coding: utf-8 -*-
# Author: Tanghong

# 在python2.x版本中要使用Python3.x的特性,可以使用__future__模組匯入相應的介面，減少對當前低版本影響
# from __future__ import print_function 


# 計時，程式執行時間
from time import time
# 列印程式進展時的一些資訊
import logging
# 最後識別出來的人臉通過繪圖打印出來
import matplotlib.pyplot as plt
# DeprecationWarning: `imread` is deprecated! `imread` is deprecated in SciPy
# 1.0.0,and will be removed in 1.2.0.
from PIL import Image
from scipy import ndimage

# 當import 一個模組比如如下模組cross_validation時，會有刪除橫線，表示該模組在當前版本可能已經被刪除,在新版本中改為model_selection模組 

# DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection
# module into which all the refactored classes and functions are moved.
# Also note that the interface of the new CV iterators are different from that of this module.
# This module will be removed in 0.20."This module will be removed in 0.20.", DeprecationWarning)
# from sklearn.cross_validation import train_test_split
from sklearn.model_selection import train_test_split
from sklearn.datasets import fetch_lfw_people
# grid_search已經被移除
# from sklearn.grid_search import GridSearchCV
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report
# Class RandomizedPCA is deprecated; RandomizedPCA was deprecated in 0.18 and will be removed in 0.20. 
# Use PCA(svd_solver='randomized') instead. The new implementation DOES NOT store whiten ``components_``.
# Apply transform to get them.
from sklearn.decomposition import PCA
from sklearn.svm import SVC
# 匯入混淆矩陣模組confusion_matrix()
from sklearn.metrics import confusion_matrix

print(__doc__)

# Display progress logs on stdout程式進展的資訊打印出來
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(message)s')

###############################################################################
# Download the data, if not already on disk and load it as numpy arrays
# 下載人臉庫 http://vis-www.cs.umass.edu/lfw/
lfw_people = fetch_lfw_people(min_faces_per_person=70, resize=0.4)

# introspect the images arrays to find the shapes (for plotting)
n_samples, h, w = lfw_people.images.shape

# for machine learning we use the 2 data directly (as relative pixel
# positions info is ignored by this model)
# 獲取特徵向量矩陣
X = lfw_people.data
# 特徵向量的維度(列數)或者稱特徵點的個數
n_features = X.shape[1]

# the label to predict is the id of the person
# 返回每一組的特徵標記
y = lfw_people.target
target_names = lfw_people.target_names
# 返回多少類(多少行)，也就是多少個人進行人臉識別
n_classes = target_names.shape[0]

print("Total dataset size:")
print("n_samples: %d" % n_samples)
print("n_features: %d" % n_features)
print("n_classes: %d" % n_classes)

###############################################################################
# Split into a training set and a test set using a stratified k fold
# split into a training and testing set
# 將資料集拆分成四個部分
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

###############################################################################
# PCA降維方法，減少特徵值，降低複雜度。
# Compute a PCA (eigenfaces) on the face dataset (treated as unlabeled
# dataset): unsupervised feature extraction / dimensionality reduction
n_components = 150

print("Extracting the top %d eigenfaces from %d faces" % (n_components, X_train.shape[0]))
t0 = time()
pca = PCA(n_components=n_components, whiten=True).fit(X_train)

print("done in %0.3fs" % (time() - t0))

# 提取特徵值
eigenfaces = pca.components_.reshape((n_components, h, w))

print("Projecting the input data on the eigenfaces orthonormal basis")
t0 = time()
X_train_pca = pca.transform(X_train)
X_test_pca = pca.transform(X_test)
print("done in %0.3fs" % (time() - t0))

###############################################################################
# Train a SVM classification model

print("Fitting the classifier to the training set")
t0 = time()
param_grid = {'C': [1e3, 5e3, 1e4, 5e4, 1e5], 'gamma': [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.1], }
# clf = GridSearchCV(SVC(kernel='rbf', class_weight='auto'), param_grid)
clf = GridSearchCV(SVC(kernel='rbf'), param_grid)
clf = clf.fit(X_train_pca, y_train)
print("done in %0.3fs" % (time() - t0))
print("Best estimator found by grid search:")
print(clf.best_estimator_)

###############################################################################
# Quantitative evaluation of the model quality on the test set

print("Predicting people's names on the test set")
t0 = time()
y_pred = clf.predict(X_test_pca)
print("done in %0.3fs" % (time() - t0))

print(classification_report(y_test, y_pred, target_names=target_names))
print(confusion_matrix(y_test, y_pred, labels=range(n_classes)))


###############################################################################
# Qualitative evaluation of the predictions using matplotlib

def plot_gallery(images, titles, h, w, n_row=3, n_col=4):
    """Helper function to plot a gallery of portraits"""
    plt.figure(figsize=(1.8 * n_col, 2.4 * n_row))
    plt.subplots_adjust(bottom=0, left=.01, right=.99, top=.90, hspace=.35)
    for i in range(n_row * n_col):
        plt.subplot(n_row, n_col, i + 1)
        plt.imshow(images[i].reshape((h, w)), cmap=plt.cm.gray)
        plt.title(titles[i], size=12)
        plt.xticks(())
        plt.yticks(())


# plot the result of the prediction on a portion of the test set

def title(y_pred, y_test, target_names, i):
    pred_name = target_names[y_pred[i]].rsplit(' ', 1)[-1]
    true_name = target_names[y_test[i]].rsplit(' ', 1)[-1]
    return 'predicted: %s\ntrue:      %s' % (pred_name, true_name)


prediction_titles = [title(y_pred, y_test, target_names, i)
                     for i in range(y_pred.shape[0])]

plot_gallery(X_test, prediction_titles, h, w)

# plot the gallery of the most significative eigenfaces

eigenface_titles = ["eigenface %d" % i for i in range(eigenfaces.shape[0])]
plot_gallery(eigenfaces, eigenface_titles, h, w)

plt.show()

Result如下：

None
Total dataset size:
n_samples: 1288
n_features: 1850
n_classes: 7
Extracting the top 150 eigenfaces from 966 faces
done in 0.862s
Projecting the input data on the eigenfaces orthonormal basis
done in 0.062s
Fitting the classifier to the training set
done in 34.029s
Best estimator found by grid search:
SVC(C=1000.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma=0.005, kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)
Predicting people's names on the test set
done in 0.079s
                   precision    recall  f1-score   support

     Ariel Sharon       1.00      0.63      0.77        19
     Colin Powell       0.74      0.92      0.82        64
  Donald Rumsfeld       0.91      0.70      0.79        30
    George W Bush       0.82      0.92      0.87       130
Gerhard Schroeder       0.92      0.80      0.86        30
      Hugo Chavez       1.00      0.55      0.71        22
       Tony Blair       0.91      0.78      0.84        27

      avg / total       0.85      0.84      0.83       322

[[ 12   4   0   3   0   0   0]
 [  0  59   0   5   0   0   0]
 [  0   2  21   6   0   0   1]
 [  0   8   2 120   0   0   0]
 [  0   2   0   3  24   0   1]
 [  0   4   0   5   1  12   0]
 [  0   1   0   4   1   0  21]]

下載圖片
降維圖

支援向量機(SVM)演算法應用——人臉識別

環境簡述:python3.6.4 根據python2.7版本程式碼進行勘誤 Class RandomizedPCA is deprecated; RandomizedPCA was deprecated in 0.18 and will be

支援向量機SVM演算法應用【Python實現】

from __future__ import print_function from time import time import logging import matplotlib.pyplot as plt from sklearn.cross_validation import train_te

一步步教你輕鬆學支援向量機SVM演算法之理論篇1

摘要：支援向量機即SVM(Support Vector Machine) ，是一種監督學習演算法，屬於分類的範疇。首先，支援向量機不是一種機器，而是一種機器學習演算法。在資料探勘的應用中，與無監督學習的聚類相對應和區別。廣泛應用於機器學習，計算機視覺和資料探勘當中。（本文原創，轉載必須註明出處.）

一步步教你輕鬆學支援向量機SVM演算法之案例篇2

帶你搞懂支援向量機SVM演算法原理

一、原理 1. 線性可分支援向量機問題的輸入輸出 X = {x1,x2,...,xnx1,x2,...,xn} Y = {+1, -1} 模型：感知機的目的是找到一個可以正確分類資料的超平面S：ω⋅x+b=0ω⋅x+b=0, 得到感知機

支援向量機(SVM)演算法

1. 背景： 1.1 最早是由 Vladimir N. Vapnik 和 Alexey Ya. Chervonenkis 在1963年提出 1.2 目前的版本(soft margin)是由Corinna Cortes 和 Vapnik在1993年提出，並在

演算法學習——支援向量機SVM

SVM現在的公式推導很多，都是現成的，而且寫的也很好，我會提供相關資源，這篇博文主要從思想理解的方面做一個簡單介紹。 1、SVM 是如何工作的？支援向量機的基礎概念可以通過一個簡單的例子來解釋。讓我們想象兩個類別：紅色和藍色，我們的資料有兩個特徵：x 和 y。我們想要一個分類器，給定一

機器學習演算法——支援向量機svm，實現過程

初學使用python語言來實現支援向量機演算法對資料進行處理的全過程。 from sklearn.datasets import load_iris #匯入資料集模組 from sklearn.model_selection import train_test_spli

【支援向量機SVM】演算法原理公式推導 python程式設計實現

1.前言如圖，對於一個給定的資料集，通過直線A或直線B（多維座標系中為平面A或平面B）可以較好的將紅點與藍點分類。那麼線A與線B那個更優呢？在SVM演算法中，我們認為線A是優於線B的。因為A的‘分類間隔’大於B。

支援向量機(SVM)理解以及在sklearn庫中的簡單應用

1. 什麼是支援向量機英文Support Vector Machines，簡寫SVM . 主要是基於支援向量來命名的，什麼是支援向量後面會講到…….最簡單的SVM是用來二分類的，在深度學習崛起之前被譽為最好的現成分類器，”現成”指的是資料處理好，SVM可

資料探勘十大演算法——支援向量機SVM（一）：線性支援向量機

首先感謝“劉建平pinard”的淵博知識以及文中詳細準確的推導！！！本文轉自“劉建平pinard”，原網址為：http://www.cnblogs.com/pinard/p/6097604.html。支援向量機原理SVM系列文章共分為5部分：（一）線性支援向量機

資料探勘十大演算法——支援向量機SVM（四）：SMO演算法原理

首先感謝“劉建平pinard”的淵博知識以及文中詳細準確的推導！！！支援向量機原理SVM系列文章共分為5部分：（一）線性支援向量機（二）線性支援向量機的軟間隔最大化模型（三）線性不可分支援向量機與核函式（四）SMO演算法原理（五）線性支援迴歸

機器學習之支援向量機SVM Support Vector Machine (五) scikit-learn演算法庫

一、scikit-learn SVM演算法庫概述 scikit-learn中SVM的演算法庫分為兩類，一類是分類演算法庫，包括SVC、 NuSVC和LinearSVC三個類。另一類是迴歸演算法庫，包括SVR、NuSVR和LinearSVR三個類。相關的

資料探勘十大演算法——支援向量機SVM（二）：線性支援向量機的軟間隔最大化模型

資料探勘十大演算法——支援向量機SVM（五）：線性支援迴歸

資料探勘---分類演算法之支援向量機SVM

這篇來說說支援向量機，說實在的，這個是我的最愛，一直比較看好這個演算法，而且也是花了不少時間在這個上面，下面一起來複習下。同上篇，下面摘自本人的畢業設計論文中，後面給出參考文獻。基於統計學習理論的支援向量機演算法是現代智慧技術

【機器學習】支援向量機SVM及例項應用

【機器學習】支援向量機1.分類超平面與最大間隔2.對偶問題與拉格朗日乘子法3.核函式4.軟間隔與正則化準備：資料集匯入SVM模組步驟：1.讀取資料集 2.劃分訓練樣本與測試樣本 3.訓練SVM

分類演算法-支援向量機(SVM)

支援向量機(Support Vector Machine，SVM)是Corinna Cortes和Vapnik等於1995年首先提出的，它在解決小樣本、非線性及高維模式識別中表現出許多特有的優勢，並能夠推廣應用到函式擬合等其他機器學習問題中。在機器學習中，支

機器學習實戰（五）支援向量機SVM（Support Vector Machine）

目錄 0. 前言 1. 尋找最大間隔 2. 拉格朗日乘子法和KKT條件 3. 鬆弛變數 4. 帶鬆弛變數的拉格朗日乘子法和KKT條件 5. 序列最小優化SMO（Sequential Minimal Optimiz

吳恩達機器學習（第十三章）---支援向量機SVM

一、優化目標邏輯迴歸中的代價函式：畫出兩種情況下的函式影象可得： y=1: 我們找一條折線來近似表示這個函式影象 y=0：我們用這兩條折線來近似表示原來的曲線函式可得新的代價函式（假設-log(h(x))為,-log(1

支援向量機(SVM)演算法應用——人臉識別

環境簡述:python3.6.4

根據python2.7版本程式碼進行勘誤

Code如下

Result如下：

相關推薦