用主成分分析（PCA）演算法做人臉識別

阿新 • • 發佈：2019-01-07

詳細資料可以參考https://www.cnblogs.com/xingshansi/p/6445625.html

一、概念

主成分分析（PCA）是一種統計方法。通過正交變換將一組可能存在相關性的變數轉化為一組線性不相關的變數，轉換後的這組變數叫主成分。

二、思想

PCA的思想是將n維特徵對映到k維上（k<n），這k維是全新的正交特徵，稱為主成分，是重新構造出來的k維特徵，而不是簡單的從n維特徵中去除n-k維的特徵。

三、PCA的計算過程

現在假設有一組資料，是五個同學期中考試的語文和數學成績

語文成績

學生1

學生2

學生3

學生4

學生5

語文成績	50	60	70	80	90
數學成績	80	82	84	86	88

現在我們利用PCA來降維，也就是把這個二維資料降到一維

演算法步驟如下：

步驟一：資料中心化——去均值，根據需要，有的需要歸一化——Normalized；
步驟二：求解協方差矩陣；
步驟三：利用特徵值分解/奇異值分解求解特徵值以及特徵向量；
步驟四：利用特徵向量構造投影矩陣；
步驟五：利用投影矩陣，得出降維的資料。

MATLAB程式如下：

clc;clear all;close all;
set(0,'defaultfigurecolor','w') ;
x = [50 60 70 80 90];%語文成績
y = [80 82 84 86 88];%數學成績
%繪圖
figure()
subplot(1,2,1)
scatter(x,y,'r*','linewidth',5);
xlim([50,100]);
ylim([50,100]);
grid on;
xlabel('語文成績');
ylabel('數學成績');
 
data = [x;y];
%步驟一：中心化
mu = mean(data,2);%按行取平均值
data(1,:) = data(1,:)-mu(1);%去均值
data(2,:) = data(2,:)-mu(2);
%步驟二：求協方差矩陣
R = data*data';
%步驟三：求特徵值、特徵向量
%利用：特徵值分解
[V,D] = eig(R);
[EigR,PosR] = sort(diag(D),'descend');%特徵值按降序排列
VecR = V(PosR,:);
%步驟四：利用特徵向量構造投影矩陣
%假設降到一維
K = 1;%降到一維
Proj = VecR(1:K,:);
%步驟五：利用投影矩陣，得出降維的資料
DataPCA = Proj*data;
x0 = -30:30;
subplot 122
scatter(data(1,:),data(2,:),'r*','linewidth',5);hold on;
plot(x0,Proj(2)/Proj(1)*x0,'b','linewidth',3);hold on;%繪出投影方向
xlim([-30,30]);
ylim([-30,30]);
grid on;
xlabel('語文成績');
ylabel('數學成績');

結果：

四、PCA人臉識別MATLAB程式碼

把別人的程式碼修改了一下，可以直接執行這個程式來做人臉識別，人臉識別資料庫為orl face database

function T = CreateTrainingSet(TrainingSetPath)  
TrainFiles = dir(TrainingSetPath);  
Train_Class_Number = 0;%訓練類別的個數，使用的資料集共40個類（40個人），每個人有10張臉  
for i = 1:size(TrainFiles,1)  
    if not(strcmp(TrainFiles(i).name,'.')|strcmp(TrainFiles(i).name,'..')|strcmp(TrainFiles(i).name,'Thumbs.db'))
        %strcmp(S1,S2)S1和S2是否完全匹配
        Train_Class_Number = Train_Class_Number + 1; % Number of all images in the training database  
    end  
end  
%%%%%%%%%%%%%%%%%%%%%%%% Construction of 2D matrix from 1D image vectors  
T = [];  
Each_Class_Train_Num=5; % Choose top-5 faces in each class for training 每個樣本中選擇五個 
for i = 1 : Train_Class_Number  
    str='';  
    % s是因為資料夾命名為s1 s2等  
    %str是每個樣本的路徑
    str = strcat(TrainingSetPath,'\s',int2str(i),'\');%這裡只到了每個類的路徑，還沒有讀到圖片strcat將兩個char型別連線
    for j=1:Each_Class_Train_Num  
        tmpstr='';  
        tmpstr=strcat(str,int2str(j),'.pgm');  
        img=imread(tmpstr);  %讀出影象
        if length(size(img))>2  %如果圖片大於二維
            img=rgb2gray(img);  
        end  
        vecimg=double(reshape(img,1,size(img,1)*size(img,2)));  
        T=cat(1,T,vecimg);  
    end  
end  
    [MeanFace, MeanNormFaces, EigenFaces] = EigenfaceCore(T) ;
  TestImagePath ='D:\data\copy\att_faces.tar\att_faces\s40\6.pgm';%單張測試的人臉照
    OutputName = Recognition(TestImagePath, MeanFace, MeanNormFaces, EigenFaces);
end

function [MeanFace, MeanNormFaces, EigenFaces] = EigenfaceCore(T)  
% Revised by Jianzhu Wang  email:jzwangATbjtuDOTeduDOTcn  
% Use Principle Component Analysis (PCA) to determine the most discriminating features between images of faces.  
% Description: This function gets a 2D matrix, containing all training image vectors  
  
% Input:T is a 2D matrix containing all 1D image vectors. Suppose we  
% totally choose P training images with the same size of M*N. Each training  
% image is then vectorized into a 'row' vector with length equals to M*N.  
% That is , we finally get a P*MN 2D matrix.   
% Output:  
%                MeanFace              - (1*MN) mean vector of faces  
%                EigenFaces            -  
%- (M*Nx(P-1)) Eigen vectors of the covariance matrix of the training database  
%                MeanNormFaces         - (M*NxP) Matrix of centered image vectors         
%% Calculate meanface  
MeanFace=mean(T,1);%(Default each row in T corresponds to a face)  
TrainNumber=size(T,1); % We totally have Train_Number training images  
%% Mean-normalize  均值規範化
MeanNormFaces=[];  
for i=1:TrainNumber  
    MeanNormFaces(i,:)=double(T(i,:)-MeanFace);  
end  
%% Recall some linear algebra theory  
% If a matrix with size M*N, then matrices AA' and A'*A have same non-zero  
% eigenvales. And if x is an eigenvector of AA', then A'x is eigenvector of  
% A'A. This can be easily proved. Note that if x is eigenvector of a  
% matrix, then a*x (a is a constant) is also the eigenvector of the matrix.  
% Thus, eigenvector result for A'A obtained from matlab may not be same as A'x.  
  
% Use L to replace covariance matrix C=A'*A so as to decrease dimension  
L=MeanNormFaces*MeanNormFaces';   %200*200代替協方差矩陣
[E, D] = eig(L);   %求特徵值特徵向量
%sort eigenvalues and corresponding eigenvectors  
eigenValue=diag(wrev(diag(D)));  %wrev得到時間序列的逆序，eigenValue按照遞減的順序排列
%accroding to the eigenvector relationship between AA' and A'A  
EE=MeanNormFaces'*E;  
eigenVector=fliplr(EE);  
  
EigenFaces=[];  
SumOfAllEigenValue=sum(eigenValue(:));  
TmpSumOfEigenValue=0;  
for i=1:size(eigenValue,1)  
    TmpSumOfEigenValue=TmpSumOfEigenValue+eigenValue(i,i);  
    ChooseEigenValueNum=i;  
    if(TmpSumOfEigenValue/SumOfAllEigenValue>0.85)  %累計貢獻率達到百分之八十五以上
        break;  
    end  
end      
for i=1:ChooseEigenValueNum  
    EigenFaces(i,:)=eigenVector(:,i)';  
end  
end
%%
function OutputName = Recognition(TestImagePath, MeanFace, MeanNormFaces, EigenFaces)  
% Description: This function compares two faces by projecting the images into facespace and   
% measuring the Euclidean distance between them.  
% Input:         TestImagePath          - Path of test face image      
%                MeanFace               -(1*MN) mean vector, which is one  
%                                         of the output of 'EigenfaceCore.m'  
%  
%                MeanNormFaces          -(P*MN) matrix with each row  
%                                         represents a mean-normalized  
%                                         face, which is one of the output  
%                                         of 'EigenfaceCore.m'  
%                EigenFaces                             
%%%%%%%%%%%%%%%%%%%%%%%% Projecting centered image vectors into facespace  
% All centered images are projected into facespace by multiplying in  
% Eigenface basis's. Projected vector of each face will be its corresponding  
% feature vector.  
  
ProjectedImages = [];  
% I think here should be the number of centered training faces rather than  
% number of eigenfaces  
Train_Number = size(MeanNormFaces,1);  
for i = 1 : Train_Number  
    temp = (EigenFaces*MeanNormFaces(i,:)')'; % Projection of centered images into facespace  
    ProjectedImages(i,:) =temp; % each row corresponds to a feature  
end  
%%%%%%%%%%%%%%%%%%%%%%%% Extracting the PCA features from test image  
InputImage = imread(TestImagePath);  
VecInput=reshape(InputImage,1,size(InputImage,1)*size(InputImage,2));  
MeanNormInput = double(VecInput)-MeanFace; % Centered test image  
ProjectedTestImage = (EigenFaces*MeanNormInput')'; % Test image feature vector  
  
%%%%%%%%%%%%%%%%%%%%%%%% Calculating Euclidean distances   
% Euclidean distances between the projected test image and the projection  
% of all centered training images are calculated. Test image is  
% supposed to have minimum distance with its corresponding image in the  
% training database.  
Euc_dist = [];  
for i = 1 : Train_Number  
    temp = ( norm( ProjectedTestImage - ProjectedImages(i,:) ) )^2;  
    Euc_dist = [Euc_dist temp];  
end  
[Euc_dist_min , Recognized_index] = min(Euc_dist);  
OutputName = strcat('s',int2str((Recognized_index-1)/5+1),'class');  
figure,  
subplot(121);  
imshow(InputImage,[]);  
title('輸入人臉');  
subplot(122);  
imshow(reshape((MeanNormFaces(Recognized_index,:)+MeanFace),112,92),[]);  
title(strcat('最相似人臉，類別:',int2str((Recognized_index-1)/5+1)));  
end

用主成分分析（PCA）演算法做人臉識別

詳細資料可以參考https://www.cnblogs.com/xingshansi/p/6445625.html一、概念主成分分析（PCA）是一種統計方法。通過正交變換將一組可能存在相關性的變數轉化為一組線性不相關的變數，轉換後的這組變數叫主成分。二、思想PCA的思想是將n

主成分分析（pca）演算法原理

影象處理中對很多副圖片提取特徵時，由於特徵的維數過高而影響程式的效率，所以用到pca進行特徵降維。那怎樣才能降低維數呢？它又用到了什麼數學方法呢？ 1.協方差矩陣假設有一個樣本集X，裡面有N個樣本，每個樣本的維度為d。即：將這些樣本組織成樣本矩陣形

主成分分析（pca）演算法的實現步驟及程式碼

%%%%%%%%%%%%開啟一個30行8列資料的txt檔案%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% %第一步：輸入樣本矩陣%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% filename='src.txt'; fid=fopen(filename,'

深入學習主成分分析（PCA）演算法原理及其Python實現

一：引入問題　　首先看一個表格，下表是某些學生的語文，數學，物理，化學成績統計：　　首先，假設這些科目成績不相關，也就是說某一科目考多少分與其他科目沒有關係，那麼如何判斷三個學生的優秀程度呢？首先我們一眼就能看出來，數學，物理，化學這三門課的成績構成了這組資料的主成分（很顯然，數學作為第一主成分，

對KLT，主成分分析（PCA）演算法的理解

1 #include "pcaface.h" 2 #include "ui_pcaface.h" 3 #include <QString> 4 #include <iostream> 5 #include <stdio.h> 6 7 usi

主成分分析（PCA）演算法以及PCA在人臉識別上的應用及程式碼

PCA（Principal Component Analysis）是一種常用的資料分析方法，PCA通過線性變換將原始資料變換為一組各維度線性無關的表示，可用於提取資料的主要特徵分量，可用於高維資料的降維。一般情況下，在資料探勘和機器學習中，資

主成分分析（PCA）原理詳解（轉載）

增加信息什麽之前 repl 神奇 cto gmail 協方差一、PCA簡介 1. 相關背景上完陳恩紅老師的《機器學習與知識發現》和季海波老師的《矩陣代數》兩門課之後，頗有體會。最近在做主成分分析和奇異值分解方面的項目，所以記錄一下心得體會。

[python機器學習及實踐(6)]Sklearn實現主成分分析（PCA）

相關性 hit 變量 gray tran total 空間 mach show 1.PCA原理主成分分析（Principal Component Analysis，PCA），是一種統計方法。通過正交變換將一組可能存在相關性的變量轉換為一組線性不相關的變量，轉換後的這組

【原始碼】主成分分析（PCA）與獨立分量分析（ICA）MATLAB工具箱

本MATLAB工具箱包含PCA和ICA實現的多個函式，並且包括多個演示示例。在主成分分析中，多維資料被投影到最大奇異值相對應的奇異向量上，該操作有效地將輸入訊號分解成在資料中最大方差方向上的正交分量。因此，PCA常用於維數降低的應用中，通過執行PCA產生資料的低維表示，同時，該低維表

主成分分析（PCA）詳細講解

介紹主成分分析（Principal Component Analysis，PCA）是一種常用的資料降維演算法，可以將高維度的資料降到低維度，並且保留原始資料中最重要的一些特徵，同時去除噪聲和部分關聯特徵，從而提高資料的處理效率，降低時間成本。資料降維優點：低維資

主成分分析（PCA）原理詳解

1. 問題真實的訓練資料總是存在各種各樣的問題： 1、比如拿到一個汽車的樣本，裡面既有以“千米/每小時”度量的最大速度特徵，也有“英里/小時”的最大速度特徵，顯然這兩個特徵有一個多餘。 2、拿到一個數學系的本科生期末考試成績單，裡面有三列，一列是對數學的

主成分分析（PCA）在壓縮影象方面的應用

一、主成分分析的原理主成分分析能夠通過提取資料的主要成分，減少資料的特徵，達到資料降維的目的。具體的原理可參見之前寫的關於PCA原理的一篇文章：二、使用matlab模擬實現%% 利用PCA對影象壓縮 close all clear all clc %% 輸入 In = i

Machine Learning第八講【非監督學習】--（三）主成分分析（PCA）

一、Principal Component Analysis Problem Formulation（主成分分析構思）首先來看一下PCA的基本原理： PCA會選擇投影誤差最小的一條線，由圖中可以看出，當這條線是我們所求時，投影誤差比較小，而投影誤差比較大時，一定是這條線偏離最優直線。

使用主成分分析（PCA）方法對資料進行降維

我們知道當資料維度太大時，進行分類任務時會花費大量時間，因此需要進行資料降維，其中一種非常流行的降維方法叫主成分分析。 Exploratory Data Analysis 鳶尾花資料集： import numpy as np from skle

機器學習實戰學習筆記5——主成分分析（PCA）

1.PCA演算法概述 1.1 PCA演算法介紹主成分分析（Principal Component Analysis）是一種用正交變換的方法將一個可能相關變數的觀察值集合轉換成一個線性無關變數值集合的統計過程，被稱為主成分。主成分的數目小於或等於原始

主成分分析（PCA）-理論基礎

要解釋為什麼協方差矩陣的特徵向量可以將原始特徵對映到 k 維理想特徵，我看到的有三個理論：分別是最大方差理論、最小錯誤理論和座標軸相關度理論。這裡簡單探討前兩種，最後一種在討論PCA 意義時簡單概述。最大方差理論在訊號處理中認為訊號具有較大的方差

主成分分析（PCA）的線性代數推導過程

【摘自Ian Goodfellow 《DEEP LEANRNING》一書。覺得寫得挺清楚，儲存下來學習參考使用。】主成分分析(principal components analysis, PCA)是一個簡單的機器學習演算法，可以通過基礎的線性代數知識推導。假設在n維的R空間中我們有 m

主成分分析（PCA）與Kernel PCA

本部落格在之前的文章【1】中曾經介紹過PCA在影象壓縮中的應用。其基本思想就是設法提取資料的主成分（或者說是主要資訊），然後摒棄冗餘資訊（或次要資訊），從而達到壓縮的目的。本文將從更深的層次上討論PCA

機器學習（十三）：CS229ML課程筆記（9）——因子分析、主成分分析（PCA）、獨立成分分析（ICA）

1.因子分析：高維樣本點實際上是由低維樣本點經過高斯分佈、線性變換、誤差擾動生成的，因子分析是一種資料簡化技術，是一種資料的降維方法，可以從原始高維資料中，挖掘出仍然能表現眾多原始變數主要資訊的低維資料。是基於一種概率模型，使用EM演算法來估計引數。因子分析，是分析屬性們的公

主成分分析（PCA）原理總結

　　　　主成分分析（Principal components analysis，以下簡稱PCA）是最重要的降維方法之一。在資料壓縮消除冗餘和資料噪音消除等領域都有廣泛的應用。一般我們提到降維最容易想到的演算法就是PCA，下面我們就對PCA的原理做一個總結。 1. PCA的思想　　　　PCA顧名思義，就是找出

用主成分分析（PCA）演算法做人臉識別

相關推薦