用主成分分析(PCA)演算法做人臉識別
阿新 • • 發佈:2019-01-07
詳細資料可以參考https://www.cnblogs.com/xingshansi/p/6445625.html
一、概念
主成分分析(PCA)是一種統計方法。通過正交變換將一組可能存在相關性的變數轉化為一組線性不相關的變數,轉換後的這組變數叫主成分。
二、思想
PCA的思想是將n維特徵對映到k維上(k<n),這k維是全新的正交特徵,稱為主成分,是重新構造出來的k維特徵,而不是簡單的從n維特徵中去除n-k維的特徵。
三、PCA的計算過程
現在假設有一組資料,是五個同學期中考試的語文和數學成績
語文成績 | 學生1 | 學生2 | 學生3 | 學生4 | 學生5 |
語文成績 | 50 | 60 | 70 | 80 | 90 |
數學成績 | 80 | 82 | 84 | 86 | 88 |
現在我們利用PCA來降維,也就是把這個二維資料降到一維
演算法步驟如下:
- 步驟一:資料中心化——去均值,根據需要,有的需要歸一化——Normalized;
- 步驟二:求解協方差矩陣;
- 步驟三:利用特徵值分解/奇異值分解 求解特徵值以及特徵向量;
- 步驟四:利用特徵向量構造投影矩陣;
- 步驟五:利用投影矩陣,得出降維的資料。
MATLAB程式如下:
clc;clear all;close all; set(0,'defaultfigurecolor','w') ; x = [50 60 70 80 90];%語文成績 y = [80 82 84 86 88];%數學成績 %繪圖 figure() subplot(1,2,1) scatter(x,y,'r*','linewidth',5); xlim([50,100]); ylim([50,100]); grid on; xlabel('語文成績'); ylabel('數學成績'); data = [x;y]; %步驟一:中心化 mu = mean(data,2);%按行取平均值 data(1,:) = data(1,:)-mu(1);%去均值 data(2,:) = data(2,:)-mu(2); %步驟二:求協方差矩陣 R = data*data'; %步驟三:求特徵值、特徵向量 %利用:特徵值分解 [V,D] = eig(R); [EigR,PosR] = sort(diag(D),'descend');%特徵值按降序排列 VecR = V(PosR,:); %步驟四:利用特徵向量構造投影矩陣 %假設降到一維 K = 1;%降到一維 Proj = VecR(1:K,:); %步驟五:利用投影矩陣,得出降維的資料 DataPCA = Proj*data; x0 = -30:30; subplot 122 scatter(data(1,:),data(2,:),'r*','linewidth',5);hold on; plot(x0,Proj(2)/Proj(1)*x0,'b','linewidth',3);hold on;%繪出投影方向 xlim([-30,30]); ylim([-30,30]); grid on; xlabel('語文成績'); ylabel('數學成績');
結果:
四、PCA人臉識別MATLAB程式碼
把別人的程式碼修改了一下,可以直接執行這個程式來做人臉識別,人臉識別資料庫為orl face database
function T = CreateTrainingSet(TrainingSetPath) TrainFiles = dir(TrainingSetPath); Train_Class_Number = 0;%訓練類別的個數,使用的資料集共40個類(40個人),每個人有10張臉 for i = 1:size(TrainFiles,1) if not(strcmp(TrainFiles(i).name,'.')|strcmp(TrainFiles(i).name,'..')|strcmp(TrainFiles(i).name,'Thumbs.db')) %strcmp(S1,S2)S1和S2是否完全匹配 Train_Class_Number = Train_Class_Number + 1; % Number of all images in the training database end end %%%%%%%%%%%%%%%%%%%%%%%% Construction of 2D matrix from 1D image vectors T = []; Each_Class_Train_Num=5; % Choose top-5 faces in each class for training 每個樣本中選擇五個 for i = 1 : Train_Class_Number str=''; % s是因為資料夾命名為s1 s2等 %str是每個樣本的路徑 str = strcat(TrainingSetPath,'\s',int2str(i),'\');%這裡只到了每個類的路徑,還沒有讀到圖片strcat將兩個char型別連線 for j=1:Each_Class_Train_Num tmpstr=''; tmpstr=strcat(str,int2str(j),'.pgm'); img=imread(tmpstr); %讀出影象 if length(size(img))>2 %如果圖片大於二維 img=rgb2gray(img); end vecimg=double(reshape(img,1,size(img,1)*size(img,2))); T=cat(1,T,vecimg); end end [MeanFace, MeanNormFaces, EigenFaces] = EigenfaceCore(T) ; TestImagePath ='D:\data\copy\att_faces.tar\att_faces\s40\6.pgm';%單張測試的人臉照 OutputName = Recognition(TestImagePath, MeanFace, MeanNormFaces, EigenFaces); end function [MeanFace, MeanNormFaces, EigenFaces] = EigenfaceCore(T) % Revised by Jianzhu Wang email:jzwangATbjtuDOTeduDOTcn % Use Principle Component Analysis (PCA) to determine the most discriminating features between images of faces. % Description: This function gets a 2D matrix, containing all training image vectors % Input:T is a 2D matrix containing all 1D image vectors. Suppose we % totally choose P training images with the same size of M*N. Each training % image is then vectorized into a 'row' vector with length equals to M*N. % That is , we finally get a P*MN 2D matrix. % Output: % MeanFace - (1*MN) mean vector of faces % EigenFaces - %- (M*Nx(P-1)) Eigen vectors of the covariance matrix of the training database % MeanNormFaces - (M*NxP) Matrix of centered image vectors %% Calculate meanface MeanFace=mean(T,1);%(Default each row in T corresponds to a face) TrainNumber=size(T,1); % We totally have Train_Number training images %% Mean-normalize 均值規範化 MeanNormFaces=[]; for i=1:TrainNumber MeanNormFaces(i,:)=double(T(i,:)-MeanFace); end %% Recall some linear algebra theory % If a matrix with size M*N, then matrices AA' and A'*A have same non-zero % eigenvales. And if x is an eigenvector of AA', then A'x is eigenvector of % A'A. This can be easily proved. Note that if x is eigenvector of a % matrix, then a*x (a is a constant) is also the eigenvector of the matrix. % Thus, eigenvector result for A'A obtained from matlab may not be same as A'x. % Use L to replace covariance matrix C=A'*A so as to decrease dimension L=MeanNormFaces*MeanNormFaces'; %200*200代替協方差矩陣 [E, D] = eig(L); %求特徵值特徵向量 %sort eigenvalues and corresponding eigenvectors eigenValue=diag(wrev(diag(D))); %wrev得到時間序列的逆序,eigenValue按照遞減的順序排列 %accroding to the eigenvector relationship between AA' and A'A EE=MeanNormFaces'*E; eigenVector=fliplr(EE); EigenFaces=[]; SumOfAllEigenValue=sum(eigenValue(:)); TmpSumOfEigenValue=0; for i=1:size(eigenValue,1) TmpSumOfEigenValue=TmpSumOfEigenValue+eigenValue(i,i); ChooseEigenValueNum=i; if(TmpSumOfEigenValue/SumOfAllEigenValue>0.85) %累計貢獻率達到百分之八十五以上 break; end end for i=1:ChooseEigenValueNum EigenFaces(i,:)=eigenVector(:,i)'; end end %% function OutputName = Recognition(TestImagePath, MeanFace, MeanNormFaces, EigenFaces) % Description: This function compares two faces by projecting the images into facespace and % measuring the Euclidean distance between them. % Input: TestImagePath - Path of test face image % MeanFace -(1*MN) mean vector, which is one % of the output of 'EigenfaceCore.m' % % MeanNormFaces -(P*MN) matrix with each row % represents a mean-normalized % face, which is one of the output % of 'EigenfaceCore.m' % EigenFaces %%%%%%%%%%%%%%%%%%%%%%%% Projecting centered image vectors into facespace % All centered images are projected into facespace by multiplying in % Eigenface basis's. Projected vector of each face will be its corresponding % feature vector. ProjectedImages = []; % I think here should be the number of centered training faces rather than % number of eigenfaces Train_Number = size(MeanNormFaces,1); for i = 1 : Train_Number temp = (EigenFaces*MeanNormFaces(i,:)')'; % Projection of centered images into facespace ProjectedImages(i,:) =temp; % each row corresponds to a feature end %%%%%%%%%%%%%%%%%%%%%%%% Extracting the PCA features from test image InputImage = imread(TestImagePath); VecInput=reshape(InputImage,1,size(InputImage,1)*size(InputImage,2)); MeanNormInput = double(VecInput)-MeanFace; % Centered test image ProjectedTestImage = (EigenFaces*MeanNormInput')'; % Test image feature vector %%%%%%%%%%%%%%%%%%%%%%%% Calculating Euclidean distances % Euclidean distances between the projected test image and the projection % of all centered training images are calculated. Test image is % supposed to have minimum distance with its corresponding image in the % training database. Euc_dist = []; for i = 1 : Train_Number temp = ( norm( ProjectedTestImage - ProjectedImages(i,:) ) )^2; Euc_dist = [Euc_dist temp]; end [Euc_dist_min , Recognized_index] = min(Euc_dist); OutputName = strcat('s',int2str((Recognized_index-1)/5+1),'class'); figure, subplot(121); imshow(InputImage,[]); title('輸入人臉'); subplot(122); imshow(reshape((MeanNormFaces(Recognized_index,:)+MeanFace),112,92),[]); title(strcat('最相似人臉,類別:',int2str((Recognized_index-1)/5+1))); end