1. 程式人生 > >機器學習筆記 ---- Principal Component Analysis

機器學習筆記 ---- Principal Component Analysis

1. Task of PCA

Find a direction and project all points to that line, thus minimizing the projection error.
Projection error: Sum of distances between points and line

2. Data Preprocessing

Feature Scaling + Mean Normalization

3. PCA Algorithm



Using the first k vectors in

U and denote it as U r , the result is Z = U r T X

4. Reconstruction from PCA


X
a p p r o x = U r Z

5. How to Choose the Reduced Dimension



Using S = d i a g ( s 1 . . . s n ) , Check whether

1 i = 1 k s i i = 1 n s i <= 0.01

—– an O ( n ) Algorithm

6. Speed Up Supervised Learning by PCA

Train the model using data compressed by PCA
Note: Running PCA which only depends on TRAINING SET when training!
While this mapping can be applied to other sets.

Only use PCA when the original data perform badly on your system!