機器學習筆記 ---- Principal Component Analysis

阿新 • • 發佈：2018-11-18

1. Task of PCA

Find a direction and project all points to that line, thus minimizing the projection error.
Projection error: Sum of distances between points and line

2. Data Preprocessing

Feature Scaling + Mean Normalization

3. PCA Algorithm

Using the first k vectors in $U$

U

$U$ and denote it as

U_{r}

$U_r$ , the result is

Z = U_{r}^{T} X

$Z=U_r^T X$

4. Reconstruction from PCA

X_{a p p r o x} = U_{r} Z

$X_{approx}=U_r Z$

5. How to Choose the Reduced Dimension

Using $S=diag(s_1...s_n)$ , Check whether

1 - \frac{\sum_{i = 1}^{k} s_{i}}{\sum_{i = 1}^{n} s_{i}} <= 0.01

$1- \frac{\sum_{i=1}^k s_i}{\sum_{i=1}^n s_i}<=0.01$

—– an

O (n)

$O(n)$ Algorithm

6. Speed Up Supervised Learning by PCA

Train the model using data compressed by PCA
Note: Running PCA which only depends on TRAINING SET when training!
While this mapping can be applied to other sets.

Only use PCA when the original data perform badly on your system!

機器學習筆記 ---- Principal Component Analysis

1. Task of PCA Find a direction and project all points to that line, thus minimizing the projection error. Projection error: Sum of distances

Coursera-吳恩達-機器學習-第八週-測驗-Principal Component Analysis

本片文章內容： Coursera吳恩達機器學習課程，第八週的測驗，題目及答案截圖。

[機器學習]PCA（principal component analysis）

PCA(主成分分析)屬於無監督學習的範疇，是一種降維方法。PCA選取包含資訊量最多的方向對資料進行投影。1. 推導PCA的2種方法（需回顧）1）從重建誤差最小化的角度2）從方差最大化的角度（詳細推導見機器學習聖經 PRML ）。2. 求解方法求解特徵值和特徵向量的方法分為一

機器學習筆記（Washington University）- Regression Specialization-week five

ril ... des stl it is idg evaluate date lec 1. Feature selection Sometimes, we need to decrease the number of features Efficiency: With f

機器學習筆記（Washington University）- Regression Specialization-week six

lar fec space cti different only similar ant var 1. Fit locally If the true model changes much, we want to fit our function locally to di

機器學習筆記（四）機器學習可行性分析

資料表示 image 隨機訓練樣本 -s mage 例如 lin 從大量數據中抽取出一些樣本，例如，從大量彈珠中隨機抽取出一些樣本，總的樣本中橘色彈珠的比例為，抽取出的樣本中橘色彈珠的比例為，這兩個比例的值相差很大的幾率很小，數學公式表示為：用抽取到的樣本作為訓練

機器學習筆記（六）邏輯回歸

邏輯回歸 alt 表示結果不變改變最小值 nbsp 可能性一、邏輯回歸問題二分類的問題為是否的問題，由算出的分數值，經過sign函數輸出的是（+1，-1），想要輸出的結果為一個幾率值，則需要改變函數模型，其中，，則邏輯回歸的函數為二、邏輯回歸錯誤評價線性

機器學習筆記（八）非線性變換

nbsp 線性 logs 等於線性模型 images http 自己空間一、非線性問題對於線性不可分的數據資料，用線性模型分類，Ein會很大，相應的Ein=Eout的情況下，Eout也會很大，導致模型表現不好，此時應用非線性模型進行分類，例如：分類器模型是一個圓

機器學習筆記（Washington University）- Classification Specialization-week 3

read was lowest already start choose class sort pty 1. Quality metric Quality metric for the desicion tree is the classification error er

機器學習筆記（Washington University）- Classification Specialization-week six & week 7

ges only end label rod eas point for lar 1. Precisoin and recall precision is how precise i am at showing good stuff on my website recall

【機器學習筆記】第二章：模型評估與選擇

機器學習 ini ppi 第二章 err cap ner rate rac 2.1 經驗誤差與過擬合 1. error rate/accuracy 2. error: training error/empirical error, generalization error

機器學習筆記（Washington University）- Clustering Specialization-week four

++ blog isp special specified ring all png mat 1. Probabilistic clustering model (k-means) Hard assignments do not tell the full story,

機器學習筆記（Washington University）- Clustering Specialization-week six

with idea help gaussian cif big tar elong efi 1. Hierarchical clustering Avoid choosing number of clusters beforehand Dendrograms help v

機器學習筆記 1 LMS和梯度下降（批梯度下降） 20170617

temp eas 理解 import 樣本 alt mes show 超過 # 概念 LMS(least mean square)：（最小均方法）通過最小化均方誤差來求最佳參數的方法。 GD(gradient descent) : （梯度下降法）一種參數更新法則。可以作為L

解釋一下核主成分分析(Kernel Principal Component Analysis, KPCA)的公式推導過程（轉載）

線性不可分 itl 專註 out center forest 測試重要原因 KPCA，中文名稱”核主成分分析“，是對PCA算法的非線性擴展，言外之意，PCA是線性的，其對於非線性數據往往顯得無能為力，例如，不同人之間的人臉圖像，肯定存在非線性關系，自己做的基於ORL數據

機器學習筆記-CNN-神經網絡

連通很大的符號表其他專家系統定量人的滲透直觀轉自：http://blog.csdn.net/kevin_bobolkevin/article/details/50494034 深度學習之一---什麽是神經網絡剛開始學習深度學習，最近把所學的整理了一下，也

Andrew Ng機器學習筆記+Weka相關算法實現（四）SVM和原始對偶問題

優化問題坐標出了變量 addclass fun ber 找到線性這篇博客主要解說了Ng的課第六、七個視頻，涉及到的內容包含，函數間隔和幾何間隔、最優間隔分類器（ Optimal Margin Classifier）、原始/對偶問題（ Pr

機器學習筆記 ML01d

筆記 logs img .cn es2017 src 學習機器學習 ima 機器學習筆記 ML01d

斯坦福2014機器學習筆記七----應用機器學習的建議

訓練集 image 是的 bsp 推斷學習曲線正則偏差 wid 一、綱要　　糾正較大誤差的方法　　模型選擇問題之目標函數階數的選擇　　模型選擇問題之正則化參數λ的選擇　　學習曲線二、內容詳述　　1、糾正較大誤差的方法　　當我們運用訓練好了的模型來做預測時

（原創）(二)機器學習筆記之數據預處理

labels 學習筆記取值特征 tarray 均值 imp represent 中位數數據預處理數據預處理一般包括：（1）數據標準化這是最常用的數據預處理，把某個特征的所有樣本轉換成均值為0，方差為1。將數據轉換成標準正態分布的方法：對每維特征單

機器學習筆記 ---- Principal Component Analysis

1. Task of PCA

2. Data Preprocessing

3. PCA Algorithm

4. Reconstruction from PCA

5. How to Choose the Reduced Dimension

6. Speed Up Supervised Learning by PCA

Only use PCA when the original data perform badly on your system!

相關推薦