Machine Learning - week 2 - Multivariate Linear Regression

阿新 • • 發佈：2017-08-08

learning itl 技術分享隨著刪除 mean 9.png ear 可能性

Gradient Descent in Practice - Feature Scaling

Make sure features are on a similar scale.

Features 的範圍越小，總的可能性就越小，計算速度就能加快。

Dividing by the range

通過 feature/range 使每個 feature 大概在 [-1, 1] 的範圍內

下題是一個例子：

技術分享

Mean normalization

將值變為接近 0。除了 x₀，因為 x₀的值為 1。

技術分享

mu1 是 average value of x₁in trainning sets;

S₁是 x₁的範圍大小，比如臥室是 [0, 5]，那麽範圍為 5 - 0 = 5。

確保 gradient descent 工作正確

技術分享

如上圖，這個圖像是正確的，隨著循環次數的增加，J(θ) 主鍵減小。超過一定循環次數之後，J(θ) 曲線趨於平緩。可以根據圖像得出什麽時候停止，或者當每次循環，J(θ) 的變化小於 ε 時停止。

圖像上升

技術分享

說明 α 取值大了，應該減小。真實的圖像可能如下：

技術分享

如果 α 足夠小，那麽能緩慢但完全覆蓋。

如果 α 太大：在每次循環時，可能不會減少從而不能完全覆蓋。

Features and polynomial regression

可以使用自定義的 features 而不是完全照搬已存在的 features。比如房子有長寬兩個屬性，我們可以創建一個新屬性--面積。然後，表達式變成

，但是這個曲線是先減小後增大的，與實際數據不符（面積越大，總價越高）。所以調整為

技術分享。

Normal equation

Gradient Descent 隨著循環次數增加，逐步逼近最小值。如圖：

技術分享

Normal equation 是通過方法直接計算出 θ。

導數為 0 時最小

技術分享

然後解出 θ₀ 到 θ_n

求解 θ 的方程

技術分享

Matrix 概念見 Machine Learning - week 1

什麽時候用 Gradient Descent 或者 Normal Equation

技術分享

當 n 較大時，右邊的會很慢，因為計算是 O(n³)

當 n 小的時候，右邊會更快，因為它是直接得出結果，不需要 iterations 或者 feature scaling。

如果是 non-invertible？

1. Redundant features (are not linearly independent).

E.g. x₁ = size in feet²; x₂ = size in m²

2. Too many features(e.g. m <= n)

比如 m = 10, n = 100，意思是你只有 10 個數據，但有 100 個 features，顯然，數據不足以覆蓋所有的 features。

可以刪除一些 features（只保留與數據相關的 features）或者使用 regularization。

習題

1. 技術分享

不知道如何同時使用兩種方法，這兩種方法是否是順序相關的？

使用 Dividing by the range

range = max - min = 8836 - 4761 = 4075

vector / range 後變為

1.9438
1.2721
2.1683
1.1683

對上述使用 mean normalization

avg = 1.6382

range = 2.1683 - 1.1683 = 1

x_2⁽^{4) = (1.1683 - 1.6382) / 1 = -0.46990 保留兩位小數為 -0.47}

5. 技術分享

上面提到了“Features 的範圍越小，總的可能性就越小，計算速度就能加快。”（多選題也可以單選）

Machine Learning - week 2 - Multivariate Linear Regression

learning itl 技術分享隨著刪除 mean 9.png ear 可能性 Gradient Descent in Practice - Feature Scaling Make sure features are on a similar scale. Fe

Machine Learning - week 1

坐標如何選擇 dia ner lin spa wikipedia img 一半 Matrix 定義及基本運算 Transposing To "transpose" a matrix, swap the rows and columns. We put a "T" i

Machine Learning - week 3 - Overfitting

mac features pro 2-2 gradient png wid 解決 logistic The Problem of Overfitting 如果有太多的 features，假設可能與訓練數據太匹配了以致於預測未來的數據不準確。如下圖：解決 overf

Machine Learning - week 4 - 習題

返回 cnblogs -c 乘法 image alt learning png round 1. 第一個。，是對的。第二個。結果只會認出是一類，如圖：。所以和為 1。第三個：所有基於 0，1 的邏輯方程都可以使用神經網絡來表示。真值表是有限的，所以可以。

Machine Learning Week 3-advanced-optimization

describes completed ecif search tolerance LV exp nal cond costfunction代碼如下 function[jVal,gradient]=costFunction(theta) jVal=(theta(1)-5)

UPenn - Robotics 5:Robotics: Estimation and Learning - week 2:Bayesian Estimation - Target Tracking

with eve system cred tracking abi fuse true edi

【學習筆記】Pattern Recognition&Machine Learning [1.2] Probability Theory(2) 基於高斯分佈和貝葉斯理論的曲線擬合

高斯分佈不必贅述，這裡記錄個有意思的東西，即從高斯分佈和貝葉斯理論出發看曲線擬合（即選擇引數w）。首先假設我們使用多項式擬合曲線，根據泰勒展開的方法，我們可以用有限項多項式在一定精度內擬合任何曲線。 &nb

【學習筆記】Pattern Recognition&Machine Learning [1.2] Probability Theory(1)貝葉斯理論

這節講了概率論中的一些基本概念，這裡記錄一下對貝葉斯理論的理解。首先簡單描述一下貝葉斯理論。對於一個隨機事件，我們首先給出先驗分佈，不妨設為p(w)

Machine Learning week 4 總結

Multivariate Linear Regression 問題可以由函式表示函式，本質上就是兩集合的一種對映關係，從輸入值，得到輸出值。同時函式，也是現實世界的一種抽象表示，就如原因對應結果，事件A對應事件B。通常我們只要給一個問題建立起比較好的數學模型，

Machine Learning Tutorial: The Multinomial Logistic Regression (Softmax Regression)

In the previous two machine learnin

Coursera Machine Learning Week 7

Quiz 1. Suppose you have trained an SVM classifier with a Gaussian kernel, and it learned the following decision boundary on th

多元高斯分佈(斯坦福machine learning week 9)

1 背景之前的異常檢測演算法，其實是以中心區域向外以正圓的形式擴散的。也就是說距離中心區域距離相等的點，對應的p(x)都是一樣的，所以我們可能無法檢測到這一個異常樣本，因為它也處在一個p(x)比較大的範圍內：之前的也就是圓形的範圍，但是我們現在將要說

CS229機器學習個人筆記（2）——Linear Regression with Multiple Variables

1.Multiple Features 目前，我們只討論了單特徵的迴歸模型，現在來增加一些特徵。增添更多特徵後，我們引入一系列新的註釋： n n —— 代表特徵的數量。 x(i) x^{(i)}代表第 i 個訓練例項，是

Machine Learning week 6 quiz: Machine Learning System Design

You are working on a spam classification system using regularized logistic regression. "Spam" is a positive class (y = 1) and "not spam" is the negative

Machine Learning week 5 programming exercise Neural Network Learning

Neural Networks Learning Visualizing the data 這次試用的資料和上次是一樣的資料。5000個training example，每一個代表一個數字的影象，影象是20x20的灰度圖，400個畫素的每個位置的灰度值組成了一個train

機器學習應用例項（照片OCR）(斯坦福machine learning week 11)

NG說：我想介紹這部分內容的原因主要有以下三個: 第一，我想向你展示一個複雜的機器學習系統是如何被組合起來的。第二，我想介紹一下機器學習流水線（machine learning

Machine Learning week 10 quiz: Large Scale Machine Learning

Large Scale Machine Learning 5 試題 1. Suppose you are training a logistic regres

Machine Learning week 3 quiz : Regularization

Regularization 5 試題 1. You are training a classification model with logistic r

[Machine Learning (Andrew NG courses)]V. Octave Tutorial (Week 2)

img and learning text net con fonts http .net [Machine Learning (Andrew NG courses)]V. Octave Tutorial (Week 2)

Machine Learning (2) Parameter Learning & Linear Algebra Review

上一篇介紹了機器學習的基本概念以及這個系列中將要使用的各種表示法，建議手動畫幾次所謂的訓練資料集的表格，加深對各個引數的理解。另，這個系列的主要目的是對整體ML提供一個有深度併兼顧廣度的flavor，所

Machine Learning - week 2 - Multivariate Linear Regression

Gradient Descent in Practice - Feature Scaling

Dividing by the range

Mean normalization

確保 gradient descent 工作正確

Features and polynomial regression

Normal equation

求解 θ 的方程

什麽時候用 Gradient Descent 或者 Normal Equation

如果 是 non-invertible？

習題

相關推薦

如果是 non-invertible？