1. 程式人生 > >吳恩達機器學習 - 評估假設 吳恩達機器學習 - 評估假設

吳恩達機器學習 - 評估假設 吳恩達機器學習 - 評估假設

吳恩達機器學習 - 評估假設

2018年06月22日 20:47:29 閱讀數:105
																														</div>
			<div class="operating">
													</div>
		</div>
	</div>
</div>
<article>
	<div id="article_content" class="article_content clearfix csdn-tracking-statistics" data-pid="blog" data-mod="popu_307" data-dsm="post" style="height: 2211px; overflow: hidden;">
							<div class="article-copyright">
				版權宣告:如果感覺寫的不錯,轉載標明出處連結哦~blog.csdn.net/wyg1997					https://blog.csdn.net/wyg1997/article/details/80778511				</div>
							            <div class="markdown_views">
						<!-- flowchart 箭頭圖示 勿刪 -->
						<svg xmlns="http://www.w3.org/2000/svg" style="display: none;"><path stroke-linecap="round" d="M5,0 0,2.5 5,5z" id="raphael-marker-block" style="-webkit-tap-highlight-color: rgba(0, 0, 0, 0);"></path></svg>
						<p>題目連結:<a href="https://s3.amazonaws.com/spark-public/ml/exercises/on-demand/machine-learning-ex5.zip" rel="nofollow" target="_blank">點選開啟連結</a></p>

正則線性迴歸:

視覺化資料:

Code:
load ('ex5data1.mat');
plot(X, y, 'rx', 'MarkerSize', 10, 'LineWidth', 1.5);
  
  • 1
  • 2
結果為:

這裡寫圖片描述

代價函式:

公式(並正則化):

這裡寫圖片描述

Code(填充在linearRegCostFunction.m中):
t = X*theta-y;
J = t'*t/(2.0*m) + lambda/(2.0*m)*theta(2:end)'*theta(2:end);
  
  • 1
  • 2

求正則線性迴歸梯度

公式:

這裡寫圖片描述

Code:
grad(1) = (X(:,1)'*t)/m;
grad(2:end) = (X(:,2:end)'*t./m) + (lambda/m).*theta(2:end);
  
  • 1
  • 2

訓練樣本

效果如下:

這裡寫圖片描述


偏差和方差

樣本數與訓練誤差、交叉驗證誤差的關係

Code(learningCurve.m):
function [error_train, error_val] = ...
    learningCurve(X, y, Xval, yval, lambda)
%LEARNINGCURVE Generates the train and cross validation set errors needed 
%to plot a learning curve % [error_train, error_val] = ... % LEARNINGCURVE(X, y, Xval, yval, lambda) returns the train and % cross validation set errors for a learning curve. In particular, % it returns two vectors of the same length - error_train and % error_val. Then, error_train(i) contains the training error for % i examples (and similarly for error_val(i)). % % In this function, you will compute the train and test errors for % dataset sizes from 1 up to m. In practice, when working with larger % datasets, you might want to do this in larger intervals. % % Number of training examples m = size(X, 1); % You need to return these values correctly error_train = zeros(m, 1); error_val = zeros(m, 1); % ====================== YOUR CODE HERE ====================== % Instructions: Fill in this function to return training errors in % error_train and the cross validation errors in error_val. % i.e., error_train(i) and % error_val(i) should give you the errors % obtained after training on i examples. % % Note: You should evaluate the training error on the first i training % examples (i.e., X(1:i, :) and y(1:i)). % % For the cross-validation error, you should instead evaluate on % the _entire_ cross validation set (Xval and yval). % % Note: If you are using your cost function (linearRegCostFunction) % to compute the training and cross validation error, you should % call the function with the lambda argument set to 0. % Do note that you will still need to use lambda when running % the training to obtain the theta parameters. % % Hint: You can loop over the examples with the following: % % for i = 1:m % % Compute train/cross validation errors using training examples % % X(1:i, :) and y(1:i), storing the result in % % error_train(i) and error_val(i) % .... % % end % % ---------------------- Sample Solution ---------------------- for i = 1:m theta = trainLinearReg(X(1:i,:),y(1:i),lambda); [error_train(i), ~] = linearRegCostFunction(X(1:i,:),y(1:i),theta,0); [error_val(i), ~] = linearRegCostFunction(Xval,yval,theta,0); end % ------------------------------------------------------------- % ========================================================================= end
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
圖示(高偏差):

這裡寫圖片描述


多項式迴歸

特徵擴張(將一次線性的特徵擴張到p次)

Code(polyFeatures.m):
function [X_poly] = polyFeatures(X, p)
%POLYFEATURES Maps X (1D vector) into the p-th power
%   [X_poly] = POLYFEATURES(X, p) takes a data matrix X (size m x 1) and
%   maps each example into its polynomial features where
%   X_poly(i, :) = [X(i) X(i).^2 X(i).^3 ...  X(i).^p];
%


% You need to return the following variables correctly.
X_poly = zeros(numel(X), p);

% ====================== YOUR CODE HERE ======================
% Instructions: Given a vector X, return a matrix X_poly where the p-th 
%               column of X contains the values of X to the p-th power.
%
% 

for i = 1:p
    X_poly(:,i) = X.^i;
end

% =========================================================================

end
  
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24

效果檢視

這裡說明一下接下來執行ex5的效果。因為我們把特徵擴張到了8次方(本例中),所以最後一個特徵的數值特別特別大,所以我們這裡需要特徵歸一化

程式是這麼呼叫的(實際上我們可以自己寫這個部分,也不是很麻煩)(featureNormalize.m):
function [X_norm, mu, sigma] = featureNormalize(X)
%FEATURENORMALIZE Normalizes the features in X 
%   FEATURENORMALIZE(X) returns a normalized version of X where
%   the mean value of each feature is 0 and the standard deviation
%   is 1. This is often a good preprocessing step to do when
%   working with learning algorithms.

mu = mean(X);
X_norm = bsxfun(@minus, X, mu);

sigma = std(X_norm);
X_norm = bsxfun(@rdivide, X_norm, sigma);


% ============================================================

end
  
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17

然後就是利用我們之前寫好的函式計算代價啦:

效果圖如下:

這裡寫圖片描述
從圖上看來是高方差了(過擬合)

然後我們看一下不同的λ對結果有什麼影響吧(這一步不計分,只是幫助我們來理解)~

Code(直接在控制檯執行)(只需對第一行進行修改):
lambda = 1;
[theta] = trainLinearReg(X_poly, y, lambda);

% Plot training data and fit
figure(1);
plot(X, y, 'rx', 'MarkerSize', 10, 'LineWidth', 1.5);
plotFit(min(X), max(X), mu, sigma, theta, p);
xlabel('Change in water level (x)');
ylabel('Water flowing out of the dam (y)');
title (sprintf('Polynomial Regression Fit (lambda = %f)', lambda));

figure(2);
[error_train, error_val] = ...
    learningCurve(X_poly, y, X_poly_val, yval, lambda);
plot(1:m, error_train, 1:m, error_val);

title(sprintf('Polynomial Regression Learning Curve (lambda = %f)', lambda));
xlabel('Number of training examples')
ylabel('Error')
axis([0 13 0 100])
legend('Train', 'Cross Validation')
  
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
λ=1時(擬合的不錯)

這裡寫圖片描述

λ=32時(欠擬合啦)

這裡寫圖片描述

那麼我們改小一點:λ=0.1時(懲罰的力度不夠,還是有點過擬合)

這裡寫圖片描述

使用交叉驗證集選擇合適的λ(畫出λ-Error曲線)

效果圖:

這裡寫圖片描述
發現λ大概在3的位置上比較好

我們來看看λ=3的曲線:

這裡寫圖片描述

閱讀更多