從零單排入門機器學習:線性回歸(linear regression)實踐篇
阿新 • • 發佈:2017-05-26
class rom enter instr function ont 線性 gin 向量
簡單來說,就是依據一個城市的人口數量,來預測一輛快餐車能獲得的利益。
為了方便理解上面代碼,看看各變量大概長什麽樣子的。
難以理解的是theta = theta - alpha * (X‘ * (X * theta - y)) / m; 這樣的向量化算法。
線性回歸(linear regression)實踐篇
之前一段時間在coursera看了Andrew ng的機器學習的課程,感覺還不錯,算是入門了。
這次打算以該課程的作業為主線,對機器學習基本知識做一下總結。小弟才學疏淺,如有錯誤。敬請指導。
問題原描寫敘述:
you will implement linear regression with one variable to predict prots for a food truck. Suppose you are the CEO of a restaurant franchise and are considering dierent cities for opening a new outlet. The chain already has trucks in various cities and you have data for prots and populations from the cities.
簡單來說,就是依據一個城市的人口數量,來預測一輛快餐車能獲得的利益。
數據集大概是這樣子的:
一行數據為一個樣本。第一列表示人口,第二列表示利益。
首先。先把數據可視化。
%% ======================= Part 2: Plotting ======================= fprintf(‘Plotting Data ...\n‘) data = load(‘ex1data1.txt‘); X = data(:, 1); y = data(:, 2); m = length(y); % number of training examples % Plot Data % Note: You have to complete the code in plotData.m plotData(X, y); fprintf(‘Program paused. Press enter to continue.\n‘); pause;
function plotData(x, y) %PLOTDATA Plots the data points x and y into a new figure % PLOTDATA(x,y) plots the data points and gives the figure axes labels of % population and profit. % ====================== YOUR CODE HERE ====================== % Instructions: Plot the training data into a figure using the % "figure" and "plot" commands. Set the axes labels using % the "xlabel" and "ylabel" commands. Assume the % population and revenue data have been passed in % as the x and y arguments of this function. % % Hint: You can use the ‘rx‘ option with plot to have the markers % appear as red crosses. Furthermore, you can make the % markers larger by using plot(..., ‘rx‘, ‘MarkerSize‘, 10); figure; % open a new figure window plot(x, y, ‘rx‘, ‘MarkerSize‘, 10); % Plot the data ylabel(‘Profit in $10,000s‘); % Set the y label xlabel(‘Population of City in 10,000s‘); % Set the x label % ============================================================ end
計算cost function
function J = computeCost(X, y, theta) %COMPUTECOST Compute cost for linear regression % J = COMPUTECOST(X, y, theta) computes the cost of using theta as the % parameter for linear regression to fit the data points in X and y % Initialize some useful values m = length(y); % number of training examples % You need to return the following variables correctly % ====================== YOUR CODE HERE ====================== % Instructions: Compute the cost of a particular choice of theta % You should set J to the cost. H = X*theta; diff = H - y; %J = sum(diff.^2)/(2*m); J = sum(diff.*diff)/(2*m); % ========================================================================= end
為了方便理解上面代碼,看看各變量大概長什麽樣子的。
梯度下降法計算參數theta
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) %GRADIENTDESCENT Performs gradient descent to learn theta % theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps with learning rate alpha % Initialize some useful values m = length(y); % number of training examples J_history = zeros(num_iters, 1); for iter = 1:num_iters % ====================== YOUR CODE HERE ====================== % Instructions: Perform a single gradient step on the parameter vector % theta. % % Hint: While debugging, it can be useful to print out the values % of the cost function (computeCost) and gradient here. % H = X*theta-y; theta(1) = theta(1) - sum(H.* X(:,1))*alpha/m;%感覺這樣寫挺搓的 theta(2) = theta(2) - sum(H.* X(:,2))*alpha/m; %theta = theta - alpha * (X‘ * (X * theta - y)) / m; % ============================================================ % Save the cost J in every iteration J_history(iter) = computeCost(X, y, theta); end end
難以理解的是theta = theta - alpha * (X‘ * (X * theta - y)) / m; 這樣的向量化算法。
先看看theta本質是怎麽計算的
再看看各變量長什麽樣子的
算出theta之後,就能夠畫出擬合直線了。
註:本文作者linger,如有轉載。請標明轉載於http://blog.csdn.net/lingerlanlan。
本文鏈接:http://blog.csdn.net/lingerlanlan/article/details/32162559
從零單排入門機器學習:線性回歸(linear regression)實踐篇