SVM用於線性回歸
SVM用於線性回歸
方法分析
在樣本數據集()中,不是簡單的離散值,而是連續值。如在線性回歸中,預測房價。與線性回歸類型,目標函數是正則平方誤差函數:
在SVM回歸算法中,目的是訓練出超平面,采用作為預測值。為了獲得稀疏解,即計算超平面參數w,b不依靠所有樣本數據,而是部分數據(如在SVM分類算法中,支持向量的定義),采用誤差函數
誤差函數定義為,如果預測值與真實值的差值小於閾值將不對此樣本做懲罰,若超出閾值,懲罰量為。
下圖為誤差函數與平方誤差函數的圖形
目標函數
觀察上述的誤差函數的形式,可以看到,實際形成了一個類似管道的樣子,在管道中樣本點,不做懲罰,所以被稱為,如下圖陰影紅色部分
用替代平方誤差項,因此可以定義最小化誤差函數作為優化目標:
由於上述目標函數含有絕對值項不可微。我們可以轉化成一個約束優化問題,常用的方法是為每一個樣本數據定義兩個松弛變量,表示度量與的距離。
如上圖所示:
當樣本點真實值位於管道上方時,寫成表達式:時,
當樣本點真實值位於管道下方時,寫成表達式:時,
因此使得每個樣本點位於管道內部的條件為:
當位於管道上方時,,有
當位於管道下方時,,有
誤差函數可以寫成為一個凸二次優化問題:
約束條件為:
寫成拉格朗日函數:
對偶問題
上述問題為極小極大問題
與SVM分類分析方法一樣,改寫成對偶問題
首先求偏導數
超平面計算
Support Vector Machine - Regression (SVR)
Support Vector Machine can also be used as a regression method, maintaining all the main features that characterize the algorithm (maximal margin). The Support Vector Regression (SVR) uses the same principles as the SVM for classification, with only a few minor differences. First of all, because output is a real number it becomes very difficult to predict the information at hand, which has infinite possibilities. In the case of regression, a margin of tolerance (epsilon) is set in approximation to the SVM which would have already requested from the problem. But besides this fact, there is also a more complicated reason, the algorithm is more complicated therefore to be taken in consideration. However, the main idea is always the same: to minimize error, individualizing the hyperplane which maximizes the margin, keeping in mind that part of the error is tolerated.
Linear SVR
Non-linear SVR
The kernel functions transform the data into a higher dimensional feature space to make it possible to perfom the linear separation
Kernel functions
SVM用於線性回歸