sklearn中SVM調參說明

阿新 • • 發佈：2018-09-24

his suppose ise which tin chan erl adjust only

寫在前面

之前只停留在理論上，沒有實際沈下心去調參，實際去做了後，發現調參是個大工程（玄學）。於是這篇來總結一下sklearn中svm的參數說明以及調參經驗。方便以後查詢和回憶。

常用核函數

1.linear核函數:

K (x_{i}, x_{j}) = x_{i}^{T} x_{j}

2.polynomial核函數:

K (x_{i}, x_{j}) = (γ x_{i}^{T} x_{j} + r)^{d}, d > 1

3.RBF核函數（高斯核函數）:

K (x_{i}, x_{j}) = e x p (- γ | | x_{i} - x_{j} | |^{2}), γ > 0

4.sigmoid核函數:

K (x_{i}, x_{j}) = t a n h (γ x_{i}^{T} x_{j} + r), γ > 0, r < 0

sklearn svm 相關參數的官方說明

Parameters:
C : float, optional (default=1.0). Penalty parameter C of the error term.
kernel : string, optional (default=’rbf’). Specifies the kernel type to be used in the algorithm. It must be one of ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’ or a callable. If none is given, ‘rbf’ will be used. If a callable is given it is used to pre-compute the kernel matrix from data matrices; that matrix should be an array of shape (n_samples, n_samples).

degree : int, optional (default=3). Degree of the polynomial kernel function (‘poly’). Ignored by all other kernels.
gamma : float, optional (default=’auto’). Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’. If gamma is ‘auto’ then 1/n_features will be used instead.

coef0 : float, optional (default=0.0). Independent term in kernel function. It is only significant in ‘poly’ and ‘sigmoid’.
probability : boolean, optional (default=False). Whether to enable probability estimates. This must be enabled prior to calling fit, and will slow down that method.
shrinking : boolean, optional (default=True). Whether to use the shrinking heuristic.
tol : float, optional (default=1e-3). Tolerance for stopping criterion.
cache_size : float, optional. Specify the size of the kernel cache (in MB).
class_weight : {dict, ‘balanced’}, optional. Set the parameter C of class i to class_weight[i]C for SVC. If not given, all classes are supposed to have weight one. The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classesnp.bincount(y))
verbose : bool, default: False. Enable verbose output. Note that this setting takes advantage of a per-process runtime setting in libsvm that, if enabled, may not work properly in a multithreaded context.
max_iter : int, optional (default=-1). Hard limit on iterations within solver, or -1 for no limit.
decision_function_shape : ‘ovo’, ‘ovr’ or None, default=None. Whether to return a one-vs-rest (‘ovr’) decision function of shape (n_samples, n_classes) as all other classifiers, or the original one-vs-one (‘ovo’) decision function of libsvm which has shape (n_samples, n_classes * (n_classes - 1) / 2). The default of None will currently behave as ‘ovo’ for backward compatibility and raise a deprecation warning, but will change ‘ovr’ in 0.19.
New in version 0.17: decision_function_shape=’ovr’ is recommended.
Changed in version 0.17: Deprecated decision_function_shape=’ovo’ and None.
random_state : int seed, RandomState instance, or None (default). The seed of the pseudo random number generator to use when shuffling the data for probability estimation.

libsvm中參數說明

因為sklearn底層是調用libsvm的，因此sklearn中svm參數說明是可以直接參考libsvm中的。

1.linear核函數:

K (x_{i}, x_{j}) = x_{i}^{T} x_{j}

2.polynomial核函數:

K (x_{i}, x_{j}) = (γ x_{i}^{T} x_{j} + r)^{d}, d > 1

3.RBF核函數（高斯核函數）:

K (x_{i}, x_{j}) = e x p (- γ | | x_{i} - x_{j} | |^{2}), γ > 0

4.sigmoid核函數:

K (x_{i}, x_{j}) = t a n h (γ x_{i}^{T} x_{j} + r), γ > 0, r < 0

首先介紹下與核函數相對應的參數：
1）對於線性核函數，沒有專門需要設置的參數
2）對於多項式核函數，有三個參數。-d用來設置多項式核函數的最高次項次數，也就是公式中的d，默認值是3。-g用來設置核函數中的gamma參數設置，也就是公式中的gamma，默認值是1/k（特征數）。-r用來設置核函數中的coef0，也就是公式中的第二個r，默認值是0。
3）對於RBF核函數，有一個參數。-g用來設置核函數中的gamma參數設置，也就是公式中gamma，默認值是1/k（k是特征數）。
4）對於sigmoid核函數，有兩個參數。-g用來設置核函數中的gamma參數設置，也就是公式中gamma，默認值是1/k（k是特征數）。-r用來設置核函數中的coef0，也就是公式中的第二個r，默認值是0。

具體來說說rbf核函數中C和gamma ：

SVM模型有兩個非常重要的參數C與gamma。其中 C是懲罰系數，即對誤差的寬容度。c越高，說明越不能容忍出現誤差,容易過擬合。C越小，容易欠擬合。C過大或過小，泛化能力變差

gamma是選擇RBF函數作為kernel後，該函數自帶的一個參數。隱含地決定了數據映射到新的特征空間後的分布，gamma越大，支持向量越少，gamma值越小，支持向量越多。支持向量的個數影響訓練與預測的速度。

這裏面大家需要註意的就是gamma的物理意義，大家提到很多的RBF的幅寬，它會影響每個支持向量對應的高斯的作用範圍，從而影響泛化性能。我的理解：如果gamma設的太大，方差會很小，方差很小的高斯分布長得又高又瘦，會造成只會作用於支持向量樣本附近，對於未知樣本分類效果很差，存在訓練準確率可以很高，(如果讓方差無窮小，則理論上，高斯核的SVM可以擬合任何非線性數據，但容易過擬合)而測試準確率不高的可能，就是通常說的過訓練；而如果設的過小，則會造成平滑效應太大，無法在訓練集上得到特別高的準確率，也會影響測試集的準確率。

此外，可以明確的兩個結論是：
結論1：樣本數目少於特征維度並不一定會導致過擬合，這可以參考余凱老師的這句評論：
“這不是原因啊，呵呵。用RBF kernel, 系統的dimension實際上不超過樣本數，與特征維數沒有一個trivial的關系。”

結論2：RBF核應該可以得到與線性核相近的效果（按照理論，RBF核可以模擬線性核），可能好於線性核，也可能差於，但是，不應該相差太多。
當然，很多問題中，比如維度過高，或者樣本海量的情況下，大家更傾向於用線性核，因為效果相當，但是在速度和模型大小方面，線性核會有更好的表現。

Reference
http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC
http://blog.csdn.net/lqhbupt/article/details/8610443
http://blog.csdn.net/lujiandong1/article/details/46386201

sklearn中SVM調參說明

his suppose ise which tin chan erl adjust only 寫在前面之前只停留在理論上，沒有實際沈下心去調參，實際去做了後，發現調參是個大工程（玄學）。於是這篇來總結一下sklearn中svm的參數說明以及調參經驗。方便以後查詢和回憶。

sklearn中SVM調參說明

寫在前面

常用核函數

sklearn svm 相關參數的官方說明

libsvm中參數說明

sklearn中SVM調參說明

libsvm中OC-SVM 調參問題

sklearn中SVM簡單使用

使用sklearn中svm做多分類時難點解惑

sklearn中SVM與AdaBoost對手寫體數字進行識別

sklearn隨機森林調參小結

sklearn.linear_model——LogisticRegression調參小結

SKLearn中SVM引數自動選擇的最簡單示例（使用GridSearchCV）

SVM調參（機器學習）

機器學習學習筆記第十八章 SVM調參並觀察

gensim中doc2vec調參

Python資料分析與機器學習-SVM調參例項

Caffe中網路調參基本套路

機器學習之SVM調參例項

關於sklearn中的網格搜尋（調參）

Python中Gradient Boosting Machine(GBM）調參方法詳解

sklearn學習8-----GridSearchCV(自動調參）

機器學習：SVM（scikit-learn 中的 RBF、RBF 中的超參數 γ）

Shorthand Argument Names $0 ：只用於指代Closer聲明中的形參

C語言中函數聲明、形參、實參

sklearn中SVM調參說明

寫在前面

常用核函數

sklearn svm 相關參數的官方說明

libsvm中參數說明

相關推薦