caffe中的損失函式

阿新 • • 發佈：2019-01-22

損失函式，一般由兩項組成，一項是loss term,另外一項是regularization term。

J=L+R

先說損失項loss，再說regularization項。

1. 分對得分1，分錯得分0.gold standard

2. hinge loss(for softmargin svm),J=1/2||w||^2 + sum(max(0,1-yf(w,x)))

3. log los, cross entropy loss function in logistic regression model.J=lamda||w||^2+sum(log(1+e(-yf(wx))))

4. squared loss, in linear regression. loss=(y-f(w,x))^2

5. exponential loss in boosting. J=lambda*R+exp(-yf(w,x))

再說regularization項，

一般用的多的是R2=1/2||w||^2,R1=sum(|w|)。R1和R2是凸的，同時R1會使得損失函式更加具有sparse，而R2則會更加光滑些。具體可以參見下圖：

caffe的損失函式，目前已經囊括了所有可以用的了吧，損失函式由最後一層分類器決定，同時有時會加入regularization,在BP過程中，使得誤差傳遞得以良好執行。

contrastive_loss，對應contrastive_loss_layer，我看了看程式碼，這個應該是輸入是一對用來做驗證的資料，比如兩張人臉圖，可能是同一個人的（正樣本），也可能是不同個人（負樣本）。在caffe的examples中，

siamese這個例子中，用的損失函式是該型別的。該損失函式具體數學表達形式可以參考lecun的文章Dimensionality Reduction by Learning an Invariant Mapping, Raia Hadsell, Sumit Chopra, Yann LeCun, cvpr 2006.

euclidean_loss，對應euclidean_loss_layer,該損失函式就是l=(y-f(wx))^2，是線性迴歸常用的損失函式。

hinge_loss，對應hinge_loss_layer，該損失函式就是 $\ell(y) = \max(0, 1-t \cdot y)$ 。主要用在SVM分類器中。

infogain_loss，對應infogain_loss_layer，損失函式表示式沒找到，只知道這是在文字處理中用到的損失函式。

multinomial_logistic_loss，對應multinomial_logistic_loss_layer，

sigmoid_cross_entropy，對應sigmoid_cross_entropy_loss_layer,也就是logistic regression使用的損失函式。

softmax_loss,對應softmax_loss_layer，損失函式等可以見UFLDL中關於softmax章節。在caffe中多類分類問題，損失函式就是softmax_loss，比如imagenet, mnist等。softmax_loss是sigmoid的多類問題。但是，我就沒明白，multinomial_logistic_loss和這個有什麼區別，看程式碼，輸入有點差別，softmax的輸入是probability,而multinomial好像不要求是probability，但是還是沒明白，如果只是這樣，豈不是一樣啊？

這裡詳細說明了兩者之間的差異，並且有詳細的測試結果，非常贊。簡單理解，multinomial 是將loss分成兩個層進行，而softmax則是合在一起了。或者說，multinomial loss是按部就班的計算反向梯度，而softmax則是把兩個步驟直接合併為一個步驟進行了，減少了中間的精度損失等，從計算穩定性講，softmax更好，multinomial是標準做法，softmax則是一種優化吧。

轉自caffe:

Softmax

LayerType: SOFTMAX_LOSS

The softmax loss layer computes the multinomial logistic loss of the softmax of its inputs. It’s conceptually identical to a softmax layer followed by a multinomial logistic loss layer, but provides a more numerically stable gradient.

references：

http://caffe.berkeleyvision.org/tutorial/layers.html

Bishop, pattern recognition and machine learning

http://deeplearning.stanford.edu/wiki/index.php/Softmax%E5%9B%9E%E5%BD%92

http://freemind.pluskid.org/machine-learning/softmax-vs-softmax-loss-numerical-stability/

caffe中的損失函式

Softmax

faster rcnn中損失函式（二）—— Smoooh L1 Loss的講解

SVM分類器中損失函式梯度求法及理解

faster rcnn中損失函式（一）——softmax，softmax loss和cross entropy的講解

pytorch中損失函式的reduce,size_average

『深度概念』度量學習中損失函式的學習與深入理解

caffe 中的損失函式分析

caffe中的損失函式

CS231n——機器學習演算法——線性分類（中：SVM及其損失函式）

機器學習中常用損失函式

keras中內建的多種損失函式

機器學習中Logistic損失函式以及神經網路損失函式詳解

神經網路損失函式中的正則化項L1和L2

斯坦福cs231n計算機視覺——線性分類器(中下)，損失函式和最優化

softmax + cross-entropy交叉熵損失函式詳解及反向傳播中的梯度求導

邏輯迴歸中的損失函式的解釋

關於機器學習中的損失函式loss function

邏輯迴歸中如何應用梯度下降演算法與損失函式

mxnet中自定義損失函式和評估標準

機器學習中的損失函式總結

機器學習中的常見問題——損失函式

caffe中的損失函式

Softmax

相關推薦