precision and recall

阿新 • • 發佈：2019-02-14

首先強調multi-class 和multi-label是不同的，前者是每個樣本只屬於一個類別，後者是每個樣本可以有多個類別標記，即多個類別

在模式識別，資訊檢索，二分類等問題中常常需要對結果進行評價，評價的幾個指標通常是準確率（accuracy），精確率（precision）和召回率（recall）

1、分類問題

舉個二分類的問題，類別狗為正，貓為負。測試用例12只動物，7只狗，5只貓。識別出8只狗，4只貓，有5只確實是狗

TP:正確“地”標為正。本來為正而且識別為正的數量，比如識別出8只狗，有5只確實是狗，TP=5

FP:錯誤“地”標為正。本來為負卻識別為正的數量，如上述例子剩餘的3只實際是貓卻識別為狗FP=3-------也叫誤報，或一類錯誤

TP+FP表示識別為正（不管真是情況）的總數即8

TN:正確“地”標為負。把本來是負的確實識別為負的數量,上述例子識別出4只貓，有3只確實是貓即TN=8-5=3

FN:錯誤“地”標為負。把本來為正的識別為負的數量，如上述例子，識別為貓的有一隻實際是狗FN=7-5=2-------也叫漏報，或二類錯誤

最重要的就是理解清楚上面TP,FP,TN,FN的含義

TP+FN表示“整個測試集中” “實際為正”的數量，如上述例子本來是狗識別為狗和本來為狗卻識別為貓即7

上面的TF是用來表示識別正確性的，即T，F表示正確與否，P,N 表示正負類

精確率的含義是識別為正的所有數量中，正確是識別數量結果（識別為正，確實為正）所佔的比例，也就是說識別為正的結果中，有的識別對了，有的識別錯了，因此會有精確的問題

召回率的含義是正確識別為正的數量在測試集中正類總數中所佔的比例，就是說，沒有識別出所有位正的測試用例，因此是 “是否完全性” 的問題

precision for a class is the number oftrue positives (i.e. the number of items correctly labeled as belonging to the positive class) divided by thetotal number of elements labeled as belonging to the positive class (i.e. the sum of true positives and

false positives, which are items incorrectly labeled as belonging to the class).

Recall in this context is defined as the number oftrue positives divided by the total number of elements thatactually belong to the positive class (i.e. the sum of true positives and false negatives, which are items which were not labeled as belonging to the positive class but should have been).

2、檢索問題

在文獻檢索的應用中，輸入一個關鍵詞，那麼文獻資料庫中有一部分是相關的（事實上有相關性），根據關鍵詞會檢索出來一部分系統認為相關的文獻，那麼系統檢索出來的文獻會出現什麼結果呢？最有可能的結果是檢索出來的這些文獻（系統認為相關）只有一部分事實相關，另外一部分不相關但系統誤以為相關而被檢索出來。在這種背景下，precision和recall的定義如下

分子表示檢索出來的文獻而且是確實相關的數量（檢索出來的文獻的一部分），分母表示檢索出來的文獻的總數量

分子表示檢索出來的文獻而且確實相關的數量（檢索出來的文獻的一部分），分母表示文獻資料庫中真實相關的文獻的總數量

In aninformation retrieval scenario, the instances are documents and the task is to return a set of relevant documents given a search term; or equivalently, to assign each document to one of two categories, "relevant" and "not relevant". In this case, the "relevant" documents are simply those that belong to the "relevant" category.

Recall is defined as the number of relevant documents retrieved by a search divided by the total number of existing relevant documents,

precision is defined as the number of relevant documents retrieved by a search divided by the total number of documents retrieved by that search.

3、precision和recall的關係

通常情況下precision和recall有一種相反的關係，兩者通常不是單獨討論的，一般的做法是在一方的值設定的情況下評價並優化另一個值，或者將兩者結合起來綜合評價，這就出現了F-measure(一會再說)，正陽率，假陽率曲線常畫在一起形成ROC曲線，從而獲得平衡。

4、有了前文定義的TP,FP,TN,FN我們還可以定義其他的指標

Accuracy(正確率)：，就是識別的正確的數量（不管正類負類），與總測試樣本的比值