1. 程式人生 > >論文閱讀: Anomaly Detection with Partially Observed Anomalies

論文閱讀: Anomaly Detection with Partially Observed Anomalies

對於異常檢測而言,通常是根據標籤是否可用而採取監督或者無監督的方式。論文提出一種新的方法,部分觀測到異常,針對大量未標記的資料和少量已經標記為異常的資料,提出了一種ADOA的兩階段檢測方法。首先聚類未標籤資料確信正常和可能異常。然後再用加群多分類方法來給出對應類別的置信度。

對於無標籤的資料而言,常用的無監督行為Distance based approaches [26], density based approaches [3] and isolation based methods [23] are typical representatives along this way

文章以malicious URL detection為例 PU (Positive and Unlabeled) learn- ing [17, 19] 但是PUlearning的正樣本通常是同一類的異常,而另一個則是單一的異常

semi-supervised clustering

ADOA follows a two-stage manner In the rst stage, we address that the observed anomalies should not be simply regarded into one concept center, and by assuming that the anomalies belong to k di erent concept centers, the anomalies are rstly clustered into k clusters. After that, both potential anomalies and reliable nor- mal samples are selected from the unlabeled samples according to the isolation degree and the similarity to the nearest anomaly clus- ter center. In stage two, a weight is set to each sample according to the con dence of its attached label, and a weighted multi-class classi cation model is built to distinguish di erent anomalies from the normal samples, using original anomalies and the selected sam- ples. Experiments on di erent datasets and a real application task demonstrate the e ectiveness of our approach.

2.相關的工作

通過兩階段,第一階段通過聚類和異常值的方法對他進行匯聚,然後為每個模型新增,再利用手動標籤模型進行

3.問題簡述和演算法描述