轉載論文筆記-Person Re-identification Past, Present and Future翻譯

阿新 • • 發佈：2019-02-12

原文地址：http://blog.csdn.net/zdh2010xyz/article/details/53741682

2016_Person Re-identification Past, Present and FutureLiang Zheng, Yi Yang, and Alexander G. Hauptmann這是一篇有關於Person re-ID 綜述性文章。Abstractre-ID變得越來越important。早期，主要是有關hand-crafted演算法與小規模的evaluation的文章。近些年，large-scale datasets 與 deep learning系統興起。文章將目前的re-ID問題分為兩大類，image-based和video-based。在每一類的討論中，都會回顧hand-crafted和deep learning system問題。同時，討論了兩個接近真實應用的new re-ID任務：end-to-end re-ID 與 fast re-ID in very large galleries。文章貢獻：1）介紹了person re-ID的歷史，以及其與image classfication 和 instance retrievial的關係。2）詳細分析了image-based 與 video-based re-ID任務中的hand-crafted systems與 large-scale methods。3）描繪了end-to-end re-ID與fast tetrieval in large galleries是未來的方向。4）最後簡短的敘述了一些under-developed但又很important的問題。1 Introduction

講什麼是re-ID。開頭有關特洛伊戰爭，沒看懂。。。反正就是re-ID很重要，有實踐價值。從技術上講，實際視訊監控系統的person re-ID系統可分為三個模組：person detection，person tracking，和 person retrieval。前兩個任務是獨立的計算機視覺任務，所以大家主要的工作還是最後一個模組。論文安排：文章主要討論的是re-ID的vision part。文章與以前的綜述性文獻不同的是，本文注重re-ID的subtask（現在available的以及未來可能的），而沒有過分細講techniques或者architectures。特別強調的是：deep learning methds、end-to-end re-ID 以及 large scale re-ID。1.2節介紹re-ID的歷史。1.3節介紹re-ID與classification和retrieval的關係。第2、3章分別介紹image-based、video-based的相關文獻，每一類都分為hand-crafted與deeply systems方法。第4章回顧了detection、tracking以及re-ID相關的技術，並指出未來研究重點。第5章介紹代表當前最好的retrieval models：large-scale re-ID，這也是未來研究的方向。第6章介紹了一些open issues。第7章結論。

至於與Classification和Retrieval的關係，person re-ID 結合了二者的優勢。一方面，在訓練階段，可以從person space學習到區別能力強的ditance metrics或者feature embeddings。另一方面，在檢索階段，有效的indexing structures和hashing techniques將有助於large gallery中query的檢索。

2 Image-based person re-ID主要模型是使用單張影象作為query，模型可描述為closed-word model，G是gallery，包含N張影象，特徵可描述為

這N張影象屬於N個不同的identities。給定一個probe（query）q，其identity號可以通過過以下公式獲得：

2.1 Hand-crafted Systems從公式(1)，可以看出一個re-ID 系統，包括兩個組成部分，image decription與distance metrics。（1）pedestrian description使用最多的feature是color，texture features使用相對較少。一般使用的是weighted color histogram（WH）、maximally stable color regions（MSCR）和recurrent high-structured patches（RHSP）。WH 賦予對稱軸附近畫素更高的權重，對於每個part得到一個color histogram。MSCR主要處理stable color regions，提取特徵包括color、area、centroid等。RHSP是紋理特徵，recurrent texture patches。近些年，hand-crafted features所注重的特徵多多少少都是一樣。Zhao et al. 提取10*10影象塊的32-dim LAB color histogram和128-dim SIFT特徵。同時，採用Adjacency constrained search技術，按水平劃線分塊對應匹配的方式，從gallery image中找到最合適的匹配塊。這種方式也有很多人研究，代表性的有SCNCD、LOMO以及BoW等。除了直接提取low-level color和texture features，還有一種選擇：attribute-based features，可以看成是mid-level representations。可以確信，相對於low-level descriptors，採用attributes進行image translation具有更強的魯棒性。已經有很多文獻做了這方面的工作，結果表明效果優秀。（2）Distacne Metric Learning在hand-crafte re-ID systems中，一個良好的distance metric是至關重要的。原因：high-dimensional visual features typically do not capture the invariant factors under sample variances. 關於metric learning methods，已經有文章詳細綜述。文章將其分為 w.r.t supervised learning versus unsupervised learning與global learning versus local learning等。在person re-ID，主要是supervised global distance metric learning。global metric learning，一般而言就是使屬於同一類的vector距離儘量closer，不屬於同一類的儘量further apart。最常採用的是馬氏距離（Mahalanobis distance）。在person re-ID中，最出名的metric learning method 是KISSME（原理沒弄懂，以後再補）。在馬氏距離的基礎上，一大批metric learning method湧現。Weinberger提出large margin nearest neighbor Learning (LMNN) method、Davis提出information-theoretic metric learning (ITML)。最近，Hirzer提出relaxing the positivity constraint，具有更低的計算開銷。Chen在馬氏距離中，融合了bilinear similarity，使得cross-patch similarities can be modeled。等等。。。除了learning distance metrics，也有人關注learning discriminative subspaces（不懂，待以後詳述）。同時，也有人採用其他的學習工具，比如說SVM、Boosting。2.2 Deeply-learned Systems自從Krizhevsky贏得了ILSVRC 12比賽，CNN-based的深度學習模型得以流行。兩類CNN模型廣泛應用：1）classification model，用於image classification和object detection。2）siamese model，用於image pairs or triplets。在re-ID使用深度學習的瓶頸是lack of training data。由於大部分資料集為每個identity提供兩張影象，所以目前CNN-based re-ID方法主要是採用siamese model。siamese model的一個缺點是不能完全利用re-ID annotations。其實，siamese model僅僅使用了pairwise (or triplet) labels。另外一個與潛力的策略是採用classification/identification mode，這樣可以充分利用re-ID labels。在大規模資料集，如PRW、MARS，classification model取得了在without careful training sample selection情況下的優秀效能。但為了模型收斂，應用identification loss需要更多的training instances per ID。

以上所提到的工作是以end-to-end的方式learn deep features。也可以採用提取low-level features作為輸入，比如SIFT、color histograms，整合進入Fish Vector。2.3 Datasets and Evaluation

第一，資料集的規模在不斷擴大。第二，bounding boxes開始採用pedestrian detectors獲得，例如DPM、ACF等。第三，採用了更多的攝像頭。Evaluation Metrics，主要是cumulative matching characteristics (cmc) curve。但隨著研究的輸入，尤其是multiole ground truths的存在，也有人提出mean average precision (mAP)。Re-ID Accuracy Over the Years，在不斷提升。

3 Video-Based Person Re-IDVedio-based methods主要關注的multi-shot matching 方案和對temporal imformation的整合。3.1 Hand-crafted Systems主要是color-based descriptors。與image-based re-ID類似。主要的不同是距離計算上，這涉及到兩個sets of bounding box features。稱為“multi-shot”person re-ID。這些方法主要是基於multiple shots，構建appearance models。現在一個新趨勢是incorporate temporal cues in the model。Wang採用spatial-temporal descriptors來再識別行人。特徵包括HOG3D，以及the gait energy image (GEI) 步態能量影象。Gao 利用週期性行人，將步態分為幾個片段，進行識別。3.2 Deeply-learned Systemsvideo-based 和image-based re-ID的明顯區別是，with multiple images for each matching unit (video sequence), 在video pooling 後，要麼採用multi-match strategy，要麼採用a single-match strategy. 在以前的工作中，採用multi-match strategy，但計算量大。另一方面，pooling-based methods，將多個query的vector池化到一個global vector，擴充套件性好。由此，目前的video-based re-ID都會包含pooling step，可以是max/average pooling，或者從一個fully connected layer獲得。Another good practice：injecting temporal information in the final representation。3.3 Datasets and Evaluationmulti-shot re-ID的資料集包括ETH、3DPES、PRID-2011、iLIDS-VID,和MARS。4 Future:Detection,Tracking,and person Re-ID4.1 Previous Works儘管現在person re-ID是一個獨立的研究任務，但文章認為未來會結合pedestrian detection 和tracking。特別的，文章認為end-to-end re-ID 系統（spotting a query person from raw videos），把raw videos作為輸入，整合pedestrian detection 和tracking，再進行re-identification。目前，大部分re-ID工作都是假定兩點：1）給定行人邊界匡的gallery。2）邊界匡hand-drawn。這樣會有很好的檢測精度。但是在實際中，這兩種假設是不成立的。一方面，gallery 大小會隨著detector threshold而變化。低的閾值會產生更多的bounding boxing（更大的gallery，高的recall，但低的precision），反之亦然。re-ID檢測的準確度將會由於不同的閾值，而不問題。另一方面，使用pedestrian detectors，bounding boxes中不可避免的會出現錯誤（misalignment, miss-detection, and false alarms），這將大大影響re-ID的檢測準確性，這現在還很少有人考慮這個問題。第二個問題，很多資料集，如CUHK03, Market-1501, and MARS，與實際場景很類似。在這些資料集中，採用檢測器檢測的bounding boxes與採用hand-drawn bounding boxes，前者所獲得檢測精度要比後者低。在MARS資料集中，雖然提出了tracking errors與detection error，但我們不知道tracknig errors是怎麼影響re-ID accuracy的。這在end-to-end person re-ID 系統中，如何挑選detectors和tracker將是一個難題。 2016年，xiao和zheng差不多同時提出了基於large-scale dataset的end-to-end re-ID system。都是將raw video frame與query bounding box作為輸入。如下圖所示：

從圖中可以看出，在給定同樣re-ID feature的情況下，better better pedestrian detector會產生higher re-ID accuracy。從多篇論文中，可以得出結論，良好的pedestrian detection將有助於person re-ID。但是，在這些所謂的end-to-end系統中，還沒有人研究pedestrian tracking。這一工作被視為整合detection、tracking與retrieval為一個框架的終極目標。此項研究需要提供用於這三個任務的的bounding box annotations的大規模資料集支援。4.2 Future Issues1）System performance evaluation一個適當的evaluation methodology對於end-to-end re-ID任務異常重要，end-to-end re-ID不同於常規的re-ID問題，它帶有dynamic galeries。同時，現在還不知道如何evaluate detection/tracking 在person re-ID中的效能表現。下面從兩個方面提出問題：1。針對re-ID中pedestrian detection和tracking的evaluation metrics相當重要。evaluation protocol應該能夠quantify and rank detector/tracker performance in a realistin，同時是unbiased manner and informative of re-ID accuracy。由於在person re-ID任務中，只是要找出這個person，並不太關心person檢測的準確性。所以，文章認為可以採用miss rate與average precison作為person re-ID中pedestrian detection效能的評價。另外一個就是AP/MR的計算，這個涉及IoU的值，試驗結果表明，IoU閾值取0.7要比取0.5，檢測精度更加穩定。文章的建議是larger IoU criteria能保證better localization results，但這個也得根據不同的情況而定。雖然有了關於pedestrian detection的evaluation，但對於person re-ID中的tracking，現在還是largely unknown。在以前的multiple object tracking (MOT) benchmark，常用multiple object tracking precision (MOTP)、mostly track (MT) targets、the total number of false positives (FP)、the total number of ID switches (IDS)、the total number of times a trajectory is fragmented (Frag)、the number of frames processed per second (Hz)等，可能一些指標會受到處理速度的影響，因為person re-ID中的tracking任務是off-line step。For re-ID, we envision that tracking precision is critical as it is undesirable to have outlier images in the tracklets whichcompromise the effectiveness of pooling. We also speculate that 80% might not be an optimal threshold for evaluating MT under re-ID. 在未來的資料集中，一旦考慮考慮re-id的tracking問題，首要任務就是設計出適當的metrics來評價不同的tracker。2。w.r.t the evaluation procedure concerns the re-ID accuracy of the entire system.這裡涉及到detector的threshold問題，太strict，則gallery少，則目標可能包含不全；太loose，則gallery多，則可能會有更多的背景包含進去。這兩種結果對re-ID結果都不好。暫時還沒有有效的解決辦法，但記住一點，這個gallery的大小是受detector threshold控制，在設計new evaluation metrics要考慮到這個問題。另外一個點，就是如何從一段給定的視訊中定位到query的identy出現的位置，這個任務要比detection/tracking+reidentification相對簡單，不要求有那麼高的檢測精度，只要能定位就行。這個任務中，可以設定loose IoU，將更多的精力放到matching上，即從一大堆的bounding box或者spatial-temporal tube中找到特定的person。2）The Influence of Detector/Tracker on Re-ID對於end-to-end re-ID系統，研究detection/tracking methods/data對re-ID的貢獻。第一：pedestrian/tracking errors確實影響re-ID accuracy。但也有去研究表明，detection/tracking errors可以在更早的階段避免。舉個例子，Xiao所提出網路中，他在fast R-CNN sub-model網路中加入localization loss，這對re-ID system的有效定位很有幫助。未來的研究，可以關注person re-ID中detection/tracking quality的獨立性。鑑於開發無錯誤的detector與 tracker是不現實的，文章建議在re-ID matching scores中整合detection confidence。舉例：how to correct errors by effectively identifying outliers、how to train context models that do not rely solely on detected bounding boxes.第二，需要更加關注detection和tracking，如果設計得當，將會大大促進re-ID。雖然我們暫時不能直接看出pedestrian detection/tracking對re-ID有幫助，但可以參考通用image classification and fine-grained classification，可以獲取一些線索。如果能夠更好的區分不同的identity，會對區分行人與背景有幫助，同樣相反也是。另外一個可以研究的點是unsupervised tracking data。在視訊中進行行人跟蹤是一件沒有那麼難得事，雖然不可避免的是會存在錯誤。但是，人臉識別、顏色、非背景資訊都有利於提高tracking的準確性。在追蹤的過程中，行人會有比較大的變化。運用這些序列圖，即racking results，用來訓練pedestrian verification/identification.以減輕對大規模supervised data的依賴。5 Future：person re-ID in very large galeries雖然資料庫的規模一直在擴大，但很明顯，還遠未達到時用的地步。所以，person re-ID in very large galleries should be a critical direction in the future.6 Other important yet Under-developed open issues6.1 Battle Against Data Volumnperson re-ID中資料集的標註是一件非常難得事情，因為不僅要標註邊界匡，還得標註出ID。最近兩年，有一些大規模的資料集出現，如Market-1501、PRW、LSPS和MARS，首先得感謝這些資料集的製作者，但這些資料集也還是遠未達到實用的地步。文章認為可以有兩種替代策略來改善這一問題。第一：在tracking和detection中使用annotation還有待深入探討。第二：transfer learning。transfers a trained model from the source to the target domain。6.2 Re-ranking Re-ID Resultsre-identification可以看成是retrieval過程，則re-ranking對於提高檢索的精度變得非常重要。7 結論略

轉載論文筆記-Person Re-identification Past, Present and Future翻譯

轉載論文筆記-Person Re-identification Past, Present and Future翻譯

[論文閱讀] Person Re-identification: Past, Present and Future

行人重識別——《A Systematic Evaluation and Benchmark for Person Re-Identification Features, Metrics, and D》

The past, present and future of humankind

42 Cutting Edge Facts About the Past, Present and Future of Artificial Intelligence

The Past, Present, and Future of Speech Recognition Technology

Person Re-identification 系列論文筆記（二）：A Discriminatively Learned CNN Embedding for Person Re-identification

Person Re-identification 系列論文筆記（八）：SPReID

論文筆記（8）--（Re-ID）Camera Style Adaptation for Person Re-identification

論文筆記（7）--（Re-ID）Video-based Person Re-identification via Self Paced Weighting

論文筆記（4）--（Re-ID）Re-ranking Person Re-identification with k-reciprocal Encoding

論文筆記（3）--（Re-ID）In Defense of the Triplet Loss for Person Re-Identification

【論文筆記】Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification

part-aligned系列論文：1707.Deeply-Learned Part-Aligned Representations for Person Re-Identification 論文筆記

【論文筆記】In Defense of the Triplet Loss for Person Re-Identification

Reinforced Temporal Attention and Split-Rate Transfer for Depth-Based Person Re-Identification 論文筆記

1705.Person Re-Identification by Deep Joint Learning of Multi-Loss Classification 論文閱讀筆記

《Diversity Regularized Spatiotemporal Attention for Video-based Person Re-identification》論文翻譯

【論文閱讀】Batch Feature Erasing for Person Re-identification and Beyond

Re-ID：AlignedReID: Surpassing Human-Level Performance in Person Re-Identification 論文解析

轉載 論文筆記-Person Re-identification Past, Present and Future翻譯

相關推薦

轉載論文筆記-Person Re-identification Past, Present and Future翻譯