Mask-guided Contrastive Attention Model for Person Re-Identification 詳解

阿新 • • 發佈：2018-11-17

最近在看Re-ID相關的東西，現在把這篇paper記錄一下。程式碼地址

一、概述

首先二元體掩碼可以在兩個方面為Re-ID做出貢獻。1、掩模可以幫助消除畫素級的背景雜波，這可以極大地提高ReID模型在各種背景條件下的魯棒性。2、面具包含可被視為重要步態特徵的體形資訊。
如果直接掩蓋掉影象中的背景，會使得效能變差，具體的實驗結果可以在作者文章4.3節見到，如下：
在這裡插入圖片描述

二、網路結構：

為了解決這個問題，作者利用二元掩碼來減少特徵級別中的背景中的噪聲，並提出了一種對比注意模型（MGCAM）來從身體和背景區域對比學習特徵。如下圖：
在這裡插入圖片描述
在特徵空間中，從body區域和完整影象學習的特徵應該是相似的，而從背景和完整影象學習的特徵應該是不同的。為此，提出的MGCAM首先在二元體掩模的指導下產生一對對比注意力圖。然後將對比注意力圖新增到CNN特徵中以分別生成身體感知和背景感知特徵。
There are two main components, the contrastive attention
sub-net and the region-level triplet loss for contrastive
feature learning. The first part can generate a pair of inverse
attention masks which are used to the body-aware
and background-aware feature learning. Whereas the second
part restrains the distances between features from the
full-stream, the body-stream and the background-stream.
對於三個stream，full stream學習整個image的feature；body stream學習body-attention map；background stream學習background-attention map。雖然這三個stream都是學習的同一張圖，但是他們是有些差別的，對於background-stream從background 學習到的特徵對於Re-ID這個任務來說是完全沒有用的，並且應該提出背景對於前景的影響，所以作者使用triplet loss，正樣本是body feature而負樣本是bkgd loss。作者希望通過這個函式，使得body feature提供大部分資訊，並且同時希望減少背景對於最終結果的影響。

三、Loss

3.1 Mask-guided Contrastive Attention Sub-net

在這裡插入圖片描述
已知這前景和背景的attention map兩個操作是互補的，所以肯定會存在這樣一個條件：對於feature map上每一個點（i,j）：

之後的body feature以及bkgd feature的獲得則是利用 $f_{s t}$

a g e 2 f_{stage_2}

f_{s t a g e_{2}}

與這兩個值進行內積操作：
在這裡插入圖片描述

3.2 Region-Level Triplet Loss for Contrastive Feature

Learning
作者通過一個損失函式來生成獨立的body feature以及background feature。損失函式如下所示：
在這裡插入圖片描述
作者使用了triplet loss。這個目標樣本自然就是full feature，正樣本是body feature而負樣本是bkgd loss。這個很容易理解，希望通過這個函式，使得body feature提供大部分資訊，並且同時希望減少背景對於最終結果的影響。
在這裡插入圖片描述
Note：其中m為超引數，根據經驗設定為10

3.3 Objective Function

前面提到了這麼多都是為了Re-ID這個目標服務，總體的框架為：
在這裡插入圖片描述
這個網路框架類似於孿生網路，對於兩個待對比的人，我們經過MGCAM網路提取到最後的特徵分別為h§和h(g)，最後通過如下函式對比其相似度：

Note:m同上，為經驗值10

整個函式訓練過程中使用的目標函式式表達為：
在這裡插入圖片描述
Note：where λ, α and β are the hypermeters, which are respectively
set to 0.01, 0.01 and 0.1 in our experiments

四、總結

本文作者提出的思路可以總結如下：
1、為了減少帶有蒙版的人物影象背景雜亂，設計了一個由二元蒙版引導的對比注意模型。它可以生成一對身體感知和背景感知的注意力圖，可用於生成身體和背景的特徵。

1、作者進一步提出從完整影象，身體和背景的特徵區域級三聯體損失。它可以強制模型學習的特徵對背景雜亂不變。

3、作者探索將身體蒙版作為附加輸入並伴隨RGB影象來增強ReID特徵學習。二元掩模有兩個主要優點：1）它可以幫助減少背景雜亂，2）它包含身份相關的功能，如身體形狀資訊。

Mask-guided Contrastive Attention Model for Person Re-Identification 詳解

最近在看Re-ID相關的東西，現在把這篇paper記錄一下。程式碼地址一、概述首先二元體掩碼可以在兩個方面為Re-ID做出貢獻。1、掩模可以幫助消除畫素級的背景雜波，這可以極大地提高ReID模型在各種背景條件下的魯棒性。2、面具包含可被視為重要步態特徵的體形資訊。如果直接掩蓋掉

2017 ICCV-Pose-driven Deep Convolutional Model for Person Re-identification

論文地址 Motivation 巨大的姿勢變化以及複雜的視角差異增加了從行人圖片中提取特徵與匹配的困難 Contribution 提出了Pose-driven Deep Convolutional(PDC) model來提高特徵學習以及匹配

Attention-Aware Compositional Network for Person Re-identification論文精讀

Attention-Aware Compositional Network for Person Re-identification 論文地址 Abstract 現在行人重識別（Person ReID）越來越火，一個比較大的挑戰是首先跨攝像頭目標重識別

Person Re-identification 系列論文筆記（二）：A Discriminatively Learned CNN Embedding for Person Re-identification

triplet put ali com multi 深度學習 native alt 出現　　A Discriminatively Learned CNN Embedding for Person Re-identification Zheng Z, Zheng L, Ya

Human Semantic Parsing for Person Re-identification

論文地址 GitHub程式碼 Introduction 目前大部分的Person ReID方法都開始集中於提取更加具有表徵能力的區域性特徵輔助全域性特徵用於行人檢索。這篇文章是CVPR2018中關於Person ReID的一篇，文章的主體思路就是part-base的方法，但是跟大部分pa

行人重識別——《A Systematic Evaluation and Benchmark for Person Re-Identification Features, Metrics, and D》

Benchmark演算法總結論文：《A Systematic Evaluation and Benchmark for Person Re-Identification Features, Metrics, and Datasets》論文提出了一套迄今為止最全面的

論文筆記（8）--（Re-ID）Camera Style Adaptation for Person Re-identification

論文：《Camera Style Adaptation for Person Re-identification》 https://arxiv.org/abs/1711.10295v1 因為相機之間的差異，ReID任務會受到不同相機圖片風格變化的影響。以往的paper中，潛在的學習一個不

論文筆記（3）--（Re-ID）In Defense of the Triplet Loss for Person Re-Identification

deep metric learning – 深度度量學習，也就是相似度學習 Classification Loss – 當目標很大時，會嚴重增加網路引數，而訓練結束後很多引數都會被摒棄。 Verification Loss – 只能成對的判斷兩張圖片的相似度，因此很難應用到目標聚類和檢索上

【論文閱讀】Batch Feature Erasing for Person Re-identification and Beyond

轉載請註明出處：https://www.cnblogs.com/White-xzx/ 原文地址：https://arxiv.org/abs/1811.07130 【Abstract】　　這篇文章展示了行人ReID的一個新的訓練機制——批特徵擦除（Batch Feature Erasing，BFE）。作

2014 CVPR-DeepReID Deep Filter Pairing Neural Network for Person Re-Identification

論文地址第一篇用深度學習來做Re-ID的工作，介紹了很多基礎性的概念 model部分對CNN的設計思路講的很詳細，有些細節還沒有完全搞懂，回頭會繼續理解總結~ Motivation 傳統的re-

【Person Re-ID】Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification

Introduction Person Re-ID目前依然是一項十分具有挑戰的任務。姿勢，視角，光照，背景和遮擋都給這項任務帶來困難。傳統的方法通過學習low-level特徵，比如顏色、外形、區域性描述子等來描述一個人。而CNN通過學習high-lev

【論文筆記】Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification

摘要 Person re-identification (ReID) is an important task in computer vision. Recently, deep learning with a metric learning loss has becom

part-aligned系列論文：1707.Deeply-Learned Part-Aligned Representations for Person Re-Identification 論文筆記

Deeply-Learned Part-Aligned Representations for Person Re-Identification一種超簡單有效的行人對齊識別網路！ inspired by attention model，propose a pa

【論文筆記】In Defense of the Triplet Loss for Person Re-Identification

1、前言 Triplet loss是非常常用的一種deep metric learning方法，在影象檢索領域有非常廣泛的應用，比如人臉識別、行人重識別、商品檢索等。傳統的triplet loss訓練需要一個三元組，包括三張圖片：achor,positive,

ReID：Harmonious Attention Network for Peson Re-Identification 解讀

Problem Existing person re-identification(re-id) methods either assume the availability of well-aligned person bounding box

CVPR2018論文翻譯 Human Semantic Parsing for Person Re-identification

論文連結：摘要混亂的背景、光照、視角等因素制約了提取魯棒性表示的能力，因此reid是個挑戰性的任務。為了改進表示學習，通常提取行人身體各部分的區域性特徵。然而，實際中通常基於包圍框的部分檢測。本文提出了改編的human semantic parsing，它有著畫素等級

《Diversity Regularized Spatiotemporal Attention for Video-based Person Re-identification》論文翻譯

基於視訊的人體再識別的時間規整化時空注意 &

Reinforced Temporal Attention and Split-Rate Transfer for Depth-Based Person Re-Identification 論文筆記

Reinforced Temporal Attention and Split-Rate Transfer for Depth-Based Person Re-Identification 論文筆記一、提出問題利用深度學習方法進行行人重識別時的資料稀缺問題基於視訊的行人重識

CVPR 2017：See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-based Person Re-identification

network 測試 eee 分享 The 因此進行最大變化 [1] Z. Zhou, Y. Huang, W. Wang, L. Wang, T. Tan, Ieee, See the Forest for the Trees: Joint Spatial and

2017-CVPR-Spindle Net: Person Re-identification with Human Body Region Guided Feature

轉載自：https://blog.csdn.net/weixin_41427758/article/details/82910295 論文地址：http://openaccess.thecvf.com/content_cvpr_2017/papers/Zhao

Mask-guided Contrastive Attention Model for Person Re-Identification 詳解

一、概述

二、網路結構：

三、Loss

3.1 Mask-guided Contrastive Attention Sub-net

3.2 Region-Level Triplet Loss for Contrastive Feature

3.3 Objective Function

四、總結

相關推薦