1. 程式人生 > >Re-ID:AlignedReID: Surpassing Human-Level Performance in Person Re-Identification 論文解析

Re-ID:AlignedReID: Surpassing Human-Level Performance in Person Re-Identification 論文解析

  • A global feature(a C-d vector) is extracted by directly applying global pooling on the feature map.
  • 對於全域性特徵的提取,便是用global pooling在feature map上滑動提取特徵。
  • For the local features, a horizontal pooling, which is a global pooling in the horizontal direction, is first applied to extract a local feature for each row, and a 1X1 convolution is then applied to reduce the channel number from C to c. In this way, each local feature(a c-d vector) represents a horizontal part of image for a person.
  • 對於區域性特徵提取,便是用horizontal pooling對feature map進行逐行提取,然後再進行1x1的卷積操作。這樣得到的特徵代表人體的水平部分。
  • As a result, a person image is represented by a global feature and H local features.
  • 最後,一張影象就可以用一個全域性特徵和多個區域性特徵代替。
  • The distance of two person images is the summation of their global and local distances.
  • 兩張圖片的距離是全域性特徵距離與區域性特徵距離之和。
  • The global distance is simply the L2 distance of the global features.
  • 全域性特徵距離是指全域性特徵之間的L2距離。
  • For the local distance, we dynamically match the local parts from top to bottom to find the alignment of local feature with the minimum total distance.
  • 區域性特徵距離是指通過動態規劃的方法求出的最短路徑,並通過該最短距離找到對齊的區域性特徵。
  • This is based on a simple assumption that, for two images of the same person, the local feature from one body part of the first image is more similar to the semantically corresponding body part of the other image.
  • 當然這一度量學習是基於假設:對於同一個人的同一部位在不同的圖片中具有較高的相似度。
  • Given the local features of two image, F=f1,...,fH and G=g1,...,gH, we first normalize the distance to [0, 1) by an element-wise transformation:
    • 這裡寫圖片描述
    • where di,j is the distance between the i-th vertical part of
      the first image and the j-th vertical part of the second image. A distance matrix D is formed based on these distances, where its (i, j)-element is di,j.
  • We define the local distance between the two images as the total distance of the shortest path from (1, 1) to (H, H) in the matrix D.
  • 以上公式是matrix D的每個元素的計算公式
  • The distance can be calculated through dynamic programming as follows:
    • 這裡寫圖片描述
    • where Si,jis the total distance of the shortest path when walking from (1, 1) to (i, j) in the distance matrix D, andSH,H is the total distance of the final shortest path between two image.
  • 以上公式便是動態規劃中求最短路徑所採用的狀態轉移方程。
  • 這裡寫圖片描述
  • Non-corresponding alignments are necessary to maintain the order of vertical alignment, as well as make the correspnding alignments possible.
  • 在最短路徑中,可能包含非對齊的特徵,但這非但不會對結果造成影響,而且還會對維護垂直方向對齊的次序起著至關重要的作用。
  • The reason for using the global distance to mine hard samples is due to two consideration:
    • First, the calculation of the global distance is much faster than that of the local distance.
    • Second, we observe that there is no significant difference in mining hard samples using both distances.
  • Note that in the inference stage, we only use the global features to compute the similaritity of two person images. We make this choice mainly because we unexpectedly observed that the global feature itself is also almost as good as the combined features.
  • This somehow counter-intuitive phenomenon might be caused by two factors:
    • the feature map jointly learned is better than learning the global feature only, because we have exploited the structure prior of the person image in the learning stage;
    • with the aid of local feature matching, the global feature can pay more attention to the body of the person, rather than over fitting the background.
  • 以上解釋了為什麼只使用全域性特徵距離而不使用區域性特徵或者兩者都使用。
  • We apply mutual learning to train models for AlignedReID, which can further improve performance.
  • 作者採用mutual learning去訓練模型,因為這樣可以提高效能。
  • 這裡寫圖片描述
  • A distillation-based model usually transfers knowledge from a pre-trained large teacher network to a small student network.
  • 一個好的模型通常都是採用遷移學習的方法:預訓練一個模型然後在進行微調獲得自己的模型。
  • In this paper, we train a set of student models simultaneously, transferring knowledge between each other.
  • 這篇論文同時訓練多個模型,並讓它們相互學習。
  • We propose a new mutual learning loss for metric learning.
    • The overall loss function include the metric loss, the metric mutual loss, the classification loss and classification mutual loss.
    • The metric loss is decided by both the global distances and the local distances, while the metric mutual loss is decided only by the global distances.
    • The classification mutual loss is the KL divergence for classification.
  • The mutual learning loss is defined as:
    • 這裡寫圖片描述
  • By applying the zero gradient function, the second-order gradients is:
    • 這裡寫圖片描述
  • We found that it speeds up the convergence and improves the accuracy compared to a mutual loss without the zero gradient function.
  • 這篇論文定義了新的mutual learning loss,且該loss中的zero gradient function加快了收斂速度,並提高了準確率。