1. 程式人生 > >View Invariant Gait Recognition Using Only One Uniform Model論文翻譯以及理解

View Invariant Gait Recognition Using Only One Uniform Model論文翻譯以及理解

View Invariant Gait Recognition Using Only One Uniform Model論文翻譯以及理解

一行英文,一行翻譯

論文中所述的優點:The unique advantage is that it can extract view invariant feature from any view using only one model。

II. VIEW INVARIANT GAIT FEATURE EXTRACTION

In gait recognition, when the angle between the walking direction of and the camera is 90◦ (the side view), it is the best view for gait recognition because of more dynamic information. We would try to transform the gait data from any views to the side view using one uniform non-linear model, and then extract the view invariant feature. The proposed model is inspired by the one in [14] where a model based on auto-encoder which is named as Stacked Progressive AutoEncoders(SPAE) is proposed to deal with multi-view face recognition. We use it to multi-view gait recognition. The framework for view invariant feature extraction is illustrated in Fig. 1. The model is described in the following subsections.

在步態識別中,當步行方向與攝像機之間的角度為90°(側檢視)時,其所攜帶的動態資訊更多,因此它是步態識別的最佳檢視。 我們將嘗試使用一個統一的非線性模型將步態資料從任何檢視轉換為側檢視,然後提取檢視不變特徵。 所提出的模型受到[14]中的模型的啟發,這個基於自動編碼器的模型名為堆疊漸進自編碼器(SPAE),它被提出來是為了處理多檢視人臉識別。 我們用它來做多檢視步態識別。 用來提取檢視不變特徵的框架如圖1所示。該模型在以下小節中描述。

Figure1

B. Auto-Encoder for Gait View Transformation

figure
(

a ) \left (a \right ) schematic diagram of auto-encoder;
( b ) \left (b \right )
larger angle change would be much difficult for one auto-encoder to handle. It is much more difficult for one auto-encoder to transform 54◦ image to 90◦ image than transform 72◦ image to 90◦ image, but we could gradually transform 54◦ image to 72◦ image with one auto-encoder and then 72◦ image to 90◦ image with another auto-encoder, it would be much easier.

對上圖進行解釋:
( a ) \left (a \right ) 自編碼器的示意圖;
( b ) \left (b \right ) 用一個編碼器來處理角度變化較大的情況比較困難,一個自動編碼器將54°影象轉換為90°影象要比將72°影象轉換為90°影象困難得多,但我們可以使用一個自動編碼器逐漸將54°影象轉換為72°影象,然後再使用另一個自動編碼器將72°影象轉換為90°影象,這比第一個處理方法來說會容易一些。

Auto-encoder [16] is one of the popular models in recent years. It can be used to extract compact features. As shown in Fig. 3(a), an auto-encoder usually contains three layers: one input layer, one hidden layer and one output layer. There are two parts in an auto-encoder, encoder and decoder. The encoder can transform the input data into a new representation in hidden layer. It usually consists of a linear transformation and a nonlinear transformation as follows:

自編碼器[16]是近年來最受歡迎的模型之一。它可用於提取緊湊的特徵。 如圖3(a)所示,自動編碼器通常包含三層:一個輸入層,一個隱藏層和一個輸出層。自編碼器分為編碼器和解碼器兩部分。 編碼器可以將輸入資料轉換為隱藏層中的新表示。 它通常由線性變換和非線性變換組成,如下所示:
tt

where f ( ) f\left ( \bullet \right ) denotes the encoder, W denotes the linear transformation, b denotes the basis and s(·) is the nonlinear transformation, also called activation function, such as:

其中, f ( ) f\left ( \bullet \right ) 表示編碼器, W W 表示現行變換, b b 表示偏置, s ( ) s\left ( \bullet \right ) 表示非線性變換,也可以被稱為啟用函式,例如 :
yy
The decoder can transform the hidden layer representation back to input data as follows:

解碼器按照下式,將隱藏層的表示反向轉化為輸入資料:
tt
where g ( ) g\left ( \bullet \right ) denotes the decoder, W W^{′} and b b^{′} denote the linear transformation and basis in decoder and x x^{′} is the output data. We usually use the least square error as the cost function to optimize the parameters in W W , b b , W W^{′} and b b^{′} :

其中, g ( ) g\left ( \bullet \right ) 表示解碼器, W W^{′} b b^{′} 表示解碼器中的線性變換和偏置, x x^{′} 表示解碼器的輸出,我們通常採用均方誤差來優化引數 W W , b b , W W^{′} and b b^{′}
yy
where x i x_{i} denotes the i t h i_{th} one of the N N training samples and x i x_{i}^{'} means the correspond output of x i x_{i} . The traditional auto-encoder can reconstruct the input, but if we replace the output with a different data what distinguishes with input data, then the whole auto-encoder could be regarded as a regression function. But it would be really hard for just one auto-encoder to deal with large angle change.

其中 x i x_ {i} 表示 N N 訓練樣本中的第 x t h x_ {th} 個, x i x_ {i} ^ {'} 表示對應 x i x_ {i} 的輸出。傳統的自編碼器可以對輸入進行重建,但是如果我們將輸出用不同於輸入的資料代替,那麼整個自編碼器可以被視為迴歸函式。但是僅僅只採用一個自編碼器是無法處理輸入和輸出影象所跨視角比較大的情況的。

As shown in Fig. 3(b), the difference between 54◦ image and 90◦ image is much larger than that between 72◦ image and 90◦ image, especially in the leg part. It would be very difficult for just one auto-encoder to transform 54◦ image to 90◦image. But if we use one auto encoder to transform 54◦ image to the 72◦ one, and then use another one auto-encoder to transform 72◦ image to 90◦ image, it would be much easier. So we may need more than one auto-encoder to deal with the multi-view challenge。

如圖 3 ( b ) 3\left ( b \right ) 所示,54°影象和90°影象之間的差異遠大於72°影象和90°影象之間的差異,尤其是腿部。只用一個自編碼器將54°影象轉換為90°影象是非常困難的。但是如果我們使用一個自編碼器將54°影象轉換為72°影象,然後使用另一個自編碼器將72°影象轉換為90°影象,則會更容易些。因此,我們可能需要多個自編碼器來處理多檢視挑戰。

C. Stacked Progressive Auto-Encoders (SPAE)

In [14], the authors stacked some auto-encoders together to deal with the multi-view problem in face recognition. In model training, the output is synthesized in a progressive way. We try to stack some auto-encoders together for the multi-view challenge in gait recognition in a similar manner.In gait recognition, side view contains more dynamic information about the gait. So we would try to convert all the gait energy image to side view. But it is difficult for one auto-encoder to deal with all the views, so each auto-encoder will convert the gait energy images at a larger view to an adjacent smaller view. At the same time those gait energy images at smaller views are kept unchanged. Then after some auto-encoders, all the images would gradually become side view images as shown in Fig. 4, it would be very helpful for improving the accuracy of gait recognition. It is assumed that there are 2 × L + 1 views in the dataset. The difference between the adjacent angles is ∆ = 1 8 18^{\circ} and L = 5. The view angles of the gait data are{ 0 0^{\circ} , 1 8 18^{\circ} , · · · , 18 0 180^{\circ} }. The auto-encoder in first layer would map the gait images at 0◦to 18◦, and the gait images at 180◦ to 162◦. Meanwhile it keeps the gait images from 18◦ to 162◦ unchanged. Then auto-encoder in second layer would map the gait image which is smaller than 36◦ to 36◦ , and larger than 144◦ to 144◦ . The last layer would map all the images to 90◦ but maintain images at 90◦ unchanged. Fig. 4 shows a schematic view of the training phase in a progressive way.

在[14]中,作者將一些自編碼器堆疊在一起,以處理人臉識別中的多視角問題。在模型訓練中,輸出影象以漸進方式合成。我們嘗試以同樣的方式,將一些自編碼器堆疊在一起來應對步態識別中的多檢視問題的挑戰。在步態識別中,側檢視包含更多的動態步態資訊。所以我們會嘗試將所有步態能量影象轉換為側檢視。但是一個自編碼器很難處理所有視角,因此每個自動編碼器會將視角較大的步態能量圖轉換為相鄰的較小視角的步態能量圖。同時,較小視角的步態能量影象是保持不變。然後經過一些自編碼器之後,所有步態能量圖將逐漸變為側檢視,如圖4所示,這對提高步態識別的準確性非常有幫助。假設資料集中有2×L + 1個視角。相鄰角度之間的差值是 Δ \Delta =18°和 L = 5 步態資料的視角為{ 0 0^{\circ} , 1 8 18^{\circ} , · · · , 18 0 180^{\circ} }.。第一層中的自編碼器將步態影象對映到 0 0^{\circ} 1 8 18^{\circ} 18 0 180^{\circ} 16 2 162^{\circ} 。同時它保持步態影象從 1 8 18^{\circ} 16 2 162^{\circ} 不變。然後,第二層中的自動編碼器將步態能量圖小於 3 6 36^{\circ} 的對映到 3 6 36^{\circ} ,大於 14 4 144^{\circ} 的對映到 14 4 144^{\circ} 。最後一層將所有影象對映到 9 0 90^{\circ} ,但保持 9 0 90^{\circ} 的影象不變。圖4以漸進方式示出了訓練階段的示意圖。(每一層的資料集不同,但是結構是一樣的,每一層用特定視角的資料集先訓練,然後再把所有的層串聯起來用所有資料集進行微調)
yy

相關推薦

View Invariant Gait Recognition Using Only One Uniform Model論文翻譯以及理解

View Invariant Gait Recognition Using Only One Uniform Model論文翻譯以及理解 一行英文,一行翻譯 論文中所述的優點:The unique advantage is that it can extract view in

GaitGAN: Invariant Gait Feature Extraction Using Generative Adversarial Networks論文翻譯以及理解

GaitGAN: Invariant Gait Feature Extraction Using Generative Adversarial Networks論文翻譯以及理解 格式:一段英文,一段中文 2. Proposed method To reduce the eff

Multi-Task GANs for View-Specific Feature Learning in Gait Recognition論文翻譯以及理解

Multi-Task GANs for View-Specific Feature Learning in Gait Recognition論文翻譯以及理解 今天想嘗試一下翻譯一篇自己讀的論文。寫的不好,後續慢慢改進。 Abstract  Abstract— Gait rec

Beyond View Transformation Cycle-Consistent Global 論文翻譯以及理解

Beyond View Transformation Cycle-Consistent Global and Partial Perception Gan for View-Invariant Gait Recognition論文翻譯以及理解 翻譯格式:一段英文,一段中文 下面圍

Multi-View Gait Recognition Based on A Spatial-Temporal Deep Neural Network論文翻譯理解

Multi-View Gait Recognition Based on A Spatial-Temporal Deep Neural Network論文翻譯和理解 翻譯格式:一句英文,一句中文 結合圖來講解 ABSTRACT ABSTRACT This paper p

詭異的 Scroll view may have only one direct child placed within it 錯誤

最近在Android上建立一個簡單的Fragment時出現了詭異的錯誤。我本來的意圖可能也不是很正規,在Activity的Layout xml檔案中,我放置好了一個fragment element,然後打算在Activity的onCreate()函式中利用Fragment T

mysql 多個timestamp 錯誤:there can be only one TIMESTAMP column with CURRENT_TIMESTAMP

post div blog primary bsp pos mule ins one mysql 5.6.5以下的版本不支持多個timestamp同時設為default current_timestamp 替代方式是使用trigger CREATE TABLE `exam

mysql單表多timestamp報錯#1293 - Incorrect table definition; there can be only one TIMESTAMP column with CURRENT_TIMESTAMP in DEFAULT or ON UPDATE clause

column 但是 cor 選項 rec bsp 單表 correct ini 一個表中出現多個timestamp並設置其中一個為current_timestamp的時候經常會遇到#1293 - Incorrect table definition; there can b

This view is not constrained. It only has designtime positions, so it will jump to (0,0) at runtime

使用 ConstraintLayout 遇到的問題 This view is not constrained. It only has designtime positions, so it will jump to (0,0) at runtime ConstraintLayout

Altium Designer報錯has only one pin

          使用AD進行多通道設計時,原理圖報錯:***has only one pin***;經檢查,併為發現不妥的地方,百度查詢相關問題,並不是我所遇到的情況,沒能得到解決;   

mysql報ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must

mysql在建表時報故障ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must be defined as a key   create t

Path for project must have only one segment

分享一下我老師大神的人工智慧教程!零基礎,通俗易懂!http://blog.csdn.net/jiangjunshow 也歡迎大家轉載本篇文章。分享知識,造福人民,實現我們中華民族偉大復興!        

1075 - Incorrect table definition there can be only one auto

分享一下我老師大神的人工智慧教程!零基礎,通俗易懂!http://blog.csdn.net/jiangjunshow 也歡迎大家轉載本篇文章。分享知識,造福人民,實現我們中華民族偉大復興!        

org.springframework.beans.factory.parsing.BeanDefinitionParsingException: Configuration problem: Only one AsyncAnnotationBeanPostProcessor may exist w

org.springframework.beans.factory.parsing.BeanDefinitionParsingException: Configuration problem: Only one AsyncAnnotationBeanPostProcessor may exist within

讀書筆記25:2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning(CVPR2018)

摘要:首先指出背景,即action recognition和human pose estimation是兩個緊密相連的領域,但是總是被分開處理。然後自然地引出本文的模型,本文的模型就針對這個現狀,提出了一個multitask framework,既能從靜態image中進行

Only one ConfirmCallback is supported by each RabbitTemplate

       釋出確認機制是保證訊息可靠性的第一步,釋出確認保證我們知道訊息是否成功到達佇列中,返回ack則代表成功,nack則   代表失敗。使用這個特性,我們需要設定RabbitTemplate的mandatory屬性 rabb

Rewiew: Unsupervised Learning of Digit Recognition Using Spike-Timing-Dependent Plasticity(IEEE)

閱讀時間:2017年12月 更新時間:2018年6月,合併了在《Frontiers in computational neuroscience》上發表的同名文章 文章資訊 題目:基於STDP非監督學習的數字識別 刊物:IEEE Transactions on Neur

Android 控制檯異常:ScrollView can host only one direct child

android 採用ScrollView佈局時出現異常:ScrollView can host only one direct child。 異常原因: 主要是ScrollView內部只能有一個子元

A may have only one child element

報錯 程式碼如下: export default class App extends Component { render(){ return( <HashRouter>

論文閱讀:Disentangled Representation Learning GAN for Pose-Invariant Face Recognition

ICCV2017的文章,主要使用multi-task的GAN網路來提取pose-invariant特徵,同時生成指定pose的人臉。 下載連結: 作者: Motivation: 對於大pose的人臉識別,現在大家都是兩種方案:1 先轉正再人臉識別。2 直接學習