1. 程式人生 > >論文筆記之:Collaborative Deep Reinforcement Learning for Joint Object Search

論文筆記之:Collaborative Deep Reinforcement Learning for Joint Object Search

region format es2017 join sid col str bottom respond

Collaborative Deep Reinforcement Learning for Joint Object Search

CVPR 2017

Motivation:

  傳統的 bottom-up object region proposals 的方法,由於提取了較多的 proposal,導致後續計算必須依賴於搶的計算能力,如 GPU 等。那麽,在計算機不足的情況下,則會導致應用範圍受限。而 Active search method (就是 RL 的方法) 則提供了不錯的方法,可以很大程度上降低需要評估的 proposal 數量。

  技術分享圖片

  我們檢查了在交互過程中,多個物體之間的 Joint Active Search 的問題。

  On the one hand, it is interesting to consider such a collabrative detection "game" played by multiple agents under an RL setting;

  On the other hand, it seems especially beneficial in the context of visual object localization where different objects often appear with certain correlation patterns, 如:行人騎自行車,座子上的杯子,等等。

  這些物體在交互的情況下,可以提供更多的 contextual cues 。這些線索有很好的潛力來促進更加有效的搜索策略。

  

  本文提出一種協助的多智能體 deep RL algorithm 來學習進行聯合物體定位的最優策略。我們的 proposal 服從現有的 RL 框架,但是允許多個智能體之間進行協作。在這個領域當中,有兩個開放的問題:

  1. how to make communications effective in between different agents ;

  2. how to jointly learn good policies for all agents.

  

  本文提出通過 gated cross connections between the Q-networks 來學習 inter-agent communication。

  

  所提出的創新點:

  1. 是物體檢測領域的第一個做 collaborative deep RL algorithm ;

  2. propose a novel multi-agent Q-learning solution that facilitates learnable inter-agent communication with gated cross connections between the Q-networks;

  3. 本文方法有效的探索了 相關物體之間有用的 contextual information,並且進一步的提升了檢測的效果。

  

  3. Collaborative RL for Joint Object Search

    3.1. Single Agent RL Object Localization

      作者這裏首先回顧了常見的單智能體進行物體檢測的大致思路,此處不再贅述。

    3.2. Collaborative RL for Joint Object Localization

      本文將 single agent 的方法推廣到 multi-agent,關鍵的概念有:

      --- gated cross connections between different Q-networks;

      --- joint exploitation sampling for generating corresponding training data,

      --- a vitrual agent implementation that facilitates easy adaptation to existing deep Q-learning algorithm.

      

      3.2.1 Q-Networks with Gates Cross Connections

      技術分享圖片

      3.2.2 Joint Exploitation Sampling

論文筆記之:Collaborative Deep Reinforcement Learning for Joint Object Search