[Learning Deep Features for Scene Recognition using Places Database]場景分類demo的實現

阿新 • • 發佈：2019-02-03

需要預先安裝包：

numpy
pytorch
opencv-python
Pillow

首先將places365的程式碼git到本地：

git clone https://github.com/CSAILVision/places365.git

程式包大約有1.37Mb左右。

cd palces365

執行demo程式：

python run_placesCNN_basic.py

程式run_placesCNN_basic.py中使用的架構（arch）是resnet18網路，因此程式碼段中會通過http請求自動下載與訓練好的權重resnet18_places365.pth.tar

以及標籤目錄categories_places365.txt檔案，再通過http請求下載測試圖片12.jpg,返回訓練結果：

2018-05-18 20:01:26 (82.8 KB/s) - 已儲存 “12.jpg” [63736/63736])

resnet18 prediction on 12.jpg
0.621 -> patio
0.296 -> restaurant_patio
0.021 -> porch
0.018 -> beer_garden
0.012 -> courtyard

識別結果為院子。
如果覺得上面的功能太簡單，還可以執行下面的demo程式：

 python run_placesCNN_unified.py

需要下載categories_places365.txt，IO_places365.txt，labels_sunattribute.txt，W_sceneattribute_wideresnet18.npy，wideresnet18_places365.pth.tar以及測試圖片test.jpg當然程式會自動將這些檔案下載好，直接輸出結果：
這裡寫圖片描述

RESULT ON http://places.csail.mit.edu/demo/6.jpg
--TYPE OF ENVIRONMENT: indoor
--SCENE CATEGORIES:
0.511 -> food_court
0.085 ->  
fastfood_restaurant
0.083 -> cafeteria
0.040 -> dining_hall
0.021 -> flea_market/indoor
--SCENE ATTRIBUTES:
no horizon, enclosed area, man-made, socializing, indoor lighting, cloth, congregating, eating, working
Class activation map is saved as cam.jpg

可以看出來ResNet將圖片識別成food_court飯廳，底下還有相關的場景屬性描述（沒有地平線，封閉空間，人造場景，社交等等）以及生成分類啟用圖片：
這裡寫圖片描述

實在是很神奇～

[Learning Deep Features for Scene Recognition using Places Database]場景分類demo的實現

需要預先安裝包： numpy pytorch opencv-python Pillow 首先將places365的程式碼git到本地： git clone https://github.com/CSAILVision/places365.git

《Learning Deep Features for Discriminative Localization》文章解讀

摘要在這項工作中，我們重新審視了《 Network in network》中提出的全域性平均池化層（global average pooling），並闡明瞭它是如何通過圖片標籤就能讓卷積神經網路具有卓越的定位能力。雖然這項技術以前被當做正則化訓練的一種方法，但是我們發現它實際構建了一種通用的適

CAM論文剖析（Learning Deep Features for Discriminative Localization）

英文原文請點這裡譯文請點這裡文章內容剖析實驗程式碼在這裡. 摘要本文主要工作 1、闡述GAP如何使CNN具有卓越定位能力 2、證明了所提出網路能定位出“區別性”

Wasserstein CNN: Learning Invariant Features for NIR-VIS Face Recognition

表現 ted diff 差異 osi asi nta measure mod 承接上上篇博客，在其基礎上，加入了Wasserstein distance和correlation prior 。其他相關工作、網絡細節（maxout operator）、訓練方式和數據處理等基本

Learning hierarchical spatio-temporal features for action recognition with ISA

Reading papers_16(Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis)

論文筆記：Learning Region Features for Object Detection

中心思想繼Relation Network實現可學習的nms之後，MSRA的大佬們覺得目標檢測器依然不夠fully learnable，這篇文章類似之前的Deformable ROI Pooling，主要在ROI特徵的組織上做文章，文章總結了現有的各種ROI Pooling變體，提出了一個統一的數學表示式

論文閱讀-《BlitzNet: A Real-Time Deep Network for Scene Understanding》

ICCV 2017 1.Motivation: 為了做到實時的目標檢測和語義分割 2.Framework 採用的是Resnet50+SSD, ssd這種one-stage的檢測器天生適合和分割一塊做。上取樣過程用到的block如下圖所示，除了

【論文閱讀】HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis

ase channels 手機 features feature 輸出傳統 logs evel 轉載請註明出處：https://www.cnblogs.com/White-xzx/ 原文地址：https://arxiv.org/abs/1709.09930 如有不準確或錯

CVPR2016之A Key Volume Mining Deep Framework for Action Recognition論文閱讀（視訊關鍵幀選取）

該論文的主要思想是從視訊中選取關鍵的幀卷（frame volume）用來行為識別。該文章的意圖是通過對視訊中關鍵幀進行選取，減少與視訊表達內容不相關的視訊幀，實現視訊中行為識別準確率的提升。該文章主要從兩個方面進行闡述：1、如何選取關鍵幀。2、如何檢

讀書筆記31：What have we learned from deep representations for action recognition?（CVPR2018）

摘要：首先是背景，深度模型在計算機視覺的每個領域都有部署，因此，理解這些深度模型得到的representation到底是怎麼工作的，以及這些representation到底抓去了什麼資訊就變得越來越重要。接著說本文的工作，本文通過視覺化two-stream模型在進行動作識

Deep Residual Learning for Image Recognition

ant PE ear network sub cit test error inpu Kaiming HeXiangyu ZhangShaoqing RenMicrosoft Research {kahe, v-xiangz, v-shren, jiansun}@micr

RBM-An approach for text summarization using deep learning algorithm

Padmapriya G, Duraiswamy K. AN APPROACH FOR TEXT SUMMARIZATION USING DEEP LEARNING ALGORITHM[J]. Journal of Computer Science, 2014, 10(1):1-9. ##A

【論文翻譯】ResNet論文中英對照翻譯--（Deep Residual Learning for Image Recognition）

【開始時間】2018.10.03 【完成時間】2018.10.05 【論文翻譯】ResNet論文中英對照翻譯--（Deep Residual Learning for Image Recognition）【中文譯名】深度殘差學習在影象識別中的應用【論文連結】https://arx

ResNet: Deep Residual Learning for Image Recognition詳解

Deep Residual Learning for Image Recognition 這是一篇2015年何凱明在微軟團隊提出的一篇大作，截止目前其論文引用量達12000多次。摘要網路比較深的模型比較難以訓練。作者提出了一個殘差學習的框架來減輕模型的訓練難度，

Deep Residual Learning for Image Recognition（譯）

轉載自：http://blog.csdn.net/wspba/article/details/57074389 僅供參考，如有翻譯不到位的地方敬請指出。論文地址：Deep Residual Learning for Image Recognition 摘要越深的

Learning Invariant Deep Representation for NIR-VIS Face Recognition

查詢異質影象匹配的過程中，發現幾篇某組的論文，都是關於NIR-VIS的識別問題，提到了許多處理異質影象的處理方法，網路結構和idea都很不錯，記錄其中一篇。摘要 VIS-NIR（可見光與近紅外）面部識別仍然是異質影象識別中的挑戰。本文只用一個網路來對映NIR和VIS影象至一個緊湊的歐式空間。網路的低階層

讀書筆記25：2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning（CVPR2018）

摘要：首先指出背景，即action recognition和human pose estimation是兩個緊密相連的領域，但是總是被分開處理。然後自然地引出本文的模型，本文的模型就針對這個現狀，提出了一個multitask framework，既能從靜態image中進行

[CVPR 18] Discriminative Learning of Latent Features for Zero-Shot Recognition

本文亮點：對人類定義屬性進行擴充，學習隱含屬性。 ZSL 零樣本學習(zero-shot learning, ZSL)詳見鄭哲東在知乎中的回答。它的目標是通過訓練階段從已見類別中學習到的知識，來識別未見類別。 Abstract 摘要：零樣本學習（ZSL）

How to use DeepLab in TensorFlow for object segmentation using Deep Learning

How to use DeepLab in TensorFlow for object segmentation using Deep LearningModifying the DeepLab code to train on your own dataset for object segmentation

【論文閱讀】Learning Spatio-Temporal Features with 3D Residual Networks for Action Recognition

【論文閱讀】Learning Spatio-Temporal Features with 3D Residual Networks for Action Recognition 這是2017ICCV workshop的一篇文章，這篇文章只是提出了一個3D-ResNets網路，與之前介紹的

[Learning Deep Features for Scene Recognition using Places Database]場景分類demo的實現

相關推薦