1. 程式人生 > >[深度學習論文筆記] Convolutional Neuron Networks and its Applications

[深度學習論文筆記] Convolutional Neuron Networks and its Applications

In artificial intelligence, there exists a Moravec’s Paradox, 1 “High-level reasoning requires very little computation, but low-level sensorimotor skills require enormous computational resources”. It is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility.

Computer vision is one of the such low-level sensorimotor skills. The task of recognizing an object is trivial for human, but it is quite hard for computers due to the semantic gap. Computers only see a collection of integers from 0 to 255. It is hard to write an explicit algorithm for compute to identity object from a 3D array of numbers. Therefore, inspired by the human learning process, we are going to provide the compute with many examples of each class and let the compute learn from data. This is called data-driven approach. 

Convolution Neural Network (CNN) is the state-of-the-art approach to object recognition, and it has show greatly advance on the performance of many compute vision tasks. To have a deep understanding of CNN and to inspire ideas for cutting-edge research, I think the most fundamental and effective way is to look at recent CNN publications from top-tier vision conferences and journals. Therefore, I decided to write a note to take down the basic ideas and my understandings of those publications. At present, this note contains around 60 papers from ICCV, ECCV, CVPR, NIPS, ICML, ICLR and so on. The content covers the basic topics in computer vision including image classification, object localization, object detection, object segmentation, image and language, video classification, GAN, etc. 

I would like to give acknowledgment to the followings for providing fabulous materials on CNN/deep learning.

• Andrew Ng et al. “UFLDL: Deep Learning Tutorial.” Stanford.
• Fei-Fei Li, Andrej Karpathy, and Justin Johnson. “cs231n: Convolutional Neural Networks for Visual Recognition.” Stanford.
• Andrea Vedaldi, Andrew Zisserman. “VGG Convolutional Neural Networks Practical.” Oxford Visual Geometry Group.
• Ian Goodfellow, Aaron Courville, and Yoshua Bengio. “Deep Learning.” Book in preparation for MIT Press. 2015.

• Jianxin Wu. “Introduction to Convolutional Neural Networks”. Nanjing University. 

This note is still under continuous update. If you have any question or advice, please feel free to contact with me via email.

The pdf file can be download at here.

1 https://en.wikipedia.org/wiki/Moravec’s_paradox.