論文整理集合 -- 吳恩達老師深度學習課程

阿新 • • 發佈：2019-02-07

吳恩達老師深度學習課程中所提到的論文整理集合！這些論文是深度學習的基本知識，閱讀這些論文將更深入理解深度學習。

這些論文基本都可以免費下載到，如果無法免費下載，請留言！可以到coursera中看該視訊課程。

下面的論文主要分為3部分：
1. 優化神經網路的各種方法
2. 卷積神經網路，包含各種model和物體檢測的論文
3. RNN型別的神經網路

A collection of papers mentioned in the deep learning course of Andrew Ng

1. Neural Networks and Deep Learning

None

2. Improving Deep Neural Networks Hyperparameter tuning, Regularization and Optimization

Dropout; regularization;

Srivastava, Nitish, et al. “Dropout: A simple way to prevent neural networks from overfitting.” The Journal of Machine Learning Research 15.1 (2014): 1929-1958.

RMSprop; optimization of gradient descent, it is an unpublished, adaptive learning rate method proposed by Geoff Hinton in Lecture 6e of his Coursera Class. RMSprop and Adadelta have both been developed independently around the same time stemming from the need to resolve Adagrad’s radically diminishing learning rates.

Tieleman, Tijmen, and Geoffrey Hinton. “Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude.” COURSERA: Neural networks for machine learning 4.2 (2012): 26-31.

Adam optimization algorithm; an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments.

Batch normalization

Ioffe, Sergey, and Christian Szegedy. “Batch normalization: Accelerating deep network training by reducing internal covariate shift.” International conference on machine learning. 2015.

3. Structuring Machine Learning Projects

None

4. Convolutional Neural Networks

LeNet-5; a kind of nueral network model;

LeCun et al., 1998. Gradient-based learning applied to document recognition
這裡寫圖片描述
- AlexNet; a kind of nueral network model;

Krizhevsky et al., 2012. ImageNet classification with deep convolutional neural networks
這裡寫圖片描述

這裡寫圖片描述
- VGG-16;

Simonyan & Zisserman 2015. Very deep convolutional networks for large-scale image recognition

ResNet(Residual Network);

He et al., 2015. Deep residual networks for image recognition

Network in Network (one by one convolution); filter size is (1 ,1), but filter number is more than one;

Lin et al., 2013, Network in network.

inception network; motivation for inception network;

Szegedy et al. 2014. Going deeper with convolutions

object recognition;

Sermanet et al., 2014, OverFeat: Integrated recognition, localization and detection using convolutional networks

YOLO (you only look once); real-time object detection;

Redmon et al,. 2015. You Only Look Once: Unified real-time object detection.

R-CNN; region proposal, classify proposed regions one at a time. output label + bounding box;

Girshik et al., 2013. Rich feature hierarchies for accurate object detection and semantic segmentation.

Fast R-CNN; Propose regions, use convolution implementation of sliding windows to classify all the proposed regions;

Girshik, 2015. Fast R-CNN.

Faster R-CNN; use convolutional network to propose regions;

Ren et.al, 2016. Faster R-CNN:Towards real-time object detection with region proposal networks.

Siamese network; Face recognition;

Taigman et.al., 2014. DeepFace closing the gap to human level performance

FaceNet;

Schreff et.al., 2015, FaceNet: A unified embedding for face recognition and clustering

5. Sequence Models

gated recurrent unit;

Cho et al., 2014. On the properties of neural machine translation: Encoder-decoder approaches

gated recurrent unit;

Chung et al., 2014. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling.

LSTM (long short-term memory);

Hochreiter & Schmidhuber 1997. Long short-term memory

Visualizing word embeddings

van der Maaten and Hinton., 2008. Visualizing data using t-SNE

About word embedding

Mikolov et.al., 2013. Linguistic regularities in continuous space word representations

neural language model. to predict next word.

Bengio et.al., 2003, A neural probabilistic language model

Skip-gram model, about how to learn word-to-vector of word embedding in the neural network.

Mikolov et.al., 2013. Efficient estimation of word representations in vector space

Negative sampling; similar to skip-gram model but with much more efficient.

Mikolov et.al., 2013. Distributed representation of words and phrases and their compositionality.

GloVe (global vectors for word representation); Has some momentum in the NLP community. It is not used as much as the Word2Vec or the skip-gram models.

Pennington et.al., 2014. GloVe: Global vectors for word representation.

About the problem of bias in word embeddings.

Bolukbasi et.al., 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings

CTC (Connectionist temporal classification) cost for speech recognition

Graves et al., 2006. Connectionist Temporal Classification: Labeling unsegmented sequence data with recurrent neural networks

language tranlation; Sequence to sequence model

Sutskever et al., 2014. Sequence to sequence learning with neural networks

language tranlation; Sequence to sequence model

Cho et al., 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation

Image captioning

Mao et. al., 2014. Deep captioning with multimodal recurrent neural networks

Vinyals et.al., 2014. Show and tell: Neural image caption generator

Karpathy and Fei Fei, 2015. Deep visual-semantic alignments for generating image descriptions

Evaluating machine translation

Papineni et.al., 2002. A method for automatic evaluation of machine translation

Attention model

Bahdanau et.al., 2014. Neural machine translation by jointly learning to align and tranlate

Xu et.al., 2015. Show attention and tell: neural image caption generation with visual attention
注意力模型，第二張圖是第一張圖中“Attention” 的細分。
這裡寫圖片描述
$a^{< t, T_{x} >}$ 是 $a^{< T_{x} >}$ 的權值。其中 $a^{< t, T_{x} >}$ 越大，則對應的 $a^{< T_{x} >}$ 被注意的程度也就也大。

論文整理集合 -- 吳恩達老師深度學習課程

A collection of papers mentioned in the deep learning course of Andrew Ng

1. Neural Networks and Deep Learning

2. Improving Deep Neural Networks Hyperparameter tuning, Regularization and Optimization

3. Structuring Machine Learning Projects

4. Convolutional Neural Networks

5. Sequence Models

論文整理集合 -- 吳恩達老師深度學習課程

Operations on word vectors-v2 吳恩達老師深度學習課程第五課第二週程式設計作業1

v2 吳恩達老師深度學習第五課第二週程式設計作業2

吳恩達老師深度學習視訊課筆記：構建機器學習專案(機器學習策略)(1)

<吳恩達老師深度學習筆記二>第一周，深度學習介紹（未完待續）

<吳恩達老師深度學習筆記二>第一週，深度學習介紹（未完待續）

吳恩達Coursera深度學習課程 course4-week1 Convolutional Neural Networks & CNN Application 作業

吳恩達Coursera深度學習課程 course2-week3 超引數除錯和Batch Norm及框架作業

吳恩達Coursera深度學習課程 course2-week2 優化方法作業

吳恩達Coursera深度學習課程 deeplearning.ai (5-3) 序列模型和注意力機制--程式設計作業(二)：觸發字檢測

吳恩達Coursera深度學習課程 deeplearning.ai (4-1) 卷積神經網路--程式設計作業

吳恩達Coursera深度學習課程 DeepLearning.ai 程式設計作業——Regularization（2-1.2）

吳恩達Coursera深度學習課程 DeepLearning.ai 提煉筆記（1-2）-- 神經網路基礎

吳恩達Coursera深度學習課程 deeplearning.ai (4-1) 卷積神經網路--課程筆記

吳恩達Coursera深度學習課程 deeplearning.ai (4-4) 人臉識別和神經風格轉換--課程筆記

吳恩達Coursera深度學習課程 deeplearning.ai (5-1) 迴圈序列模型--程式設計作業(一)：構建迴圈神經網路

吳恩達Coursera深度學習課程筆記（1-1）神經網路和深度學習-深度學習概論

吳恩達Coursera深度學習課程 deeplearning.ai (5-1) 迴圈序列模型--課程筆記

吳恩達Coursera深度學習課程 DeepLearning.ai 提煉筆記（5-1）-- 迴圈神經網路

吳恩達Coursera深度學習課程 deeplearning.ai (5-3) 序列模型和注意力機制--課程筆記

論文整理集合 -- 吳恩達老師深度學習課程

A collection of papers mentioned in the deep learning course of Andrew Ng

1. Neural Networks and Deep Learning

2. Improving Deep Neural Networks Hyperparameter tuning, Regularization and Optimization

3. Structuring Machine Learning Projects

4. Convolutional Neural Networks

5. Sequence Models

相關推薦