Sutskever2014_Sequence to Sequence Learning with Neural Networks

阿新 • • 發佈：2018-11-09

INFO: Sutskever2014_Sequence to Sequence Learning with Neural Networks

ABSTRACT

Use one LSTM to read the input sequence, one timestep at a time, to obtain large fixed-dimensional vector representation, and then to use another LSTM to extract the output sequence from that vector. The second LSTM is essentially a recurrent neural network language model except that it is conditioned on the input sequence.

It is not clear how to apply an RNN to problems whose input and the output sequences have different lengths with complicated and non-monotonic relationships.
Reversing the input sentences results in LSTMs with better memory utilization.
(Instead of mapping the sentence a, b, c to the sentence α, β, γ, the LSTM is asked to map c, b, a to α, β, γ, where α, β, γ is the translation of a, b, c.)

RELEVANT INFORMATION:

Encoder - Decoder:
1. Encoder-Decoder並不是一個具體的模型，而是一類框架。Encoder和Decoder部分可以是任意的文字，語音，影象，視訊資料，模型可以採用CNN，RNN，BiRNN、LSTM、GRU等等。所以基於Encoder-Decoder，我們可以設計出各種各樣的應用演算法。
2. Encoder-Decoder框架有一個最顯著的特徵就是它是一個End-to-End學習的演算法；這樣的模型往往用在機器翻譯中，比如將法語翻譯成英語。這樣的模型也被叫做 Sequence to Sequence learning。
3. 侷限性：編碼和解碼之間的唯一聯絡就是一個固定長度的語義向量c。也就是說，編碼器要將整個序列的資訊壓縮排一個固定長度的向量中去。但是這樣做有兩個弊端：
  1. 語義向量c無法完全表示整個序列的資訊。
  2. 先輸入的內容攜帶的資訊會被後輸入的資訊稀釋掉/覆蓋，且輸入序列越長就越嚴重，使得在解碼的時候一開始就沒有獲得輸入序列足夠的資訊，那麼解碼的準確度自然也就要打個折扣。
    ENCODER-DECODER 模型
Attention Model
1. 為了彌補上述基本Encoder-Decoder模型的侷限性，近兩年NLP領域提出Attention Model（注意力模型），典型的例子就是在機器翻譯的時候，讓生成詞不是隻能關注全域性的語義編碼向量c，而是增加了一個“注意力範圍”，表示接下來輸出詞時候要重點關注輸入序列中的哪些部分，然後根據關注的區域來產生下一個輸出。
2. 相比於之前的Encoder-Decoder模型，Attention模型最大的區別就在於它不再要求編碼器將所有輸入資訊都編碼進一個固定長度的向量之中。相反，此時編碼器需要將輸入編碼成一個向量的序列，而在解碼的時候，每一步都會選擇性的從向量序列中挑選一個子集進行進一步處理。
  ATTENTION 模型

Sutskever2014_Sequence to Sequence Learning with Neural Networks

INFO: Sutskever2014_Sequence to Sequence Learning with Neural Networks ABSTRACT Use one LSTM to read the input sequence, one timestep at a

論文筆記-Sequence to Sequence Learning with Neural Networks

map tran between work down all 9.png ever onf 大體思想和RNN encoder-decoder是一樣的，只是用來LSTM來實現。 paper提到三個important point： 1）encoder和decoder的LSTM

【論文閱讀】Sequence to Sequence Learning with Neural Networks

看論文時查的知識點前饋神經網路就是一層的節點只有前面一層作為輸入，並輸出到後面一層，自身之間、與其它層之間都沒有聯絡，由於資料是一層層向前傳播的，因此稱為前饋網路。 BP網路是最常見的一種前饋網路，BP體現在運作機制上，資料輸入後，一層層向前傳播，然後計算損失函式，得到損失函式的殘差

Sequence to Sequence Learning with Neural Networks

用神經網路進行序列到序列的學習摘要 1.介紹 2.模型 3.實驗 3.1 Dataset details 3.2 Decoding and Rescoring 3.3 Reversing the Source Sent

（翻譯）Sequence to Sequence Learning with Neural Networks

2 模型 RNN，給定一個輸入序列（x1,x2，…，xT），RNN通過迴圈計算下面的式子得到一個輸出序列（y1,y2,…,yT）如何一個input和output是對應的，比如輸入單詞，輸出是詞性，就可以用RNN對映，本文是解決輸入輸出之間沒有對應關

論文復現Sequence to sequence learning with neural networks

Sequence to sequence learning with neural networks <模型彙總-7>基於CNN的Seq2Seq模型-Convolutional Sequence to Sequence Learning Sequence

Sequence to Sequence Learning with Neural Networks論文閱讀

[論文下載](https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf) ![](https://s2.ax1x.com/2020/03/05/3TxBrt.png#shadow) 作者

An Introduction to Deep Learning and Neural Networks

aitopics.org uses cookies to deliver the best possible experience. By continuing to use this site, you consent to the use of cookies. Learn more » I und

Deep Learning 16：用自編碼器對資料進行降維_讀論文“Reducing the Dimensionality of Data with Neural Networks”的筆記

前言筆記摘要：高維資料可以通過一個多層神經網路把它編碼成一個低維資料，從而重建這個高維資料，其中這個神經網路的中間層神經元數是較少的，可把這個神經網路叫做自動編碼網路或自編碼器（autoencoder）。梯度下降法可用來微調這個自動編碼器的權值，但是隻有在初始化權值較好時才能得到最優解，不然就

Deep Learning讀書筆記（一）：Reducing the Dimensionality of Data with Neural Networks

這是發表在Science上的一篇文章，是Deep Learning的開山之作，同樣也是我讀的第一篇文章，我的第一篇讀書筆記也從這開始吧。文章的主要工作是資料的降維，等於說這裡使用深度學習網路主要提取資料中的特徵，但卻並沒有將這個特徵應用到分類等

機器翻譯模型之Fairseq：《Convolutional Sequence to Sequence Learning》

近年來，NLP領域發展迅速，而機器翻譯是其中比較成功的一個應用，自從2016年穀歌宣佈新一代谷歌翻譯系統上線，神經機器翻譯（NMT，neural machine translation）就取代了統計機器翻譯（SMT，statistical machine translation），在翻譯

Introduction to Machine Learning with Python/Python機器學習基礎教程_程式碼修改與更新

2.3.1樣本資料集 --程式碼bug及修改意見 import matplotlib.pyplot as plt import mglearn X,y=mglearn.datasets.make_forge() mglearn.discrete_scatter(X[:,0

Facebook的Fairseq模型詳解(Convolutional Sequence to Sequence Learning)

1. 前言近年來，NLP領域發展迅速，而機器翻譯是其中比較成功的一個應用，自從2016年穀歌宣佈新一代谷歌翻譯系統上線，神經機器翻譯（NMT，neural machine translation）就取代了統計機器翻譯（SMT，statistical machine translation），在翻譯質量上面

Artificial Intelligence, Machine Learning and Neural Networks – Keeping Things in Perspective

It is an overarching computer science discipline that deals with making machines think like humans, having consciousness and the ability to adjust to the c

How to Build and Use Neural Networks

How to Build and Use Neural NetworksCreating a neural network means creating a one-track mind system, trained to solve a single problem, or at most, relate

Introduction to Machine Learning with IBM Watson Studio

After logging into Watson Studio, select New Modeler Flow. Enter a name, keep the default settings, and then click Create. Next expand the Import menu, dra

How to use Paperspace to train your Deep Neural Networks

First, you have to sign up for the service. One tip here: students of the fast.ai course get a promo code, which is worth $15. That’s up to about 30 hours

A Gentle Introduction to Exploding Gradients in Neural Networks

Tweet Share Share Google Plus Exploding gradients are a problem where large error gradients accu

Convolutional Sequence to Sequence Learning筆記

摘要：序列到序列學習的流形方法對映輸入序列到一個變長輸出序列通過迴圈神經網路。我們引入一個完全依賴於卷積神經網路的架構。和迴圈模型相比，所有元素計算可以並行化更好利用GPU並且當非線性的兩固定並不依賴於輸入長度時更容易優化。簡介：和迴圈層相比，卷積層對固

Introduction.to.Machine.Learning.with.Python 筆記

Python 3.0+ Chapter One from preamble import * %matplotlib inline import numpy as np x = np.array([[1, 2, 3], [4, 5, 6]]) print("x:\

Sutskever2014_Sequence to Sequence Learning with Neural Networks

相關推薦