TensorFlow自然語言處理篇--------遞迴（迴圈）神經網路RNN（LSTM模型）

阿新 • • 發佈：2019-01-05

歡迎點選參觀我的 ——> 個人學習網站

（未完待續）

準備工作

我們將會訓練一個RNN用於語言方面，目標是給出一系列單詞，然後預測下一個單詞。為此，我們使用專門衡量這些模型好壞的標準資料：PTB資料。它的資料量比較小並且訓練起來相對較快。
PTB資料集已經預處理並含有整體10000個不同的詞，包括結束句子的標記和用於罕見詞語的特殊符號（\ < UNK>）。
為了更容易處理資料，在 reader.py 中，我們將每個單詞轉換成唯一整數識別符號。

程式碼	功能
ptb_word_lm.py	使用PTB資料集訓練模型程式碼
reader.py	讀取資料程式碼

點選這裡下載資料。

構建模型

1. LSTM

模型的核心由一個LSTM單元組成，該單元每次處理一個單詞並計算句子中下一個單詞的可能值的概率。LSTM單元狀態使用零向量初始化並在讀入單詞時進行更新。出於計算原因，我們將以 batch_size 的小批量處理資料，每一批的每個詞都對應著一個時間 t ，TensorFlow將會自動計算每一批的梯度和。

例如：

t=0  t=1    t=2  t=3     t=4
[The, brown, fox, is,     quick]
[The, red,   fox, jumped, high]

words_in_dataset[0] = [The, The] 

words_in_dataset[1] = [brown, red]
words_in_dataset[2] = [fox, fox]
words_in_dataset[3] = [is, jumped]
words_in_dataset[4] = [quick, high]
batch_size = 2, time_steps = 5

虛擬碼如下：

words_in_dataset = tf.placeholder(tf.float32, [time_steps, batch_size, num_features ])
lstm = tf.contrib.rnn.BasicLSTMCell 
(lstm_size)
# Initial state of the LSTM memory.
hidden_state = tf.zeros([batch_size, lstm.state_size])
current_state = tf.zeros([batch_size, lstm.state_size])
state = hidden_state, current_state
probabilities = []
loss = 0.0
for current_batch_of_words in words_in_dataset:
    # The value of state is updated after processing each batch of words.
    output, state = lstm(current_batch_of_words, state)

    # The LSTM output can be used to make next word predictions
    logits = tf.matmul(output, softmax_w) + softmax_b
    probabilities.append(tf.nn.softmax(logits))
    loss += loss_function(probabilities, target_words)

2. 截斷反向傳播

通過設計，RNN的輸出依賴任意距離的輸入。然而，這使得BP演算法難以計算。為了使學習過程易於處理，通常會建立一個“展開”版本的網路，其中包含固定數量（num_steps）的LSTM輸入和輸出。這可以通過一次輸入長度為 num_steps 的輸入並在每個這樣的輸入之後進行反向傳遞來實現。

下面是建立執行截斷後向傳播的圖的簡化程式碼：

# Placeholder for the inputs in a given iteration.
words = tf.placeholder(tf.int32, [batch_size, num_steps])

lstm = tf.contrib.rnn.BasicLSTMCell(lstm_size)
# Initial state of the LSTM memory.
initial_state = state = tf.zeros([batch_size, lstm.state_size])

for i in range(num_steps):
    # The value of state is updated after processing each batch of words.
    output, state = lstm(words[:, i], state)

    # The rest of the code.
    # ...

final_state = state

對全部資料實現迭代的程式碼：

# A numpy array holding the state of LSTM after each batch of words.
numpy_state = initial_state.eval()
total_loss = 0.0
for current_batch_of_words in words_in_dataset:
    numpy_state, current_loss = session.run([final_state, loss],
        # Initialize the LSTM state from the previous iteration.
        feed_dict={initial_state: numpy_state, words: current_batch_of_words})
    total_loss += current_loss

TensorFlow自然語言處理篇--------遞迴（迴圈）神經網路RNN（LSTM模型）

（未完待續）

準備工作

構建模型

1. LSTM

2. 截斷反向傳播

3. 輸入

TensorFlow自然語言處理篇--------遞迴（迴圈）神經網路RNN（LSTM模型）

【自然語言處理篇】--以NLTK為基礎講解自然語?處理的原理

斯坦福大學-自然語言處理入門筆記第十八課排序檢索介紹（ranked retrieval）

斯坦福大學-自然語言處理入門筆記第十六課依存句法分析（Dependency Parsing）

C語言基礎篇——遞迴函式

2018彙總自然語言處理篇

TensorFlow從入門到理解（四）：你的第一個迴圈神經網路RNN（分類例子）

TensorFlow從入門到理解（五）：你的第一個迴圈神經網路RNN（迴歸例子）

TensorFlow 高階之二（卷積神經網路手寫字型識別）

吳恩達.深度學習系列-C1神經網路與深度學習-W2-（作業：神經網路思想的邏輯迴歸）

TensorFlow實現經典深度學習網路（5）：TensorFlow實現自然語言處理基礎網路Word2Vec

利用Tensorflow進行自然語言處理（NLP）系列之二高階Word2Vec

Python與自然語言處理（三）：Tensorflow基礎學習

cs224d 自然語言處理作業 problem set3 (一) 實現Recursive Nerual Net Work 遞歸神經網絡

自然語言處理隨筆（一）

【數學之美筆記】自然語言處理部分（一）.md

自然語言處理---用隱馬爾科夫模型（HMM）實現詞性標註---1998年1月份人民日報語料---learn---test---evaluation---Demo---java實現

自然語言處理--TF-IDF（關鍵詞提取）

吳恩達《深度學習》第五門課（2）自然語言處理與詞嵌入

自然語言處理NLP（一）

TensorFlow自然語言處理篇--------遞迴（迴圈）神經網路RNN（LSTM模型）

（未完待續）

準備工作

構建模型

1. LSTM

2. 截斷反向傳播

3. 輸入

相關推薦