用 TensorFlow 做個聊天機器人

阿新 • • 發佈：2019-01-13

上一次提到了不錯的學習聊天機器人的資源，不知道小夥伴們有沒有去學習呢。
自己動手做聊天機器人教程
我最近每天都會學一點，拿出解讀來和大家分享一下。

本文結構：

1. 聊天機器人的架構簡圖
1. 用 TensorFlow 實現 Chatbot 的模型
1. 如何準備 chatbot 的訓練資料
1. Chatbot 原始碼解讀

1. 聊天機器人的架構簡圖

聊天機器人的工作流程大體為：提問－檢索－答案抽取。

提問：就是要分析主人的問句中關鍵詞，提問型別，還有真正想知道的東西。

檢索：根據前一步的分析，去找答案。

答案抽取：找到的答案，並不能直接應用，還要整理成真正有用的，可以作為答案的回答。

涉及到的關鍵技術如圖中所示。

看不清圖的話，就是醬紫：

問句解析：
中文分詞、詞性標註、實體標註、概念類別標註、句法分析、語義分析、邏輯結構標註、指代消解、關聯關係標註、問句分類、答案類別確定；

海量文字知識表示：
網路文字資源獲取、機器學習方法、大規模語義計算和推理、知識表示體系、知識庫構建

答案生成與過濾：
候選答案抽取、關係推演、吻合程度判斷、噪聲過濾

2. 用 TensorFlow 實現 Chatbot 的模型

之前有根據 Siraj 的視訊寫過一篇《自己動手寫個聊天機器人吧》，
文章裡只寫了主函式的簡單過程：Data－Model－Training，是用 Lua 實現的，詳細的程式碼可以去他的

github 上學習

下面這篇文章是用 TensorFlow + tflearn 庫實現，在 建模，訓練和預測 等環節可以學到更多細節：

兩篇的共同點是都用了 Seq2Seq 來實現。

LSTM的模型結構為：

細節的話可以直接去看上面那篇原文，這裡 po 出建立模型階段簡要的流程圖和過程描述：

先將原始資料 300w chat 做一下預處理，即切詞，分為問答對。
然後用 word2vec 訓練出詞向量，生成二進位制的詞向量檔案。

作為 Input data X 傳入下面流程：

question 進入 LSTM 的 encoder 環節，answer 進入 decoder 環節，
分別生成 output tensor。
其中 decoder 是一個詞一個詞的生成結果，將所有結果加入到一個 list 中。
最後和 encoder 的輸出，一起做為下一環節 Regression 的輸入，並傳入 DNN 網路。

3. 如何準備 chatbot 的訓練資料

訓練資料的生成過程如下：

首先在 input file 裡讀取每一行，並根據 ‘｜’ 拆分成 question 和 answer 句子。
每個句子，都將 word 通過 word2vec 轉化成詞向量。
每一句的向量序列都轉化成相同維度的形式：self.word_vec_dim * self.max_seq_len
最後 answer 構成了 y 資料，question＋answer 構成了 xy 資料，再被投入到 model 中去訓練：

model.fit(trainXY, trainY, n_epoch=1000, snapshot_epoch=False, batch_size=1)

程式碼如下：

def init_seq(input_file):
    """讀取切好詞的文字檔案，載入全部詞序列
    """
    file_object = open(input_file, 'r')
    vocab_dict = {}
    while True:
        question_seq = []
        answer_seq = []
        line = file_object.readline()
        if line:
            line_pair = line.split('|')
            line_question = line_pair[0]
            line_answer = line_pair[1]
            for word in line_question.decode('utf-8').split(' '):
                if word_vector_dict.has_key(word):
                    question_seq.append(word_vector_dict[word])
            for word in line_answer.decode('utf-8').split(' '):
                if word_vector_dict.has_key(word):
                    answer_seq.append(word_vector_dict[word])
        else:
            break
        question_seqs.append(question_seq)
        answer_seqs.append(answer_seq)
    file_object.close()

def generate_trainig_data(self):
        xy_data = []
        y_data = []
        for i in range(len(question_seqs)):
            question_seq = question_seqs[i]
            answer_seq = answer_seqs[i]
            if len(question_seq) < self.max_seq_len and len(answer_seq) < self.max_seq_len:
                sequence_xy = [np.zeros(self.word_vec_dim)] * (self.max_seq_len-len(question_seq)) + list(reversed(question_seq))
                sequence_y = answer_seq + [np.zeros(self.word_vec_dim)] * (self.max_seq_len-len(answer_seq))
                sequence_xy = sequence_xy + sequence_y
                sequence_y = [np.ones(self.word_vec_dim)] + sequence_y
                xy_data.append(sequence_xy)
                y_data.append(sequence_y)
        return np.array(xy_data), np.array(y_data)

4. Chatbot 原始碼解讀

提煉出步驟如下：

其中 2. 準備資料， 3. 建立模型就是上文著重說的部分。

1. 引入包
1. 準備資料
1. 建立模型
1. 訓練
1. 預測

1. 引入包

import sys
import math
import tflearn
import tensorflow as tf
from tensorflow.python.ops import rnn_cell
from tensorflow.python.ops import rnn
import chardet
import numpy as np
import struct

2. 準備資料

def load_word_set()
將 3000 萬語料，分成 Question 和 Answer 部分，提取出 word。

def load_word_set():
    file_object = open('./segment_result_lined.3000000.pair.less', 'r')
    while True:
        line = file_object.readline()
        if line:
            line_pair = line.split('|')
            line_question = line_pair[0]
            line_answer = line_pair[1]
            for word in line_question.decode('utf-8').split(' '):
                word_set[word] = 1
            for word in line_answer.decode('utf-8').split(' '):
                word_set[word] = 1
        else:
            break
    file_object.close()

def load_vectors(input)
從 vectors.bin 載入詞向量，返回一個 word_vector_dict 的詞典，key 是詞，value 是200維的向量。

def init_seq(input_file)
將 Question 和 Answer 中單詞對應的詞向量放在詞向量序列中 question_seqs， answer_seqs。

def init_seq(input_file):
    """讀取切好詞的文字檔案，載入全部詞序列
    """
    file_object = open(input_file, 'r')
    vocab_dict = {}
    while True:
        question_seq = []
        answer_seq = []
        line = file_object.readline()
        if line:
            line_pair = line.split('|')
            line_question = line_pair[0]
            line_answer = line_pair[1]
            for word in line_question.decode('utf-8').split(' '):
                if word_vector_dict.has_key(word):
                    question_seq.append(word_vector_dict[word])
            for word in line_answer.decode('utf-8').split(' '):
                if word_vector_dict.has_key(word):
                    answer_seq.append(word_vector_dict[word])
        else:
            break
        question_seqs.append(question_seq)
        answer_seqs.append(answer_seq)
    file_object.close()

def vector_sqrtlen(vector)
用來求向量的長度。

def vector_sqrtlen(vector):
    len = 0
    for item in vector:
        len += item * item
    len = math.sqrt(len)
    return len

def vector_cosine(v1, v2)
用來求兩個向量間的距離。

def vector_cosine(v1, v2):
    if len(v1) != len(v2):
        sys.exit(1)
    sqrtlen1 = vector_sqrtlen(v1)
    sqrtlen2 = vector_sqrtlen(v2)
    value = 0
    for item1, item2 in zip(v1, v2):
        value += item1 * item2
    return value / (sqrtlen1*sqrtlen2)

def vector2word(vector)
給定一個詞向量，去 word－vector 字典中查詢與此向量距離最近的向量，並記憶相應的單詞，返回單詞和 cosine 值。

def vector2word(vector):
    max_cos = -10000
    match_word = ''
    for word in word_vector_dict:
        v = word_vector_dict[word]
        cosine = vector_cosine(vector, v)
        if cosine > max_cos:
            max_cos = cosine
            match_word = word
    return (match_word, max_cos)

3. 建立模型

class MySeq2Seq(object)
在前兩篇筆記中單獨寫了這兩塊。

def generate_trainig_data(self)
由 question_seqs， answer_seqs 得到 xy_data 和 y_data 的形式。

def model(self, feed_previous=False)
用 input data 生成 encoder_inputs 和帶GO頭的 decoder_inputs。
將 encoder_inputs 傳遞給編碼器，返回一個輸出(預測序列的第一個值)和一個狀態(傳給解碼器)。
在解碼器中，用編碼器的最後一個輸出作為第一個輸入，預測過程用前一個時間序的輸出作為下一個時間序的輸入。

4. 訓練

def train(self)
用 generate_trainig_data() 生成 X y 資料，傳遞給上面定義的 model，並訓練 model.fit，再儲存。

    def train(self):
        trainXY, trainY = self.generate_trainig_data()
        model = self.model(feed_previous=False)
        model.fit(trainXY, trainY, n_epoch=1000, snapshot_epoch=False, batch_size=1)
        model.save('./model/model')
        return model

5. 預測

用 generate_trainig_data() 生成資料，用 model.predict 進行預測，predict 結果的每一個 sample 相當於一句話的詞向量序列，每個 sample 中的每個 vector 在 word－vector 字典中找到與其最近的向量，並返回對應的 word，及二者間的 cosine。

if __name__ == '__main__':
    phrase = sys.argv[1]
    if 3 == len(sys.argv):
        my_seq2seq = MySeq2Seq(word_vec_dim=word_vec_dim, max_seq_len=max_seq_len, input_file=sys.argv[2])
    else:
        my_seq2seq = MySeq2Seq(word_vec_dim=word_vec_dim, max_seq_len=max_seq_len)
    if phrase == 'train':
        my_seq2seq.train()
    else:
        model = my_seq2seq.load()
        trainXY, trainY = my_seq2seq.generate_trainig_data()
        predict = model.predict(trainXY)
        for sample in predict:
            print "predict answer"
            for w in sample[1:]:
                (match_word, max_cos) = vector2word(w)
                #if vector_sqrtlen(w) < 1:
                #    break
                print match_word, max_cos, vector_sqrtlen(w)

用 TensorFlow 做個聊天機器人

1. 聊天機器人的架構簡圖

2. 用 TensorFlow 實現 Chatbot 的模型

3. 如何準備 chatbot 的訓練資料

4. Chatbot 原始碼解讀

1. 引入包

2. 準備資料

3. 建立模型

4. 訓練

5. 預測

用 TensorFlow 做個聊天機器人

用Python做一個聊天機器人

10行程式碼讓你秒變撩妹達人：用Python做一個聊天機器人

Avaya將在GITEX 2018上展示全球首個聊天機器人社交平臺

（30）進階：用 jQuery 做個輪播吧

教你用TensorFlow做影象識別

室友玩個掃雷通關炫耀？我用Python做個十秒通關的程式迴應他！

用Python做個小遊戲：環境篇

用Python做個微信秒回器，再也不怕捱罵啦！

用Python做個QQ郵箱定時傳送天氣預報給女神，教你呵護心愛的她！

用Python做個QQ郵箱定時發送天氣預報給女神，教你呵護心愛的她！

動手用JAVA做個小遊戲--貪吃蛇

你用TensorFlow做過哪些有趣的嘗試？

通過攝像頭捕獲影象用tensorflow做手寫數字識別

用python做個彈球遊戲（一）

用Python做個小網站（MVC架構）

socket.io+angular.js+express.js做個聊天應用(三)

socket.io+angular.js+express.js做個聊天應用(二)

用機器學習打造聊天機器人(三) 設計篇

用機器學習打造聊天機器人(六) 原理篇

用 TensorFlow 做個聊天機器人

1. 聊天機器人的架構簡圖

2. 用 TensorFlow 實現 Chatbot 的模型

3. 如何準備 chatbot 的訓練資料

4. Chatbot 原始碼解讀

1. 引入包

2. 準備資料

3. 建立模型

4. 訓練

5. 預測

相關推薦