《深度學習——Andrew Ng》第五課第三週程式設計作業_1_Machine Translation

阿新 • • 發佈：2019-01-12

pycharm版

from keras.layers import Bidirectional, Concatenate, Permute, Dot, Input, LSTM, Multiply
from keras.layers import RepeatVector, Dense, Activation, Lambda
from keras.optimizers import Adam
from keras.utils import to_categorical
from keras.models import load_model, Model
import keras.backend as 
 K
import numpy as np

from faker import Faker             # faker是一個生成偽造資料的Python第三方庫，可以偽造城市，姓名，等等，而且支援中文
import random
from tqdm import tqdm               # 可以顯示迴圈的進度條的庫
from babel.dates import format_date # A collection of tools for internationalizing Python applications.
from nmt_utils import *
import 
 matplotlib.pyplot as plt



# GRADED FUNCTION: one_step_attention
def one_step_attention(a, s_prev):
    """
    Performs one step of attention: Outputs a context vector computed as a dot product of the attention weights
    "alphas" and the hidden states "a" of the Bi-LSTM.

    Arguments:
    a -- hidden state output of the Bi-LSTM, numpy-array of shape (m, Tx, 2*n_a)
    s_prev -- previous hidden state of the (post-attention) LSTM, numpy-array of shape (m, n_s)

    Returns:
    context -- context vector, input of the next (post-attetion) LSTM cell
    """ 


    ### START CODE HERE ###
    # Use repeator to repeat s_prev to be of shape (m, Tx, n_s) so that you can concatenate it with all hidden states "a" (≈ 1 line)
    s_prev = repeator(s_prev)
    # Use concatenator to concatenate a and s_prev on the last axis (≈ 1 line)
    concat = concatenator([a, s_prev])
    # Use densor1 to propagate concat through a small fully-connected neural network to compute the "intermediate energies" variable e. (≈1 lines)
    e = densor1(concat)
    # Use densor2 to propagate e through a small fully-connected neural network to compute the "energies" variable energies. (≈1 lines)
    energies = densor2(e)
    # Use "activator" on "energies" to compute the attention weights "alphas" (≈ 1 line)
    alphas = activator(energies)
    # Use dotor together with "alphas" and "a" to compute the context vector to be given to the next (post-attention) LSTM-cell (≈ 1 line)
    context = dotor([alphas, a])
    ### END CODE HERE ###

    return context


# GRADED FUNCTION: model
def model(Tx, Ty, n_a, n_s, human_vocab_size, machine_vocab_size):
    """
    Arguments:
    Tx -- length of the input sequence
    Ty -- length of the output sequence
    n_a -- hidden state size of the Bi-LSTM
    n_s -- hidden state size of the post-attention LSTM
    human_vocab_size -- size of the python dictionary "human_vocab"
    machine_vocab_size -- size of the python dictionary "machine_vocab"

    Returns:
    model -- Keras model instance
    """

    # Define the inputs of your model with a shape (Tx,)
    # Define s0 and c0, initial hidden state for the decoder LSTM of shape (n_s,)
    X = Input(shape=(Tx, human_vocab_size))
    s0 = Input(shape=(n_s,), name='s0')
    c0 = Input(shape=(n_s,), name='c0')
    s = s0
    c = c0

    # Initialize empty list of outputs
    outputs = []

    ### START CODE HERE ###

    # Step 1: Define your pre-attention Bi-LSTM. Remember to use return_sequences=True. (≈ 1 line)
    a = Bidirectional(LSTM(n_a, return_sequences=True))(X)

    # Step 2: Iterate for Ty steps
    for t in range(Ty):

        # Step 2.A: Perform one step of the attention mechanism to get back the context vector at step t (≈ 1 line)
        context = one_step_attention(a, s)

        # Step 2.B: Apply the post-attention LSTM cell to the "context" vector.
        # Don't forget to pass: initial_state = [hidden state, cell state] (≈ 1 line)
        s, _, c = post_activation_LSTM_cell(context, initial_state = [s, c])

        # Step 2.C: Apply Dense layer to the hidden state output of the post-attention LSTM (≈ 1 line)
        out = output_layer(s)

        # Step 2.D: Append "out" to the "outputs" list (≈ 1 line)
        outputs.append(out)

    # Step 3: Create model instance taking three inputs and returning the list of outputs. (≈ 1 line)
    model = Model(inputs=[X,s0,c0],outputs=outputs)

    ### END CODE HERE ###

    return model

if __name__ == '__main__':

    # 1 - Translating human readable dates into machine readable dates.
    ## 1.1 - DataSet

    m = 10000
    dataset, human_vocab, machine_vocab, inv_machine_vocab = load_dataset(m)
    print(str(dataset[:10]) + '\n')
    Tx = 30
    Ty = 10
    X, Y, Xoh, Yoh = preprocess_data(dataset, human_vocab, machine_vocab, Tx, Ty)

    print("X.shape:", X.shape)
    print("Y.shape:", Y.shape)
    print("Xoh.shape:", Xoh.shape)
    print("Yoh.shape:", Yoh.shape)
    index = 0
    print("Source date:", dataset[index][0])
    print("Target date:", dataset[index][1])
    print()
    print("Source after preprocessing (indices):", X[index])
    print("Target after preprocessing (indices):", Y[index])
    print()
    print("Source after preprocessing (one-hot):", Xoh[index])
    print("Target after preprocessing (one-hot):", Yoh[index])


    # 2 - Neural machine translation with attention
    ## 2.1 - Attention mechanism
    # Defined shared layers as global variables
    ## one_step_attention
    repeator = RepeatVector(Tx)
    concatenator = Concatenate(axis=-1)
    densor1 = Dense(10, activation="tanh")
    densor2 = Dense(1, activation="relu")
    activator = Activation(softmax,
                           name='attention_weights')  # We are using a custom softmax(axis = 1) loaded in this notebook
    dotor = Dot(axes=1)
    ## whole model
    n_a = 32
    n_s = 64
    post_activation_LSTM_cell = LSTM(n_s, return_state=True)
    output_layer = Dense(len(machine_vocab), activation=softmax)

    model = model(Tx, Ty, n_a, n_s, len(human_vocab), len(machine_vocab))
    model.summary()


    ### START CODE HERE ### (≈2 lines)
    opt = Adam(lr=0.0005, beta_1=0.9, beta_2=0.999, decay=0.01)
    model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
    ### END CODE HERE ###

    s0 = np.zeros((m, n_s))
    c0 = np.zeros((m, n_s))
    outputs = list(Yoh.swapaxes(0, 1))

    # model.fit([Xoh, s0, c0], outputs, epochs=1, batch_size=100)

    model.load_weights('models/model.h5')

    EXAMPLES = ['3 May 1979', '5 April 09', '21th of August 2016', 'Tue 10 Jul 2007', 'Saturday May 9 2018',
                'March 3 2001', 'March 3rd 2001', '1 March 2001']

    for example in EXAMPLES:
        source = string_to_int(example, Tx, human_vocab)

        source = np.array(list(map(lambda x: to_categorical(x, num_classes=len(human_vocab)), source))).swapaxes(0, 1)

        ### 這兩行為了將source陣列該到正確的緯度，經試驗得到可行
        source = source.T
        source = source[np.newaxis, :]
        print(source.shape)
        ###

        prediction = model.predict([source, s0, c0])
        prediction = np.argmax(prediction, axis=-1)
        output = [inv_machine_vocab[int(i)] for i in prediction]

        print("source:", example)
        print("output:", ''.join(output))


    # 3 - Visualizing Attention (Optional / Ungraded)
    model.summary()
    attention_map = plot_attention_map(model, human_vocab, inv_machine_vocab, "Tuesday 09 Oct 1993", num=7, n_s=64)


    print(" END !!!")

《深度學習——Andrew Ng》第五課第三週程式設計作業_1_Machine Translation

pycharm版 from keras.layers import Bidirectional, Concatenate, Permute, Dot, Input, LSTM, Multiply from keras.layers import RepeatV

《深度學習——Andrew Ng》第一課第四周程式設計作業

Building your Deep Neural Network: Step by Step 3.2 - L-layer Neural Network The initialization for a deeper L-layer neural

《深度學習——Andrew Ng》第一課第二週程式設計作業

最近在網易雲課堂學習《深度學習》微專業，將課後的程式設計作業記錄下來。 Logistic Regression with a Neural Network mindset Welcome to your first (required) pr

《深度學習——Andrew Ng》第五課第一週程式設計作業_2_dinosaurus island

第二課的作業是給恐龍起名，訓練集是一系列恐龍的名字，經過訓練後，RNN網路可以生成新的恐龍的名字，隨著訓練次數的迭代，可以發現得到的名字越來越像是正常的恐龍名字。這裡有兩點需要注意一下：使用的模型RNN 圖中的每個cell都把計算流程標清楚了

《深度學習——Andrew Ng》第五課第三週程式設計作業_2_Trigger+word+detection

Set the random seed np.random.seed(18) # Make background quieter background = background - 20 ### START CODE HERE ### # Step 1: Initi

《深度學習——Andrew Ng》第四課第四周程式設計作業_2_神經網路風格遷移

課程筆記演算法將一幅圖片分為內容+風格，有了這兩像，圖片也就確定了，所以”生成圖片主要的思想，通過兩個損失函式（內容損失+風格損失）來進行迭代更新” 遷移學習總體分為三步: 建立內容損失函式 Jcontent(C,G)Jcontent(C,G)

《深度學習——Andrew Ng》第四課第二週程式設計作業

深度學習第四課是卷積神經網路，共四周內容：第一週卷積神經網路（卷積的含義，各個層的功能，如何計算資料在不同層的大小（shape））第二週深度卷積網路：例項探究（LeNet5、ResNet50等經典神經網路，遷移學習，資料擴充）第三週

《深度學習——Andrew Ng》第四課第三週程式設計作業

第三週的課程是目標檢測，程式設計作業是以yolo網路為主。程式設計作業的主要部分是對yolo網路輸出進行 anchor boxes過濾、IOU過濾、非極大抑制處理。理論知識交併比（Intersection-over-Union，IoU），目標檢測中使

深度學習-吳恩達第一課第四周課程作業

在前面兩節課的基礎上，這次作業是訓練一個N層神經網路，來判斷一張圖片是否有貓，實現過程其實和第三週很相似，因為層數不確定，所以在向前傳播和反向傳播的時候會用到for迴圈，程式碼相對而言反而更精簡了。貼出的程式碼可能和老師給的模板不一樣，我沒有看到老師的原版課程作業，也是在網上

網易雲深度學習第一課第三週程式設計作業

具有一個隱藏層的平面資料分類第三週的程式設計任務：構建一個含有一層隱藏層的神經網路，你將會發現這和使用邏輯迴歸有很大的不同。首先先匯入在這個任務中你需要的所有的包。 -numpy是Python中與科學計算相關的基礎包 -sklearn提供簡單高效

深度學習-吳恩達第一課第三週課程作業

第二週的課程作業是利用邏輯迴歸來訓練一個分類器來辨別一張圖片是否為貓，這周老師講了單隱層的神經網路，所以先看看利用這個模型能否在上次作業的基礎上對訓練準確度作出改善訓練一個神經網路神經網路分為幾層，隱藏層中包含幾個神經元，使用的啟用函式初始化引數 W（i）和

第五課-第三講05_03_bash腳本編程之二條件判斷

表達重名關鍵字系統 amp 文件是否存在取反 bash腳本編程 bash 第五課-第三講05_03_bash腳本編程之二條件判斷變量名稱：只能保含字母數字下劃線，且不能數字開頭。不能和系統中已存在的環境變量重名。見名知意bash中如何實現條件判斷？條件測試類型

第五課-第四講05_04_bash腳本編程之三條件判斷及算術運算

ash 如果寫一個腳本字符命令引用是否練習 bash腳本 [] 第五課-第四講05_04_bash腳本編程之三條件判斷及算術運算練習：寫一個腳本，判斷當前系統上是否有用戶的默認shell為bash：如果有，就顯示有多少個這類用戶，否則，就顯示沒有這類用戶 bc

Coursera 吳恩達《神經網路與深度學習》第三週程式設計作業

# Package imports import numpy as np import matplotlib.pyplot as plt from testCases import * import sklearn import sklearn.datasets impo

第五課第六課浮點數和逐次近似牛頓法、列表簡介

python有兩種數字型別：整型，任意精度整數 a=2**1000=。。。。。L l代表內部長格式，處理長整數效率低分界線大概是20億左右 b=2**999，那麼a/b=2L，一旦數字加上了L，就別想還原了 float 型 x=0.1 列印x 則會出現0.1000000

機器學習 | 吳恩達機器學習第三週程式設計作業(Python版)

實驗指導書下載密碼:fja4 本篇部落格主要講解，吳恩達機器學習第三週的程式設計作業，作業內容主要是利用邏輯迴歸演算法(正則化)進行二分類。實驗的原始版本是用Matlab實現的，本篇部落格主要用Python來實現。目錄 1.實驗包含的檔案 2.使用邏

Coursera-機器學習（吳恩達）第三週-程式設計作業

1、邏輯迴歸邏輯迴歸與線性迴歸的主要區別在於假設函式，邏輯迴歸中的假設函式： hθ(x) = g(θ'x)=sgmoid(θ’

Coursera概率圖模型（Probabilistic Graphical Models）第三週程式設計作業分析

Markov Networks for OCR 光學字元識別的馬爾科夫網路說到光學字元識別（OCR），此前筆者首先想到的會是卷積神經網路，而單詞識別則會考慮使用遞迴神經網路。而本週的作業則基於馬爾科夫網路構建了一個較為基礎OCR系統，目的也主要是讓我們對馬爾科夫網路有個感

吳恩達Coursera深度學習課程 deeplearning.ai (5-1) 迴圈序列模型--程式設計作業(一)：構建迴圈神經網路

Part 1: 構建神經網路歡迎來到本週的第一個作業，這個作業我們將利用numpy實現你的第一個迴圈神經網路。迴圈神經網路(Recurrent Neural Networks: RNN) 因為有”記憶”，所以在自然語言處理(Natural Languag

程式設計入門—Java語言_第三週程式設計作業

1奇偶個數題目內容：你的程式要讀入一系列正整數資料，輸入-1表示輸入結束，-1本身不是輸入的資料。程式輸出讀到的資料中的奇數和偶數的個數。輸入格式: 一系列正整數，整數的範圍是（0,100000）。如果輸入-1則表示輸入結束。輸出格式：兩個整數，第一個整數表示讀入

《深度學習——Andrew Ng》第五課第三週程式設計作業_1_Machine Translation

pycharm版

相關推薦