不要抄作業！
我只是把思路整理了，供個人學習。
不要抄作業！

本週的作業分了兩個部分，第一部分先構建神經網路的基本函式，第二部分才是構建出模型並預測。

Part1

構建的函式有：

Initialize the parameters
- two-layer
- L-layer
forworad propagation
- Linear part 先構建一個線性的計算函式
- linear->activation 在構建某一個神經元的線性和啟用函式
- L_model_forward funciton 再融合 L-1次的Relu 和一次的 sigmoid最後一層
Compute loss

backward propagation
- Linear part
- linear->activation
- L_model_backward funciton

Initialization

初始化使用：

w : np.random.randn(shape)*0.01

b : np.zeros(shape)

1. two-layer

先寫了個兩層的初始化函式，上週已經寫過了。

def initialize_parameters(n_x, n_h, n_y):
    """
    Argument:
    n_x -- size of the input layer
    n_h -- size of the hidden layer
    n_y -- size of the output layer

    Returns:
    parameters -- python dictionary containing your parameters:
                    W1 -- weight matrix of shape (n_h, n_x)
                    b1 -- bias vector of shape (n_h, 1)
                    W2 -- weight matrix of shape (n_y, n_h)
                    b2 -- bias vector of shape (n_y, 1)
    """ 


    np.random.seed(1)

    ### START CODE HERE ### (≈ 4 lines of code)
    W1 = np.random.randn(n_h, n_x) * 0.01
    b1 = np.zeros((n_h,1))
    W2 = np.random.randn(n_y, n_h) * 0.01
    b2 = np.zeros((n_y,1))
    ### END CODE HERE ###

    assert(W1.shape == (n_h, n_x))
    assert(b1.shape == (n_h, 1))
    assert 
(W2.shape == (n_y, n_h))
    assert(b2.shape == (n_y, 1))

    parameters = {"W1": W1,
                  "b1": b1,
                  "W2": W2,
                  "b2": b2}

    return parameters

2. L-layer

然後寫了個L層的初始化函式，其中，輸入的引數是一個列表，如[12,4,3,1]，表示一共4層：

def initialize_parameters_deep(layer_dims):
    """
    Arguments:
    layer_dims -- python array (list) containing the dimensions of each layer in our network

    Returns:
    parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":
                    Wl -- weight matrix of shape (layer_dims[l], layer_dims[l-1])
                    bl -- bias vector of shape (layer_dims[l], 1)
    """

    np.random.seed(3)
    parameters = {}
    L = len(layer_dims)            # number of layers in the network

    for l in range(1, L):
        ### START CODE HERE ### (≈ 2 lines of code)
        parameters['W' + str(l)] = np.random.randn(layer_dims[l], layer_dims[l-1]) * 0.01
        parameters['b' + str(l)] = np.zeros((layer_dims[l], 1))
        ### END CODE HERE ###

        assert(parameters['W' + str(l)].shape == (layer_dims[l], layer_dims[l-1]))
        assert(parameters['b' + str(l)].shape == (layer_dims[l], 1))


    return parameters

Forward propagation module

1. Linear Forward

利用公式：

Z^{[l]} = W^{[l]} A^{[l - 1]} + b^{[l]}

where $A^{[0]} = X$ .

這個時候，輸入的引數是 A,W,b,輸出是計算得到的Z，以及cache=（A， W， b）儲存起來

def linear_forward(A, W, b):
    """
    Implement the linear part of a layer's forward propagation.

    Arguments:
    A -- activations from previous layer (or input data): (size of previous layer, number of examples)
    W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)
    b -- bias vector, numpy array of shape (size of the current layer, 1)

    Returns:
    Z -- the input of the activation function, also called pre-activation parameter 
    cache -- a python dictionary containing "A", "W" and "b" ; stored for computing the backward pass efficiently
    """

    ### START CODE HERE ### (≈ 1 line of code)
    Z = np.dot(W, A) + b
    ### END CODE HERE ###

    assert(Z.shape == (W.shape[0], A.shape[1]))
    cache = (A, W, b)

    return Z, cache

2. Linear-Activation Forward

在這裡就是把剛才得到的Z，通過 $A = g (Z)$ 啟用函式，合併成一個

這個時候，notebook已經給了我們現成的sigmoid和relu函數了，只要呼叫就行，不過在裡面好像沒有說明原始碼，輸出都是A和cache=Z，這裡貼出來：

def sigmoid(Z):
    """
    Implements the sigmoid activation in numpy

    Arguments:
    Z -- numpy array of any shape

    Returns:
    A -- output of sigmoid(z), same shape as Z
    cache -- returns Z as well, useful during backpropagation
    """

    A = 1/(1+np.exp(-Z))
    cache = Z

    return A, cache

def relu(Z):
    """
    Implement the RELU function.

    Arguments:
    Z -- Output of the linear layer, of any shape

    Returns:
    A -- Post-activation parameter, of the same shape as Z
    cache -- a python dictionary containing "A" ; stored for computing the backward pass efficiently
    """

    A = np.maximum(0,Z)

    assert(A.shape == Z.shape)

    cache = Z 
    return A, cache

而後利用之前的linear_forward，可以寫出某層神經元的前向函數了，輸入是 $A^{[l - 1]}, W, b$ ，還有一個是說明sigmoid還是relu的字串activation。

輸出是 $A^{[l]}$ 和cache，這裡的cache已經包含的4個引數了，分別是 $A^{[l - 1]}, W^{[l]}, b^{[l]}, Z^{[l]}$


# GRADED FUNCTION: linear_activation_forward

def linear_activation_forward(A_prev, W, b, activation):
    """
    Implement the forward propagation for the LINEAR->ACTIVATION layer

    Arguments:
    A_prev -- activations from previous layer (or input data): (size of previous layer, number of examples)
    W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)
    b -- bias vector, numpy array of shape (size of the current layer, 1)
    activation -- the activation to be used in this layer, stored as a text string: "sigmoid" or "relu"

    Returns:
    A -- the output of the activation function, also called the post-activation value 
    cache -- a python dictionary containing "linear_cache" and "activation_cache";
             stored for computing the backward pass efficiently
    """

    if activation == "sigmoid":
        # Inputs: "A_prev, W, b". Outputs: "A, activation_cache".
        ### START CODE HERE ### (≈ 2 lines of code)
        Z, linear_cache = linear_forward(A_prev, W, b)
        A, activation_cache = sigmoid(Z)
        ### END CODE HERE ###

    elif activation == "relu":
        # Inputs: "A_prev, W, b". Outputs: "A, activation_cache".
        ### START CODE HERE ### (≈ 2 lines of code)
        Z, linear_cache = linear_forward(A_prev, W, b)
        A, activation_cache = relu(Z)
        ### END CODE HERE ###

    assert (A.shape == (W.shape[0], A_prev.shape[1]))
    cache = (linear_cache, activation_cache)
   # print(cache)
    return A, cache

3. L-Layer Model

這一步就把多層的神經網路從頭到尾串起來了。前面有L-1層的Relu，第L層是sigmoid。

輸入是X，也就是 $A^{[0]}$ ，和 parameters包含了各個層的W,b

輸出是最後一層的 $A^{[L]}$ ，也就是預測結果 $Y_{h} a t$ ，以及每一層的caches : $A^{[l - 1]}, W^{[l]}, b^{[l]}, Z^{[l]}$

def L_model_forward(X, parameters):
    """
    Implement forward propagation for the [LINEAR->RELU]*(L-1)->LINEAR->SIGMOID computation

    Arguments:
    X -- data, numpy array of shape (input size, number of examples)
    parameters -- output of initialize_parameters_deep()

    Returns:
    AL -- last post-activation value
    caches -- list of caches containing:
                every cache of linear_activation_forward() (there are L-1 of them, indexed from 0 to L-1)
    """

    caches = []
    A = X
    L = len(parameters) // 2                  # number of layers in the neural network

    # Implement [LINEAR -> RELU]*(L-1). Add "cache" to the "caches" list.
    for l in range(1, L):
        A_prev = A 
        ### START CODE HERE ### (≈ 2 lines of code)
        A, cache = linear_activation_forward(A_prev, parameters['W'+str(l)], parameters['b'+str(l)], 'relu')
        caches.append(cache)
        ### END CODE HERE ###

    # Implement LINEAR -> SIGMOID. Add "cache" to the "caches" list.
    ### START CODE HERE ### (≈ 2 lines of code)
    AL, cache = linear_activation_forward(A, parameters['W'+str(L)], parameters['b'+str(L)],'sigmoid')
    caches.append(cache)
    ### END CODE HERE ###
   # print(AL.shape)
    assert(AL.shape == (1,X.shape[1]))

    return AL, caches

Cost function

- \frac{1}{m} \sum_{i = 1}^{m} (y^{(i)} \log (a^{[L] (i)}) + (1 - y^{(i)}) \log (1 - a^{[L] (i)}))

利用np.multiply and np.sum求得交叉熵


def compute_cost(AL, Y):
    """
    Implement the cost function defined by equation (7).

    Arguments:
    AL -- probability vector corresponding to your label predictions, shape (1, number of examples)
    Y -- true "label" vector (for example: containing 0 if non-cat, 1 if cat), shape (1, number of examples)

    Returns:
    cost -- cross-entropy cost
    """

    m = Y.shape[1]

    # Compute loss from aL and y.
    ### START CODE HERE ### (≈ 1 lines of code)
    cost = - np.sum(np.multiply(Y,np.log(AL)) + np.multiply(1-Y,np.log(1-AL))) / m
    print(cost)
    ### END CODE HERE ###
    cost = np.squeeze(cost)      # To make sure your cost's shape is what we expect (e.g. this turns [[17]] into 17).
    assert(cost.shape == ())

    return cost

Backward propagation module

1. Linear backward

首先假設知道 $d Z^{[l]} = \frac{\partial L}{\partial Z^{[l]}}$ ，然後想要求得的是 $(d W^{[l]}, d b^{[l]} d A^{[l - 1]})$

DeepLearning.ai作業:(1-4)-- 深層神經網路（Deep neural networks）

不要抄作業！我只是把思路整理了，供個人學習。不要抄作業！本週的作業分了兩個部分，第一部分先構建神經網路的基本函式，第二部分才是構建出模型並預測。 Part1 構建的函式有： Initialize the parameters t

DeepLearning.ai筆記:(1-4)-- 深層神經網路（Deep neural networks）

這一週主要講了深層的神經網路搭建。深層神經網路的符號表示在深層的神經網路中， LL表示神經網路的層數 L=4L=4 n[l]n[l]表示第ll層的神經網路個數 W[l]:(n[l],nl−1)W[l]:(n[l],nl−1) dW[l

DeepLearning.ai作業:(5-1)-- 迴圈神經網路（Recurrent Neural Networks）（1）

title: ‘DeepLearning.ai作業:(5-1)-- 迴圈神經網路（Recurrent Neural Networks）（1）’ id: dl-ai-5-1h1 tags: dl.ai homework categories: AI Deep

DeepLearning.ai作業:(5-1)-- 迴圈神經網路（Recurrent Neural Networks）（2）

title: ‘DeepLearning.ai作業:(5-1)-- 迴圈神經網路（Recurrent Neural Networks）（2）’ id: dl-ai-5-1h2 tags: dl.ai homework categories: AI Deep

DeepLearning.ai作業:(5-1)-- 迴圈神經網路（Recurrent Neural Networks）（3）

title: ‘DeepLearning.ai作業:(5-1)-- 迴圈神經網路（Recurrent Neural Networks）（3）’ id: dl-ai-5-1h3 tags: dl.ai homework categories: AI Deep

DeepLearning.ai筆記:(5-1)-- 迴圈神經網路（Recurrent Neural Networks）

title: ‘DeepLearning.ai筆記:(5-1)-- 迴圈神經網路（Recurrent Neural Networks）’ id: dl-ai-5-1 tags: dl.ai categories: AI Deep Learning date: 2

深度神經網路（Deep Neural Network）

dZ[l]=dA[l]∗g[l]′(Z[l])dW[l]=1mdZ[l]⋅A[l−1]db[l]=1mnp.sum(dZ[l],axis=1,keepdims=True)dA[l−1]=W[l]T⋅dZ[l]

DeepLearning.ai作業:(4-1)-- 卷積神經網路（Foundations of CNN）

title: ‘DeepLearning.ai作業:(4-1)-- 卷積神經網路（Foundations of CNN）’ id: dl-ai-4-1h tags: dl.ai homework categories: AI Deep Learning d

DeepLearning.ai筆記:(4-1)-- 卷積神經網路（Foundations of CNN）

title: ‘DeepLearning.ai筆記:(4-1)-- 卷積神經網路（Foundations of CNN）’ id: dl-ai-4-1 tags: dl.ai categories: AI Deep Learning date: 2018-09-

吳恩達deeplearning.ai課程《改善深層神經網路：超引數除錯、正則化以及優化》____學習筆記（第一週）

____tz_zs學習筆記第一週深度學習的實用層面（Practical aspects of Deep Learning）我們將學習如何有效運作神經網路（超引數調優、如何構建資料以及如何確保優化演算法快速執行）設定ML應用（Setting up your ML applic

Coursera 吳恩達 Deep Learning 第二課改善神經網路 Improving Deep Neural Networks 第二週程式設計作業程式碼Optimization methods

Optimization Methods Until now, you’ve always used Gradient Descent to update the parameters and minimize the cost. In this notebo

RBM（受限玻爾茲曼機）和深層信念網路（Deep Brief Network）

目錄：一、RBM 二、Deep Brief Network 三、Deep Autoencoder 一、RBM 1、定義【無監督學習】 RBM記住三個要訣：1）兩層結構圖，可視層和隱藏層；【沒輸出層】2）層內無連線，層間全連線；3）二值狀態值，

卷積神經網路（Convolutional Neural Networks，CNNS/ConvNets）

卷積神經網路非常類似於普通的神經網路：它們都是由具有可以學習的權重和偏置的神經元組成。每一個神經元接收一些輸入，然後進行點積和可選的非線性運算。而整個網路仍然表示一個可微的得分函式：從原始的影象畫素對映到類得分。在最後一層（全連線層）也有損失函

深度學習【6】二值網路（Binarized Neural Networks）學習與理解

http://blog.csdn.net/linmingan/article/details/51008830 Binarized Neural Networks: Training Neural Networks with Weights and Ac

吳恩達《深度學習-神經網路和深度學習》4--深層神經網路

1. 深層神經網路長什麼樣？所謂深層神經網路其實就是含有更多的隱藏層或隱藏層單元的神經網路。2. 前向傳播深層神經網路前向傳播的向量化實現：這裡需要逐層計算每層的z和a，一般只能用for迴圈（從第1層到第L層，輸入層被稱為第0層）3. 如何確定矩陣的維數n[0]是輸入層的特徵

R語言——實驗4-人工神經網路（更新中）

帶包實現： rm(list=ls()) setwd("C:/Users/Administrator/Desktop/R語言與資料探勘作業/實驗4-人工神經網路") Data=read.csv("sales_data.csv")[,2:5] library(nnet) colnames(

TensorFlow實戰：Chapter-4（CNN-2-經典卷積神經網路（AlexNet、VGGNet））

引言 AlexNet AlexNet 簡介 AlexNet的特點 AlexNet論文分析引言

序列模型（5）-----雙向神經網路（BRNN）和深層迴圈神經網路（Deep RNN）

一、雙向迴圈神經網路BRNN 採用BRNN原因：雙向RNN，即可以從過去的時間點獲取記憶，又可以從未來的時間點獲取資訊。為什麼要獲取未來的資訊呢？判斷下面句子中Teddy是否是人名，如果只從前面兩個詞是無法得知Teddy是否是人名，如果能有後面的資訊就很好判斷了，這就需要用的雙向迴圈神經網路。

吳恩達Coursera深度學習課程 deeplearning.ai (3-1) 機器學習(ML)策略（1）--課程筆記

1.1 為什麼是 ML 策略實踐中優化深度學習模型的方法有好多種，應該如何抉擇? 1.2 正交化正交化：一個維度做且只做一件事，各個維度相互獨立，不影響其他維度做的事情。比如電視條件：有調節高度的按鈕，寬度的按鈕，旋轉的按鈕，色彩

個人作業1——四則運算題目生成程序（基於控制臺）

deb nio body min 此外 list eve span i++ 一、需求分析生成四則運算題目控制生成題目個數控制生成題目中數字的範圍結果為真分數每道題目運算符個數為3 每次運行生成的題目不能重復保存生成的題目在生成題目的同時，計算出所有題目的答案