1. 程式人生 > >吳恩達深度學習2.2練習_Improving Deep Neural Networks_Optimization

吳恩達深度學習2.2練習_Improving Deep Neural Networks_Optimization

版權宣告:本文為博主原創文章,未經博主允許不得轉載。 https://blog.csdn.net/weixin_42432468

學習心得:
1、每週的視訊課程看一到兩遍
2、做筆記

3、做每週的作業練習,這個裡面的含金量非常高。先根據notebook過一遍,掌握後一定要自己敲一遍,這樣以後用起來才能得心應手。


1、Load Dataset

2、演算法程式碼實現

2.1、初始化引數

2.2、正向傳播相關函式

2.3、計算cost

2.4、反向傳播相關函式

2.5、引數更新

3、預測

# import packages
import numpy as np
import matplotlib.pyplot as plt
# from reg_utils import sigmoid, relu, plot_decision_boundary, initialize_parameters, load_2D_dataset, predict_dec
# from reg_utils import compute_cost, predict, forward_propagation, backward_propagation, update_parameters
from
reg_utils import load_2D_dataset import sklearn import sklearn.datasets import scipy.io from testCases_improve_regulariation import * %matplotlib inline plt.rcParams['figure.figsize'] = (7.0, 4.0) # set default size of plots plt.rcParams['image.interpolation'] = 'nearest' plt.rcParams['image.cmap']
= 'gray'

1、Load Dataset

train_X, train_Y, test_X, test_Y = load_2D_dataset()

png

#檢視獲得資料集到底是個啥東西,型別、形狀、第一個例項
print ('train_X:\n',type(train_X),train_X.shape,'\n')
print (train_X[:,0])
print ('test_X:\n',type(test_X),test_X.shape,'\n')
print (test_X[:,0])
train_X:
 <class 'numpy.ndarray'> (2, 211) 

[-0.158986  0.423977]
test_X:
 <class 'numpy.ndarray'> (2, 200) 

[-0.35306235 -0.67390181]

2、演算法程式碼實現

2.1、初始化引數

def initialize_parameters(layer_dims,initialization='he'):
    
    np.random.seed(3)
    L = len(layer_dims)
    pars = {}
    if initialization == 'zeros':
        for l in range(1,L):
            pars['W'+str(l)] = np.zeros((layer_dims[l],layer_dims[l-1]))
            pars['b'+str(l)] = np.zeros((layer_dims[l],1))
        
    elif initialization == 'random':
        for l in range(1,L):
#             pars['W'+str(l)] = np.random.randn(layer_dims[l],layer_dims[l-1])*10
            pars['W'+str(l)] = np.random.randn(layer_dims[l],layer_dims[l-1])
            pars['b'+str(l)] = np.zeros((layer_dims[l],1))
          
    elif initialization == 'he':
        for l in range(1,L):
#             pars['W'+str(l)] = np.random.randn(layer_dims[l],layer_dims[l-1])* np.sqrt(2./layer_dims[l-1])
            pars['W'+str(l)] = np.random.randn(layer_dims[l],layer_dims[l-1])* np.sqrt(1./layer_dims[l-1])
            pars['b'+str(l)] = np.zeros((layer_dims[l],1))
        
    return pars
# test initialize_parameters function
pars_test = initialize_parameters([3,2,1],initialization='he')
print (pars_test)
pars_test = initialize_parameters([3,2,1],initialization='random')
print (pars_test)
{'W1': array([[ 1.03266513,  0.25201908,  0.05571284],
       [-1.07588801, -0.16015015, -0.20482019]]), 'b1': array([[0.],
       [0.]]), 'W2': array([[-0.05850706, -0.44335643]]), 'b2': array([[0.]])}
{'W1': array([[ 1.78862847,  0.43650985,  0.09649747],
       [-1.8634927 , -0.2773882 , -0.35475898]]), 'b1': array([[0.],
       [0.]]), 'W2': array([[-0.08274148, -0.62700068]]), 'b2': array([[0.]])}

2.2、正向傳播相關函式

def linear_forward(A,W,b,keep_prob=1,regularization=None):
    
    np.random.seed(1)
    D = np.random.rand(A.shape[0],A.shape[1])
    # this code for dropout
    if regularization == 'dropout':
#         print ('D:\n',D)   #不知為啥在第二次迴圈的時候D2與課件裡面產生的D2不一樣,此處造成的差異以至於與與課件最後結果不一樣
        D = np.where(D <= keep_prob,1,0)
        A = np.multiply(A,D)
        A = A/keep_prob
    #####################################
    
    Z = np.dot(W,A) + b
    cache = (A,W,b,D)
    
    return Z,cache
#第一次隨機產生的資料一樣,後面的不一樣
np.random.seed(1)  # 放在此處,產生的結果一樣
for i in range(3):
#     np.random.seed(1)     # 放在此處,隨機資料不一樣
    D = np.random.rand(2,3)
    print (D,'\n')

np.random.seed(1)
print ('- '*30)
D = np.random.rand(2,3)
print (D,'\n')
D = np.random.rand(2,3)
print (D,'\n')
D = np.random.rand(2,3)
print (D,'\n')
[[4.17022005e-01 7.20324493e-01 1.14374817e-04]
 [3.02332573e-01 1.46755891e-01 9.23385948e-02]] 

[[0.18626021 0.34556073 0.39676747]
 [0.53881673 0.41919451 0.6852195 ]] 

[[0.20445225 0.87811744 0.02738759]
 [0.67046751 0.4173048  0.55868983]] 

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
[[4.17022005e-01 7.20324493e-01 1.14374817e-04]
 [3.02332573e-01 1.46755891e-01 9.23385948e-02]] 

[[0.18626021 0.34556073 0.39676747]
 [0.53881673 0.41919451 0.6852195 ]] 

[[0.20445225 0.87811744 0.02738759]
 [0.67046751 0.4173048  0.55868983]] 
def sigmoid_forward(Z):
    '''
    arguments:
    x --> 自變數
    
    returns:
    s --> sigmoid(x)
    
    '''
    A = 1./(1+np.exp(-Z))
    cache = Z
    
    return A,cache
def relu_forward(Z):
    '''
    arguments:
    x --> 自變數
    
    returns:
    s --> ReLu(x)
    
    '''
#     s = np.maximum(0.01*x,x)
    A = np.maximum(0,Z)
    cache = Z
    
    return A,cache
def activation_forward(Z,activation):
    
    if activation == 'sigmoid':
        A,cache = sigmoid_forward(Z)
    elif activation == 'relu':
        A,cache = relu_forward(Z)
    
    return A,cache
def linear_activation_forward(A_prev,W,b,activation,keep_prob=1,regularization=None):
    
    Z,linear_cache = linear_forward(A_prev,W,b,keep_prob=keep_prob,regularization=regularization)
    A,activation_cache =  activation_forward(Z,activation)
    cache = (linear_cache,activation_cache)
    
    return A,cache
def L_model_forward(X,pars,keep_prob=1,regularization=None):
    caches = []
    A = X
    L = len(pars)//2 + 1
    np.random.seed(1)
    
    A_prev = A
    A,cache = linear_activation_forward(A_prev,pars['W1'],pars['b1'],activation='relu',keep_prob=1,regularization=None)
    caches.append(cache)
    
#     A_prev = A
#     A,cache = linear_activation_forward(A_prev,pars['W2'],pars['b2'],activation='relu',keep_prob=keep_prob,regularization=regularization)
#     caches.append(cache)
    
    for l in range(2,L-1):
        A_prev = A
        A,cache = linear_activation_forward(A_prev,pars['W'+str(l)],pars['b'+str(l)],activation='relu',keep_prob=keep_prob,regularization=regularization)
        caches.append(cache)
        
    AL,cache = linear_activation_forward(A,pars['W'+str(L-1)],pars['b'+str(L-1)],activation='sigmoid',keep_prob=keep_prob,regularization=regularization)
    caches.append(cache)
    assert(AL.shape == (1,X.shape[1]))

    return AL,caches
X_assess, parameters = forward_propagation_with_dropout_test_case()

A3, cache = L_model_forward(X_assess, parameters, keep_prob = 0.7,regularization='dropout')
print ("A3 = " + str(A3))
A3 = [[0.36974721 0.49683389 0.04565099 0.01446893 0.36974721]]

2.3、計算cost

def compute_cost(AL,Y,pars,lambd=0,regularization=None):
    assert(AL.shape[1] == Y.shape[1])
    
#     cost = -np.mean(Y*np.log(AL)+(1-Y)*np.log(1-AL),axis=1,keepdims=True) # 陣列對應位置相乘,矩陣進行矩陣乘法

    m = Y.shape[1]
#     cost = (1./m) * (-np.dot(Y,np.log(AL).T) - np.dot(1-Y, np.log(1-AL).T)) # 一維陣列對應位置求乘積求和一步進行,再求均值
#     cost = np.squeeze(cost)
#     print (AL)

    cost = (1./m) * (-np.multiply(Y,np.log(AL)) - np.multiply(1-Y, np.log(1-AL))) # (陣列和矩陣)對應位置成績再求均值,再求和
    cost = np.nansum(cost)        #     np.nansum,序列裡面即使還有空值仍然能夠進行計算 

    # this code for L2 regularization 
    if regularization == 'L2':
        l2 = 0
        L = int(len(pars)/2)
        for l in range(1,L+1):
            a = np.sum(np.square(pars['W'+str(l)]))
            l2 +=a
        l2 = l2*lambd/m/2
        cost = cost + l2
     ##############################
    
#  三種乘法*,np.dot, np.multiply
    return cost
# test compute_cost with regularization function 
A3, Y_assess, parameters = compute_cost_with_regularization_test_case()
print("cost = " + str(compute_cost(A3, Y_assess, parameters, lambd = 0.1,regularization='L2')))
cost = 1.786485945159076

2.4、反向傳播相關函式

def sigmoid_backrward(dA,activation_cache):
    
    Z = activation_cache
    A = 1./(1 + np.exp(-Z))
    dZ = dA*A*(1-A)
    
    return dZ
def relu_backward(dA,activation_cache):
    
    Z = activation_cache
    dZ = np.array(dA,copy=True)
    assert (dZ.shape == Z.shape)
    dZ[Z <= 0] = 0
    
    return dZ
def activation_backward(dA,activation_cache,activation):
    
    if activation == 'sigmoid':
        dZ = sigmoid_backrward(dA,activation_cache)
    elif activation == 'relu':
        dZ = relu_backward(dA,activation_cache)
        
    return dZ
    
def linear_backward(dZ,linear_cache,lambd=0,regularization=None,keep_prob=1):
    
    A_prev, W, b ,D = linear_cache
    m = A_prev.shape[1]
    dA_prev = np.dot(W.T,dZ)
    
    # this code for dropout
    if regularization == 'dropout':
        assert (dA_prev.shape == D.shape)
        dA_prev = np.multiply(dA_prev,D)
        dA_prev = dA_prev/keep_prob
    ######################################
    
    dW = 1./m*np.dot(dZ,A_prev.T)       #沒有除以m,導致計算錯誤
    
    # this code for regularization
    if regularization == 'L2':
        dW = dW + W*lambd/m
    ######################
    
    db = np.mean(dZ,axis=1,keepdims=True)   #應該使用這種方式,效果應該會更好,層數較少,神經元較少使用此種較好
#     db = 1./m * np.sum(dZ)  #這兩種方式計算db結果為什麼不一樣,之前都是這麼計算的啊
    # 與課程結果不一致的原因,db的計算方式不一樣。
    return dA_prev,dW,db
def activation_linear_backward(dA,cache,activation,lambd=0,regularization=None,keep_prob=1):
    
    linear_cache,activation_cache = cache
    
    dZ = activation_backward(dA,activation_cache,activation)
    dA_prev,dW,db = linear_backward(dZ,linear_cache,lambd=lambd,regularization=regularization,keep_prob=keep_prob)

    return dA_prev,dW,db
def L_model_backward(AL,Y,caches,lambd=0,regularization=None,keep_prob=1):
    
    Y = Y.reshape(AL.shape)
    dAL = -(np.divide(Y,AL) - np.divide(1-Y,1-AL))
    grads = {}
    L = len(caches) + 1
    current_cache = caches[L-2]
    
    grads['dA'+str(L-1)],grads['dW'+str(L-1)],grads['db'+str(L-1)] = activation_linear_backward(dAL,current_cache,activation='sigmoid',lambd=lambd,regularization=regularization,keep_prob=keep_prob)
    for l in reversed(range(L-2)):
        current_cache = caches[l]
        dA_prev_temp, dW_temp, db_temp = activation_linear_backward(grads['dA'+str(l+2)],current_cache,activation='relu',lambd=lambd,regularization=regularization,keep_prob=keep_prob)
        grads["dA" + str(l + 1)] = dA_prev_temp
        grads["dW" + str(l + 1)] = dW_temp
        grads["db" + str(l + 1)] = db_temp
    
    return grads

2.5、引數更新

def update_parameters(pars,grads,learning_rate):
    
    L = len(pars)//2 + 1
    for l in range(1,L):
        pars['W'+str(l)] = pars['W'+str(l)] - learning_rate*grads['dW'+str(l)]
        pars['b'+str(l)] = pars['b'+str(l)] - learning_rate*grads['db'+str(l)]
    
    return pars

L_layer_model

def L_layer_model(X,Y,layer_dims,learning_rate = 0.01,num_iterations = 3000,print_cost=False,initialization='he',lambd=0,regularization=None,keep_prob = 1):
    
    '''
    1、初始化引數
    2、根據迭代次數迴圈
        3、正向傳播
        4、計算cost
        5、反向傳播
        6、更新引數
    7、輸出costs和pars
    '''
#     np.random.seed(1)
    
    #初始化引數
    pars = initialize_parameters(layer_dims,initialization)

    L = len(layer_dims)
    costs = []
    for i in range(0,num_iterations):
        
        #正向傳播
        AL,caches = L_model_forward(X,pars)

        #計算cost
        cost = compute_cost(AL,Y,pars,lambd=lambd,regularization=regularization) 

        if i%1000 ==
            
           

相關推薦

深度學習4.2練習_Convolutional Neural Networks_Happy House & Residual Networks

1、Happy House 1.1、 Load Dataset 1.2、構建流圖:def HappyModel 1.3、PlaceHolder --> happyModel = HappyModel((64,64,3))

深度學習4.2練習_Convolutional Neural Networks_Residual Networks

轉載自吳恩達老師深度學習課程作業notebook Residual Networks Welcome to the second assignment of this week! You will learn how to build very deep convolutional

深度學習總結(2)

DeaplearningAI01.weak3 回顧 Logistic Regression 淺層神經網路(只有一層隱藏單元) 網路中每個符號的含義 啟用函式的選擇 可選函式 啟用函式的選擇 使用非線性啟

深度學習4.3練習_Convolutional Neural Networks_Car detection

轉載自吳恩達老師深度學習課程作業notebook Autonomous driving - Car detection Welcome to your week 3 programming assignment. You will learn about object detecti

深度學習4.1練習_Convolutional Neural Networks_Convolution_model_Application_2

版權宣告:本文為博主原創文章,未經博主允許不得轉載。 https://blog.csdn.net/weixin_42432468 學習心得: 1、每週的視訊課程看一到兩遍 2、做筆記 3、做每週的作業練習,這個裡面的含金量非常高。先根據notebook過一遍,掌握後一定要自己敲一遍,

深度學習4.1練習_Convolutional Neural Networks_Convolution_model_StepByStep_1

轉載自吳恩達老師深度學習練習notebook Convolutional Neural Networks: Step by Step Welcome to Course 4’s first assignment! In this assignment, you will implem

深度學習4.4練習_Convolutional Neural Networks_Face Recognition for the Happy House

轉載自吳恩達老師深度學習課程作業notebook Face Recognition for the Happy House Welcome to the first assignment of week 4! Here you will build a face recognitio

深度學習2.3練習_Improving Deep Neural Networks_Tensorflow

轉載自吳恩達老師深度學習練習notebook TensorFlow Tutorial Welcome to this week’s programming assignment. Until now, you’ve always used numpy to build neural

深度學習2.2練習_Improving Deep Neural Networks_Optimization

版權宣告:本文為博主原創文章,未經博主允許不得轉載。 https://blog.csdn.net/weixin_42432468 學習心得: 1、每週的視訊課程看一到兩遍 2、做筆記 3、做每週的作業練習,這個裡面的含金量非常高。先根據notebook過一遍,掌握後一定要自己敲一遍,

深度學習2.1練習_Improving Deep Neural Networks(Initialization_Regularization_Gradientchecking)

版權宣告:本文為博主原創文章,未經博主允許不得轉載。 https://blog.csdn.net/weixin_42432468 學習心得: 1、每週的視訊課程看一到兩遍 2、做筆記 3、做每週的作業練習,這個裡面的含金量非常高。先根據notebook過一遍,掌握後一定要自己敲一遍,

深度學習2.1練習_Improving Deep Neural Networks_initialization

轉載自吳恩達老師深度學習練習notebook Initialization Welcome to the first assignment of “Improving Deep Neural Networks”. Training your neural network requ

-深度學習-課程筆記-3: Python和向量化( Week 2 )

有時 指數 檢查 都是 效果 很快 -1 tro str 1 向量化( Vectorization ) 在邏輯回歸中,以計算z為例,z = w的轉置和x進行內積運算再加上b,你可以用for循環來實現。 但是在python中z可以調用numpy的方法,直接一句z = np.d

深度學習專項課程2學習筆記/week2/Optimization Algorithms

sce 適應 耗時 bubuko 優化算法 src bat -a 過程 Optimization algorithms 優化算法以加速訓練。 Mini-batch gradient descend Batch gradient descend:每一小步梯度下降否需要計算所

深度學習2-Week2課後作業3-優化演算法

一、deeplearning-assignment 到目前為止,在之前的練習中我們一直使用梯度下降來更新引數並最小化成本函式。在本次作業中,將學習更先進的優化方法,它在加快學習速度的同時,甚至可以獲得更好的最終值。一個好的優化演算法可以讓你幾個小時內就獲得一個結果,而不是等待幾天。 1.

深度學習2-Week1課後作業3-梯度檢測

一、deeplearning-assignment 神經網路的反向傳播很複雜,在某些時候需要對反向傳播演算法進行驗證,以證明確實有效,這時我們引入了“梯度檢測”。 反向傳播需要計算梯度 , 其中θ表示模型的引數。J是使用前向傳播和損失函式計算的。因為前向傳播實現相對簡單, 所以

深度學習2-Week3課後作業-Tensorflow

一、deeplearning-assignment 到目前為止,我們一直使用numpy來建立神經網路。這次作業將深入學習框架,可以更容易地建立神經網路。 TensorFlow,PaddlePaddle,Torch,Caffe,Keras等機器學習框架可以顯著地加速機器學習開發。這些框架有

深度學習4-Week2課後作業2-殘差網路

一、Deeplearning-assignment 在本次作業中,我們將學習如何通過殘差網路(ResNets)建立更深的卷及網路。理論上,深層次的網路可以表示非常複雜的函式,但在實踐中,他們是很難建立和訓練的。殘差網路使得建立比以前更深層次的網路成為可能。對於殘差網路的詳細講解,具體可參考該

深度學習4-Week4課後作業2-Neural Style Transfer

一、Deeplearning-assignment 在本節的學習中,我們將學習神經風格遷移(Neural Style Transfer)演算法,通過該演算法使得兩張不同風格的圖片融合成一張圖片。 問題描述:神經風格遷移演算法是深度學習中的一種有趣的技術。正如下面的圖片所示,演算法將兩種圖

深度學習2-Week1課後作業2-正則化

一、deeplearning-assignment 這一節作業的重點是理解各個正則化方法的原理,以及它們的優缺點,而不是去注重演算法實現的具體末節。 問題陳述:希望你通過一個數據集訓練一個合適的模型,從而幫助推薦法國守門員應該踢球的位置,這樣法國隊的球員可以用頭打。法國過

深度學習2.3筆記_Improving Deep Neural Networks_超引數除錯 和 Batch Norm

版權宣告:本文為博主原創文章,未經博主允許不得轉載。 https://blog.csdn.net/weixin_42432468 學習心得: 1、每週的視訊課程看一到兩遍 2、做筆記 3、做每週的作業練習,這個裡面的含金量非常高。先根據notebook過一遍,掌握後一定要自己敲一遍,