1. 程式人生 > >吳恩達深度學習2.1練習_Improving Deep Neural Networks(Initialization_Regularization_Gradientchecking)

吳恩達深度學習2.1練習_Improving Deep Neural Networks(Initialization_Regularization_Gradientchecking)

版權宣告:本文為博主原創文章,未經博主允許不得轉載。 https://blog.csdn.net/weixin_42432468

學習心得:
1、每週的視訊課程看一到兩遍
2、做筆記

3、做每週的作業練習,這個裡面的含金量非常高。先根據notebook過一遍,掌握後一定要自己敲一遍,這樣以後用起來才能得心應手。


1、Load Dataset

2、演算法程式碼實現

2.1、初始化引數

2.2、正向傳播相關函式

2.3、計算cost

2.4、反向傳播相關函式

2.5、梯度檢查

2.6、引數更新

3、預測

# import packages
import numpy as np
import matplotlib.pyplot as plt
import sklearn
import sklearn.datasets
import scipy.io
from testCases import *

%matplotlib inline
plt.rcParams['figure.figsize'] = (7.0, 4.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.
rcParams['image.cmap'] = 'gray'

1、Load Dataset

# train_X, train_Y, test_X, test_Y = load_2D_dataset()
# #檢視獲得資料集到底是個啥東西,型別、形狀、第一個例項
# print ('train_X:\n',type(train_X),train_X.shape,'\n')
# print (train_X[:,0])
# print ('test_X:\n',type(test_X),test_X.shape,'\n')
# print (test_X[:,0])

2、演算法程式碼實現

2.1、初始化引數

def initialize_parameters(layer_dims,initialization='he'):
    
    np.random.seed(3)
    L = len(layer_dims)
    pars = {}
    if initialization == 'zeros':
        for l in range(1,L):
            pars['W'+str(l)] = np.zeros((layer_dims[l],layer_dims[l-1]))
            pars['b'+str(l)] = np.zeros((layer_dims[l],1))
        
    elif initialization == 'random':
        for l in range(1,L):
#             pars['W'+str(l)] = np.random.randn(layer_dims[l],layer_dims[l-1])*10
            pars['W'+str(l)] = np.random.randn(layer_dims[l],layer_dims[l-1])
            pars['b'+str(l)] = np.zeros((layer_dims[l],1))
          
    elif initialization == 'he':
        for l in range(1,L):
#             pars['W'+str(l)] = np.random.randn(layer_dims[l],layer_dims[l-1])* np.sqrt(2./layer_dims[l-1])
            pars['W'+str(l)] = np.random.randn(layer_dims[l],layer_dims[l-1])* np.sqrt(1./layer_dims[l-1])
            pars['b'+str(l)] = np.zeros((layer_dims[l],1))
        
    return pars
# test initialize_parameters function
pars_test = initialize_parameters([3,2,1],initialization='he')
print (pars_test)
pars_test = initialize_parameters([3,2,1],initialization='random')
print (pars_test)

2.2、正向傳播相關函式

def linear_forward(A,W,b,keep_prob=1,regularization=None):
    
    np.random.seed(1)
    D = np.random.rand(A.shape[0],A.shape[1])
    # this code for dropout
    if regularization == 'dropout':
#         print ('D:\n',D)   #不知為啥在第二次迴圈的時候D2與課件裡面產生的D2不一樣,此處造成的差異以至於與與課件最後結果不一樣
        D = np.where(D <= keep_prob,1,0)
        A = np.multiply(A,D)
        A = A/keep_prob
    #####################################
    
    Z = np.dot(W,A) + b
    cache = (A,W,b,D)
    
    return Z,cache
#第一次隨機產生的資料一樣,後面的不一樣
np.random.seed(1)  # 放在此處,產生的結果一樣
for i in range(3):
#     np.random.seed(1)     # 放在此處,隨機資料不一樣
    D = np.random.rand(2,3)
    print (D,'\n')

np.random.seed(1)
print ('- '*30)
D = np.random.rand(2,3)
print (D,'\n')
D = np.random.rand(2,3)
print (D,'\n')
D = np.random.rand(2,3)
print (D,'\n')
def sigmoid_forward(Z):
    '''
    arguments:
    x --> 自變數
    
    returns:
    s --> sigmoid(x)
    
    '''
    A = 1./(1+np.exp(-Z))
    cache = Z
    
    return A,cache
def relu_forward(Z):
    '''
    arguments:
    x --> 自變數
    
    returns:
    s --> ReLu(x)
    
    '''
#     s = np.maximum(0.01*x,x)
    A = np.maximum(0,Z)
    cache = Z
    
    return A,cache
def activation_forward(Z,activation):
    
    if activation == 'sigmoid':
        A,cache = sigmoid_forward(Z)
    elif activation == 'relu':
        A,cache = relu_forward(Z)
    
    return A,cache
def linear_activation_forward(A_prev,W,b,activation,keep_prob=1,regularization=None):
    
    Z,linear_cache = linear_forward(A_prev,W,b,keep_prob=keep_prob,regularization=regularization)
    A,activation_cache =  activation_forward(Z,activation)
    cache = (linear_cache,activation_cache)
    
    return A,cache
def L_model_forward(X,pars,keep_prob=1,regularization=None):
    caches = []
    A = X
    L = len(pars)//2 + 1
    np.random.seed(1)
    
    A_prev = A
    A,cache = linear_activation_forward(A_prev,pars['W1'],pars['b1'],activation='relu',keep_prob=1,regularization=None)
    caches.append(cache)
    
    
    for l in range(2,L-1):
        A_prev = A
        A,cache = linear_activation_forward(A_prev,pars['W'+str(l)],pars['b'+str(l)],activation='relu',keep_prob=keep_prob,regularization=regularization)
        caches.append(cache)
        
    AL,cache = linear_activation_forward(A,pars['W'+str(L-1)],pars['b'+str(L-1)],activation='sigmoid',keep_prob=keep_prob,regularization=regularization)
    caches.append(cache)
    assert(AL.shape == (1,X.shape[1]))
#     print ('- '*30)
    
    return AL,caches

2.3、計算cost

def compute_cost(AL,Y,pars,lambd=0,regularization=None):
    assert(AL.shape[1] == Y.shape[1])
    
#     cost = -np.mean(Y*np.log(AL)+(1-Y)*np.log(1-AL),axis=1,keepdims=True) # 陣列對應位置相乘,矩陣進行矩陣乘法

    m = Y.shape[1]
#     cost = (1./m) * (-np.dot(Y,np.log(AL).T) - np.dot(1-Y, np.log(1-AL).T)) # 一維陣列對應位置求乘積求和一步進行,再求均值
#     cost = np.squeeze(cost)
#     print (AL)

    cost = (1./m) * (-np.multiply(Y,np.log(AL)) - np.multiply(1-Y, np.log(1-AL))) # (陣列和矩陣)對應位置成績再求均值,再求和
    cost = np.nansum(cost)        #     np.nansum,序列裡面即使還有空值仍然能夠進行計算 

    # this code for L2 regularization 
    if regularization == 'L2':
        l2 = 0
        L = int(len(pars)/2)
        for l in range(1,L+1):
            a = np.sum(np.square(pars['W'+str(l)]))
            l2 +=a
        l2 = l2*lambd/m/2
        cost = cost + l2
     ##############################
    
#  三種乘法*,np.dot, np.multiply
    return cost

2.4、反向傳播相關函式

def sigmoid_backrward(dA,activation_cache):
    
    Z = activation_cache
    A = 1./(1 + np.exp(-Z))
    dZ = dA*A*(1-A)
    
    return dZ
def relu_backward(dA,activation_cache):
    
    Z = activation_cache
    dZ = np.array(dA,copy=True)
    assert (dZ.shape == Z.shape)
    dZ[Z <= 0] = 0
    
    return dZ
def activation_backward(dA,activation_cache,activation):
    
    if activation == 'sigmoid':
        dZ = sigmoid_backrward(dA,activation_cache)
    elif activation == 'relu':
        dZ = relu_backward(dA,activation_cache)
        
    return dZ
    
def linear_backward(dZ,linear_cache,lambd=0,regularization=None,keep_prob=1):
    
    A_prev, W, b ,D = linear_cache
    m = A_prev.shape[1]
    dA_prev = np.dot(W.T,dZ)
    
    # this code for dropout
    if regularization == 'dropout':
        assert (dA_prev.shape == D.shape)
        dA_prev = np.multiply(dA_prev,D)
        dA_prev = dA_prev/keep_prob
    ######################################
    
    dW = 1./m*np.dot(dZ,A_prev.T)       #沒有除以m,導致計算錯誤
    
    # this code for regularization
    if regularization == 'L2':
        dW = dW + W*lambd/m
    ######################
    
    db = np.mean(dZ,axis=1,keepdims=True)   #應該使用這種方式,效果應該會更好,層數較少,神經元較少使用此種較好
#     db = 1./m * np.sum(dZ)  #這兩種方式計算db結果為什麼不一樣,之前都是這麼計算的啊
    # 與課程結果不一致的原因,db的計算方式不一樣。
    return dA_prev,dW,db
def activation_linear_backward(dA,cache,activation,lambd=0,regularization=None,keep_prob=1):
    
    linear_cache,activation_cache = cache
    
    dZ = activation_backward(dA,activation_cache,activation)
    dA_prev,dW,db = linear_backward(dZ,linear_cache,lambd=lambd,regularization=regularization,keep_prob=keep_prob)

    return dA_prev,dW,db
def L_model_backward(AL,Y,caches,lambd=0,regularization=None,keep_prob=1):
    
    Y = Y.reshape(AL.shape)
    dAL = -(np.divide(Y,AL) - np.divide(1-Y,1-AL))
    grads = {}
    L = len(caches) + 1
    current_cache = caches[L-2]
    
    grads['dA'+str(L-1)],grads['dW'+str(L-1)],grads['db'+str(L-1)] = activation_linear_backward(dAL,current_cache,activation='sigmoid',lambd=lambd,regularization=regularization,keep_prob=keep_prob)
    for l in reversed(range(L-2)):
        current_cache = caches[l]
        dA_prev_temp, dW_temp, db_temp = activation_linear_backward(grads['dA'+str(l+2)],current_cache,activation='relu',lambd=lambd,regularization=regularization,keep_prob=keep_prob)
        grads["dA" + str(l + 1)] = dA_prev_temp
        grads["dW" + str(l + 1)] = dW_temp
        grads["db" + str(l + 1)] = db_temp
    
    return grads

2.5、梯度檢查

def dictionary_to_vector(parameters,flag='pars'):
    """
    Roll all our parameters dictionary into a single vector satisfying our specific required shape.
    """
    shapes = []
    count = 0
    
    if flag == 'pars':
        L = int(len(parameters)/2)
        keylist = []
        for l in range(1,L+1):
            keylist.append('W'+str(l))
            keylist.append('b'+str(l))
    elif flag == 'grads':
        L = int(len(parameters)/3)
        keylist = []
        for l in range(1,L+1):
            keylist.append('dW'+str(l))
            keylist.append('db'+str(l))
            
    for key in keylist:
        new_vector = np.reshape(parameters[key], (-1,1))
        pars_shape = parameters[key].shape
        shapes.append(pars_shape)

        if count == 0:
            theta = new_vector
        else:
            theta = np.concatenate((theta, new_vector), axis=0)
        count = count + 1        

    return theta, shapes

def vector_to_dictionary(theta,shapes,pre_pars):
    """
    Unroll all our parameters dictionary from a single vector satisfying our specific required shape.
    """
    parameters = {}
    L = int(len(pre_pars)/2)
    i = 0
    for l in range(1,L+1):
        a = shapes[2*(l-1)]
        parameters['W'+str(l)] = theta[i:i+a[0]*a[1]].reshape(a)
        i +=a[0]*a[1]
        
        a = shapes[2*l-1]
        parameters['b'+str(l)] = theta[i:i+a[0]*a[1]].reshape(a)
        i +=a[0]*a[1]

    return parameters
# test dictionary_to_vector and vector_to_dictionary function
_, _, parameters_test = gradient_check_n_test_case()
# print (parameters)
theta_test,shapes_test = dictionary_to_vector(parameters_test)
# print (theta_test)
pars_test = vector_to_dictionary(theta_test,shapes_test,parameters_test)
print (pars_test)
# GRADED FUNCTION: gradient_check_n

def gradient_check_n(parameters, gradients, X, Y, epsilon 
            
           

相關推薦

深度學習2.1練習_Improving Deep Neural Networks(Initialization_Regularization_Gradientchecking)

版權宣告:本文為博主原創文章,未經博主允許不得轉載。 https://blog.csdn.net/weixin_42432468 學習心得: 1、每週的視訊課程看一到兩遍 2、做筆記 3、做每週的作業練習,這個裡面的含金量非常高。先根據notebook過一遍,掌握後一定要自己敲一遍,

深度學習2.1練習_Improving Deep Neural Networks_initialization

轉載自吳恩達老師深度學習練習notebook Initialization Welcome to the first assignment of “Improving Deep Neural Networks”. Training your neural network requ

深度學習2.3練習_Improving Deep Neural Networks_Tensorflow

轉載自吳恩達老師深度學習練習notebook TensorFlow Tutorial Welcome to this week’s programming assignment. Until now, you’ve always used numpy to build neural

深度學習2.1筆記_Improving Deep Neural Networks_深度學習的實踐層面

版權宣告:本文為博主原創文章,未經博主允許不得轉載。 https://blog.csdn.net/weixin_42432468 學習心得: 1、每週的視訊課程看一到兩遍 2、做筆記 3、做每週的作業練習,這個裡面的含金量非常高。先根據notebook過一遍,掌握後一定要自己敲一遍,

深度學習2.3筆記_Improving Deep Neural Networks_超引數除錯 和 Batch Norm

版權宣告:本文為博主原創文章,未經博主允許不得轉載。 https://blog.csdn.net/weixin_42432468 學習心得: 1、每週的視訊課程看一到兩遍 2、做筆記 3、做每週的作業練習,這個裡面的含金量非常高。先根據notebook過一遍,掌握後一定要自己敲一遍,

深度學習4.1練習_Convolutional Neural Networks_Convolution_model_Application_2

版權宣告:本文為博主原創文章,未經博主允許不得轉載。 https://blog.csdn.net/weixin_42432468 學習心得: 1、每週的視訊課程看一到兩遍 2、做筆記 3、做每週的作業練習,這個裡面的含金量非常高。先根據notebook過一遍,掌握後一定要自己敲一遍,

深度學習4.1練習_Convolutional Neural Networks_Convolution_model_StepByStep_1

轉載自吳恩達老師深度學習練習notebook Convolutional Neural Networks: Step by Step Welcome to Course 4’s first assignment! In this assignment, you will implem

深度學習2.2練習_Improving Deep Neural Networks_Optimization

版權宣告:本文為博主原創文章,未經博主允許不得轉載。 https://blog.csdn.net/weixin_42432468 學習心得: 1、每週的視訊課程看一到兩遍 2、做筆記 3、做每週的作業練習,這個裡面的含金量非常高。先根據notebook過一遍,掌握後一定要自己敲一遍,

深度學習2-Week2課後作業3-優化演算法

一、deeplearning-assignment 到目前為止,在之前的練習中我們一直使用梯度下降來更新引數並最小化成本函式。在本次作業中,將學習更先進的優化方法,它在加快學習速度的同時,甚至可以獲得更好的最終值。一個好的優化演算法可以讓你幾個小時內就獲得一個結果,而不是等待幾天。 1.

深度學習2-Week1課後作業3-梯度檢測

一、deeplearning-assignment 神經網路的反向傳播很複雜,在某些時候需要對反向傳播演算法進行驗證,以證明確實有效,這時我們引入了“梯度檢測”。 反向傳播需要計算梯度 , 其中θ表示模型的引數。J是使用前向傳播和損失函式計算的。因為前向傳播實現相對簡單, 所以

深度學習總結(1)

DeaplearningAI01.weak2 forward backward 本週主要介紹了神經網路中forward和backward的一般實現和向量實現。一般實現較為簡單,向量實現中存在一些疑點

深度學習2-Week3課後作業-Tensorflow

一、deeplearning-assignment 到目前為止,我們一直使用numpy來建立神經網路。這次作業將深入學習框架,可以更容易地建立神經網路。 TensorFlow,PaddlePaddle,Torch,Caffe,Keras等機器學習框架可以顯著地加速機器學習開發。這些框架有

深度學習2-Week1課後作業2-正則化

一、deeplearning-assignment 這一節作業的重點是理解各個正則化方法的原理,以及它們的優缺點,而不是去注重演算法實現的具體末節。 問題陳述:希望你通過一個數據集訓練一個合適的模型,從而幫助推薦法國守門員應該踢球的位置,這樣法國隊的球員可以用頭打。法國過

深度學習4.3練習_Convolutional Neural Networks_Car detection

轉載自吳恩達老師深度學習課程作業notebook Autonomous driving - Car detection Welcome to your week 3 programming assignment. You will learn about object detecti

深度學習2.2筆記_Improving Deep Neural Networks_優化演算法

版權宣告:本文為博主原創文章,未經博主允許不得轉載。 https://blog.csdn.net/weixin_42432468 學習心得: 1、每週的視訊課程看一到兩遍 2、做筆記 3、做每週的作業練習,這個裡面的含金量非常高。先根據notebook過一遍,掌握後一定要自己敲一遍,

深度學習4.4練習_Convolutional Neural Networks_Face Recognition for the Happy House

轉載自吳恩達老師深度學習課程作業notebook Face Recognition for the Happy House Welcome to the first assignment of week 4! Here you will build a face recognitio

深度學習1.3樣例用一個隱含層神經網路對資料進行分類

coding: utf-8 # Planar data classification with one hidden layer 用一個隱含層神經網路對資料進行分類 Welcome to your week 3 programming assi

深度學習筆記1-神經網絡的編程基礎(Basics of Neural Network programming)

算法 只有一個 ear 最小化 維度 編程基礎 clas 什麽 分類問題 一:二分類(Binary Classification)   邏輯回歸是一個用於二分類(binary classification)的算法。在二分類問題中,我們的目標就是習得一個分類器,它以對象的特

Coursera--機器學習-第五週-程式設計作業: Neural Networks Learning

本次文章內容: Coursera吳恩達機器學習課程,第五週程式設計作業。程式語言是Matlab。 學習演算法分兩部分進行理解,第一部分是根據code對演算法進行綜述,第二部分是程式碼。 0 Introduction  在這個練習中,將應用 backpropagation

深度學習4.2練習_Convolutional Neural Networks_Happy House & Residual Networks

1、Happy House 1.1、 Load Dataset 1.2、構建流圖:def HappyModel 1.3、PlaceHolder --> happyModel = HappyModel((64,64,3))