1. 程式人生 > >一文入門BP神經網路——從原理到應用(應用篇)

一文入門BP神經網路——從原理到應用(應用篇)

編號 公式 備註
1 Z[l]=w[l]A[l1]+b[l]
2 A[l]=σ(Z[l])
3 dZ[L]=ACσ(Z[L])
4 dZ[l]=[w[l+1]TdZ[l+1]]σ(Z[l])
5 db[l]=Cb[l]=1mmeanOfEachRow(dZ[l])
6 dw[l]=Cw[l]=1mdZ[l]A[l1]T
7 b[l]b[l]αdb[l]
8 w[l]w[l]αdw[l]
9 d
A[l]=w[l]TdZ[l]
10 C=1mmi=1(y(i)log(a[L](i))+(1y(i))log(1a[L](i))) 代價函式

1. 輔助函式

  輔助函式主要包括啟用函式以及啟用函式的反向傳播過程函式:

其中,啟用函式反向傳播程式碼對應公式4和9.

def sigmoid(z):
    """
    使用numpy實現sigmoid函式

    引數:
    Z numpy array
    輸出:
    A 啟用值(維數和Z完全相同)
    """
    return 1/(1 + np.exp(-z))

def
relu(z):
""" 線性修正函式relu 引數: z numpy array 輸出: A 啟用值(維數和Z完全相同) """ return np.array(z>0)*z def sigmoidBackward(dA, cacheA): """ sigmoid的反向傳播 引數: dA 同層啟用值 cacheA 同層線性輸出 輸出: dZ 梯度 """ s = sigmoid(cacheA) diff = s*(1 - s) dZ = dA * diff return
dZ def reluBackward(dA, cacheA): """ relu的反向傳播 引數: dA 同層啟用值 cacheA 同層線性輸出 輸出: dZ 梯度 """ Z = cacheA dZ = np.array(dA, copy=True) dZ[Z <= 0] = 0 return dZ

  另外一個重要的輔助函式是資料讀取函式和引數初始化函式:

def loadData(dataDir):
    """
    匯入資料

    引數:
    dataDir 資料集路徑
    輸出:
    訓練集,測試集以及標籤
    """
    train_dataset = h5py.File(dataDir+'/train.h5', "r")
    train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
    train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels

    test_dataset = h5py.File(dataDir+'/test.h5', "r")
    test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
    test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels

    classes = np.array(test_dataset["list_classes"][:]) # the list of classes

    train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
    test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))

    return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes

def iniPara(laydims):
    """
    隨機初始化網路引數

    引數:
    laydims 一個python list
    輸出:
    parameters 隨機初始化的引數字典(”W1“,”b1“,”W2“,”b2“, ...)
    """
    np.random.seed(1)
    parameters = {}
    for i in range(1, len(laydims)):
        parameters['W'+str(i)] = np.random.randn(laydims[i], laydims[i-1])/ np.sqrt(laydims[i-1])
        parameters['b'+str(i)] = np.zeros((laydims[i], 1))
    return parameters

2. 前向傳播過程

對應公式1和2.

def forwardLinear(W, b, A_prev):
    """
    前向傳播
    """
    Z = np.dot(W, A_prev) + b
    cache = (W, A_prev, b)
    return Z, cache

def forwardLinearActivation(W, b, A_prev, activation):
    """
    帶啟用函式的前向傳播
    """
    Z, cacheL = forwardLinear(W, b, A_prev)
    cacheA = Z
    if activation == 'sigmoid':
        A = sigmoid(Z)
    if activation == 'relu':
        A = relu(Z)
    cache = (cacheL, cacheA)
    return A, cache

def forwardModel(X, parameters):
    """
    完整的前向傳播過程
    """
    layerdim = len(parameters)//2
    caches = []
    A_prev = X
    for i in range(1, layerdim):
        A_prev, cache = forwardLinearActivation(parameters['W'+str(i)], parameters['b'+str(i)], A_prev, 'relu')
        caches.append(cache)

    AL, cache = forwardLinearActivation(parameters['W'+str(layerdim)], parameters['b'+str(layerdim)], A_prev, 'sigmoid')
    caches.append(cache)

    return AL, caches

3. 反向傳播過程

線性部分反向傳播對應公式5和6。

def linearBackward(dZ, cache):
    """
    線性部分的反向傳播

    引數:
    dZ 當前層誤差
    cache (W, A_prev, b)元組
    輸出:
    dA_prev 上一層啟用的梯度
    dW 當前層W的梯度
    db 當前層b的梯度
    """
    W, A_prev, b = cache
    m = A_prev.shape[1]

    dW = 1/m*np.dot(dZ, A_prev.T)
    db = 1/m*np.sum(dZ, axis = 1, keepdims=True)
    dA_prev = np.dot(W.T, dZ)

    return dA_prev, dW, db

非線性部分對應公式3、4、5和6 。

def linearActivationBackward(dA, cache, activation):
    """
    非線性部分的反向傳播

    引數:
    dA 當前層啟用輸出的梯度
    cache (W, A_prev, b)元組
    activation 啟用函式型別
    輸出:
    dA_prev 上一層啟用的梯度
    dW 當前層W的梯度
    db 當前層b的梯度
    """
    cacheL, cacheA = cache

    if activation == 'relu':
        dZ = reluBackward(dA, cacheA)
        dA_prev, dW, db = linearBackward(dZ, cacheL)
    elif activation == 'sigmoid':
        dZ = sigmoidBackward(dA, cacheA)
        dA_prev, dW, db = linearBackward(dZ, cacheL)

    return dA_prev, dW, db

完整反向傳播模型:

def backwardModel(AL, Y, caches):
    """
    完整的反向傳播過程

    引數:
    AL 輸出層結果
    Y 標籤值
    caches 【cacheL, cacheA】
    輸出:
    diffs 梯度字典
    """
    layerdim = len(caches)
    Y = Y.reshape(AL.shape)
    L = layerdim

    diffs = {}

    dAL = - (np.divide(Y, AL) - np.divide(1 - Y, 1 - AL))

    currentCache = caches[L-1]
    dA_prev, dW, db =  linearActivationBackward(dAL, currentCache, 'sigmoid')
    diffs['dA' + str(L)], diffs['dW'+str(L)], diffs['db'+str(L)] = dA_prev, dW, db

    for l in reversed(range(L-1)):
        currentCache = caches[l]
        dA_prev, dW, db =  linearActivationBackward(dA_prev, currentCache, 'relu')
        diffs['dA' + str(l+1)], diffs['dW'+str(l+1)], diffs['db'+str(l+1)] = dA_prev, dW, db

    return diffs

4. 測試結果

  開啟你的jupyter notebook,執行我們的BP.ipynb檔案,首先匯入依賴庫和資料集,然後使用一個迴圈來確定最佳的迭代次數大約為2000:


enter image description here
【圖6】

  最後用一個例子來看一下模型的效果——判斷一張圖片是不是貓:


enter image description here
【圖7】

好了,測試到此結束。你也可以自己嘗試其它的神經網路結構和測試其它圖片。