淺層神經網絡

阿新 • • 發佈：2018-11-21

slide print sig bsp alt upd ive ret respond

1、神經網絡概述：

技術分享圖片

dW^[L]=（1/m)*dZ^[L]A^[L-1].T

db^[L]=(1/m)*np.sum(dZ^[L],axis=1,keepdims=True)

dZ^[L-1]=W^[L].T dZ^[L]*g‘(Z^[L-1])

2. 激活函數：

sigmoid(z)=1/(1+e^-z), tanh(z)=(e^z+e^-z)/(e^z-e^-z) , RelU(z)=max(0,z) , Leaky RelU(z)=max(0.01z,z)

sigmoid(z)‘=a(1-a), tanh(z)‘=1-a², RelU(z)‘=1 or 0 , Leaky RelU(z)‘=1 or 0.01

技術分享圖片

sigmoid激活函數：除了輸出層是一個二分類問題基本不會用它；

tanh激活函數：tanh是非常優秀的，幾乎適合所有場合；

ReLu激活函數：最常用的默認函數，如果不確定用哪個激活函數，就使用ReLu或者Leaky ReLu;

3.隨機初始化：

W^[L]=np.random.randn(n^L,n^L-1)*0.01

b^L=np.zeros((n^L,1)

4.編程實踐：

技術分享圖片

  1 #Defining the neural network structure:
  2 def layer_sizes(X, Y):
  3     """
  4     Arguments:
 
  5     X -- input dataset of shape (input size, number of examples)
  6     Y -- labels of shape (output size, number of examples)
  7     
  8     Returns:
  9     n_x -- the size of the input layer
 10     n_h -- the size of the hidden layer
 11     n_y -- the size of the output layer
 12     """
 13 
     n_x = X.shape[0] # size of input layer
 14     n_h = 4
 15     n_y =X.shape[0] # size of output layer
 16     
 17     return (n_x, n_h, n_y)
 18 
 19 #Initialize the model‘s parameters
 20 def initialize_parameters(n_x, n_h, n_y):
 21     """
 22     Argument:
 23     n_x -- size of the input layer
 24     n_h -- size of the hidden layer
 25     n_y -- size of the output layer
 26     
 27     Returns:
 28     params -- python dictionary containing your parameters:
 29                     W1 -- weight matrix of shape (n_h, n_x)
 30                     b1 -- bias vector of shape (n_h, 1)
 31                     W2 -- weight matrix of shape (n_y, n_h)
 32                     b2 -- bias vector of shape (n_y, 1)
 33     """
 34     
 35     np.random.seed(2) # we set up a seed so that your output matches ours although the initialization is random.
 36     
 37     ### START CODE HERE ### (≈ 4 lines of code)
 38     W1 = np.random.randn(n_h,n_x)*0.01
 39     b1 = np.zeros((n_h,1))
 40     W2 = np.random.randn(n_y,n_h)*0.01
 41     b2 = np.zeros((n_y,0))
 42 
 43     ### END CODE HERE ###
 44     
 45     assert (W1.shape == (n_h, n_x))
 46     assert (b1.shape == (n_h, 1))
 47     assert (W2.shape == (n_y, n_h))
 48     assert (b2.shape == (n_y, 1))
 49     
 50     parameters = {"W1": W1,
 51                   "b1": b1,
 52                   "W2": W2,
 53                   "b2": b2}
 54     
 55     return parameters 
 56 
 57 #Implement forward_propagation()
 58 def forward_propagation(X, parameters):
 59     """
 60     Argument:
 61     X -- input data of size (n_x, m)
 62     parameters -- python dictionary containing your parameters (output of initialization function)
 63     
 64     Returns:
 65     A2 -- The sigmoid output of the second activation
 66     cache -- a dictionary containing "Z1", "A1", "Z2" and "A2"
 67     """
 68     # Retrieve each parameter from the dictionary "parameters"
 69     ### START CODE HERE ### (≈ 4 lines of code)
 70     W1 = parameters[‘W1‘]
 71     b1 = parameters[‘b1‘]
 72     W2 = parameters[‘W2‘]
 73     b2 = parameters[‘b2‘]
 74     ### END CODE HERE ###
 75     
 76     # Implement Forward Propagation to calculate A2 (probabilities)
 77     ### START CODE HERE ### (≈ 4 lines of code)
 78     Z1 = np.dot(W1,X)+b1
 79     A1 = np.tanh(Z1)
 80     Z2 = np.dot(W2,A1)+b2
 81     A2 = sigmoid(Z2)
 82     ### END CODE HERE ###
 83     
 84     assert(A2.shape == (1, X.shape[1]))
 85     
 86     cache = {"Z1": Z1,
 87              "A1": A1,
 88              "Z2": Z2,
 89              "A2": A2}
 90     
 91     return A2, cache
 92 
 93 #implement  compute_cost
 94 def compute_cost(A2, Y, parameters):
 95     """
 96     Computes the cross-entropy cost given in equation (13)
 97     
 98     Arguments:
 99     A2 -- The sigmoid output of the second activation, of shape (1, number of examples)
100     Y -- "true" labels vector of shape (1, number of examples)
101     parameters -- python dictionary containing your parameters W1, b1, W2 and b2
102     
103     Returns:
104     cost -- cross-entropy cost given equation (13)
105     """
106     
107     m = Y.shape[1] # number of example
108 
109     # Compute the cross-entropy cost
110     ### START CODE HERE ### (≈ 2 lines of code)
111     logprobs = np.multiply(np.log(A2),Y)+np.multiply((1-Y),np.log((1-A2)))
112     cost =np.sum(logprobs)/m
113     ### END CODE HERE ###
114     
115     cost = np.squeeze(cost)     # makes sure cost is the dimension we expect. 
116                                 # E.g., turns [[17]] into 17 
117     assert(isinstance(cost, float))
118     
119     return cost
120 
121 #implement backward_propagation:
122 def backward_propagation(parameters, cache, X, Y):
123     """
124     Implement the backward propagation using the instructions above.
125     
126     Arguments:
127     parameters -- python dictionary containing our parameters 
128     cache -- a dictionary containing "Z1", "A1", "Z2" and "A2".
129     X -- input data of shape (2, number of examples)
130     Y -- "true" labels vector of shape (1, number of examples)
131     
132     Returns:
133     grads -- python dictionary containing your gradients with respect to different parameters
134     """
135     m = X.shape[1]
136     
137     # First, retrieve W1 and W2 from the dictionary "parameters".
138     ### START CODE HERE ### (≈ 2 lines of code)
139     W1 = parameters[‘W1‘]
140     W2 = parameters[‘W2‘]
141     ### END CODE HERE ###
142         
143     # Retrieve also A1 and A2 from dictionary "cache".
144     ### START CODE HERE ### (≈ 2 lines of code)
145     A1 = cache[‘A1‘]
146     A2 = cache[‘A2‘]
147     ### END CODE HERE ###
148     
149     # Backward propagation: calculate dW1, db1, dW2, db2. 
150     ### START CODE HERE ### (≈ 6 lines of code, corresponding to 6 equations on slide above)
151     dZ2 = A2-Y
152     dW2 = (1.0/m)*np.dot(dZ2,A1.T)
153     db2 = (1.0/m)*np.sum(dZ2,axis=1,keepdims=True)
154     dZ1 = np.multiply(np.dot(W2.T,dZ2),(1-np.power(A1,2)))
155     dW1 = (1.0/m)*np.dot(dZ1,X.T)
156     db1 = (1.0/m)*np.sum(dZ1,axis=1,keepdims=True)
157     ### END CODE HERE ###
158     
159     grads = {"dW1": dW1,
160              "db1": db1,
161              "dW2": dW2,
162              "db2": db2}
163     
164     return grads
165 
166 #update_parameters:
167 def update_parameters(parameters, grads, learning_rate = 1.2):
168     """
169     Updates parameters using the gradient descent update rule given above
170     
171     Arguments:
172     parameters -- python dictionary containing your parameters 
173     grads -- python dictionary containing your gradients 
174     
175     Returns:
176     parameters -- python dictionary containing your updated parameters 
177     """
178     # Retrieve each parameter from the dictionary "parameters"
179     ### START CODE HERE ### (≈ 4 lines of code)
180     W1 = parameters[‘W1‘]
181     b1 = parameters[‘b1‘]
182     W2 = parameters[‘W2‘]
183     b2 = parameters[‘b2‘]
184     ### END CODE HERE ###
185     
186     # Retrieve each gradient from the dictionary "grads"
187     ### START CODE HERE ### (≈ 4 lines of code)
188     dW1 = grads[‘dW1‘]
189     db1 = grads[‘db1‘]
190     dW2 = grads[‘dW2‘]
191     db2 = grads[‘db2‘]
192     ## END CODE HERE ###
193     
194     # Update rule for each parameter
195     ### START CODE HERE ### (≈ 4 lines of code)
196     W1 = W1-learning_rate*dW1
197     b1 = b1-learning_rate*db1
198     W2 = W2-learning_rate*dW2
199     b2 = b2-learning_rate*db2
200     ### END CODE HERE ###
201     
202     parameters = {"W1": W1,
203                   "b1": b1,
204                   "W2": W2,
205                   "b2": b2}
206     
207     return parameters
208 
209 #Build your neural network model 
210 def nn_model(X, Y, n_h, num_iterations = 10000, print_cost=False):
211     """
212     Arguments:
213     X -- dataset of shape (2, number of examples)
214     Y -- labels of shape (1, number of examples)
215     n_h -- size of the hidden layer
216     num_iterations -- Number of iterations in gradient descent loop
217     print_cost -- if True, print the cost every 1000 iterations
218     
219     Returns:
220     parameters -- parameters learnt by the model. They can then be used to predict.
221     """
222     
223     np.random.seed(3)
224     n_x = layer_sizes(X, Y)[0]
225     n_y = layer_sizes(X, Y)[2]
226     
227     # Initialize parameters, then retrieve W1, b1, W2, b2. Inputs: "n_x, n_h, n_y". Outputs = "W1, b1, W2, b2, parameters".
228     ### START CODE HERE ### (≈ 5 lines of code)
229     parameters = initialize_parameters(n_x,n_h,n_y)
230     W1 = parameters[‘W1‘]
231     b1 = parameters[‘b1‘]
232     W2 = parameters[‘W2‘]
233     b2 = parameters[‘b2‘]
234     ### END CODE HERE ###
235     
236     # Loop (gradient descent)
237 
238     for i in range(0, num_iterations):
239          
240         ### START CODE HERE ### (≈ 4 lines of code)
241         # Forward propagation. Inputs: "X, parameters". Outputs: "A2, cache".
242         A2, cache = forward_propagation(X,parameters)
243         
244         # Cost function. Inputs: "A2, Y, parameters". Outputs: "cost".
245         cost =compute_cost(A2,Y,parameters)
246  
247         # Backpropagation. Inputs: "parameters, cache, X, Y". Outputs: "grads".
248         grads =backward_propagation(parameters,cache,X,Y)
249  
250         # Gradient descent parameter update. Inputs: "parameters, grads". Outputs: "parameters".
251         parameters = update_parameters(parameters,grads)
252         
253         ### END CODE HERE ###
254         
255         # Print the cost every 1000 iterations
256         if print_cost and i % 1000 == 0:
257             print ("Cost after iteration %i: %f" %(i, cost))
258 
259     return parameters
260 
261 #Use your model to predict by building predict().Use forward propagation to predict results
262 
263 def predict(parameters, X):
264     """
265     Using the learned parameters, predicts a class for each example in X
266     
267     Arguments:
268     parameters -- python dictionary containing your parameters 
269     X -- input data of size (n_x, m)
270     
271     Returns
272     predictions -- vector of predictions of our model (red: 0 / blue: 1)
273     """
274     
275     # Computes probabilities using forward propagation, and classifies to 0/1 using 0.5 as the threshold.
276     ### START CODE HERE ### (≈ 2 lines of code)
277     A2, cache = forward_propagation(X,parameters)
278     predictions =np.where(A2>0.5,1,0)
279     ### END CODE HERE ###
280     
281     return predictions

淺層神經網絡

slide print sig bsp alt upd ive ret respond 1、神經網絡概述： dW[L]=（1/m)*dZ[L]A[L-1].T db[L]=(1/m)*np.sum(dZ[L],axis=1,keepdims=True) dZ[L-1]=

實現一個單隱層神經網絡

深度學習 scalar 線性 cos arguments some calculate desc shape 　　看過首席科學家NG的深度學習公開課很久了，一直沒有時間做神經網絡編程題，做完想把思路總結下來，僅僅記錄神經網絡編程主線。一引用工具包 imp

用cnn構建多層神經網絡來識別mnist中的圖片

argv padding out load 神經網絡 dir sco ack import mnist.py import tensorflow as tf import numpy as np import argparse import sys import urll

CS231n 作業1 SVM+softmax+兩層神經網絡

clas 天都 dao mar ref har svm .com 成了大概用了有小半個月的時間斷斷續續的完成了作業1，因為期間每天都還在讀論文，所以進度有些落後，不過做完感覺也是收獲頗豐。附上地址 http://note.youdao.com/noteshare?id=

Spark2.0機器學習系列之7： MLPC（多層神經網絡）

element nbsp hid 隨機梯度下降 support file dict 分類器希望 Spark2.0 MLPC（多層神經網絡分類器）算法概述 MultilayerPerceptronClassifier（MLPC）這是一個基於前饋神經網絡的分類器，它是一種在

識別貓的單隱藏層神經網絡（我的第一個模型）

負數所有 sha 分類 col 缺少 right shadow 一個摘要：算法詳解；代碼；可視化查看超參數影響目標：識別一張圖是不是貓數據集：訓練數據209張64*64 測試數據50張 64*64 方案：二分分類法算法：logistic回歸，

MXNET：多層神經網絡

pri 批量 ali end ear 多層權重 clas 方法多層感知機（multilayer perceptron，簡稱MLP）是最基礎的深度學習模型。多層感知機在單層神經網絡的基礎上引入了一到多個隱藏層（hidden layer）。隱藏層位於輸入層和輸出層之間。隱

神經網絡之dropout層

軟件同時依賴 chm ast 縮減 ref word 叠代一：引言　　因為在機器學習的一些模型中，如果模型的參數太多，而訓練樣本又太少的話，這樣訓練出來的模型很容易產生過擬合現象。在訓練bp網絡時經常遇到的一個問題，過擬合指的是模型在訓練數據上損失函數比較小，預測準

torch教程[1]用numpy實現三層全連接神經網絡

一個 out () numpy port import 課程例子程序 bsp torch的第一個例子程序，是用numpy函數實現神經網絡。cs231n的課程中有大量這樣的作業。 import numpy as np N,D_in,H,D_out=64,1000,100,

C++卷積神經網絡實例：tiny_cnn代碼具體解釋（6）——average_pooling_layer層結構類分析

加權 for com 整數 ret 子類 mismatch normal 信息　　在之前的博文中我們著重分析了convolutional_layer類的代碼結構。在這篇博文中分析相應的下採樣層average_pooling_layer類：　　一、下採樣層的作用　　下採

神經網絡淺講：從神經元到深度學習

永遠創新方向轉化展期反向傳播通用堅持高性能計算 https://www.cnblogs.com/subconscious/p/5058741.html 　神經網絡是一門重要的機器學習技術。它是目前最為火熱的研究方向--深度學習的基礎。學習神經網絡不僅可以讓你

1.4激活函數-帶隱層的神經網絡tf實戰

ima 需要 logs .com horizon optimizer 數量 sid ont 激活函數激活函數----日常不能用線性方程所概括的東西左圖是線性方程，右圖是非線性方程當男生增加到一定程度的時候，喜歡女生的數量不可能無限制增加，更加趨於平穩

TensorFlow 卷積神經網絡--卷積層

意圖有著 image 卷積神經網絡細節 inf gpo body kernel 之前我們已經有一個卷積神經網絡識別手寫數字的代碼，執行下來正確率可以達到96%以上。若是再優化下結構，正確率還可以進一步提升1~2個百分點。卷積神經網絡在機器學習領域有著廣泛的應用。現在

python構建bp神經網絡_鳶尾花分類(一個隱藏層)__1.數據集

data learn 9.png blog spa src 兩個 idt 數據 IDE：jupyter 目前我知道的數據集來源有兩個，一個是csv數據集文件另一個是從sklearn.datasets導入 1.1 csv格式的數據集（下載地址已上傳到博客園--

對於分類問題的神經網絡最後一層的函數：sigmoid、softmax與損失函數

網絡選擇函數介紹中間 one 玫瑰兩個類函數激活對於分類問題的神經網絡最後一層的函數做如下知識點總結： sigmoid和softmax一般用作神經網絡的最後一層做分類函數（備註：sigmoid也用作中間層做激活函數）；對於類別數量大於2的分類問題，如果每個

用tensorflow構建兩層簡單神經網絡（全連接）

atm 傳播輸入 txt 人工智 ESS var etx 構建中國大學Mooc 北京大學人工智能實踐：Tensorflow筆記(week3) #coding:utf-8 #兩層簡單神經網絡（全連接） import tensorflow as tf #定義輸入和參

Python實現——二層BP神經網絡

直線 python erro 隱藏運用每次能夠維度 turn 2019/4/20 二層BP神經網絡但是仍有部分在公式上的不明了，但是其運作方式還是很簡單的，先簡單解析我的代碼 from createData import generate_data 是本次所解題目

吳裕雄 python 神經網絡——TensorFlow 三層簡單神經網絡的前向傳播算法

tdd global dom NPU sta seed flow python dde import tensorflow as tf w1= tf.Variable(tf.random_normal([2, 3], stddev=1, seed=1)) w2=

利用Tensorflow實現神經網絡模型

flow one 什麽 hold test ase tensor dom def 首先看一下神經網絡模型，一個比較簡單的兩層神經。代碼如下： # 定義參數 n_hidden_1 = 256 #第一層神經元 n_hidden_2 = 128 #第

神經網絡結構在命名實體識別（NER）中的應用

field edi most 好的向量後來目標領域 png 神經網絡結構在命名實體識別（NER）中的應用近年來，基於神經網絡的深度學習方法在自然語言處理領域已經取得了不少進展。作為NLP領域的基礎任務—命名實體識別（Named Entity Recogni

淺層神經網絡

相關推薦