以下程式碼來自Deep Learning for Computer Vision with Python第十章。

本例程需要在同一檔案內新建四個檔案。分別是1、perceptron.py;2、perceptron_or.py;3、perceptron_and.py;4、perceptron_xor.py。

1、perceptron.py

# import the necessary packages
import numpy as np

class Perceptron:
	def __init__(self, N, alpha=0.1):
		# initialize the weight matrix and store the learning rate
		self.W = np.random.randn(N + 1) / np.sqrt(N)
		self.alpha = alpha
		
	def step(self, x):
		# apply the step function
		return 1 if x > 0 else 0
		
	def fit(self, X, y, epochs=10):
		# insert a column of 1's as the last entry in the feature
		# matrix -- this little trick allows us to treat the bias
		# as a trainable parameter within the weight matrix
		X = np.c_[X, np.ones((X.shape[0]))]
		
		# loop over the desired number of epochs
		for epoch in np.arange(0, epochs):
			# loop over each individual data point
			for (x, target) in zip(X, y):
				# take the dot product between the input features
				# and the weight matrix, then pass this value
				# through the step function to obtain the prediction
				p = self.step(np.dot(x, self.W))

				#print("[training] self.W={}, x={}, target={}".format(self.W, x, target))
				
				# only perform a weight update if our prediction
				# does not match the target
				if p != target:
					# determine the error
					error = p - target

					# update the weight matrix
					self.W += -self.alpha * error * x

	def predict(self, X, addBias=True):
		# ensure our input is a matrix
		X = np.atleast_2d(X)

		# check to see if the bias column should be added
		if addBias:
			# insert a column of 1's as the last entry in the feature
			# matrix (bias)
			X = np.c_[X, np.ones((X.shape[0]))]

		# take the dot product between the input features and the
		# weight matrix, then pass the value through the step
		# function
		return self.step(np.dot(X, self.W))
		

分析:Perception類是一個(3-1)結構的神經網路,(3-1)代表有輸入層有3個神經元(其中兩個神經元用於處理輸入引數x1和x2,另外一個神經元輸入固定為1),輸出層有1個神經元。示意圖見下圖。

神經網路權重檔案除了權重(w1和w2),還加上了偏置(b)。本身輸入引數只有兩個,對應的權值是w1和w2,輸入層神經元為3的目的是把偏置b也加入權重矩陣中。當訓練權重矩陣時,偏置b也得以更新。輸出引數的表示式是y=step(x1*w1+x2*w2+b)。

神經元的啟用函式是Step函式。Step函式包含一個輸入引數和一個輸出引數。當輸入小於0,則輸出為0;當輸入大於0,則輸出1。

fit函式用於訓練,使用的是隨機梯度下降法。predict函式作用是測試樣品,把目標樣品經過本神經網路,獲得預測結果。

2、perceptron_or.py

# import the necessary packages
from perceptron import Perceptron
import numpy as np

# construct the OR dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [1], [1], [1]])

# define our perceptron and train it
print("[INFO] training perceptron...")
p = Perceptron(X.shape[1], alpha=0.1)
p.fit(X, y, epochs=20)

# now that our perceptron is trained we can evaluate it
print("[INFO] testing perceptron...")

# now that our network is trained, loop over the data points
for (x, target) in zip(X, y):
	# make a prediction on the data point and display the result
	# to our console
	pred = p.predict(x)
	print("[INFO] data={}, ground-truth={}, pred={}".format(
		x, target[0], pred))
		

在或運算中,兩個輸入引數和一個輸出引數的關係見下表。

邏輯運算-或
輸入引數1:x1 輸入引數2:x2 輸出引數:y
0 0 0
0 1 1
1 0 1
1 1 1

Perceptron函式用於新建一個2層神經網路。第一個輸入引數X.shape[1]是X中每個樣品的引數個數。alpha是梯度下降、更新權值的速度。越接近1,速度越快,但是越大越容易錯過區域性最大值。

fit函式第一個引數是樣品的輸入引數矩陣,第二個引數是樣品的輸入引數的輸出矩陣(真實值),第三個引數是迭代次數。data是樣品的輸入引數,ground-truth表示真實值,pred是預測結果。

用python執行perceptron_or.py,可得到以下結果:

============= RESTART: E:\FENG\workspace_python\perceptron_or.py =============
[INFO] training perceptron...
[INFO] testing perceptron...
[INFO] data=[0 0], ground-truth=0, pred=0
[INFO] data=[0 1], ground-truth=1, pred=1
[INFO] data=[1 0], ground-truth=1, pred=1
[INFO] data=[1 1], ground-truth=1, pred=1

3、perceptron_and.py

# import the necessary packages
from perceptron import Perceptron
import numpy as np

# construct the OR dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [0], [0], [1]])

# define our perceptron and train it
print("[INFO] training perceptron...")
p = Perceptron(X.shape[1], alpha=0.1)
p.fit(X, y, epochs=20)

# now that our perceptron is trained we can evaluate it
print("[INFO] testing perceptron...")

# now that our network is trained, loop over the data points
for (x, target) in zip(X, y):
	# make a prediction on the data point and display the result
	# to our console
	pred = p.predict(x)
	print("[INFO] data={}, ground-truth={}, pred={}".format(
		x, target[0], pred))
		

這個檔案和上面那份檔案差別不大,因此不分析了。

4、perceptron_xor.py

# import the necessary packages
from perceptron import Perceptron
import numpy as np

# construct the OR dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [1], [1], [0]])

# define our perceptron and train it
print("[INFO] training perceptron...")
p = Perceptron(X.shape[1], alpha=0.1)
p.fit(X, y, epochs=20)

# now that our perceptron is trained we can evaluate it
print("[INFO] testing perceptron...")

# now that our network is trained, loop over the data points
for (x, target) in zip(X, y):
	# make a prediction on the data point and display the result
	# to our console
	pred = p.predict(x)
	print("[INFO] data={}, ground-truth={}, pred={}".format(
		x, target[0], pred))
		

結果如下:

============ RESTART: E:\FENG\workspace_python\perceptron_xor.py ============
[INFO] training perceptron...
[INFO] testing perceptron...
[INFO] data=[0 0], ground-truth=0, pred=1
[INFO] data=[0 1], ground-truth=1, pred=1
[INFO] data=[1 0], ground-truth=1, pred=0
[INFO] data=[1 1], ground-truth=0, pred=0

可見,異或的預測結果並不準確。主要因為,只具有2層神經元、而不具備隱含層的神經網路並無法非線性的對樣品分類。

上圖說明的是,與和或樣本的空間分佈,因為可以用一條直線把輸出0和輸出1的樣本分類,因此比較簡單。經過實踐發現,使用2層神經網路的分類器也能實現預期效果。但是,異或的樣本邏輯比較複雜。

為了正確分類異或樣本,必須改變神經網路結構,進一步增加隱含層,嘗試重新訓練以及測試。

.