1. 程式人生 > >【讀書1】【2017】MATLAB與深度學習——池化層(1)

【讀書1】【2017】MATLAB與深度學習——池化層(1)

由於它是一個二維的運算操作,文字解釋可能會導致更多的混淆,因此讓我們來舉一個例子。

As it is a two-dimensional operation, andan explanation in text may lead to more confusion, let’s go through an example.

考慮4×4畫素的輸入影象,它由圖6-15所示的矩陣表示。

Consider the 4×4 pixel input image, which isexpressed by the matrix shown in Figure 6-15.

在這裡插入圖片描述

圖6-15 4×4畫素的輸入影象The four-by-four pixel input image

在畫素互不重疊的條件下,我們將輸入影象的畫素組合成2×2矩陣。

We combine the pixels of the input imageinto a 2×2 matrixwithout overlapping the elements.

一旦輸入影象通過池化層,它將收縮成2×2畫素的影象。

Once the input image passes through thepooling layer, it shrinks into a 2×2 pixel image.

圖6- 16示出了使用平均池化和最大池化的輸出結果。

Figure 6-16 shows the resultant cases ofpooling using the mean pooling and max pooling.

在這裡插入圖片描述

圖6-16 兩種不同方法池化後的結果The resultant cases ofpooling using two different methods

實際上,在數學意義上,池化過程是一種卷積運算。

Actually, in a mathematical sense, thepooling process is a type of convolution operation.

與卷積層的區別在於卷積濾波器是固定的,且池化層的卷積區域互不重疊。

The difference from the convolution layeris that the convolution filter is stationary, and the convolution areas do notoverlap.

下一節中提供的示例將對此進行詳細說明。

The example provided in the next sectionwill elaborate on this.

池化層在一定程度上能夠補償偏心和傾斜的物體。

The pooling layer compensates for eccentricand tilted objects to some extent.

例如,池化層可以提高對影象中貓的識別,貓所處位置可以偏離輸入影象的中心。

For example, the pooling layer can improvethe recognition of a cat, which may be off-center in the input image.

此外,由於池化處理減小了影象大小,所以對於降低計算量和防止過擬合非常有益。

In addition, as the pooling process reducesthe image size, it is highly beneficial for relieving the computational loadand preventing overfitting.

示例:MNIST(Example: MNIST)

我們實現一個神經網路,使用它獲取輸入影象並識別影象所代表的數字。

We implement a neural network that takesthe input image and recognizes the digit that it represents.

訓練資料採用MNIST(Mixed National Institute of Standardsand Technology,國家標準與技術混合研究所)資料庫,它包含70000個手寫數字影象。

The training data is the MNIST database,which contains 70,000 images of handwritten numbers.

一般來說,60000幅影象用於訓練,剩下的10000幅影象用於驗證測試。

In general, 60,000 images are used fortraining, and the remaining 10,000 images are used for the validation test.

每幅數字影象是一個28×28畫素的黑白影象,如圖6-17所示。

Each digit image is a 28-by-28 pixelblack-and-white image, as shown in Figure 6-17.

在這裡插入圖片描述
圖6-17 MNIST資料庫中的28x28畫素黑白影象A 28-by-28 pixelblack-and-white image from the MNIST database

考慮到訓練時間,該示例僅使用10000幅影象,訓練資料和驗證資料的比例為8:2。

Considering the training time, this exampleemploys only 10,000 images with the training data and verification data in an8:2 ratio.

——本文譯自Phil Kim所著的《Matlab Deep Learning》

更多精彩文章請關注微訊號:在這裡插入圖片描述