1. 程式人生 > >多層感知機(Multilayer Perceptron)

多層感知機(Multilayer Perceptron)

在本節中,假設你已經瞭解了使用邏輯迴歸進行MNIST分類。同時本節的所有程式碼可以在這裡下載.

下一個我們將在Theano中使用的結構是單隱層的多層感知機(MLP)。MLP可以被看作一個邏輯迴歸分類器。這個中間層被稱為隱藏層。一個單隱層對於MLP成為通用近似器是有效的。然而在後面,我們將講述使用多個隱藏層的好處,例如深度學習的前提。這個課程介紹了MLP,反向誤差傳導,如何訓練MLPs

模型

一個多層感知機(或者說人工神經網路——ANN),在只有一個隱藏層時可以被表示為如下的圖:

mlp_model_1

事實上,一個單隱藏層的MLP是一個如下的函式mlp_model_2,其中x是輸入向量的維度,L是輸出向量的維度。我們用下面的公式來表示MLP模型:

mlp_model_3

其中b_1,W_1是輸出層到隱藏層的偏置向量和權值矩陣,s是該層的啟用函式。而b_2,W_2是隱藏層到輸出層的偏置向量和權值矩陣,G是該層的啟用函式。通常選擇s為sigmoid函式,G為softmax函式。
在訓練MLP模型的引數時,我們使用minibatch的隨機梯度下降,在獲得梯度後使用反向誤差傳導演算法來實現引數的訓練。由於Theano提供自動的微分,我們不需要在這個教程裡面談及這個方面。

從邏輯迴歸到多層感知機

本教程將專注於單隱藏層的MLP。我們以隱藏層的類的實現開始,如果要構建一個MLP,只需要在此基礎上新增一個邏輯迴歸就好。

class HiddenLayer
(object):
def __init__(self, rng, input, n_in, n_out, W=None, b=None, activation=T.tanh): """ Typical hidden layer of a MLP: units are fully-connected and have sigmoidal activation function. Weight matrix W is of shape (n_in,n_out) and the bias vector b is of shape (n_out,). NOTE : The nonlinearity used here is tanh Hidden unit activation is given by: tanh(dot(input,W) + b) :type rng: numpy.random.RandomState :param rng: a random number generator used to initialize weights :type input: theano.tensor.dmatrix :param input: a symbolic tensor of shape (n_examples, n_in) :type n_in: int :param n_in: dimensionality of input :type n_out: int :param n_out: number of hidden units :type activation: theano.Op or function :param activation: Non linearity to be applied in the hidden layer """
self.input = input

一個隱藏層的權值初始化,應當從基於啟用函式的均勻間隔中均勻取樣。對於sigmoid函式而言,這個間隔是interval。其中fan_in是第(i-1)層的單元數目,fan_out是第(i)層單元的數目,結論出自這裡
這樣的初始化,保證了在訓練的早期,每個神經元都可以工作在它啟用函式的控制範圍內,從而使得資訊可以更簡單的前向傳導(從輸入到輸出的啟用)和後向傳導(從輸出到輸入的梯度)。

        # `W` is initialized with `W_values` which is uniformely sampled
        # from sqrt(-6./(n_in+n_hidden)) and sqrt(6./(n_in+n_hidden))
        # for tanh activation function
        # the output of uniform if converted using asarray to dtype
        # theano.config.floatX so that the code is runable on GPU
        # Note : optimal initialization of weights is dependent on the
        #        activation function used (among other things).
        #        For example, results presented in [Xavier10] suggest that you
        #        should use 4 times larger initial weights for sigmoid
        #        compared to tanh
        #        We have no info for other function, so we use the same as
        #        tanh.
        if W is None:
            W_values = numpy.asarray(
                rng.uniform(
                    low=-numpy.sqrt(6. / (n_in + n_out)),
                    high=numpy.sqrt(6. / (n_in + n_out)),
                    size=(n_in, n_out)
                ),
                dtype=theano.config.floatX
            )
            if activation == theano.tensor.nnet.sigmoid:
                W_values *= 4

            W = theano.shared(value=W_values, name='W', borrow=True)

        if b is None:
            b_values = numpy.zeros((n_out,), dtype=theano.config.floatX)
            b = theano.shared(value=b_values, name='b', borrow=True)

        self.W = W
        self.b = b

注意,我們通過會將一個給定的非線性函式作為隱藏層的啟用函式。預設是tanh函式,當然很多時候你可能需要其他函式。

        lin_output = T.dot(input, self.W) + self.b
        self.output = (
            lin_output if activation is None
            else activation(lin_output)
        )

如果你已經閱讀了上面的隱藏層輸出和使用邏輯迴歸進行MNIST分類。那麼你可以看下面的MLP類的實現了。

class MLP(object):
    """Multi-Layer Perceptron Class

    A multilayer perceptron is a feedforward artificial neural network model
    that has one layer or more of hidden units and nonlinear activations.
    Intermediate layers usually have as activation function tanh or the
    sigmoid function (defined here by a ``HiddenLayer`` class)  while the
    top layer is a softamx layer (defined here by a ``LogisticRegression``
    class).
    """

    def __init__(self, rng, input, n_in, n_hidden, n_out):
        """Initialize the parameters for the multilayer perceptron

        :type rng: numpy.random.RandomState
        :param rng: a random number generator used to initialize weights

        :type input: theano.tensor.TensorType
        :param input: symbolic variable that describes the input of the
        architecture (one minibatch)

        :type n_in: int
        :param n_in: number of input units, the dimension of the space in
        which the datapoints lie

        :type n_hidden: int
        :param n_hidden: number of hidden units

        :type n_out: int
        :param n_out: number of output units, the dimension of the space in
        which the labels lie

        """

        # Since we are dealing with a one hidden layer MLP, this will translate
        # into a HiddenLayer with a tanh activation function connected to the
        # LogisticRegression layer; the activation function can be replaced by
        # sigmoid or any other nonlinear function
        self.hiddenLayer = HiddenLayer(
            rng=rng,
            input=input,
            n_in=n_in,
            n_out=n_hidden,
            activation=T.tanh
        )

        # The logistic regression layer gets as input the hidden units
        # of the hidden layer
        self.logRegressionLayer = LogisticRegression(
            input=self.hiddenLayer.output,
            n_in=n_hidden,
            n_out=n_out
        )

在本節中,我們也使用L1/L2正則化(L1/L2正則化)。所以我們需要去計算W_1和W_2矩陣的L1正則和L2平方正則。

        # L1 norm ; one regularization option is to enforce L1 norm to
        # be small
        self.L1 = (
            abs(self.hiddenLayer.W).sum()
            + abs(self.logRegressionLayer.W).sum()
        )

        # square of L2 norm ; one regularization option is to enforce
        # square of L2 norm to be small
        self.L2_sqr = (
            (self.hiddenLayer.W ** 2).sum()
            + (self.logRegressionLayer.W ** 2).sum()
        )

        # negative log likelihood of the MLP is given by the negative
        # log likelihood of the output of the model, computed in the
        # logistic regression layer
        self.negative_log_likelihood = (
            self.logRegressionLayer.negative_log_likelihood
        )
        # same holds for the function computing the number of errors
        self.errors = self.logRegressionLayer.errors

        # the parameters of the model are the parameters of the two layer it is
        # made out of
        self.params = self.hiddenLayer.params + self.logRegressionLayer.params

在此之前,我們使用minibatch的隨機梯度下降來訓練這個模型。不同的是,我們現在在cost函式裡面添加了正則項。L1_regL2_reg可以控制權值矩陣的正則化。計算新cost的程式碼如下:

    # the cost we minimize during training is the negative log likelihood of
    # the model plus the regularization terms (L1 and L2); cost is expressed
    # here symbolically
    cost = (
        classifier.negative_log_likelihood(y)
        + L1_reg * classifier.L1
        + L2_reg * classifier.L2_sqr
    )

我們使用梯度來更新模型引數,這基本和邏輯迴歸裡面的一樣。我們從模型的params中獲取引數列表,然後分析它,並計算每一步的梯度。

    # compute the gradient of cost with respect to theta (sotred in params)
    # the resulting gradients will be stored in a list gparams
    gparams = [T.grad(cost, param) for param in classifier.params]

    # specify how to update the parameters of the model as a list of
    # (variable, update expression) pairs

    # given two list the zip A = [a1, a2, a3, a4] and B = [b1, b2, b3, b4] of
    # same length, zip generates a list C of same size, where each element
    # is a pair formed from the two lists :
    #    C = [(a1, b1), (a2, b2), (a3, b3), (a4, b4)]
    updates = [
        (param, param - learning_rate * gparam)
        for param, gparam in zip(classifier.params, gparams)
    ]

    # compiling a Theano function `train_model` that returns the cost, but
    # in the same time updates the parameter of the model based on the rules
    # defined in `updates`
    train_model = theano.function(
        inputs=[index],
        outputs=cost,
        updates=updates,
        givens={
            x: train_set_x[index * batch_size: (index + 1) * batch_size],
            y: train_set_y[index * batch_size: (index + 1) * batch_size]
        }
    )

把它組合起來

已經解釋了所有的基本該概念,下面的程式碼就是一個完整的MLP類。

"""
This tutorial introduces the multilayer perceptron using Theano.

 A multilayer perceptron is a logistic regressor where
instead of feeding the input to the logistic regression you insert a
intermediate layer, called the hidden layer, that has a nonlinear
activation function (usually tanh or sigmoid) . One can use many such
hidden layers making the architecture deep. The tutorial will also tackle
the problem of MNIST digit classification.

.. math::

    f(x) = G( b^{(2)} + W^{(2)}( s( b^{(1)} + W^{(1)} x))),

References:

    - textbooks: "Pattern Recognition and Machine Learning" -
                 Christopher M. Bishop, section 5

"""
__docformat__ = 'restructedtext en'


import os
import sys
import time

import numpy

import theano
import theano.tensor as T


from logistic_sgd import LogisticRegression, load_data


# start-snippet-1
class HiddenLayer(object):
    def __init__(self, rng, input, n_in, n_out, W=None, b=None,
                 activation=T.tanh):
        """
        Typical hidden layer of a MLP: units are fully-connected and have
        sigmoidal activation function. Weight matrix W is of shape (n_in,n_out)
        and the bias vector b is of shape (n_out,).

        NOTE : The nonlinearity used here is tanh

        Hidden unit activation is given by: tanh(dot(input,W) + b)

        :type rng: numpy.random.RandomState
        :param rng: a random number generator used to initialize weights

        :type input: theano.tensor.dmatrix
        :param input: a symbolic tensor of shape (n_examples, n_in)

        :type n_in: int
        :param n_in: dimensionality of input

        :type n_out: int
        :param n_out: number of hidden units

        :type activation: theano.Op or function
        :param activation: Non linearity to be applied in the hidden
                           layer
        """
        self.input = input
        # end-snippet-1

        # `W` is initialized with `W_values` which is uniformely sampled
        # from sqrt(-6./(n_in+n_hidden)) and sqrt(6./(n_in+n_hidden))
        # for tanh activation function
        # the output of uniform if converted using asarray to dtype
        # theano.config.floatX so that the code is runable on GPU
        # Note : optimal initialization of weights is dependent on the
        #        activation function used (among other things).
        #        For example, results presented in [Xavier10] suggest that you
        #        should use 4 times larger initial weights for sigmoid
        #        compared to tanh
        #        We have no info for other function, so we use the same as
        #        tanh.
        if W is None:
            W_values = numpy.asarray(
                rng.uniform(
                    low=-numpy.sqrt(6. / (n_in + n_out)),
                    high=numpy.sqrt(6. / (n_in + n_out)),
                    size=(n_in, n_out)
                ),
                dtype=theano.config.floatX
            )
            if activation == theano.tensor.nnet.sigmoid:
                W_values *= 4

            W = theano.shared(value=W_values, name='W', borrow=True)

        if b is None:
            b_values = numpy.zeros((n_out,), dtype=theano.config.floatX)
            b = theano.shared(value=b_values, name='b', borrow=True)

        self.W = W
        self.b = b

        lin_output = T.dot(input, self.W) + self.b
        self.output = (
            lin_output if activation is None
            else activation(lin_output)
        )
        # parameters of the model
        self.params = [self.W, self.b]


# start-snippet-2
class MLP(object):
    """Multi-Layer Perceptron Class

    A multilayer perceptron is a feedforward artificial neural network model
    that has one layer or more of hidden units and nonlinear activations.
    Intermediate layers usually have as activation function tanh or the
    sigmoid function (defined here by a ``HiddenLayer`` class)  while the
    top layer is a softamx layer (defined here by a ``LogisticRegression``
    class).
    """

    def __init__(self, rng, input, n_in, n_hidden, n_out):
        """Initialize the parameters for the multilayer perceptron

        :type rng: numpy.random.RandomState
        :param rng: a random number generator used to initialize weights

        :type input: theano.tensor.TensorType
        :param input: symbolic variable that describes the input of the
        architecture (one minibatch)

        :type n_in: int
        :param n_in: number of input units, the dimension of the space in
        which the datapoints lie

        :type n_hidden: int
        :param n_hidden: number of hidden units

        :type n_out: int
        :param n_out: number of output units, the dimension of the space in
        which the labels lie

        """

        # Since we are dealing with a one hidden layer MLP, this will translate
        # into a HiddenLayer with a tanh activation function connected to the
        # LogisticRegression layer; the activation function can be replaced by
        # sigmoid or any other nonlinear function
        self.hiddenLayer = HiddenLayer(
            rng=rng,
            input=input,
            n_in=n_in,
            n_out=n_hidden,
            activation=T.tanh
        )

        # The logistic regression layer gets as input the hidden units
        # of the hidden layer
        self.logRegressionLayer = LogisticRegression(
            input=self.hiddenLayer.output,
            n_in=n_hidden,
            n_out=n_out
        )
        # end-snippet-2 start-snippet-3
        # L1 norm ; one regularization option is to enforce L1 norm to
        # be small
        self.L1 = (
            abs(self.hiddenLayer.W).sum()
            + abs(self.logRegressionLayer.W).sum()
        )

        # square of L2 norm ; one regularization option is to enforce
        # square of L2 norm to be small
        self.L2_sqr = (
            (self.hiddenLayer.W ** 2).sum()
            + (self.logRegressionLayer.W ** 2).sum()
        )

        # negative log likelihood of the MLP is given by the negative
        # log likelihood of the output of the model, computed in the
        # logistic regression layer
        self.negative_log_likelihood = (
            self.logRegressionLayer.negative_log_likelihood
        )
        # same holds for the function computing the number of errors
        self.errors = self.logRegressionLayer.errors

        # the parameters of the model are the parameters of the two layer it is
        # made out of
        self.params = self.hiddenLayer.params + self.logRegressionLayer.params
        # end-snippet-3


def test_mlp(learning_rate=0.01, L1_reg=0.00, L2_reg=0.0001, n_epochs=1000,
             dataset='mnist.pkl.gz', batch_size=20, n_hidden=500):
    """
    Demonstrate stochastic gradient descent optimization for a multilayer
    perceptron

    This is demonstrated on MNIST.

    :type learning_rate: float
    :param learning_rate: learning rate used (factor for the stochastic
    gradient

    :type L1_reg: float
    :param L1_reg: L1-norm's weight when added to the cost (see
    regularization)

    :type L2_reg: float
    :param L2_reg: L2-norm's weight when added to the cost (see
    regularization)

    :type n_epochs: int
    :param n_epochs: maximal number of epochs to run the optimizer

    :type dataset: string
    :param dataset: the path of the MNIST dataset file from
                 http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz


   """
    datasets = load_data(dataset)

    train_set_x, train_set_y = datasets[0]
    valid_set_x, valid_set_y = datasets[1]
    test_set_x, test_set_y = datasets[2]

    # compute number of minibatches for training, validation and testing
    n_train_batches = train_set_x.get_value(borrow=True).shape[0] / batch_size
    n_valid_batches = valid_set_x.get_value(borrow=True).shape[0] / batch_size
    n_test_batches = test_set_x.get_value(borrow=True).shape[0] / batch_size

    ######################
    # BUILD ACTUAL MODEL #
    ######################
    print '... building the model'

    # allocate symbolic variables for the data
    index = T.lscalar()  # index to a [mini]batch
    x = T.matrix('x')  # the data is presented as rasterized images
    y = T.ivector('y')  # the labels are presented as 1D vector of
                        # [int] labels

    rng = numpy.random.RandomState(1234)

    # construct the MLP class
    classifier = MLP(
        rng=rng,
        input=x,
        n_in=28 * 28,
        n_hidden=n_hidden,
        n_out=10
    )

    # start-snippet-4
    # the cost we minimize during training is the negative log likelihood of
    # the model plus the regularization terms (L1 and L2); cost is expressed
    # here symbolically
    cost = (
        classifier.negative_log_likelihood(y)
        + L1_reg * classifier.L1
        + L2_reg * classifier.L2_sqr
    )
    # end-snippet-4

    # compiling a Theano function that computes the mistakes that are made
    # by the model on a minibatch
    test_model = theano.function(
        inputs=[index],
        outputs=classifier.errors(y),
        givens={
            x: test_set_x[index * batch_size:(index + 1) * batch_size],
            y: test_set_y[index * batch_size:(index + 1) * batch_size]
        }
    )

    validate_model = theano.function(
        inputs=[index],
        outputs=classifier.errors(y),
        givens={
            x: valid_set_x[index * batch_size:(index + 1) * batch_size],
            y: valid_set_y[index * batch_size:(index + 1) * batch_size]
        }
    )

    # start-snippet-5
    # compute the gradient of cost with respect to theta (sotred in params)
    # the resulting gradients will be stored in a list gparams
    gparams = [T.grad(cost, param) for param in classifier.params]

    # specify how to update the parameters of the model as a list of
    # (variable, update expression) pairs

    # given two list the zip A = [a1, a2, a3, a4] and B = [b1, b2, b3, b4] of
    # same length, zip generates a list C of same size, where each element
    # is a pair formed from the two lists :
    #    C = [(a1, b1), (a2, b2), (a3, b3), (a4, b4)]
    updates = [
        (param, param - learning_rate * gparam)
        for param, gparam in zip(classifier.params, gparams)
    ]

    # compiling a Theano function `train_model` that returns the cost, but
    # in the same time updates the parameter of the model based on the rules
    # defined in `updates`
    train_model = theano.function(
        inputs=[index],
        outputs=cost,
        updates=updates,
        givens={
            x: train_set_x[index * batch_size: (index + 1) * batch_size],
            y: train_set_y[index * batch_size: (index + 1) * batch_size]
        }
    )
    # end-snippet-5

    ###############
    # TRAIN MODEL #
    ###############
    print '... training'

    # early-stopping parameters
    patience = 10000  # look as this many examples regardless
    patience_increase = 2  # wait this much longer when a new best is
                           # found
    improvement_threshold = 0.995  # a relative improvement of this much is
                                   # considered significant
    validation_frequency = min(n_train_batches, patience / 2)
                                  # go through this many
                                  # minibatche before checking the network
                                  # on the validation set; in this case we
                                  # check every epoch

    best_validation_loss = numpy.inf
    best_iter = 0
    test_score = 0.
    start_time = time.clock()

    epoch = 0
    done_looping = False

    while (epoch < n_epochs) and (not done_looping):
        epoch = epoch + 1
        for minibatch_index in xrange(n_train_batches):

            minibatch_avg_cost = train_model(minibatch_index)
            # iteration number
            iter = (epoch - 1) * n_train_batches + minibatch_index

            if (iter + 1) % validation_frequency == 0:
                # compute zero-one loss on validation set
                validation_losses = [validate_model(i) for i
                                     in xrange(n_valid_batches)]
                this_validation_loss = numpy.mean(validation_losses)

                print(
                    'epoch %i, minibatch %i/%i, validation error %f %%' %
                    (
                        epoch,
                        minibatch_index + 1,
                        n_train_batches,
                        this_validation_loss * 100.
                    )
                )

                # if we got the best validation score until now
                if this_validation_loss < best_validation_loss:
                    #improve patience if loss improvement is good enough
                    if (
                        this_validation_loss < best_validation_loss *
                        improvement_threshold
                    ):
                        patience = max(patience, iter * patience_increase)

                    best_validation_loss = this_validation_loss
                    best_iter = iter

                    # test it on the test set
                    test_losses = [test_model(i) for i
                                   in xrange(n_test_batches)]
                    test_score = numpy.mean(test_losses)

                    print(('     epoch %i, minibatch %i/%i, test error of '
                           'best model %f %%') %
                          (epoch, minibatch_index + 1, n_train_batches,
                           test_score * 100.))

            if patience <= iter:
                done_looping = True
                break

    end_time = time.clock()
    print(('Optimization complete. Best validation score of %f %% '
           'obtained at iteration %i, with test performance %f %%') %
          (best_validation_loss * 100., best_iter + 1, test_score * 100.))
    print >> sys.stderr, ('The code for file ' +
                          os.path.split(__file__)[1] +
                          ' ran for %.2fm' % ((end_time - start_time) / 60.))


if __name__ == '__main__':
    test_mlp()

預計將會得到這樣的輸出:

Optimization complete. Best validation score of 1.690000 % obtained at iteration 2070000, with test performance 1.650000 %
The code for file mlp.py ran for 97.34m

在一臺Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz的機器上,這個程式碼跑了10.3 epoch/minute然後花了828 epochs得到了1.65%的測試錯誤率。
讀者也可以在這個頁面檢視MNIST的識別結果。

訓練MLPs的技巧

在上面的程式碼中國,有一些是不能進行梯度下降來優化的。嚴格意義上將,發現最優的超參集合是不可能的任務。第一,我們不能獨立的優化每一個引數。第二,我們不能很容易的求解所有引數的梯度(有些是離散的值,有些是實數)。第三,這個優化問題是非凸的,容易陷入區域性最優。
好訊息是,過去25年,研究者發明了一些在神經網路中選擇超引數的方法和規則。你可以在LeCun等人的Efficient BackPro中閱讀,這是一個好的綜述。這裡,我們將總結下我們的程式碼中用到的幾個重要的方法和技術。

非線性

最常見的就是sigmoidtanh函式。在第4.4節中解釋的,非線性是關於原點對稱的,它傾向去輸出0均值的輸出(這是被期望的屬性)。根據我們的經驗,tanh(雙曲函式)擁有更好的收斂性。

權值初始化

在初始化權值的時候,我們一般需要它們在0附近,要足夠小(在啟用函式的近似線性區域可以獲得最大的梯度)。另一個特性,尤其對深度網路而言,是可以減小層與層之間的啟用函式的方差和反向傳導梯度的方差。這就可以讓資訊更好的向下和向上的傳導,減少層間差異。數學推倒,請看Xavier10

學習率

有許多文獻專注在好的學習速率的選擇上。最簡單的方案就是選擇一個固定速率。經驗法則:嘗試對數間隔的值(0.1,001,。。),然後縮小(對數)網路搜尋的範圍(你獲得最低驗證錯誤的區域)。
隨著時間的推移減小學習速率有時候也是一個好主意。一個簡單的方法是使用這個公式:u/(1+d*t),u是初始速率(可以使用上面講的網格搜尋選擇),d是減小常量,用以控制學習速率,可以設為0.001或者更小,t是迭代次數或者時間。
4.7節講述了網路中每個引數學習速率選擇的方法,然後基於分類錯誤率自適應的選擇它們。

隱藏節點數

這個超引數是非常基於資料集的。模糊的來說就是,輸入分佈越複雜,去模擬它的網路就需要更大的容量,那麼隱藏單元的數目就要更大。事實上,一個層的權值矩陣就是可以直接度量的(輸入維度*輸出維度)。
除非我們去使用正則選項(early-stopping或L1/L2懲罰),隱藏節點數和泛化表現的分佈圖,將呈現U型(即隱藏節點越多,在後期並不能提高泛化性)。

正則化引數

典型的方法是使用L1/L2正則化,同時lambda設為0.01,0.001等。儘管在我們之前提及的框架裡面,它並沒有顯著提高效能,但它仍然是一個值得探討的方法。

相關推薦

感知Multilayer Perceptron

在本節中,假設你已經瞭解了使用邏輯迴歸進行MNIST分類。同時本節的所有程式碼可以在這裡下載. 下一個我們將在Theano中使用的結構是單隱層的多層感知機(MLP)。MLP可以被看作一個邏輯迴歸分類器。這個中間層被稱為隱藏層。一個單隱層對於MLP成為通用近似器

用pytorch實現感知MLP)全連線神經網路FC分類MNIST手寫數字體的識別

1.匯入必備的包 1 import torch 2 import numpy as np 3 from torchvision.datasets import mnist 4 from torch import nn 5 from torch.autograd import Variable 6

Deep learning with Theano 官方中文教程翻譯——感知MLP

供大家相互交流和學習,本人水平有限,若有各種大小錯誤,還請巨牛大牛小牛微牛們立馬拍磚,這樣才能共同進步!若引用譯文請註明出處http://www.cnblogs.com/charleshuang/。 下面。http://deeplearning.net/tutorial/mlp.html#mlp  的中

TensorFlow HOWTO 4.1 感知分類

4.1 多層感知機(分類) 這篇文章開始就是深度學習了。多層感知機的架構是這樣: 輸入層除了提供資料之外,不幹任何事情。隱層和輸出層的每個節點都計算一次線性變換,並應用非線性啟用函式。隱層的啟用函式是壓縮性質的函式。輸出層的啟用函式取決於標籤的取值範圍。 其本質上相當於

感知MLP演算法原理及Spark MLlib呼叫例項Scala/Java/Python

多層感知機 演算法簡介:         多層感知機是基於反向人工神經網路(feedforwardartificial neural network)。多層感知機含有多層節點,每層節點與網路的下一層節點完全連線。輸入層的節點代表輸入資料,其他層的節點通過將輸入資料與層上節點

深度學習基礎—— 從感知MLP到卷積神經網路CNN

經典的多層感知機(Multi-Layer Perceptron)形式上是全連線(fully-connected)的鄰接網路(adjacent network)。 That is, every neuron in the network is connec

TensorFlow學習筆記4--實現感知MNIST資料集

前面使用TensorFlow實現一個完整的Softmax Regression,並在MNIST資料及上取得了約92%的正確率。現在建含一個隱層的神經網路模型(多層感知機)。 import tensorflow as tf import numpy as np

深度學習筆記二:感知MLP與神經網路結構

為了儘量能形成系統的體系,作為最基本的入門的知識,請參考一下之前的兩篇部落格: 神經網路(一):概念 神經網路(二):感知機 上面的兩篇部落格讓你形成對於神經網路最感性的理解。有些看不懂的直接忽略就行,最基本的符號的記法應該要會。後面會用到一這兩篇部落格中

MLlib--感知MLP演算法原理及Spark MLlib呼叫例項Scala/Java/Python

來源:http://blog.csdn.net/liulingyuan6/article/details/53432429 多層感知機 演算法簡介:         多層感知機是基於反向人工神經網路(feedforwardartificial neural net

感知MLP

  最終還是沒有憋住,寫下了這篇博文,最近真的是感慨很多啊,真的很想找個人說說。還有那麼多的東西要看要學。   最近一直在搞神經網路方面的東西,看了MLP的講解,但都是講解結構,我就是想知道MLP到底

Keras簡單實現感知MLP程式碼

import keras from keras.model import Sequential from keras.layers import Dense,Dropout from keras.op

MLP感知人工神經網路原理及程式碼實現

一、多層感知機(MLP)原理簡介多層感知機(MLP,Multilayer Perceptron)也叫人工神經網路(ANN,Artificial Neural Network),除了輸入輸出層,它中間可以有多個隱層,最簡單的MLP只含一個隱層,即三層的結構,如下圖:從上圖可以看

DeepLearning tutorial3MLP感知原理簡介+程式碼詳解

分享一下我老師大神的人工智慧教程!零基礎,通俗易懂!http://blog.csdn.net/jiangjunshow 也歡迎大家轉載本篇文章。分享知識,造福人民,實現我們中華民族偉大復興!        

基於神經網路感知識別手寫數字

資料集是經典的MNIST,來自美國國家標準與技術研究所,是人工書寫的0~9數字圖片,圖片的畫素為28*28,圖片為灰度圖。MNIST分別為訓練集和測試集,訓練資料包含6萬個樣本,測試資料集包含1萬個樣本。使用Tensorflow框架載入資料集。 載入資料集的程式碼如下: import ten

Deeplearning4j 實戰5:基於感知的Mnist壓縮以及在Spark實現

在上一篇部落格中,我們用基於RBM的的Deep AutoEncoder對Mnist資料集進行壓縮,應該說取得了不錯的效果。這裡,我們將神經網路這塊替換成傳統的全連線的前饋神經網路對Mnist資料集進行壓縮,看看兩者的效果有什麼異同。整個程式碼依然是利用Deeplearning4j進行實現,並且為了方

TensorFlow HOWTO 4.2 感知迴歸時間序列

4.2 多層感知機迴歸(時間序列) 這篇教程中,我們使用多層感知機來預測時間序列,這是迴歸問題。 操作步驟 匯入所需的包。 import tensorflow as tf import numpy as np import pandas as pd import matp

感知-印第安人糖尿病診斷-基於keras的python學習筆記

版權宣告:本文為博主原創文章,未經博主允許不得轉載。https://blog.csdn.net/weixin_44474718/article/details/86219792 函式解釋 np.random.seed()函式,每次執行程式碼時設定相同的seed,則每次生成的隨機數也相

神經網路之感知MLP的實現Python+TensorFlow

用 MLP 實現簡單的MNIST資料集識別。 # -*- coding:utf-8 -*- # # MLP """ MNIST classifier, 多層感知機實現 """ # Import

pytorch實戰-感知識別MNIST數字

import torch from torch.autograd import * from torch import nn,optim from torch.utils.data import DataLoader from torchvision import d

深度學習Deeplearning4j 入門實戰5:基於感知的Mnist壓縮以及在Spark實現

在上一篇部落格中,我們用基於RBM的的Deep AutoEncoder對Mnist資料集進行壓縮,應該說取得了不錯的效果。這裡,我們將神經網路這塊替換成傳統的全連線的前饋神經網路對Mnist資料集進行壓縮,看看兩者的效果有什麼異同。整個程式碼依然是利用Deeplearnin