深度學習你不可不知的技巧（上）

阿新 • • 發佈：2019-01-17

1All Zero Initialization In the ideal situation, with proper data normalization it is reasonable to assume that approximately half of the weights will be positive and half of them will be negative. A reasonable-sounding idea then might be to set all the initial weights to zero, which you expect to be the “best guess” in expectation. But, this turns out to be a mistake, because if every neuron in the network computes the same output, then they will also all compute the same gradients during back-propagation and undergo the exact same parameter updates. In other words, there is no source of asymmetry between neurons if their weights are initialized to be the same.

2Initialization with Small Random Numbers Thus, you still want the weights to be very close to zero, but not identically zero. In this way, you can random these neurons to small numbers which are very close to zero, and it is treated as symmetry breaking. The idea is that the neurons are all random and unique in the beginning, so they will compute distinct updates and integrate themselves as diverse parts of the full network. The implementation for weights might simply look like

, where

is a zero mean, unit standard deviation gaussian. It is also possible to use small numbers drawn from a uniform distribution, but this seems to have relatively little impact on the final performance in practice.
3Calibrating the Variances One problem with the above suggestion is that the distribution of the outputs from a randomly initialized neuron has a variance that grows with the number of inputs. It turns out that you can normalize the variance of each neuron's output to 1 by scaling its weight vector by the square root of its fan-in (i.e., its number of inputs), which is as follows:

深度學習你不可不知的技巧（上）

深度學習你不可不知的技巧（上）

這些深度學習術語，你瞭解多少？（上）

詳解深度學習的可解釋性研究（上篇）

從神經網路說起：深度學習初學者不可不知的25個術語和概念

吳恩達《深度學習》第一門課（1）深度學習引言

吳恩達《深度學習》第一門課（4）深層神經網絡

20個可以讓任何人成為Excel專家的Excel技巧（上）

基於深度學習的Person Re-ID（綜述）

服務計算1--安裝配置你的私有云（上）

深度學習情感分類常用方法（綜述）

吳恩達《神經網路與深度學習》課程筆記歸納（二）-- 神經網路基礎之邏輯迴歸

吳恩達《神經網路與深度學習》課程筆記歸納（三）-- 神經網路基礎之Python與向量化

深度學習caffe(4)——caffe配置（GPU）

[GAN學習系列3]採用深度學習和 TensorFlow 實現圖片修復(上）

深度學習與tensorflow的小日子（一）

深度學習模型壓縮方法綜述（一）

詳解深度學習之經典網路架構（十）：九大框架彙總

機器學習小作業KNN分類（上）

NLP入門（五）用深度學習實現命名實體識別（NER）

PyTorch 深度學習:60分鐘快速入門（2） ----Autograd: 自動求導

深度學習你不可不知的技巧（上）

相關推薦