1. 程式人生 > >使用 tensorlayer 組建 殘差網路 resnet 實現 mnist 手寫識別例子

使用 tensorlayer 組建 殘差網路 resnet 實現 mnist 手寫識別例子

最近學習殘差網路,非常給力,即使是深層網路也能很快收斂
這裡的程式碼構建了一個17層的網路,5 epoch就能達到96%以上準確率

lost-損失,acc-準確率

不過發現幾個問題
1.使用訓練過程中,lost值會先減小,然後會一直增大,而acc值卻在一直上升
2.使用prelu神經元,lost值增大更快,acc值訓練時間長會降低
3.使用elu神經元與prelu神經元相比,elu神經元lost增大更緩慢,acc值能持續增大,prelu神經元的acc值後期會下降

不使用relu主要是因為擔心relu壞死問題

完整程式碼

import tensorflow as tf
import tensorlayer as tl

sess = tf.InteractiveSession()

# 準備資料
X_train, y_train, X_val, y_val, X_test, y_test = tl.files.load_mnist_dataset(shape=(-1,784)) # 定義 placeholder x = tf.placeholder(tf.float32, shape=[None, 784], name='x') y_ = tf.placeholder(tf.int64, shape=[None, ], name='y_') # 定義模型 network = tl.layers.InputLayer(x, name='input_layer') res_a = network = tl.layers.DenseLayer(network, n_units=200
, act = tf.nn.elu, name='relu1') network = tl.layers.DenseLayer(network, n_units=200, act = tf.nn.elu, name='relu2') network = tl.layers.DenseLayer(network, n_units=200, act = tf.nn.elu, name='relu3') res_a = network = tl.layers.ElementwiseLayer([network, res_a], combine_fn=tf.add, name='res_add1') network = tl.layers.DenseLayer(network, n_units=200
, act = tf.nn.elu, name='relu4') network = tl.layers.DenseLayer(network, n_units=200, act = tf.nn.elu, name='relu5') res_a = network = tl.layers.ElementwiseLayer([network, res_a], combine_fn=tf.add, name='res_add2') network = tl.layers.DenseLayer(network, n_units=200, act = tf.nn.elu, name='relu6') network = tl.layers.DenseLayer(network, n_units=200, act = tf.nn.elu, name='relu7') res_a = network = tl.layers.ElementwiseLayer([network, res_a], combine_fn=tf.add, name='res_add3') network = tl.layers.DenseLayer(network, n_units=200, act = tf.nn.elu, name='relu8') network = tl.layers.DenseLayer(network, n_units=200, act = tf.nn.elu, name='relu9') res_a = network = tl.layers.ElementwiseLayer([network, res_a], combine_fn=tf.add, name='res_add4') network = tl.layers.DenseLayer(network, n_units=200, act = tf.nn.elu, name='relu10') network = tl.layers.DenseLayer(network, n_units=200, act = tf.nn.elu, name='relu11') res_a = network = tl.layers.ElementwiseLayer([network, res_a], combine_fn=tf.add, name='res_add5') network = tl.layers.DenseLayer(network, n_units=10, act = tf.identity, name='output_layer') # 定義損失函式和衡量指標 # tl.cost.cross_entropy 在內部使用 tf.nn.sparse_softmax_cross_entropy_with_logits() 實現 softmax y = network.outputs cost = tl.cost.cross_entropy(y, y_, name = 'cost') correct_prediction = tf.equal(tf.argmax(y, 1), y_) acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) y_op = tf.argmax(tf.nn.softmax(y), 1) # 定義 optimizer train_params = network.all_params train_op = tf.train.AdamOptimizer(learning_rate=0.003, beta1=0.9, beta2=0.999, epsilon=1e-08, use_locking=False).minimize(cost, var_list=train_params) # 初始化 session 中的所有引數 tl.layers.initialize_global_variables(sess) # 列出模型資訊 network.print_params() network.print_layers() # 訓練模型 tl.utils.fit(sess, network, train_op, cost, X_train, y_train, x, y_, acc=acc, batch_size=500, n_epoch=500, print_freq=5, X_val=X_val, y_val=y_val, eval_train=False) # 評估模型 tl.utils.test(sess, network, acc, X_test, y_test, x, y_, batch_size=None, cost=cost) # 把模型儲存成 .npz 檔案 tl.files.save_npz(network.all_params , name='model.npz') sess.close()