1. 程式人生 > >利用tensorflow訓練自己的圖片資料(3)——建立網路模型

利用tensorflow訓練自己的圖片資料(3)——建立網路模型

一. 說明

在上一部落格——利用tensorflow訓練自己的圖片資料(2)中,我們已經獲得了神經網路的訓練輸入資料:image_batch,label_batch。接下就是建立神經網路模型,筆者的網路模型結構如下:

輸入資料:(batch_size,IMG_W,IMG_H,col_channel)= (20,  64,  64,  3)

卷積層1: (conv_kernel,num_channel,num_out_neure)= (3,  3,  3,  64)

池化層1: (ksize,strides,padding)= ([1,3,3,1], [1,2,2,1], 'SAME')

卷積層2: (conv_kernel,num_channel,num_out_neure)= (3,  3,  64,  16)

池化層2: (ksize,strides,padding)= ([1,3,3,1], [1,1,1,1], 'SAME')

全連線1: (out_pool2_reshape,num_out_neure)= (dim, 128)

全連線2: (fc1_out,num_out_neure)= (128,128)

softmax層: (fc2_out,num_classes) = (128,  4)

啟用函式: tf.nn.relu

損失函式: tf.nn.sparse_softmax_cross_entropy_with_logits

二. 程式設計實現

#=========================================================================
import tensorflow as tf
#=========================================================================
#網路結構定義
    #輸入引數:images,image batch、4D tensor、tf.float32、[batch_size, width, height, channels]
    #返回引數:logits, float、 [batch_size, n_classes]
def inference(images, batch_size, n_classes):
#一個簡單的卷積神經網路,卷積+池化層x2,全連線層x2,最後一個softmax層做分類。
#卷積層1
#64個3x3的卷積核(3通道),padding=’SAME’,表示padding後卷積的圖與原圖尺寸一致,啟用函式relu()
    with tf.variable_scope('conv1') as scope:
        
        weights = tf.Variable(tf.truncated_normal(shape=[3,3,3,64], stddev = 1.0, dtype = tf.float32), 
                              name = 'weights', dtype = tf.float32)
        
        biases = tf.Variable(tf.constant(value = 0.1, dtype = tf.float32, shape = [64]),
                             name = 'biases', dtype = tf.float32)
        
        conv = tf.nn.conv2d(images, weights, strides=[1,1,1,1], padding='SAME')
        pre_activation = tf.nn.bias_add(conv, biases)
        conv1 = tf.nn.relu(pre_activation, name= scope.name)
        
#池化層1
#3x3最大池化,步長strides為2,池化後執行lrn()操作,區域性響應歸一化,對訓練有利。
    with tf.variable_scope('pooling1_lrn') as scope:
        pool1 = tf.nn.max_pool(conv1, ksize=[1,3,3,1],strides=[1,2,2,1],padding='SAME', name='pooling1')
        norm1 = tf.nn.lrn(pool1, depth_radius=4, bias=1.0, alpha=0.001/9.0, beta=0.75, name='norm1')

#卷積層2
#16個3x3的卷積核(16通道),padding=’SAME’,表示padding後卷積的圖與原圖尺寸一致,啟用函式relu()
    with tf.variable_scope('conv2') as scope:
        weights = tf.Variable(tf.truncated_normal(shape=[3,3,64,16], stddev = 0.1, dtype = tf.float32), 
                              name = 'weights', dtype = tf.float32)
        
        biases = tf.Variable(tf.constant(value = 0.1, dtype = tf.float32, shape = [16]),
                             name = 'biases', dtype = tf.float32)
        
        conv = tf.nn.conv2d(norm1, weights, strides = [1,1,1,1],padding='SAME')
        pre_activation = tf.nn.bias_add(conv, biases)
        conv2 = tf.nn.relu(pre_activation, name='conv2')

#池化層2
#3x3最大池化,步長strides為2,池化後執行lrn()操作,
    #pool2 and norm2
    with tf.variable_scope('pooling2_lrn') as scope:
        norm2 = tf.nn.lrn(conv2, depth_radius=4, bias=1.0, alpha=0.001/9.0,beta=0.75,name='norm2')
        pool2 = tf.nn.max_pool(norm2, ksize=[1,3,3,1], strides=[1,1,1,1],padding='SAME',name='pooling2')

#全連線層3
#128個神經元,將之前pool層的輸出reshape成一行,啟用函式relu()
    with tf.variable_scope('local3') as scope:
        reshape = tf.reshape(pool2, shape=[batch_size, -1])
        dim = reshape.get_shape()[1].value
        weights = tf.Variable(tf.truncated_normal(shape=[dim,128], stddev = 0.005, dtype = tf.float32),
                             name = 'weights', dtype = tf.float32)
        
        biases = tf.Variable(tf.constant(value = 0.1, dtype = tf.float32, shape = [128]), 
                             name = 'biases', dtype=tf.float32)
        
        local3 = tf.nn.relu(tf.matmul(reshape, weights) + biases, name=scope.name)
        
#全連線層4
#128個神經元,啟用函式relu() 
    with tf.variable_scope('local4') as scope:
        weights = tf.Variable(tf.truncated_normal(shape=[128,128], stddev = 0.005, dtype = tf.float32),
                              name = 'weights',dtype = tf.float32)
        
        biases = tf.Variable(tf.constant(value = 0.1, dtype = tf.float32, shape = [128]),
                             name = 'biases', dtype = tf.float32)
        
        local4 = tf.nn.relu(tf.matmul(local3, weights) + biases, name='local4')

#dropout層        
#    with tf.variable_scope('dropout') as scope:
#        drop_out = tf.nn.dropout(local4, 0.8)
            
        
#Softmax迴歸層
#將前面的FC層輸出,做一個線性迴歸,計算出每一類的得分,在這裡是2類,所以這個層輸出的是兩個得分。
    with tf.variable_scope('softmax_linear') as scope:
        weights = tf.Variable(tf.truncated_normal(shape=[128, n_classes], stddev = 0.005, dtype = tf.float32),
                              name = 'softmax_linear', dtype = tf.float32)
        
        biases = tf.Variable(tf.constant(value = 0.1, dtype = tf.float32, shape = [n_classes]),
                             name = 'biases', dtype = tf.float32)
        
        softmax_linear = tf.add(tf.matmul(local4, weights), biases, name='softmax_linear')

    return softmax_linear

#-----------------------------------------------------------------------------
#loss計算
    #傳入引數:logits,網路計算輸出值。labels,真實值,在這裡是0或者1
    #返回引數:loss,損失值
def losses(logits, labels):
    with tf.variable_scope('loss') as scope:
        cross_entropy =tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=labels, name='xentropy_per_example')
        loss = tf.reduce_mean(cross_entropy, name='loss')
        tf.summary.scalar(scope.name+'/loss', loss)
    return loss

#--------------------------------------------------------------------------
#loss損失值優化
    #輸入引數:loss。learning_rate,學習速率。
    #返回引數:train_op,訓練op,這個引數要輸入sess.run中讓模型去訓練。
def trainning(loss, learning_rate):
    with tf.name_scope('optimizer'):
        optimizer = tf.train.AdamOptimizer(learning_rate= learning_rate)
        global_step = tf.Variable(0, name='global_step', trainable=False)
        train_op = optimizer.minimize(loss, global_step= global_step)
    return train_op

#-----------------------------------------------------------------------
#評價/準確率計算
    #輸入引數:logits,網路計算值。labels,標籤,也就是真實值,在這裡是0或者1。
    #返回引數:accuracy,當前step的平均準確率,也就是在這些batch中多少張圖片被正確分類了。
def evaluation(logits, labels):
    with tf.variable_scope('accuracy') as scope:
        correct = tf.nn.in_top_k(logits, labels, 1)
        correct = tf.cast(correct, tf.float16)
        accuracy = tf.reduce_mean(correct)
        tf.summary.scalar(scope.name+'/accuracy', accuracy)
    return accuracy

#========================================================================
3 . 補充

tensorflow下的區域性相應歸一化函式:tf.nn.lrn

tf.nn.lrn = (input,depth_radius=None,bias=None,alpha=None,beta=None,name=None)

       input是一個4D的tensor,型別必須為float。

       depth_radius是一個型別為int的標量,表示囊括的kernel的範圍。

       bias是偏置。

       alpha是乘積係數,是在計算完囊括範圍內的kernel的啟用值之和之後再對其進行乘積。

       beta是指數係數。

LRN是normalization的一種,normalizaiton的目的是抑制,抑制神經元的輸出。而LRN的設計借鑑了神經生物學中的一個概念,叫做“側抑制”。

側抑制:相近的神經元彼此之間發生抑制作用,即在某個神經元受到刺激而產生興奮時,再側記相近的神經元,則後者所發生的興奮對前產生的抑制作用。也就是說,抑制側是指相鄰的感受器之間能夠相互抑制的現象。

注:可參考部落格http://blog.csdn.net/gzhermit/article/details/75389130