1. 程式人生 > >[ Keras ] ——基本使用:(2) fine-tune+凍結層+抽取模型某一層輸出

[ Keras ] ——基本使用:(2) fine-tune+凍結層+抽取模型某一層輸出

一、凍結層 (即固定某層引數在訓練的時候不變)

1.1方法

x = Dense(100,activation='relu',name='dense_100',trainable=False)(inputs)

或者

model.trainable = False

1.2凍結操作的經驗總結:

1、凍結操作在訓練時候對權重影響實驗:

       1) 不凍結:

# ■■■■■■■■ [2]模型設計 ■■■■■■■■
####### 主模型 #######

inputs = Input(shape=(784,))
x = Dense(100,activation='relu',name='dense_100')(inputs)
outputs = Dense(10,activation='softmax')(x)
model_main = Model(input = inputs,output=outputs)

####### 主模型 #######

model_main.load_weights('my_model_weights.h5')
 
# model_main.trainable = False 
 
model = model_main
 
# ****** 權重探針 ********
a = model.get_weights()
print('dense_100那層的權重:', a[1]) # 'dense_100'那層的權重。
# ****** 權重探針 ********
 
【結果】
>>> dense_100那層的權重:
[ 0.00609367  0.01774433  0.00127991  0.01685369 -0.00588948  0.0022781
  0.00694803  0.00636634 -0.00108383 -0.00480387  0.01123319  0.01685128
  0.0071973   0.00373418  0.0015275  -0.0011526  -0.00451979 -0.00653248
  0.01192301 -0.00078739 -0.00056679 -0.00057205  0.0220937  -0.00158271
 -0.00026968 -0.00664996 -0.00085808 -0.00305471  0.00620055  0.0064344
 -0.00938795  0.00266371  0.00623808  0.0083605  -0.00238177 -0.00048903
  0.00059158  0.00824707  0.00500612  0.00873516 -0.0032067   0.00337419
  0.01087511  0.004928    0.01195703  0.01690748  0.01420193 -0.0064415
  0.00545023  0.01340502 -0.00258121  0.01323839  0.00632899  0.01284719
  0.00555667  0.01261076 -0.00088008  0.01200596  0.00733639  0.01783392
 -0.00440101  0.00118115  0.01178464  0.0074486   0.00896501  0.00357948
  0.00705922  0.00520497  0.01415215 -0.00202574  0.00927804  0.0138014
  0.0098721   0.0129296   0.00189565  0.01651774  0.00946718 -0.00534614
  0.00506906 -0.00030766 -0.00026362  0.00419401  0.00212149 -0.00304823
 -0.00427098  0.0041138   0.01505729  0.00112592 -0.00334759  0.00820872
 -0.01345768 -0.00101386 -0.00698254  0.02179425  0.00819413  0.00404393
 -0.00315165  0.01334981  0.01426365  0.00202925]

      2) 凍結之後:並且將探針放在model.fit之後

# ■■■■■■■■ [2]模型設計 ■■■■■■■■

####### 主模型 ######

inputs = Input(shape=(784,))
x = Dense(100,activation='relu',name='dense_100')(inputs)
outputs = Dense(10,activation='softmax')(x)
model_main = Model(input = inputs,output=outputs)

####### 主模型 ######


model_main.load_weights('my_model_weights.h5')

model_main.trainable = False 

model = model_main

# ■■■■■■■■ [3]模型編譯 ■■■■■■■■
# 定義優化器
sgd = SGD(lr=0.2)

# 編譯,loss function,訓練過程中計算準確率
model.compile(optimizer = sgd,
              loss = 'mse',
              metrics=['accuracy'],
              )


# ■■■■■■■■ [4]訓練模型 ■■■■■■■■

model.fit(x_train,y_train,batch_size=256,epochs=1)  # 主模型訓練用這個

# ****** 權重探針 ********
a = model.get_weights()
print('凍結訓練,訓練之後,dense_100那層的權重:', a[1]) # 'dense_100'那層的權重。
# ****** 權重探針 ********

【結果】
>>>凍結訓練,訓練之後,dense_100那層的權重:(發現沒有變化!!)
[ 0.00609367  0.01774433  0.00127991  0.01685369 -0.00588948  0.0022781
  0.00694803  0.00636634 -0.00108383 -0.00480387  0.01123319  0.01685128
  0.0071973   0.00373418  0.0015275  -0.0011526  -0.00451979 -0.00653248
  0.01192301 -0.00078739 -0.00056679 -0.00057205  0.0220937  -0.00158271
 -0.00026968 -0.00664996 -0.00085808 -0.00305471  0.00620055  0.0064344
 -0.00938795  0.00266371  0.00623808  0.0083605  -0.00238177 -0.00048903
  0.00059158  0.00824707  0.00500612  0.00873516 -0.0032067   0.00337419
  0.01087511  0.004928    0.01195703  0.01690748  0.01420193 -0.0064415
  0.00545023  0.01340502 -0.00258121  0.01323839  0.00632899  0.01284719
  0.00555667  0.01261076 -0.00088008  0.01200596  0.00733639  0.01783392
 -0.00440101  0.00118115  0.01178464  0.0074486   0.00896501  0.00357948
  0.00705922  0.00520497  0.01415215 -0.00202574  0.00927804  0.0138014
  0.0098721   0.0129296   0.00189565  0.01651774  0.00946718 -0.00534614
  0.00506906 -0.00030766 -0.00026362  0.00419401  0.00212149 -0.00304823
 -0.00427098  0.0041138   0.01505729  0.00112592 -0.00334759  0.00820872
 -0.01345768 -0.00101386 -0.00698254  0.02179425  0.00819413  0.00404393
 -0.00315165  0.01334981  0.01426365  0.00202925]

2、對儲存模型進行凍結操作的注意事項:

      1)、要想對儲存模型進行凍結操作,建議使用 [結構儲存(model.to_json()) + 權值儲存(model.save_weights)]  這種方法儲存模型。

# 正常模型的引數:

=================================================================
Total params: 2,100,362
Trainable params: 2,100,362
Non-trainable params: 0
_________________________________________________________________

              原因:

              > 採用 model.save() 和 load_model()的方法得到的模型,在做凍結操作時候會發生權重錯誤。

from keras.models import load_model
model1 = load_model('CIFAR10_model_epoch_1.h5')

model1.trainable = False

model1.summary()

# ———— 看看參與訓練的權值都是什麼————
print('參與訓練的權值:')
for x in model1.trainable_weights:
    print(x.name)
    print('\n')
# —————————————————————————————————

【結果】
>>>
=================================================================
Total params: 4,200,724   (總權重咋就變多了?)
Trainable params: 2,100,362 (為什麼還有可訓練的權重????)
Non-trainable params: 2,100,362
_________________________________________________________________

參與訓練的權值:    (參與訓練的權值倒是沒有。奇怪奇怪!)
(無)
 

             > 採用 [ 結構儲存(model.to_json()) + 權值儲存(model.save_weights) ],在做凍結操作時候就不會發生權重錯誤。

from keras.models import model_from_json
model1 = model_from_json(open('my_model_architecture.json').read())

model1.trainable = False

model1.load_weights('model_weight_epoch_1.h5')

model1.summary()

【結果】
>>>
=================================================================
Total params: 2,100,362
Trainable params: 0   (看!這個就沒錯!!)
Non-trainable params: 2,100,362
_________________________________________________________________

3、如果網路層的定義部分:

y = Dense(units=128, activation='relu', kernel_initializer='he_normal',trainable=False)(y)

定義了trainable=False,那麼就不能通過model.trainable = True 來改變這一層的'凍結狀態';

x = Input(shape=(32, 32, 3))
y = x
y = Convolution2D(filters=64, kernel_size=3, strides=1, padding='same', activation='relu', kernel_initializer='he_normal')(y)
y = MaxPooling2D(pool_size=2, strides=2, padding='valid')(y)
y = Flatten()(y)
y = Dense(units=128, activation='relu', kernel_initializer='he_normal',trainable=False)(y)
y = Dropout(0.5)(y)
y = Dense(units=nb_classes, activation='softmax', kernel_initializer='he_normal')(y)

model1 = Model(inputs=x, outputs=y, name='model1')

model1.trainable = True  # 看!我讓全部層都可以train了

model1.summary()

# ———— 看看不參與訓練的權值都是什麼————
print('不參與訓練的權值:')
for x in model1.non_trainable_weights:
    print(x.name)
print('\n')
# —————————————————————————————————

【結果】
>>>
=================================================================
Total params: 2,100,362
Trainable params: 3,082   
Non-trainable params: 2,097,280 (看!還是有不能訓練的引數)
_________________________________________________________________

不參與訓練的權值: (看!這是不能訓練的引數名稱)
dense_1/kernel:0 
dense_1/bias:0

但可以通過model.layers[4].trainable=True來改變:

x = Input(shape=(32, 32, 3))
y = x
y = Convolution2D(filters=64, kernel_size=3, strides=1, padding='same', activation='relu', kernel_initializer='he_normal')(y)
y = MaxPooling2D(pool_size=2, strides=2, padding='valid')(y)
y = Flatten()(y)
y = Dense(units=128, activation='relu', kernel_initializer='he_normal',trainable=False)(y)
y = Dropout(0.5)(y)
y = Dense(units=nb_classes, activation='softmax', kernel_initializer='he_normal')(y)

model1 = Model(inputs=x, outputs=y, name='model1')

model1.layers[4].trainable = True # 看!我讓這個Dense128層可train了

model1.summary()

# ———— 看看不參與訓練的權值都是什麼————
print('不參與訓練的權值:')
for x in model1.non_trainable_weights:
    print(x.name)
print('\n')
# —————————————————————————————————

【結果】
>>>
=================================================================
Total params: 2,100,362
Trainable params: 2,100,362
Non-trainable params: 0    (看!沒有了不能訓練的引數!)
_________________________________________________________________

不參與訓練的權值:
(無)

4、檢視可訓練(trainable)和不可訓練(non_trainable)的權值方法:

方法:model.trainable_weights (可訓練權值)

print('參與訓練的權值名稱:')
for x in model.trainable_weights:
    print(x.name)
print('\n')
方法:model.non_trainable_weights (不可訓練權值)
print('不參與訓練的權值名稱:')
for x in model.non_trainable_weights:
    print(x.name)
print('\n')

二、抽取某層輸出

# ■■■■■■■■ [2]模型設計 ■■■■■■■■

# ———— 主模型
inputs = Input(shape=(784,))
x = Dense(100,activation='relu')(inputs)
outputs = Dense(10,activation='softmax')(x)

model_main = Model(input = inputs,output=outputs)

model_main.load_weights('my_model_weights.h5')
# model_main.trainable = False

# model = model_main

# ———— 提取'dense_1'層的輸出。
layer_name = 'dense_1'
intermediate_layer_model = Model(input=inputs,
                                 output=model_main.get_layer(layer_name).output)
x_train_Dense = intermediate_layer_model.predict(x_train) # 將樣本x_train輸入得到'dense_1'層輸出。

print('x_train_Dense',x_train_Dense)
print('x_train_Dense.shape',x_train_Dense.shape)

三、 fine-tune

(1) 、主模型固定(不是凍結)不訓練的fine-tune

import numpy as np
from keras.datasets import mnist
from keras.utils import np_utils
from keras.models import Sequential
from keras.models import Model
from keras.layers import Input,Dense,Conv2D,Activation,MaxPooling2D,Flatten,merge,Conv2DTranspose,ZeroPadding2D
from keras.regularizers import l2
from keras.layers import Dense
from keras.optimizers import SGD
from keras import backend as K

# ■■■■■■■■ [1] 資料載入 ■■■■■■■■
(x_train,y_train),(x_test,y_test) = mnist.load_data()
# (60000,28,28)
print('x_shape:',x_train.shape)
# (60000)
print('y_shape:',y_train.shape)
# (60000,28,28)->(60000,784)
x_train = x_train.reshape(x_train.shape[0],-1)/255.0
x_test = x_test.reshape(x_test.shape[0],-1)/255.0
# 換one hot格式
y_train = np_utils.to_categorical(y_train,num_classes=10)
y_test = np_utils.to_categorical(y_test,num_classes=10)


# ■■■■■■■■ [2]模型設計 ■■■■■■■■

# ———— 主模型
inputs = Input(shape=(784,))
x = Dense(100,activation='relu')(inputs)
outputs = Dense(10,activation='softmax')(x)

model_main = Model(input = inputs,output=outputs)

model_main.load_weights('my_model_weights.h5')
# model_main.trainable = False

# model = model_main

# ———— 提取'dense_1'層的輸出。
layer_name = 'dense_1'
intermediate_layer_model = Model(input=inputs,
                                 output=model_main.get_layer(layer_name).output)
x_train_Dense = intermediate_layer_model.predict(x_train) # 將樣本x_train輸入得到'dense_1'層輸出。

np.save('bottleneck_features.npy', x_train_Dense)  # 將提取出的'dense_1'層的特徵儲存在.npy檔案中。
train_data = np.load('bottleneck_features.npy')    # 讀取.npy檔案中的特徵向量。
print('x_train_Dense',train_data)
print('x_train_Dense.shape',train_data.shape)

# ———— fine-tune模型

inputs1 = Input(shape=(100,)) # 由於取出上面model_main中的'dense_1'層輸出為100維。
x = Dense(100,activation='relu')(inputs1)
outputs1 = Dense(10,activation='softmax')(x)
model = Model(inputs1,outputs1)

model.summary()


# ■■■■■■■■ [3]模型編譯 ■■■■■■■■
# 定義優化器
sgd = SGD(lr=0.2)

# 編譯,loss function,訓練過程中計算準確率
model.compile(optimizer = sgd,
              loss = 'mse',
              metrics=['accuracy'],
              )

# ■■■■■■■■ [4]訓練模型 ■■■■■■■■

# model.fit(x_train,y_train,batch_size=64,epochs=1)  # 主模型訓練用這個
model.fit(x_train_Dense, y_train,batch_size=64, epochs=1) # fine-tune模型用這個


# ■■■■■■■■ [5]評估模型 ■■■■■■■■

# loss,accuracy = model.evaluate(x_test,y_test)

# print('\ntest loss',loss)
# print('accuracy',accuracy)

# # 儲存引數,載入引數
# model.save_weights('my_model_weights.h5')
# model.load_weights('my_model_weights.h5')

K.clear_session()

(2)、主模型不凍結的fine-tune——並且可以驗證主模型load_weight之後,連成fine-tune模型後,權重是否也在fine-tune模型中。

# ■■■■■■■■ [2]模型設計 ■■■■■■■■

# —————— 主模型 ——————
inputs = Input(shape=(784,))
x = Dense(100,activation='relu',name='dense_100')(inputs)
outputs = Dense(10,activation='softmax')(x)

model_main = Model(input = inputs,output=outputs)

model_main.load_weights('my_model_weights.h5')

# ———— (fine-tune構建)提取'dense_1'層的輸出 ——————
layer_name = 'dense_100'
intermediate_layer_model = Model(input=inputs,
                                 output=model_main.get_layer(layer_name).output)
outputs_inter = Dense(10,activation='softmax')(intermediate_layer_model.output)
model_inter = Model(input=inputs, output=outputs_inter)
model = model_inter

# ****** 權重探針 ********
a = model.get_weights()
print('主模型load_weight之後,去掉softmax層,接成fine-tune模型後的原dense_100層權重:', a[1]) # 'dense_100'那層的權重。
# ****** 權重探針 ********

【結果】(發現確實load進了fine-tune模型)
>>>主模型load_weight之後,去掉softmax層,接成fine-tune模型後的原dense_100層權重: 
[ 0.00609367  0.01774433  0.00127991  0.01685369 -0.00588948  0.0022781
  0.00694803  0.00636634 -0.00108383 -0.00480387  0.01123319  0.01685128
  0.0071973   0.00373418  0.0015275  -0.0011526  -0.00451979 -0.00653248
  0.01192301 -0.00078739 -0.00056679 -0.00057205  0.0220937  -0.00158271
 -0.00026968 -0.00664996 -0.00085808 -0.00305471  0.00620055  0.0064344
 -0.00938795  0.00266371  0.00623808  0.0083605  -0.00238177 -0.00048903
  0.00059158  0.00824707  0.00500612  0.00873516 -0.0032067   0.00337419
  0.01087511  0.004928    0.01195703  0.01690748  0.01420193 -0.0064415
  0.00545023  0.01340502 -0.00258121  0.01323839  0.00632899  0.01284719
  0.00555667  0.01261076 -0.00088008  0.01200596  0.00733639  0.01783392
 -0.00440101  0.00118115  0.01178464  0.0074486   0.00896501  0.00357948
  0.00705922  0.00520497  0.01415215 -0.00202574  0.00927804  0.0138014
  0.0098721   0.0129296   0.00189565  0.01651774  0.00946718 -0.00534614
  0.00506906 -0.00030766 -0.00026362  0.00419401  0.00212149 -0.00304823
 -0.00427098  0.0041138   0.01505729  0.00112592 -0.00334759  0.00820872
 -0.01345768 -0.00101386 -0.00698254  0.02179425  0.00819413  0.00404393
 -0.00315165  0.01334981  0.01426365  0.00202925]

(3)、讓模型中的某幾層不參加訓練。

           model.layers方法:

# ——————————————————————— 主模型 ——————————————————————————

#....省略....

model1 = Model(inputs=x, outputs=y, name='model1')

# ——————————————— 只想讓後3層參加訓練(總共14層) ——————————————

print('\n 有多少個層(relu這種沒有引數的也算一層) :',len(model1.layers))

model1.trainable = True     # 想要讓某層參加訓練,必須'先'讓全部層[可訓練],'再'讓不想參加訓練的層[凍結].

# 讓不想參加訓練的層[凍結].
for layer in model1.layers[:11]:
    layer.trainable = False

model1.summary()

# ————————————————————————————————————————————————————————————

【結果】
>>>
Total params: 1,671,114
Trainable params: 525,706
Non-trainable params: 1,145,408