[ Keras ] ——基本使用:(2) fine-tune+凍結層+抽取模型某一層輸出
阿新 • • 發佈:2019-01-23
一、凍結層 (即固定某層引數在訓練的時候不變)
1.1方法:
x = Dense(100,activation='relu',name='dense_100',trainable=False)(inputs)
或者
model.trainable = False
1.2凍結操作的經驗總結:
1、凍結操作在訓練時候對權重影響實驗:
1) 不凍結:
# ■■■■■■■■ [2]模型設計 ■■■■■■■■ ####### 主模型 ####### inputs = Input(shape=(784,)) x = Dense(100,activation='relu',name='dense_100')(inputs) outputs = Dense(10,activation='softmax')(x) model_main = Model(input = inputs,output=outputs) ####### 主模型 ####### model_main.load_weights('my_model_weights.h5') # model_main.trainable = False model = model_main # ****** 權重探針 ******** a = model.get_weights() print('dense_100那層的權重:', a[1]) # 'dense_100'那層的權重。 # ****** 權重探針 ******** 【結果】 >>> dense_100那層的權重: [ 0.00609367 0.01774433 0.00127991 0.01685369 -0.00588948 0.0022781 0.00694803 0.00636634 -0.00108383 -0.00480387 0.01123319 0.01685128 0.0071973 0.00373418 0.0015275 -0.0011526 -0.00451979 -0.00653248 0.01192301 -0.00078739 -0.00056679 -0.00057205 0.0220937 -0.00158271 -0.00026968 -0.00664996 -0.00085808 -0.00305471 0.00620055 0.0064344 -0.00938795 0.00266371 0.00623808 0.0083605 -0.00238177 -0.00048903 0.00059158 0.00824707 0.00500612 0.00873516 -0.0032067 0.00337419 0.01087511 0.004928 0.01195703 0.01690748 0.01420193 -0.0064415 0.00545023 0.01340502 -0.00258121 0.01323839 0.00632899 0.01284719 0.00555667 0.01261076 -0.00088008 0.01200596 0.00733639 0.01783392 -0.00440101 0.00118115 0.01178464 0.0074486 0.00896501 0.00357948 0.00705922 0.00520497 0.01415215 -0.00202574 0.00927804 0.0138014 0.0098721 0.0129296 0.00189565 0.01651774 0.00946718 -0.00534614 0.00506906 -0.00030766 -0.00026362 0.00419401 0.00212149 -0.00304823 -0.00427098 0.0041138 0.01505729 0.00112592 -0.00334759 0.00820872 -0.01345768 -0.00101386 -0.00698254 0.02179425 0.00819413 0.00404393 -0.00315165 0.01334981 0.01426365 0.00202925]
2) 凍結之後:並且將探針放在model.fit之後
# ■■■■■■■■ [2]模型設計 ■■■■■■■■ ####### 主模型 ###### inputs = Input(shape=(784,)) x = Dense(100,activation='relu',name='dense_100')(inputs) outputs = Dense(10,activation='softmax')(x) model_main = Model(input = inputs,output=outputs) ####### 主模型 ###### model_main.load_weights('my_model_weights.h5') model_main.trainable = False model = model_main # ■■■■■■■■ [3]模型編譯 ■■■■■■■■ # 定義優化器 sgd = SGD(lr=0.2) # 編譯,loss function,訓練過程中計算準確率 model.compile(optimizer = sgd, loss = 'mse', metrics=['accuracy'], ) # ■■■■■■■■ [4]訓練模型 ■■■■■■■■ model.fit(x_train,y_train,batch_size=256,epochs=1) # 主模型訓練用這個 # ****** 權重探針 ******** a = model.get_weights() print('凍結訓練,訓練之後,dense_100那層的權重:', a[1]) # 'dense_100'那層的權重。 # ****** 權重探針 ******** 【結果】 >>>凍結訓練,訓練之後,dense_100那層的權重:(發現沒有變化!!) [ 0.00609367 0.01774433 0.00127991 0.01685369 -0.00588948 0.0022781 0.00694803 0.00636634 -0.00108383 -0.00480387 0.01123319 0.01685128 0.0071973 0.00373418 0.0015275 -0.0011526 -0.00451979 -0.00653248 0.01192301 -0.00078739 -0.00056679 -0.00057205 0.0220937 -0.00158271 -0.00026968 -0.00664996 -0.00085808 -0.00305471 0.00620055 0.0064344 -0.00938795 0.00266371 0.00623808 0.0083605 -0.00238177 -0.00048903 0.00059158 0.00824707 0.00500612 0.00873516 -0.0032067 0.00337419 0.01087511 0.004928 0.01195703 0.01690748 0.01420193 -0.0064415 0.00545023 0.01340502 -0.00258121 0.01323839 0.00632899 0.01284719 0.00555667 0.01261076 -0.00088008 0.01200596 0.00733639 0.01783392 -0.00440101 0.00118115 0.01178464 0.0074486 0.00896501 0.00357948 0.00705922 0.00520497 0.01415215 -0.00202574 0.00927804 0.0138014 0.0098721 0.0129296 0.00189565 0.01651774 0.00946718 -0.00534614 0.00506906 -0.00030766 -0.00026362 0.00419401 0.00212149 -0.00304823 -0.00427098 0.0041138 0.01505729 0.00112592 -0.00334759 0.00820872 -0.01345768 -0.00101386 -0.00698254 0.02179425 0.00819413 0.00404393 -0.00315165 0.01334981 0.01426365 0.00202925]
2、對儲存模型進行凍結操作的注意事項:
1)、要想對儲存模型進行凍結操作,建議使用 [結構儲存(model.to_json()) + 權值儲存(model.save_weights)] 這種方法儲存模型。
# 正常模型的引數: ================================================================= Total params: 2,100,362 Trainable params: 2,100,362 Non-trainable params: 0 _________________________________________________________________
原因:
> 採用 model.save() 和 load_model()的方法得到的模型,在做凍結操作時候會發生權重錯誤。
from keras.models import load_model
model1 = load_model('CIFAR10_model_epoch_1.h5')
model1.trainable = False
model1.summary()
# ———— 看看參與訓練的權值都是什麼————
print('參與訓練的權值:')
for x in model1.trainable_weights:
print(x.name)
print('\n')
# —————————————————————————————————
【結果】
>>>
=================================================================
Total params: 4,200,724 (總權重咋就變多了?)
Trainable params: 2,100,362 (為什麼還有可訓練的權重????)
Non-trainable params: 2,100,362
_________________________________________________________________
參與訓練的權值: (參與訓練的權值倒是沒有。奇怪奇怪!)
(無)
> 採用 [ 結構儲存(model.to_json()) + 權值儲存(model.save_weights) ],在做凍結操作時候就不會發生權重錯誤。
from keras.models import model_from_json
model1 = model_from_json(open('my_model_architecture.json').read())
model1.trainable = False
model1.load_weights('model_weight_epoch_1.h5')
model1.summary()
【結果】
>>>
=================================================================
Total params: 2,100,362
Trainable params: 0 (看!這個就沒錯!!)
Non-trainable params: 2,100,362
_________________________________________________________________
3、如果網路層的定義部分:
y = Dense(units=128, activation='relu', kernel_initializer='he_normal',trainable=False)(y)
定義了trainable=False,那麼就不能通過model.trainable = True 來改變這一層的'凍結狀態';
x = Input(shape=(32, 32, 3))
y = x
y = Convolution2D(filters=64, kernel_size=3, strides=1, padding='same', activation='relu', kernel_initializer='he_normal')(y)
y = MaxPooling2D(pool_size=2, strides=2, padding='valid')(y)
y = Flatten()(y)
y = Dense(units=128, activation='relu', kernel_initializer='he_normal',trainable=False)(y)
y = Dropout(0.5)(y)
y = Dense(units=nb_classes, activation='softmax', kernel_initializer='he_normal')(y)
model1 = Model(inputs=x, outputs=y, name='model1')
model1.trainable = True # 看!我讓全部層都可以train了
model1.summary()
# ———— 看看不參與訓練的權值都是什麼————
print('不參與訓練的權值:')
for x in model1.non_trainable_weights:
print(x.name)
print('\n')
# —————————————————————————————————
【結果】
>>>
=================================================================
Total params: 2,100,362
Trainable params: 3,082
Non-trainable params: 2,097,280 (看!還是有不能訓練的引數)
_________________________________________________________________
不參與訓練的權值: (看!這是不能訓練的引數名稱)
dense_1/kernel:0
dense_1/bias:0
但可以通過model.layers[4].trainable=True來改變:
x = Input(shape=(32, 32, 3))
y = x
y = Convolution2D(filters=64, kernel_size=3, strides=1, padding='same', activation='relu', kernel_initializer='he_normal')(y)
y = MaxPooling2D(pool_size=2, strides=2, padding='valid')(y)
y = Flatten()(y)
y = Dense(units=128, activation='relu', kernel_initializer='he_normal',trainable=False)(y)
y = Dropout(0.5)(y)
y = Dense(units=nb_classes, activation='softmax', kernel_initializer='he_normal')(y)
model1 = Model(inputs=x, outputs=y, name='model1')
model1.layers[4].trainable = True # 看!我讓這個Dense128層可train了
model1.summary()
# ———— 看看不參與訓練的權值都是什麼————
print('不參與訓練的權值:')
for x in model1.non_trainable_weights:
print(x.name)
print('\n')
# —————————————————————————————————
【結果】
>>>
=================================================================
Total params: 2,100,362
Trainable params: 2,100,362
Non-trainable params: 0 (看!沒有了不能訓練的引數!)
_________________________________________________________________
不參與訓練的權值:
(無)
4、檢視可訓練(trainable)和不可訓練(non_trainable)的權值方法:
方法:model.trainable_weights (可訓練權值)
print('參與訓練的權值名稱:')
for x in model.trainable_weights:
print(x.name)
print('\n')
方法:model.non_trainable_weights (不可訓練權值)
print('不參與訓練的權值名稱:')
for x in model.non_trainable_weights:
print(x.name)
print('\n')
二、抽取某層輸出
# ■■■■■■■■ [2]模型設計 ■■■■■■■■
# ———— 主模型
inputs = Input(shape=(784,))
x = Dense(100,activation='relu')(inputs)
outputs = Dense(10,activation='softmax')(x)
model_main = Model(input = inputs,output=outputs)
model_main.load_weights('my_model_weights.h5')
# model_main.trainable = False
# model = model_main
# ———— 提取'dense_1'層的輸出。
layer_name = 'dense_1'
intermediate_layer_model = Model(input=inputs,
output=model_main.get_layer(layer_name).output)
x_train_Dense = intermediate_layer_model.predict(x_train) # 將樣本x_train輸入得到'dense_1'層輸出。
print('x_train_Dense',x_train_Dense)
print('x_train_Dense.shape',x_train_Dense.shape)
三、 fine-tune
(1) 、主模型固定(不是凍結)不訓練的fine-tune
import numpy as np
from keras.datasets import mnist
from keras.utils import np_utils
from keras.models import Sequential
from keras.models import Model
from keras.layers import Input,Dense,Conv2D,Activation,MaxPooling2D,Flatten,merge,Conv2DTranspose,ZeroPadding2D
from keras.regularizers import l2
from keras.layers import Dense
from keras.optimizers import SGD
from keras import backend as K
# ■■■■■■■■ [1] 資料載入 ■■■■■■■■
(x_train,y_train),(x_test,y_test) = mnist.load_data()
# (60000,28,28)
print('x_shape:',x_train.shape)
# (60000)
print('y_shape:',y_train.shape)
# (60000,28,28)->(60000,784)
x_train = x_train.reshape(x_train.shape[0],-1)/255.0
x_test = x_test.reshape(x_test.shape[0],-1)/255.0
# 換one hot格式
y_train = np_utils.to_categorical(y_train,num_classes=10)
y_test = np_utils.to_categorical(y_test,num_classes=10)
# ■■■■■■■■ [2]模型設計 ■■■■■■■■
# ———— 主模型
inputs = Input(shape=(784,))
x = Dense(100,activation='relu')(inputs)
outputs = Dense(10,activation='softmax')(x)
model_main = Model(input = inputs,output=outputs)
model_main.load_weights('my_model_weights.h5')
# model_main.trainable = False
# model = model_main
# ———— 提取'dense_1'層的輸出。
layer_name = 'dense_1'
intermediate_layer_model = Model(input=inputs,
output=model_main.get_layer(layer_name).output)
x_train_Dense = intermediate_layer_model.predict(x_train) # 將樣本x_train輸入得到'dense_1'層輸出。
np.save('bottleneck_features.npy', x_train_Dense) # 將提取出的'dense_1'層的特徵儲存在.npy檔案中。
train_data = np.load('bottleneck_features.npy') # 讀取.npy檔案中的特徵向量。
print('x_train_Dense',train_data)
print('x_train_Dense.shape',train_data.shape)
# ———— fine-tune模型
inputs1 = Input(shape=(100,)) # 由於取出上面model_main中的'dense_1'層輸出為100維。
x = Dense(100,activation='relu')(inputs1)
outputs1 = Dense(10,activation='softmax')(x)
model = Model(inputs1,outputs1)
model.summary()
# ■■■■■■■■ [3]模型編譯 ■■■■■■■■
# 定義優化器
sgd = SGD(lr=0.2)
# 編譯,loss function,訓練過程中計算準確率
model.compile(optimizer = sgd,
loss = 'mse',
metrics=['accuracy'],
)
# ■■■■■■■■ [4]訓練模型 ■■■■■■■■
# model.fit(x_train,y_train,batch_size=64,epochs=1) # 主模型訓練用這個
model.fit(x_train_Dense, y_train,batch_size=64, epochs=1) # fine-tune模型用這個
# ■■■■■■■■ [5]評估模型 ■■■■■■■■
# loss,accuracy = model.evaluate(x_test,y_test)
# print('\ntest loss',loss)
# print('accuracy',accuracy)
# # 儲存引數,載入引數
# model.save_weights('my_model_weights.h5')
# model.load_weights('my_model_weights.h5')
K.clear_session()
(2)、主模型不凍結的fine-tune——並且可以驗證主模型load_weight之後,連成fine-tune模型後,權重是否也在fine-tune模型中。
# ■■■■■■■■ [2]模型設計 ■■■■■■■■
# —————— 主模型 ——————
inputs = Input(shape=(784,))
x = Dense(100,activation='relu',name='dense_100')(inputs)
outputs = Dense(10,activation='softmax')(x)
model_main = Model(input = inputs,output=outputs)
model_main.load_weights('my_model_weights.h5')
# ———— (fine-tune構建)提取'dense_1'層的輸出 ——————
layer_name = 'dense_100'
intermediate_layer_model = Model(input=inputs,
output=model_main.get_layer(layer_name).output)
outputs_inter = Dense(10,activation='softmax')(intermediate_layer_model.output)
model_inter = Model(input=inputs, output=outputs_inter)
model = model_inter
# ****** 權重探針 ********
a = model.get_weights()
print('主模型load_weight之後,去掉softmax層,接成fine-tune模型後的原dense_100層權重:', a[1]) # 'dense_100'那層的權重。
# ****** 權重探針 ********
【結果】(發現確實load進了fine-tune模型)
>>>主模型load_weight之後,去掉softmax層,接成fine-tune模型後的原dense_100層權重:
[ 0.00609367 0.01774433 0.00127991 0.01685369 -0.00588948 0.0022781
0.00694803 0.00636634 -0.00108383 -0.00480387 0.01123319 0.01685128
0.0071973 0.00373418 0.0015275 -0.0011526 -0.00451979 -0.00653248
0.01192301 -0.00078739 -0.00056679 -0.00057205 0.0220937 -0.00158271
-0.00026968 -0.00664996 -0.00085808 -0.00305471 0.00620055 0.0064344
-0.00938795 0.00266371 0.00623808 0.0083605 -0.00238177 -0.00048903
0.00059158 0.00824707 0.00500612 0.00873516 -0.0032067 0.00337419
0.01087511 0.004928 0.01195703 0.01690748 0.01420193 -0.0064415
0.00545023 0.01340502 -0.00258121 0.01323839 0.00632899 0.01284719
0.00555667 0.01261076 -0.00088008 0.01200596 0.00733639 0.01783392
-0.00440101 0.00118115 0.01178464 0.0074486 0.00896501 0.00357948
0.00705922 0.00520497 0.01415215 -0.00202574 0.00927804 0.0138014
0.0098721 0.0129296 0.00189565 0.01651774 0.00946718 -0.00534614
0.00506906 -0.00030766 -0.00026362 0.00419401 0.00212149 -0.00304823
-0.00427098 0.0041138 0.01505729 0.00112592 -0.00334759 0.00820872
-0.01345768 -0.00101386 -0.00698254 0.02179425 0.00819413 0.00404393
-0.00315165 0.01334981 0.01426365 0.00202925]
(3)、讓模型中的某幾層不參加訓練。
model.layers方法:
# ——————————————————————— 主模型 ——————————————————————————
#....省略....
model1 = Model(inputs=x, outputs=y, name='model1')
# ——————————————— 只想讓後3層參加訓練(總共14層) ——————————————
print('\n 有多少個層(relu這種沒有引數的也算一層) :',len(model1.layers))
model1.trainable = True # 想要讓某層參加訓練,必須'先'讓全部層[可訓練],'再'讓不想參加訓練的層[凍結].
# 讓不想參加訓練的層[凍結].
for layer in model1.layers[:11]:
layer.trainable = False
model1.summary()
# ————————————————————————————————————————————————————————————
【結果】
>>>
Total params: 1,671,114
Trainable params: 525,706
Non-trainable params: 1,145,408