1. 程式人生 > >解決模型載入NotFoundError (see above for traceback) Key v1 not found in checkp錯誤

解決模型載入NotFoundError (see above for traceback) Key v1 not found in checkp錯誤

NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key v1 not found in checkpoint
     [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

出現這樣的問題,大多是在使用時,checkpoint檔案中的變數名和呼叫的檔名不匹配造成的。解決方法就是檢視checkpoint檔案中的變數名,將程式呼叫變數名修改為checkpoint檔案中的變數名即可解決問題。下邊具體講如何檢視checkpoint檔案中的變數名、修改程式呼叫變數名


下邊例子是《TensorFlow實戰Google深度學習框架》中模型持久化的例子,同時也解決書中ch5 重新命名載入的問題:

模型儲存的程式碼為:

#!/usr/bin/env python
# -*- coding:utf-8 -*-

import tensorflow as tf

# 儲存計算兩個變數和的模型
v1 = tf.Variable(tf.random_normal([1], stddev=1, seed=1))
v2 = tf.Variable(tf.random_normal([1], stddev=1, seed=1))
result = v1 + v2

init_op = tf.global_variables_initializer()
saver = tf.train.Saver()

with tf.Session() as sess:
    sess.run(init_op)
    saver.save(sess, "Saved_model/model.ckpt")

模型載入的程式碼為(模型全部載入):

#!/usr/bin/env python
# -*- coding:utf-8 -*-
 
import tensorflow as tf
 
# 儲存計算兩個變數和的模型
v1 = tf.Variable(tf.random_normal([1], stddev=1, seed=1))
v2 = tf.Variable(tf.random_normal([1], stddev=1, seed=1))
result = v1 + v2

saver = tf.train.Saver()
 
# 載入儲存的模型,載入全部模型
with tf.Session() as sess:
    saver.restore(sess, "Saved_model/model.ckpt")
    print(sess.run(result))

這段程式碼並不會出現問題,正常執行。

執行結果為:

模型載入(重新命名變數) 程式碼:

#!/usr/bin/env python
# -*- coding:utf-8 -*-

import tensorflow as tf

# tf.reset_default_graph()

# 宣告變數
V1 = tf.Variable(tf.constant(1.0, shape=[1]), name="a1")
V2 = tf.Variable(tf.constant(2.0, shape=[1]), name="a2")
result = V1 + V2

saver = tf.train.Saver({"v1": V1, "v2": V2})

# 載入儲存的模型,載入全部模型
with tf.Session() as sess:
    saver.restore(sess, "Saved_model/model.ckpt")
    print(sess.run(result))

執行這段程式碼時,會出現下述錯誤: 

NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key v1 not found in checkpoint
     [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

出現這樣的問題是程式碼中:saver = tf.train.Saver({"v1": V1, "v2": V2})指定的變數名“v1”、“v2”與checkpoint檔案中的變數名名稱不符合。

執行下邊程式碼,檢視checkpoint檔案中的變數名(具體請參考博文TensorFlow中檢視checkpoint檔案中的變數名和對應值):

#!/usr/bin/env python
# -*- coding:utf-8 -*-
import os
from tensorflow.python import pywrap_tensorflow
model_dir = "Saved_model"
checkpoint_path = os.path.join(model_dir, "model.ckpt")
reader = pywrap_tensorflow.NewCheckpointReader(checkpoint_path)
var_to_shape_map = reader.get_variable_to_shape_map()
for key in var_to_shape_map:
    print("tensor_name: ", key, end=' ')
    print(reader.get_tensor(key))

執行結果為:

由執行結果可以看出,checkpoint檔案的變數名是Variable和Variable_1,並不是v1和v2,所以將上述載入模型(重新命名變數) 中saver = tf.train.Saver({"v1": V1, "v2": V2})的v1和v2分別改為Variable和Variable_1即可解決錯誤。

修改後程式碼為:

#!/usr/bin/env python
# -*- coding:utf-8 -*-

import tensorflow as tf

# tf.reset_default_graph()

# 宣告變數
V1 = tf.Variable(tf.constant(1.0, shape=[1]), name="a1")
V2 = tf.Variable(tf.constant(2.0, shape=[1]), name="a2")
result = V1 + V2

saver = tf.train.Saver({"Variable": V1, "Variable_1": V2})

# 載入儲存的模型,載入全部模型
with tf.Session() as sess:
    saver.restore(sess, "Saved_model/model.ckpt")
    print(sess.run(result))

執行結果為: