【譯】Effective TensorFlow Chapter11——在TensorFlow中除錯模型

TensorFlow · 發表 2019-03-01 13:47:56

摘要：本文翻譯自：《Debugging TensorFlow models》，如有侵權請聯絡刪除，僅限於學術交流，請勿商用。如有謬誤，請聯絡指出。與常規python程式碼相比，TensorFlow的符號特性使的TensorFlow的程式碼除錯變得相對困難。這裡我介...

本文翻譯自：《Debugging TensorFlow models》，如有侵權請聯絡刪除，僅限於學術交流，請勿商用。如有謬誤，請聯絡指出。

與常規python程式碼相比，TensorFlow的符號特性使的TensorFlow的程式碼除錯變得相對困難。這裡我介紹一些TensorFlow附帶的工具，使除錯更容易。

使用TensorFlow時最常見的錯誤可能是傳遞形狀錯誤的張量。許多TensorFlow操作可以在不同秩(rank)和形狀(shape)的張量上操作。這在使用API時很方便，但在出現問題時可能會導致額外的麻煩。

例如，考慮下面這個 tf.matmul 操作，它可以使兩個矩陣相乘：

a = tf.random_uniform([2, 3])
b = tf.random_uniform([3, 4])
c = tf.matmul(a, b)# c is a tensor of shape [2, 4]
複製程式碼

但是下面這個函式也可以實現矩陣乘法：

a = tf.random_uniform([10, 2, 3])
b = tf.random_uniform([10, 3, 4])
tf.matmul(a, b)# c is a tensor of shape [10, 2, 4]
複製程式碼

下面是我們之前在廣播部分談到的一個支援廣播的新增操作的例子：

a = tf.constant([[1.], [2.]])
b = tf.constant([1., 2.])
c = a + b# c is a tensor of shape [2, 2]
複製程式碼

使用 tf.assert * 操作驗證您的張量

減少不必要行為可能性的一種方法是使用 tf.assert * 操作驗證中間張量的秩(rank)或形狀(shape)。

a = tf.constant([[1.], [2.]])
b = tf.constant([1., 2.])
check_a = tf.assert_rank(a, 1)# This will raise an InvalidArgumentError exception
check_b = tf.assert_rank(b, 1)
with tf.control_dependencies([check_a, check_b]):
c = a + b# c is a tensor of shape [2, 2]
複製程式碼

請記住，斷言節點和其他操作一樣都屬於TensorFlow中圖（Graph）的一部分，如果不進行評估，則會在執行 Session.run() 期間進行剔除。因此，請確保為斷言操作建立顯式依賴項，以強制TensorFlow執行它們。

你還可以在執行時使用斷言驗證張量的值：

check_pos = tf.assert_positive(a)
複製程式碼

有關斷言操作的詳細資訊，請參閱官方文件。

使用 tf.Print 列印張量值

另一個對除錯有幫助的內建函式是 tf.Print ，它可以將給定的張量記錄到標準錯誤堆疊中：

input_copy = tf.Print(input, tensors_to_print_list)
複製程式碼

注意一下， tf.Print 函式將其第一個引數的副本作為返回值輸出。一種讓 tf.Print 強制執行的方式是將其輸出傳遞給另一個操作去執行。例如，如果我們想在新增它們之前就列印張量a和b的值，我們可以這樣做：

a = ...
b = ...
a = tf.Print(a, [a, b])
c = a + b
複製程式碼

或者，我們可以手動定義控制元件依賴項。

利用 tf.compute_gradient_error 檢查梯度變化的值

並不是TensorFlow中的所有操作都有梯度變化，並且很容易在無意中構建出TensorFlow無法計算梯度變化的圖。

讓我們來看個例子：

import tensorflow as tf

def non_differentiable_softmax_entropy(logits):
probs = tf.nn.softmax(logits)
return tf.nn.softmax_cross_entropy_with_logits(labels=probs, logits=logits)

w = tf.get_variable("w", shape=[5])
y = -non_differentiable_softmax_entropy(w)

opt = tf.train.AdamOptimizer()
train_op = opt.minimize(y)

sess = tf.Session()
sess.run(tf.global_variables_initializer())
for i in range(10000):
sess.run(train_op)

print(sess.run(tf.nn.softmax(w)))
複製程式碼

我們正在使用 tf.nn.softmax_cross_entropy_with_logits 定義一個分類分佈上的熵。然後我們使用Adam優化器來找到具有最大熵的權重。如果你通過了資訊理論的課程，你就會知道均勻分佈包含最大熵。所以你預計他的結果應該會是 [0.2, 0.2, 0.2, 0.2, 0.2] 。但是你執行這段程式碼的話會得到一個你意想不到的結果：

[ 0.340814860.242870230.234657750.089356830.09230034]

複製程式碼

事實證明， tf.nn.softmax_cross_entropy_with_logits 對標籤有未定義的梯度變化！但是，如果我們不知道這個現象，我們又怎麼能發現這個問題呢？

幸運的是，TensorFlow帶有一個數值微分器，可用於查詢符號梯度誤差。讓我們看看我們如何使用它：

with tf.Session():
diff = tf.test.compute_gradient_error(w, [5], y, [])
print(diff)

複製程式碼

如果你執行它，你會發現數值和符號之間的差異非常大（我試了下大約為0.06 - 0.1）。

現在讓我們更改下我們的函式並再次執行下：

import tensorflow as tf
import numpy as np

def softmax_entropy(logits, dim=-1):
plogp = tf.nn.softmax(logits, dim) * tf.nn.log_softmax(logits, dim)
return -tf.reduce_sum(plogp, dim)

w = tf.get_variable("w", shape=[5])
y = -softmax_entropy(w)

print(w.get_shape())
print(y.get_shape())

with tf.Session() as sess:
diff = tf.test.compute_gradient_error(w, [5], y, [])
print(diff)

複製程式碼

差異應該在0.0001左右，這個結果看起來好多了。

現在，如果再次使用正確的版本執行優化器，你可以看到最終權重為：

[ 0.20.20.20.20.2]

複製程式碼

這就是我們想要的答案。

TensorFlow summaries 和 tfdbg(TensorFlow Debugger) 是另外兩個用於除錯的工具，請參閱官方文件以瞭解更多資訊。

【譯】Effective TensorFlow Chapter11——在TensorFlow中除錯模型

您可能也會喜歡…