斯坦福CS20 TensorFlow學習筆記(2)：TensorFlow Ops

阿新 • • 發佈：2019-01-09

斯坦福CS20 TensorFlow學習筆記(2)：TensorFlow Ops

上一節我們介紹了graph、tensor和session，這一節主要介紹operation。主要內容有：

TensorBoard的基本用法
Basic operations
Tensor types
Importing data
Lazy loading

1- Visualize it with TensorBoard

TensorFlow提供了視覺化graph的工具TensorBoard，之前我們看到的graph示意圖就是TensorBoard生成的，下面介紹其使用方法。

1.1- 產生event檔案

如果要使用TensorBoard，需要先產生graph的event檔案，可以通過tf.summary.FileWriter，示例程式碼如下:

import tensorflow as tf
a = tf.constant(2)
b = tf.constant(3)
x = tf.add(a, b)
writer = tf.summary.FileWriter('./graphs', tf.get_default_graph())
with tf.Session() as sess:
	print(sess.run(x))
writer.close() # close the writer when you’re done using it

在graph定義結束後，session執行之前，通過tf.summary.FileWriter將graph輸出到event檔案。注意，event檔案的產生，和session是否執行無關，它只和graph有關，比如下面的程式碼：

import tensorflow as tf
a = tf.constant(2)
b = tf.constant(3)
x = tf.add(a, b)
writer = tf.summary.FileWriter('./graphs', tf.get_default_graph())
writer.close()

1.2- 執行tensorboard命令

tf.summary.FileWriter生成event檔案後，然後使用tensorboard命令去讀取：

tensorboard --logdir="./graphs"

tensorboard命令是隨TensorFlow安裝自帶的，上面的命令會在預設的6006埠啟動了一個HTTP服務，讀取./graphs目錄下的event檔案。瀏覽器開啟http://localhost:6006 , 即可看到TensorBoard的介面，類似如下：

Xnip2018-08-16_08-44-24

關於TensorBoard更多內容，請參考官網：https://www.tensorflow.org/guide/graph_viz

1.3- 命名

在TensorBoard上，可以看到每個節點都有一個名字，這個名字可以在程式碼裡定義，如果沒有定義，一般會被自動命名，比如，下面的程式碼，節點會根據其節點型別+序號自動命名：

import tensorflow as tf
a = tf.constant(2)
b = tf.constant(3)
x = tf.add(a, b)
writer = tf.summary.FileWriter('./graphs', tf.get_default_graph())
writer.close()

Xnip2018-08-16_08-48-44

我們可以在程式碼裡通過name引數，指定每個node的名字：

import tensorflow as tf
a = tf.constant(2, name='a')
b = tf.constant(3, name='b')
x = tf.add(a, b, name='add')
writer = tf.summary.FileWriter('./graphs', tf.get_default_graph())
with tf.Session() as sess:
	print(sess.run(x)) # >> 5

Xnip2018-08-16_08-49-55

最後說明一下，上面只是介紹了TensorBoard的視覺化功能，但其功能遠不僅如此，它將是我們常用的工具。

2- Constants, Variables, Ops

這裡介紹的constants和Variable，本質上都是Operation（簡稱為Ops），Operation在graph裡表現為一個節點。

2.1- Constants

graph中使用tf.constant定義常量，常量不會被改變。tf.constant方法簽名如下：

tf.constant(
    value,
    dtype=None,
    shape=None,
    name='Const',
    verify_shape=False
)

舉例：

a = tf.constant([2, 2], name='a')
b = tf.constant([[0, 1], [2, 3]], name='b')

tf.constant定義了一個返回固定值的operation。有有點繞的地方是：tf.constant()方法的返回值還是tf.Tensor，可以理解為operation是在tf.constant()的內部定義了，但返回的是operation的輸出，即tf.Tensor，我們通過TensorFlow的開原始碼可以大概窺探：

Xnip2018-08-19_17-44-59

2.1.1- 快捷方法生成常見constant

除了上面標準的定義constant的方法，對於一些常見的constant（比如比如全零，全一），與NumPy類似，tf也提供了一些同名方法，快速生成某類tensor，

tf.zeros(shape, dtype=tf.float32, name=None)

tf.zeros([2, 3], tf.int32) ==> [[0, 0, 0], [0, 0, 0]]

tf.zeros_like(input_tensor, dtype=None, name=None, optimize=True)

# input_tensor is [[0, 1], [2, 3], [4, 5]]
tf.zeros_like(input_tensor) ==> [[0, 0], [0, 0], [0, 0]]

類似的全一：

tf.ones(shape, dtype=tf.float32, name=None)
tf.ones_like(input_tensor, dtype=None, name=None, optimize=True)

或者填充某一個值的tensor

tf.fill(dims, value, name=None) 
tf.fill([2, 3], 8) ==> [[8, 8, 8], [8, 8, 8]]

2.1.2- constant序列

下面介紹幾種生成constant序列的方法，與NumPy類似。

tf.line_space生成start到stop的封閉線性空間，總的個數為num。start和stop必須包含在內。

方法簽名：

tf.lin_space(start, stop, num, name=None)

舉例：

tf.lin_space(10.0, 13.0, 4) ==> [10. 11. 12. 13.]

tf.range生成start到limit的序列，start必須包含在內，limit一定不包含，不一定包含在內。delta控制了步長。

方法簽名：

tf.range(limit, delta=1, dtype=None, name='range')
tf.range(start, limit, delta=1, dtype=None, name='range')

舉例：

tf.range(3, 18, 3) ==> [3 6 9 12 15]
tf.range(5) ==> [0 1 2 3 4]

lin_sapce 和range有什麼區別？

lin_space，嚴格準確的是start和stop，以及生成數量，每個數字之間並不一定嚴格步長相等。
range，嚴格準確的是start和步長，limit是上限。

特別注意，tensor是不可迭代的（iterable），所以如下操作是非法的：

for _ in tf.range(4): # TypeError

2.1.3- 與隨機數相關的方法：

下面是幾個常見的生成隨機數的方法：

tf.random_normal #（正態分佈）
tf.truncated_normal
tf.random_uniform
tf.random_shuffle
tf.random_crop
tf.multinomial
tf.random_gamma

其中：
tf.truncated_normal很常見，它會剔除正太分佈中超過2個標準差的隨機值。
tf.random_shuffle，會將傳入的tensor按第0個維度進行隨機重排(shuffle)

另外可以設定隨機數種子，讓所有隨機數變得固定。

tf.set_random_seed(seed)

2.1.4- broadcasting

Tensor可以像NumPy一樣broadcasting，比如下面的element-wise乘法：

import tensorflow as tf
a = tf.constant([2, 2], name='a')
b = tf.constant([[0, 1], [2, 3]], name='b')
x = tf.multiply(a, b, name='mul')
with tf.Session() as sess:
	print(sess.run(x))
#  >>  [[0 2]
#	   [4 6]]

2.1.5- verfiy_shape

關於verify_shape，預設是不校驗value的shape和引數shape必須匹配，如果value的shape不一致的話，會按shape指定的維度，廣播為一致，比如：tf.constant(2, shape=[2,2]) 相當於tf.constant([[2,2],[2,2]], shape=[2,2])

2.2- Operations

常見的operation：

Xnip2018-08-08_22-42-58

值得注意都是，從上表看出tf.Variable屬於一個operation

其中的算數運算如下，和NumPy非常類似：

Xnip2018-08-16_09-15-27

更多數學操作，請參考：https://www.tensorflow.org/api_guides/python/math_ops

關於除法，需要特別注意，下面是幾個舉例：

a = tf.constant([2, 2], name='a')
b = tf.constant([[0, 1], [2, 3]], name='b')
with tf.Session() as sess:
	print(sess.run(tf.div(b, a)))             ⇒ [[0 0] [1 1]]
	print(sess.run(tf.divide(b, a)))          ⇒ [[0. 0.5] [1. 1.5]]
	print(sess.run(tf.truediv(b, a)))         ⇒ [[0. 0.5] [1. 1.5]]
	print(sess.run(tf.floordiv(b, a)))        ⇒ [[0 0] [1 1]]
	print(sess.run(tf.realdiv(b, a)))         ⇒ # Error: only works for real values
	print(sess.run(tf.truncatediv(b, a)))     ⇒ [[0 0] [1 1]]
	print(sess.run(tf.floor_div(b, a)))       ⇒ [[0 0] [1 1]]

總體來說，tf.div是TensorFlow風格的除法，而tf.divide對應Python風格的除法。

2.3- TensorFlow Data Types

2.3.1- python原生資料型別

首先，TensorFlow使用Python原生資料型別： boolean, numeric (int, float), strings

單個值，會被轉換為0-d的tensor，list會被轉換為1-d的tensor，含有list的list，會被轉換為2-d tensor，以此類推。

t_0 = 19 			         			# scalars are treated like 0-d tensors
tf.zeros_like(t_0)                  			# ==> 0
tf.ones_like(t_0)                    			# ==> 1

t_1 = [b"apple", b"peach", b"grape"] 	# 1-d arrays are treated like 1-d tensors
tf.zeros_like(t_1)                   			# ==> [b'' b'' b'']
tf.ones_like(t_1)                    			# ==> TypeError: Expected string, got 1 of type 'int' instead.

t_2 = [[True, False, False],
  [False, False, True],
  [False, True, False]]         		# 2-d arrays are treated like 2-d tensors

tf.zeros_like(t_2)                   			# ==> 3x3 tensor, all elements are False
tf.ones_like(t_2)                    			# ==> 3x3 tensor, all elements are True

2.3.2- TensorFlow資料型別

TensorFlow提供瞭如下資料型別：

tf.float16: 16-bit half-precision floating-point.
tf.float32: 32-bit single-precision floating-point.
tf.float64: 64-bit double-precision floating-point.
tf.bfloat16: 16-bit truncated floating-point.
tf.complex64: 64-bit single-precision complex.
tf.complex128: 128-bit double-precision complex.
tf.int8: 8-bit signed integer.
tf.uint8: 8-bit unsigned integer.
tf.uint16: 16-bit unsigned integer.
tf.uint32: 32-bit unsigned integer.
tf.uint64: 64-bit unsigned integer.
tf.int16: 16-bit signed integer.
tf.int32: 32-bit signed integer.
tf.int64: 64-bit signed integer.
tf.bool: Boolean.
tf.string: String.
tf.qint8: Quantized 8-bit signed integer.
tf.quint8: Quantized 8-bit unsigned integer.
tf.qint16: Quantized 16-bit signed integer.
tf.quint16: Quantized 16-bit unsigned integer.
tf.qint32: Quantized 32-bit signed integer.
tf.resource: Handle to a mutable resource.
tf.variant: Values of arbitrary types.

2.3.3- 關於TF資料型別和NumPy資料型別

tf定義的資料型別，幾乎是和NumPy對應的。因此兩者甚至可以無縫整合：

甚至兩個型別直接判斷相等，返回的是true

tf.int32 == np.int32 			# ⇒ True

傳入給Operation的引數，指定資料型別，用NumPy型別也是可以的。

tf.ones([2, 2], np.float32) 	# ⇒ [[1.0 1.0], [1.0 1.0]]

而且，在TensorFlow中，就是用NumPy的ndarray來表示Tensor value的，對於tf.Session.run(fetches)，如果fetches是Tensor，則返回的是NumPy ndarray。

sess = tf.Session()
a = tf.zeros([2, 3], np.int32)
print(type(a))  			# ⇒ <class'tensorflow.python.framework.ops.Tensor'>
a = sess.run(a)
print(type(a))  			# ⇒ <class 'numpy.ndarray'>

雖然如此，還是建議儘可能的使用TF的資料型別，原因如下：

使用Python原生型別，TensorFlow 必須引用numpy型別
最重要的是：NumPy不相容GPU

2.4- 使用常量的注意

常量的值將作為graph定義的一部分被儲存和序列化，如果常量過多，將使graph的載入成本太大。這一點，可以通過as_graph_def()證明：

my_const = tf.constant([1.0, 2.0], name="my_const")
with tf.Session() as sess:
	print(sess.graph.as_graph_def())

Xnip2018-08-19_17-14-00

關於常量的使用，有兩點指導意見：

僅對基本資料型別使用constant
對於需要更多記憶體的資料，使用Variable或reader

疑惑：用Variable有用嗎？不是也要提供initializ嗎？也會在模型記憶體儲初始值啊。
解答：constant的value儲存在graph的定義中，因此只要graph在哪裡載入一次，constant就要複製一份。而Variable分開儲存，而且可能放在單獨的引數伺服器上。

2.5- Variables

2.5.1- 兩種建立Variable的方法：

# create variables with tf.Variable
s = tf.Variable(2, name="scalar")  # with scalar value
m = tf.Variable([[0, 1], [2, 3]], name="matrix") # with list vlue
W = tf.Variable(tf.zeros([784,10])) # with tensor 

# create variables with tf.get_variable
s = tf.get_variable("scalar", initializer=tf.constant(2)) 
m = tf.get_variable("matrix", initializer=tf.constant([[0, 1], [2, 3]]))
W = tf.get_variable("big_matrix", shape=(784, 10), initializer=tf.zeros_initializer())

兩種方法建立的都是tf.Variable物件，雖然tf.Variable()看起來更簡潔，但這是一種老式的呼叫方法，並不推薦使用。tf.get_variable是對tf.Variable的包裝，可以更容易的共享。

為什麼tf.constant是小寫開頭，而tf.Variable要大寫開頭？這是因為tf.constant只是一個op，而tf.Variable是一個class，內部包含了多個op：

x = tf.Variable(...) 

x.initializer # init op
x.value() # read op
x.assign(...) # write op
x.assign_add(...) 
# and more

通過TensorBoard可以看出，一個Variable是一個子圖：

Xnip2018-08-12_20-59-59

2.5.2- Variable初始化

Variable在使用時必須初始化，否則會報錯：

# create variables with tf.get_variable
s = tf.get_variable("scalar", initializer=tf.constant(2)) 
m = tf.get_variable("matrix", initializer=tf.constant([[0, 1], [2, 3]]))
W = tf.get_variable("big_matrix", shape=(784, 10), initializer=tf.zeros_initializer())

with tf.Session() as sess:
	print(sess.run(W))   >> FailedPreconditionError: Attempting to use uninitialized value Variable

最簡單的方法是執行tf.global_variables_initializer()這個op，它會對graph中的所有Variable初始化

with tf.Session() as sess:
	sess.run(tf.global_variables_initializer())

也可以只初始化一部分變數tf.variables_initializer()

sess.run(tf.variables_initializer([a, b]))

或直接呼叫一個Variable自帶的initializer

W = tf.Variable(tf.zeros([784,10]))
with tf.Session() as sess:
	sess.run(W.initializer)

還有一個輸出所有為初始化的Variable列表的技巧：

print(session.run(tf.report_uninitialized_variables()))

2.5.3- Eval() a variable

# W is a random 700 x 10 variable object
W = tf.Variable(tf.truncated_normal([700, 10]))
with tf.Session() as sess:
	sess.run(W.initializer)
	print(W.eval())				# Similar to print(sess.run(W))

t.eval() 只是run的一個快捷方式： tf.get_default_session().run(t).

2.5.4- tf.Variable.assign()

assgin相當於對變數的賦值語句，需要注意assigin()也是一個op，因此要run或eval()，比如下面的assgin就沒有起作用：

W = tf.Variable(10)
W.assign(100)
with tf.Session() as sess:
	sess.run(W.initializer)
	print(W.eval()) 	# >> 10

應該這樣：

W = tf.Variable(10)
assign_op = W.assign(100)
with tf.Session() as sess:
    sess.run(W.initializer)
    sess.run(assign_op)
    print(W.eval()) 				# >> 100

另外，如果已經有assign了，則Variable的initializer可以不用呼叫，initializer本質上也是一個assign。

assign語句反覆執行，效果累加。

# create a variable whose original value is 2
my_var = tf.Variable(2, name="my_var") 

# assign a * 2 to a and call that op a_times_two
my_var_times_two = my_var.assign(2 * my_var)

with tf.Session() as sess:
	sess.run(my_var.initializer)
	sess.run(my_var_times_two) 				# >> the value of my_var now is 4
	sess.run(my_var_times_two) 				# >> the value of my_var now is 8
	sess.run(my_var_times_two) 				# >> the value of my_var now is 16

2.5.5- assign_add() and assign_sub()

my_var = tf.Variable(10)
With tf.Session() as sess:
	sess.run(my_var.initializer)
	
	# increment by 10 
	sess.run(my_var.assign_add(10)) # >> 20
# decrement by 2 
sess.run(my_var.assign_sub(2)) # >> 18

2.5.6- 每個session維護一份Variable的拷貝

可以看到在兩個session內，同一個Variable物件的當前值互不干擾：

W = tf.Variable(10)

sess1 = tf.Session()
sess2 = tf.Session()

sess1.run(W.initializer)
sess2.run(W.initializer)

print(sess1.run(W.assign_add(10))) 		# >> 20
print(sess2.run(W.assign_sub(2))) 		# >> 8

print(sess1.run(W.assign_add(100))) 		# >> 120
print(sess2.run(W.assign_sub(50))) 		# >> -42

sess1.close()
sess2.close()

2.6- Session與InteractiveSession

有時你會看到InteractiveSession。與Session的唯一區別是，InteractiveSession建立後，會自動設定為預設session，相當於執行了session.as_default_session。因此在呼叫run()和eval()方法的時候不需要顯式的呼叫session。InteractiveSession在命令列環境或jupyer notebook上很常用。

sess = tf.InteractiveSession()
a = tf.constant(5.0)
b = tf.constant(6.0)
c = a * b
print(c.eval()) # we can use 'c.eval()' without explicitly stating a session
sess.close()

另外，在with tf.Sessoin as sess語句內部，也相當於直接設定了default session。

tf.get_default_session()方法可以用來獲取當前執行緒的預設session。

2.7- Importing Data

2.7.1- palceholder

之前我們談到，TF程式分為兩步：

組裝一個graph
使用session執行graph上的操作

組裝graph，並不需要知道要參與計算的具體值，這就像定義函式不需要知道函式的形參的具體值。

組裝完graph後，我們（或者我們的客戶端程式碼）可以在他們將要執行計算的時候提供它們自己的資料。

定義placeholder的方法：tf.placeholder(dtype, shape=None, name=None)。

其中shape可以是None，即不明確指定shape，根據最終輸入的資料決定。雖然如此，還是推薦儘可能的準確定義shape，至少是一部分shape，比如shape=(None, 3)

這裡有一個小小的疑問，Variable和placeholder有什麼區別呢？

Variable是變數，可以在graph裡不斷的被修改，而placeholder不行。
在機器學習模型裡，Variable通常是需要學習的權重，而placeholder通常是訓練資料。
Variable使用initializer初始化，而placeholder在run的時候通過fee_dict賦值。
活著說，Variable類比於函式內定義的變數，而placeholder相當於函式方法簽名上的形參。

2.7.1- feed_dict

如果使用了placeholder，就要在執行的時候傳入實際值，否則報錯：

tf.placeholder(dtype, shape=None, name=None)
# create a placeholder for a vector of 3 elements, type tf.float32
a = tf.placeholder(tf.float32, shape=[3])

b = tf.constant([5, 5, 5], tf.float32)

# use the placeholder as you would a constant or a variable
c = a + b  # short for tf.add(a, b)

with tf.Session() as sess:
	print(sess.run(c)) 			# >> InvalidArgumentError: a doesn’t an actual value

需要通過feed_dict引數對placeholder設值：

# create a placeholder for a vector of 3 elements, type tf.float32
a = tf.placeholder(tf.float32, shape=[3])

b = tf.constant([5, 5, 5], tf.float32)

# use the placeholder as you would a constant or a variable
c = a + b  # short for tf.add(a, b)

with tf.Session() as sess:
	print(sess.run(c, feed_dict={a: [1, 2, 3]})) 	# the tensor a is the key, not the string ‘a’

# >> [6, 7, 8]

特別注意feed_dict的key就是placeholder物件，而不是字串。placeholder也是有效的ops，tf.placeholer返回的也是一個tf.Tensor物件。

2.7.2- feed多次資料

比如下面的操作，通過一個迴圈，反覆feed不同的資料：

with tf.Session() as sess:
	for a_value in list_of_values_for_a:
	print(sess.run(c, {a: a_value}))

這種做法不僅正確，而且很常見，機器學習演算法中，定義一個訓練op，然後不斷feed不同的訓練資料進行訓練。雖然place_holder一直在傳入，但裡面的引數通過Variable一直在迭代。

2.7.3- is_feedable

事實上，feed_dict不僅可以feed的是placeholder，還可以feed任何可feed的tensor！ placeholder只是一種方法表示必須被feed。

或者我們可以通過is_feedable判斷是否可以被feed：

tf.Graph.is_feedable(tensor)
# True if and only if tensor is feedable.

這種操作在測試的時候特別有用，當一個graph太大，我們只想測試圖的一個部分，就可以用這種方法提供虛假值，節省不必要的計算時間。

# create operations, tensors, etc (using the default graph)
a = tf.add(2, 5)
b = tf.multiply(a, 3)

with tf.Session() as sess:
	# compute the value of b given a is 15
	sess.run(b, feed_dict={a: 15}) 				# >> 45

2.7.4- tf.data

placeholder是一種簡單、老舊的方式，更好的辦法是td.data，下一章我們會通過linear和logistic regression為例介紹。

2.8- lazy loading

lazy loading是一種常見的錯誤。

比如下面的做法是正常的loading

Xnip2018-08-13_20-31-54

下面這個做法是lazy loading

Xnip2018-08-13_20-32-32

雖然兩段程式的執行結果看似一樣，後者好像還省了一行程式碼，但後者不斷的在迴圈內建立了多個add節點，造成graph的定義膨脹（想象一下迴圈的是100萬次，則代價非常大）。

程式執行後，我們檢視兩個graph的定義，可以發現，前者只有一個add節點：

node {
  name: "Add"
  op: "Add"
  input: "x/read"
  input: "y/read"
  attr {
    key: "T"
    value {
      type: DT_INT32
    }
  }
}

後者會生成Add_1到Add_10一共10個add節點：

node {
  name: "Add_1"
  op: "Add"
  ...
  }
...
node {
  name: "Add_10"
  op: "Add"
  ...
}

我們應該避免lazy loading，方法是：

將op的定義和執行區分開來。
使用Python的property保證function僅在第一次呼叫時載入。比如下面的做法：

Xnip2018-08-13_20-46-12

lazy loading的更多內容，可參考：https://danijar.com/structuring-your-tensorflow-models/

【轉載】：http://imshuai.com/cs20-tensorflow-notes-2/

斯坦福CS20 TensorFlow學習筆記(2)：TensorFlow Ops

斯坦福CS20 TensorFlow學習筆記(2)：TensorFlow Ops

1- Visualize it with TensorBoard

1.1- 產生event檔案

1.2- 執行tensorboard命令

1.3- 命名

2- Constants, Variables, Ops

2.1- Constants

2.1.1- 快捷方法生成常見constant

2.1.2- constant序列

2.1.3- 與隨機數相關的方法：

2.1.4- broadcasting

2.1.5- verfiy_shape

2.2- Operations

2.3- TensorFlow Data Types

2.3.1- python原生資料型別

2.3.2- TensorFlow資料型別

2.3.3- 關於TF資料型別和NumPy資料型別

2.4- 使用常量的注意

2.5- Variables

2.5.1- 兩種建立Variable的方法：

2.5.2- Variable初始化

2.5.3- Eval() a variable

2.5.4- tf.Variable.assign()

2.5.5- assign_add() and assign_sub()

2.5.6- 每個session維護一份Variable的拷貝

2.6- Session與InteractiveSession

2.7- Importing Data

2.7.1- palceholder

2.7.1- feed_dict

2.7.2- feed多次資料

2.7.3- is_feedable

2.7.4- tf.data

2.8- lazy loading

相關推薦