1. 程式人生 > >Tensorflow (2): tf.slim庫解讀

Tensorflow (2): tf.slim庫解讀

主要作用

  • 對於一些固定的程式碼進行封裝,主要是一些高階的層和變數,方便使用者寫出一些更加緊湊的程式碼
  • 包含多個廣泛使用的CV模型(VGG, AlexNet)
  • 提供訓練(損失,學習), 評估一些高階方法

主要組成

標註[o]的將進行介紹,標註[x]的官方暫時沒有文件

  • [o]variables: provides convenience wrappers for variable creation and manipulation.
  • [o]layers: contains high level layers for building models using tensorflow.
  • [o]arg_scopes
  • [o]losses: contains commonly used loss functions.
  • [o]learning
  • [o]metrics: contains popular evaluation metrics.
  • [o]evaluation
  • [o]data: contains TF-slim’s dataset definition, data providers, parallel_reader, and decoding utilities.
  • [x]nets: contains popular network definitions such as VGG and AlexNet models.
  • [x]queues: provides a context manager for easily and safely starting and closing QueueRunners.
  • [x]regularizers: contains weight regularizers.

定義模型

Variable

  1. 封裝variable
    舉例:
weights = slim.variable('weights',
                             shape=[10, 10, 3 , 3],
                             initializer=tf.truncated_normal_initializer(stddev=0.1
), regularizer=slim.l2_regularizer(0.05), device='/CPU:0')
  1. 管理model variable
    在原生的tf中,有兩種變數:regular variable和local variable,第一種就是就是可以使用saver進行save的變數,第二中是隻存在於session中,不能saved的變數.
    在slim中,區分變數為兩種型別: model variable和non-model variable.mdoel variable就是可以學習的引數,並且在評估或預測的時候需要載入的引數,例如 slim.fully_connected or slim.conv2d層的引數,non-model variable就是在訓練或評估的時候需要,但是在inference的時候不需要,例如global_step.
    舉例:
# Model Variables
weights = slim.model_variable('weights',
                              shape=[10, 10, 3 , 3],
                              initializer=tf.truncated_normal_initializer(stddev=0.1),
                              regularizer=slim.l2_regularizer(0.05),
                              device='/CPU:0')
model_variables = slim.get_model_variables()

# Regular variables
my_var = slim.variable('my_var',
                       shape=[20, 1],
                       initializer=tf.zeros_initializer())
regular_variables_and_model_variables = slim.get_variables()

slim.get_model_variables()工作原理: 當使用slim建立一個model_varialbe的時候, slim會把這個變數新增到tf.GraphKeys.MODEL_VARIABLES collection.如果是自己建立的變數想要被slim進行管理,可以採用如下方式:

my_model_variable = CreateViaCustomCode()
# Letting TF-Slim know about the additional variable.
slim.add_model_variable(my_model_variable)

Layer

slim封裝層
原生的tf建立一個卷積層,需要如下多個低階操作:
* 建立weight和bias variable
* convolve the weight with last tensor
* add the baise to last result
* activate
舉例:

input = ...
with tf.name_scope('conv1_1') as scope:
  kernel = tf.Variable(tf.truncated_normal([3, 3, 64, 128], dtype=tf.float32,
                                           stddev=1e-1), name='weights')
  conv = tf.nn.conv2d(input, kernel, [1, 1, 1, 1], padding='SAME')
  biases = tf.Variable(tf.constant(0.0, shape=[128], dtype=tf.float32),
                       trainable=True, name='biases')
  bias = tf.nn.bias_add(conv, biases)
  conv1 = tf.nn.relu(bias, name=scope)

slim封裝了一個簡單的上邊程式碼的替代品:

input = ...
net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')

還封裝常用的其它層: slim.batch_norm, slim.fully_connected等…

Scope

為了使建立的計算圖更加模組化和方便管理,引入了Scope的概念,來對變數通過增加字首進行劃分.原生的tf提供了name_scope和variable_scope.
區別在於: 使用tf.Variable()的時候,tf.name_scope()和tf.variable_scope() 都會給 Variable 和 op 的 name屬性加上字首;
使用tf.get_variable()的時候,tf.name_scope()就不會給 tf.get_variable()創建出來的variable加字首.
get_variable和Variable的主要區別在於:每一次呼叫Variable都是新建立一個變數,所以reuse=True對其沒有影響,而get_variable會判斷如果該變數已經存在就把該變數返回.

slim在原生的Scope之外,引入了arg_scope,作用是arg_scope會把它提供的一組操作或運算全部傳遞給在它範圍內的所有操作.舉例:

# 原生程式碼
padding = 'SAME'
initializer = tf.truncated_normal_initializer(stddev=0.01)
regularizer = slim.l2_regularizer(0.0005)
net = slim.conv2d(inputs, 64, [11, 11], 4,
                  padding=padding,
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv1')
net = slim.conv2d(net, 128, [11, 11],
                  padding='VALID',
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv2')
net = slim.conv2d(net, 256, [11, 11],
                  padding=padding,
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv3')

# slim.arg_scope等價程式碼  
with slim.arg_scope([slim.conv2d], padding='SAME',
                    weights_initializer=tf.truncated_normal_initializer(stddev=0.01)
                    weights_regularizer=slim.l2_regularizer(0.0005)):
  net = slim.conv2d(inputs, 64, [11, 11], scope='conv1')
  net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')
  net = slim.conv2d(net, 256, [11, 11], scope='conv3')

例項: 構建VGG16

利用slim的相關操作,可以很容易地定義VGG16網路:

def vgg16(inputs):
  with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      activation_fn=tf.nn.relu,
                      weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),
                      weights_regularizer=slim.l2_regularizer(0.0005)):
    net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
    net = slim.max_pool2d(net, [2, 2], scope='pool1')
    net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
    net = slim.max_pool2d(net, [2, 2], scope='pool2')
    net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
    net = slim.max_pool2d(net, [2, 2], scope='pool3')
    net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
    net = slim.max_pool2d(net, [2, 2], scope='pool4')
    net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
    net = slim.max_pool2d(net, [2, 2], scope='pool5')
    net = slim.fully_connected(net, 4096, scope='fc6')
    net = slim.dropout(net, 0.5, scope='dropout6')
    net = slim.fully_connected(net, 4096, scope='fc7')
    net = slim.dropout(net, 0.5, scope='dropout7')
    net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc8')
  return net

訓練模型

在tf中,想要訓練一個模型,通常需要: a model, a loss function, the gradient computation and a training routine.
在上文中已經介紹了model的簡潔定義方式, slim還提供了常用的loss functiong和training and evaluation routines.

Loss

在slim中,利用losses模組,可以簡單地定義loss:

import tensorflow as tf
import tensorflow.contrib.slim.nets as nets
vgg = nets.vgg

# Load the images and labels.
images, labels = ...

# Create the model.
predictions, _ = vgg.vgg_16(images)

# Define the loss functions and get the total loss.
loss = slim.losses.softmax_cross_entropy(predictions, labels)

也可以定義多工loss:

# Load the images and labels.
images, scene_labels, depth_labels = ...

# Create the model.
scene_predictions, depth_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)

# The following two lines have the same effect:
total_loss = classification_loss + sum_of_squares_loss
total_loss = slim.losses.get_total_loss(add_regularization_losses=False)

在slim中, loss的工作方式也是構建了a special TensorFlow collection of loss functions, 用於手動管理本程式中的loss. 也可以將自定義的loss加入到其中:

# Load the images and labels.
images, scene_labels, depth_labels, pose_labels = ...

# Create the model.
scene_predictions, depth_predictions, pose_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)
pose_loss = MyCustomLossFunction(pose_predictions, pose_labels)
slim.losses.add_loss(pose_loss) # Letting TF-Slim know about the additional loss.

# The following two ways to compute the total loss are equivalent:
regularization_loss = tf.add_n(slim.losses.get_regularization_losses())
total_loss1 = classification_loss + sum_of_squares_loss + pose_loss + regularization_loss

# (Regularization Loss is included in the total loss by default).
total_loss2 = slim.losses.get_total_loss()

Training Loop

在slim中,利用slim.learning.create_train_op可以方便地計算損失+梯度計算和引數更新+返回損失. slim.learning.train用於迭代訓練.

g = tf.Graph()

# Create the model and specify the losses...
...

total_loss = slim.losses.get_total_loss()
optimizer = tf.train.GradientDescentOptimizer(learning_rate)

# create_train_op ensures that each time we ask for the loss, the update_ops
# are run and the gradients being computed are applied too.
train_op = slim.learning.create_train_op(total_loss, optimizer)
logdir = ... # Where checkpoints are stored.

slim.learning.train(
    train_op,
    logdir,
    number_of_steps=1000,
    save_summaries_secs=300,
    save_interval_secs=600):

例項: 訓練VGG16

import tensorflow as tf
import tensorflow.contrib.slim.nets as nets

slim = tf.contrib.slim
vgg = nets.vgg

...

train_log_dir = ...
if not tf.gfile.Exists(train_log_dir):
  tf.gfile.MakeDirs(train_log_dir)

with tf.Graph().as_default():
  # Set up the data loading:
  images, labels = ...

  # Define the model:
  predictions = vgg.vgg_16(images, is_training=True)

  # Specify the loss function:
  slim.losses.softmax_cross_entropy(predictions, labels)

  total_loss = slim.losses.get_total_loss()
  tf.summary.scalar('losses/total_loss', total_loss)

  # Specify the optimization scheme:
  optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001)

  # create_train_op that ensures that when we evaluate it to get the loss,
  # the update_ops are done and the gradient updates are computed.
  train_tensor = slim.learning.create_train_op(total_loss, optimizer)

  # Actually runs training.
  slim.learning.train(train_tensor, train_log_dir)

微調模型

  1. 在原生的tf中,使用tf.train.Saver()恢復引數:
# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add ops to restore all the variables.
restorer = tf.train.Saver()

# Add ops to restore some variables.
restorer = tf.train.Saver([v1, v2])

# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
  # Restore variables from disk.
  restorer.restore(sess, "/tmp/model.ckpt")
  print("Model restored.")
  # Do some work with the model
  ...
  1. slim 可以簡單恢復部分引數
# Create some variables.
v1 = slim.variable(name="v1", ...)
v2 = slim.variable(name="nested/v2", ...)
...

# Get list of variables to restore (which contains only 'v2'). These are all
# equivalent methods:
variables_to_restore = slim.get_variables_by_name("v2")
# or
variables_to_restore = slim.get_variables_by_suffix("2")
# or
variables_to_restore = slim.get_variables(scope="nested")
# or
variables_to_restore = slim.get_variables_to_restore(include=["nested"])
# or
variables_to_restore = slim.get_variables_to_restore(exclude=["v1"])

# Create the saver which will be used to restore the variables.
restorer = tf.train.Saver(variables_to_restore)
  1. 例項: Fine-Tuning VGG16
# Load the Pascal VOC data
image, label = MyPascalVocDataLoader(...)
images, labels = tf.train.batch([image, label], batch_size=32)

# Create the model
predictions = vgg.vgg_16(images)

train_op = slim.learning.create_train_op(...)

# Specify where the Model, trained on ImageNet, was saved.
model_path = '/path/to/pre_trained_on_imagenet.checkpoint'

# Specify where the new model will live:
log_dir = '/path/to/my_pascal_model_dir/'

# Restore only the convolutional layers:
variables_to_restore = slim.get_variables_to_restore(exclude=['fc6', 'fc7', 'fc8'])
init_fn = assign_from_checkpoint_fn(model_path, variables_to_restore)

# Start training.
slim.learning.train(train_op, log_dir, init_fn=init_fn)

評估模型

封裝的metrics

images, labels = LoadTestData(...)
predictions = MyModel(images)

# mae_value_op記錄當前樣本得到的結果, mae_update_op記錄評測至當前的樣本得到的mean結果
mae_value_op, mae_update_op = slim.metrics.streaming_mean_absolute_error(predictions, labels)
mre_value_op, mre_update_op = slim.metrics.streaming_mean_relative_error(predictions, labels)
pl_value_op, pl_update_op = slim.metrics.percentage_less(mean_relative_errors, 0.3)

自動更新

使用slim.metrics.aggregate_metric_map進行value_op和update_op的自動維護.

import tensorflow as tf
import tensorflow.contrib.slim.nets as nets

slim = tf.contrib.slim
vgg = nets.vgg


# Load the data
images, labels = load_data(...)

# Define the network
predictions = vgg.vgg_16(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    "eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),
    "eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
})

# Evaluate the model using 1000 batches of data:
num_batches = 1000

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  sess.run(tf.local_variables_initializer())

  for batch_id in range(num_batches):
    sess.run(names_to_updates.values())

  metric_values = sess.run(names_to_values.values())
  for metric, value in zip(names_to_values.keys(), metric_values):
    print('Metric %s has value: %f' % (metric, value))

evaluation_loop

import tensorflow as tf

slim = tf.contrib.slim

# Load the data
images, labels = load_data(...)

# Define the network
predictions = MyModel(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    'accuracy': slim.metrics.accuracy(predictions, labels),
    'precision': slim.metrics.precision(predictions, labels),
    'recall': slim.metrics.recall(mean_relative_errors, 0.3),
})

# Create the summary ops such that they also print out to std output:
summary_ops = []
for metric_name, metric_value in names_to_values.iteritems():
  op = tf.summary.scalar(metric_name, metric_value)
  op = tf.Print(op, [metric_value], metric_name)
  summary_ops.append(op)

num_examples = 10000
batch_size = 32
num_batches = math.ceil(num_examples / float(batch_size))

# Setup the global step.
slim.get_or_create_global_step()

output_dir = ... # Where the summaries are stored.
eval_interval_secs = ... # How often to run the evaluation.
slim.evaluation.evaluation_loop(
    'local',
    checkpoint_dir,
    log_dir,
    num_evals=num_batches,
    eval_op=names_to_updates.values(),
    summary_op=tf.summary.merge(summary_ops),
    eval_interval_secs=eval_interval_secs)

Data

slim dataset是一個封裝了某資料集的一些特殊成分的元組, 該元組主要由一下幾部分組成:
* data_sources: 組成資料集的檔案路徑
* reader: 適用於data_sources資料型別的資料讀取器
* decoder: 對讀取的資料檔案進行解碼的解碼器
* num_samples: 資料集中samples的數量
* items_to_descriptions: 從資料集提供的items到描述的map
簡單說,一個slim資料集使用reader類開啟data_sources檔案進行讀取(讀取後得到的是序列化的檔案),然後使用decoder對檔案進行解碼,並允許使用者請求items的陣列以Tensor的形式返回.

decoder例項: TFExampleDecoder

TFExampleDecoder的目的是把TF檔案對映成item(s),例如圖片或label等.
TFExample protocol buffers是keys(string)到tf.FixedLenFeature或tf.VarLenFeature格式的對映檔案,TFExampleDecoder定義了key到feature的對映,為了解碼這些Feature得到item, TFExampleDecoder還定義了ItemHandlers. ItemHandler是item到key的對映.最終得到items.

keys_to_features = {
    'image/encoded': tf.FixedLenFeature((), tf.string, default_value=''),
    'image/format': tf.FixedLenFeature((), tf.string, default_value='raw'),
    'image/class/label': tf.FixedLenFeature(
        [1], tf.int64, default_value=tf.zeros([1], dtype=tf.int64)),
}

items_to_handlers = {
    'image': tfexample_decoder.Image(
      image_key = 'image/encoded',
      format_key = 'image/format',
      shape=[28, 28],
      channels=1),
    'label': tfexample_decoder.Tensor('image/class/label'),
}

decoder = tfexample_decoder.TFExampleDecoder(
    keys_to_features, items_to_handlers)

使用三個key( image/encoded, image/format and image/class/label)解析TFExample,並吧前兩個key對映成一個叫image的item. 本decoder最終提供了兩個items( ‘image’ and ‘label’).

DataProvider舉例: DatasetDataProvider

dataset = GetDataset(...)
data_provider = tf.contrib.slim.dataset_data_provider.DatasetDataProvider(
    dataset, common_queue_capacity=32, common_queue_min=8)

DatasetDataProvider提供: num_readers, num_epochs, shuffle的控制.