1、Language model perplexity是衡量語言模型好壞的重要指標,其計算公式P(sentence)^-(1/N)


, softmax_loss_function=None, name=None )
loss = legacy_seq2seq.sequence_loss_by_example([self.logits],
                [tf.reshape(self.targets, [-1])],
                [tf.ones([args.batch_size * args.seq_length])],
_size) #計算這個batch中所有句子的平均log-perplexity self.cost = tf.reduce_sum(loss) / args.batch_size #計算這個batch中所有句子的平均perplexity self.perplexity = tf.exp(self.cost)


def sequence_loss_by_example(logits,
"""Weighted cross-entropy loss for a sequence of logits (per example). Args: logits: List of 2D Tensors of shape [batch_size x num_decoder_symbols]. targets: List of 1D batch-sized int32 Tensors of the same length as logits. weights: List of 1D batch-sized float-Tensors of the same length as logits. average_across_timesteps: If set, divide the returned cost by the total label weight. softmax_loss_function: Function (labels-batch, inputs-batch) -> loss-batch to be used instead of the standard softmax (the default if this is None). name: Optional name for this operation, default: "sequence_loss_by_example". Returns: 1D batch-sized float Tensor: The log-perplexity for each sequence. Raises: ValueError: If len(logits) is different from len(targets) or len(weights). """ if len(targets) != len(logits) or len(weights) != len(logits): raise ValueError("Lengths of logits, weights, and targets must be the same " "%d, %d, %d." % (len(logits), len(weights), len(targets))) with ops.name_scope(name, "sequence_loss_by_example", logits + targets + weights): log_perp_list = [] for logit, target, weight in zip(logits, targets, weights): if softmax_loss_function is None: # TODO(irving,ebrevdo): This reshape is needed because # sequence_loss_by_example is called with scalars sometimes, which # violates our general scalar strictness policy. target = array_ops.reshape(target, [-1]) crossent = nn_ops.sparse_softmax_cross_entropy_with_logits( labels=target, logits=logit) else: crossent = softmax_loss_function(target, logit) log_perp_list.append(crossent * weight) log_perps = math_ops.add_n(log_perp_list) if average_across_timesteps: total_size = math_ops.add_n(weights) total_size += 1e-12 # Just to avoid division by 0 for all-0 weights. log_perps /= total_size return log_perps

1、求句子的每個時間點(RNN timestep)處的loss,然後對每個時間點的loss求和。
2、求句子的長度(timestep),然後loss/timestep。(預設需要對average_across_timesteps )


RNN實現字元級語言模型 - 恐龍島_g

問題描述:樣本為所有恐龍名字,為了構建字元級語言模型來生成新的名稱,你的模型將學習不同的名稱模式,並隨機生成新的名字。 在這裡你將學習到: 如何儲存文字資料以便使用rnn進行處理。 如何合成數據,通過每次取樣預測,並將其傳遞給下一個rnn單元。 如何構建字元級文字生成迴圈神經網路。


