1. 程式人生 > >TensorFlow中 tf.nn.embedding_lookup

TensorFlow中 tf.nn.embedding_lookup

import tensorflow as tf  
src_vocab_size = 10
src_embed_size = 5
source = [1,3]
with tf.variable_scope("encoder"):
    embedding_encoder = tf.get_variable(
        "embedding_encoder", [src_vocab_size, src_embed_size], tf.float32)
encoder_emb_inp = tf.nn.embedding_lookup(
          embedding_encoder, source)
init = tf.global_variables_initializer()
with
tf.Session() as sess: sess.run(init) emb_mat = sess.run(embedding_encoder) for line in emb_mat: print line en_input = sess.run(encoder_emb_inp) print for line in en_input: print line

輸出結果:

[ 0.56113797  0.04369807  0.18308383 -0.48125005 -0.43450889]
[-0.6047132  -0.21060479
0.40796143 -0.40531671 0.55036896] [ 0.31311834 -0.4060598 0.36560428 0.2722581 -0.02451819] [ 0.18635517 -0.12266624 -0.39344144 -0.1277926 -0.45468265] [ 0.30129766 0.56903845 -0.03529584 -0.33247966 0.45404953] [-0.58887643 0.50933784 -0.19886917 -0.03041148 -0.44376266] [ 0.35494697 0.25374722 0.41377074 0.06932443 -0.21179438
] [ 0.10084659 -0.60172981 0.49977249 -0.28413546 -0.33590576] [-0.01577765 0.41795093 0.43442172 0.59790486 0.58752233] [ 0.42998117 -0.0969131 -0.34563044 0.16796118 0.62855309] [-0.6047132 -0.21060479 0.40796143 -0.40531671 0.55036896] [ 0.18635517 -0.12266624 -0.39344144 -0.1277926 -0.45468265]

1、對於one-hot的編碼embedding操作
2、embedding_lookup即去矩陣中的某一行,同時其不是簡單的查表,id對應的向量是可以訓練,即其實一個全連線
3、在分類模型中用id類的特徵,注意希望模型能夠記住資訊,但是id的維度太高,同一個商品數量也不大,因此可以用iterm embedding來代替id