1. 程式人生 > >ELMo模型的理解與實踐(2)

ELMo模型的理解與實踐(2)

預訓練好的詞向量已經released,這裡介紹一下,如何直接獲取ELMo詞向量。在pytorch裡可以通過AlenNLP包使用ELMo。

一、環境配置

1) 在conda中建立allennlp環境:

conda create -n allennlp python=3.6

2) 安裝allennlp

pip install allennlp

二、下載訓練好的引數和模型

引數下載:

https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway/elmo_2x4096_512_2048cnn_2xhighway_weights.hdf5

模型下載:

https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway/elmo_2x4096_512_2048cnn_2xhighway_options.json

三、獲得詞向量

from allennlp.commands.elmo import ElmoEmbedder
options_file = "/files/elmo_2x4096_512_2048cnn_2xhighway_options.json"
weight_file = "/files/elmo_2x4096_512_2048cnn_2xhighway_weights.hdf5"

elmo = ElmoEmbedder(options_file, weight_file)

# use batch_to_ids to convert sentences to character ids
context_tokens = [['I', 'love', 'you', '.'], ['Sorry', ',', 'I', 'don', "'t", 'love', 'you', '.']] #references
elmo_embedding, elmo_mask = elmo.batch_to_embeddings(context_tokens)

print(elmo_embedding)
print(elmo_mask)

因為環境問題,機子一直無法裝上allennlp包,流程大概如上。