Learn more about additional Word2vec learning.

Asked 2 years ago, Updated 2 years ago, 147 views

Ask about additional Word2Vec learning.

If we learn a corpus once, generate a vector, and then add an unknown word to the corpus, do we need to relearn the corpus from scratch to vectorize the unknown word?
(In short, I'd like to add a corpus and learn it over and over and over again.

According to Yasukazu Nishio's Natural Language Processing with word2vec, negative sampling is "useful if you add corpus and repeat it." How can I specify the command to learn to respond to unknown words using the learning results?

-negative option argument is only <int> and I thought I could use the -read-vocab<file> option for extracting unknown words, but there is no option to use learning results, is there?

I would appreciate it if someone could let me know.

python c algorithm natural-language-processing word2vec

2022-09-29 22:43

1 Answers

In the case of gensim, you should be able to learn by loading the old model as follows and then providing new data.

model=word2vec.Word2Vec.load("old_model")
model.train(sentences)


2022-09-29 22:43

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.