Ask about additional Word2Vec learning.
If we learn a corpus once, generate a vector, and then add an unknown word to the corpus, do we need to relearn the corpus from scratch to vectorize the unknown word?
(In short, I'd like to add a corpus and learn it over and over and over again.
According to Yasukazu Nishio's Natural Language Processing with word2vec, negative sampling is "useful if you add corpus and repeat it." How can I specify the command to learn to respond to unknown words using the learning results?
-negative
option argument is only <int>
and I thought I could use the -read-vocab<file>
option for extracting unknown words, but there is no option to use learning results, is there?
I would appreciate it if someone could let me know.
python c algorithm natural-language-processing word2vec
In the case of gensim, you should be able to learn by loading the old model as follows and then providing new data.
model=word2vec.Word2Vec.load("old_model")
model.train(sentences)
572 rails db:create error: Could not find mysql2-0.5.4 in any of the sources
610 GDB gets version error when attempting to debug with the Presense SDK (IDE)
581 PHP ssh2_scp_send fails to send files as intended
912 When building Fast API+Uvicorn environment with PyInstaller, console=False results in an error
© 2024 OneMinuteCode. All rights reserved.