We're working on a simple code related to information retrieval.
I'm using TF-IDF to make it, but I have a question as I got IDF.
When obtaining an IDF, see the article that it is common to add 1 to the denominator because the denominator becomes 0 if a specific word does not exist in the document,
log (number of complete documents / (1 + number of documents containing word t)
I have written the code according to the formula above.
But what I'm curious about is if the word t is included in all documents.
Then the denominator becomes larger than the molecule, and the value of the log idf becomes negative, but is there any problem in implementing the algorithm even if the value of the idf becomes negative?
First of all, the results come out the way I want them to, but... I can't find the contents of this part even if I search for tf-idf related articles, so I'm posting a question. Please reply. Thank you.
algorithm
If it's a negative number, you can change it to zero.
© 2024 OneMinuteCode. All rights reserved.