What are the advantages and disadvantages of classifying sentences?
■ Input
·Multiple Japanese sentences with about 200 characters
·Classification destinations (about 10 items defined in advance, such as romance, horror, suspense, etc.)
■ Output
Sentence A->Love
Sentence B->Suspense
Sentence C->Horror
...
■ How to categorize
After a little research, the following method is
I thought it would be good if I could easily implement it for myself (just a Rails engineer) who is an amateur around machine learning.
I would like to ask you about the advantages and disadvantages of the following two points.I would appreciate it if you could let me know if there are any other good ways.
Manually categorize the > tags (love, horror, suspense, etc.) that extract feature words from > TF-IDF and use them as tags.I've heard that TF should be calculated by myself and IDF should be generic.
Categorize by naive Bayes classification (I'm sorry I'm not familiar with it)
■ Supplemental
I'm implementing it in Rails, so I'd appreciate it if you had a gem
The benefits of using TF-IDF are
There is a point that(I think there are other things, but I mentioned it as something that comes to mind.)
Since you use naive Bayes, the importance of feature quantity can be obtained by using the link below.
https://stackoverflow.com/questions/50526898/how-to-get-feature-importance-in-naive-bayes
If you want to use rails, calling python from rails is relatively easy.
https://github.com/mrkn/pycall.rb
The disadvantage is
There are points such as .(I think there are other things, but I mentioned it as something that comes to mind.)
Other options include:
© 2024 OneMinuteCode. All rights reserved.