I'd like to divide Japanese in Mecab like this.
m=MeCab.Tagger("-d/usr/local/lib/mecab/dic/mecab-ipadic-neologd")
c=m.parse('Kobayashi Pharmaceutical Hifumid formula! For serious moisturizing skin.Kobayashi Pharmaceutical Co., Ltd. for the first time 980 yen).splitlines()
For sinc:
print(s)
The result is:
Kobayashi Pharmaceutical Noun, Proper Noun, General, *, *, *, Kobayashi Pharmaceutical, Kobayashi Seiyaku, Kobayashi Seiyaku
Hifmid nouns, general, *, *, *, *, *
official noun, adjective verb stem, *, *, *, *, formula, koshiki, koshiki
! symbols, general, *, *, *, *, !, !, !
serious noun, general, *, *, *, *, serious, honki, honki
the postposition, conjugation, *, *, *, *, no, no
moist nouns, general, *, *, *, moist, woooooooooooooooooooooooooooooooooooooooooooooooo
Skin noun, general, *, *, *, *, skin, haggard, haggard
(ii) postpositions, case postpositions, general, *, *, *, Ni, D, D
is a particle, an associative particle, *, *, *, *, C, C
。 symbols, punctuation marks, *, *, *, *, , , , .
Kobayashi Pharmaceutical Noun, Proper Noun, General, *, *, Kobayashi Pharmaceutical, Kobayashi Seiyaku, Kobayashi Seiyaku
first noun, general, *, *, *, *, first, Shokai, Shokai
980 yen nouns, proper nouns, general, *, *, *, 980 yen, kyuhyakuhachijuen, kyuhyakuhachijuen, [:_: 3180 3148 7806]
EOS
Why does [:_:318031487806]
appear at the end?
Could you tell me?Please!
python python3 mecab
It's not an error or anything, it's only that's in the mecab-ipadic-neologd dictionary and the mecab default output format is that's why it says.
http://taku910.github.io/mecab/dic-detail.html
The fifth column and beyond is a user-defined CSV field. Basically, you can add as much content as CSV allows.
© 2024 OneMinuteCode. All rights reserved.