I would like to focus readtextlist
on nouns, verbs, and adjectives, but all parts of speech will be output.
import MeCab
words_list = [ ]
t=MeCab.Tagger('-d/usr/lib/x86_64-linux-gnu/mecab/dic/mecab-ipadic-neologd')
for in readtextlist:
s_parsed = t.parse(s[0])
words_s = [ ]
for line ins_parsed.splitlines() [:-1]:
word=line.split("\t")[0]
ifword=='EOS':
break
else:
pos=line.split('\t')[1]
slice=pos.split(',')
if(slice[0]in['noun', 'verb', 'adjective']):
words_s.append (slice[6])
else:
words_s.append(word)
words_list.append(words_s)
print(words_list)
I am not sure about the cause, but it seems that the slice was not working well, so I was able to solve the problem below.
Thank you.
words_list=[]
t=MeCab.Tagger('-d/usr/lib/x86_64-linux-gnu/mecab/dic/mecab-ipadic-neologd')
for in readtextlist:
s_parsed = t.parse(s[0])
words_s = [ ]
for line ins_parsed.splitlines() [:-1]:
word=line.split("\t")[0]
ifword=='EOS':
break
else:
pos=line.split('\t')[-1]
slice=pos.split(',')
if(slice[0]in['noun', 'verb', 'adjective']):
words_s.append(word)
words_list.append(words_s)
print(words_list)
© 2024 OneMinuteCode. All rights reserved.