Error using MeCab TypeError: in method 'Tagger_parse', argument 2 of type 'char const *'

Asked 1 years ago, Updated 1 years ago, 438 views

As the title suggests, I would like to use MeCab to remove Stopwords from the list in Python.

However, I get TypeError: in method 'Tagger_parse', argument 2 of type 'char const *'.

The environment is

Python 3.9.7
mecab-python3

Example Code:

import urlib
from urllib.request import urlopen
import MeCab
import re

# slothlib
slotlib_path="http://svn.sourceforge.jp/svnroot/slothlib/CSharp/Version1/SlothLib/NLP/Filter/StopWord/word/Japanese.txt"
slot_file=urlib.request.urlopen(slothlib_path)

# stopwordsiso
iso_path="https://raw.githubusercontent.com/stopwords-iso/stopwords-ja/master/stopwords-ja.txt"
iso_file=urllib.request.urlopen(iso_path)
stopwords = [ line.decode("utf-8").trip() for line iniso_file ]

stopwords = [ss for ss in stopwords if not ss == u' ]
stopwords=list(set(stopwords))

with open("/Desktop/cleaned-stp.txt", encoding='utf8') asf:
    cleanedlist=f.readlines()
    cleanedlist=list(cleanedlist)

tagger=MeCab.Tagger("-Owakati")
token_text=tagger.parse(cleanedlist)

ws = re.compile("")
words = [word for words in ws.split(tok_text)]
if words[-1]==u"\n":
    words = words [:-1]
ws = [w for words if w not in stopwords ]

print(words)
print(ws)

Example List (.txt):

 Done!My score is 100 How much do you know about magic?Let's do a test!You'll also get a chance to get a surprise reward like collaboration ride skin!wilderness magic test magic battle wilderness behavior
"""Girls' Wars: Fantasy Unification Battle"" is being pre-registered!" Participate in the reserved-only gacha and get SSR characters and items!43172 total revolutions!! Top 10 Advance Reservation Pre-Gacha Reservations
I will watch the magic round-the magic-play the magic round.
The 2nd Women's Cup officially hosted by Wilderness CUP!! Here's what to see! - A series of unique combinations! The most luxurious camp ever! - There are many nostalgic combinations that trace the history of Wilderness! - The battle to decide the last queen of the year begins!Distribution URL: Wilderness Behavior

I'm sorry that I asked a simple question as a beginner.
Thank you for your cooperation.

python python3 mecab

2022-12-19 08:20

2 Answers

The reason is that you are passing a list of strings to a function that should pass a string.
Pass the string as shown in https://taku910.github.io/mecab/bindings.html.

with open("/Desktop/cleaned-stp.txt", encoding='utf8') asf:
    cleaned_text=f.read()

tagger=MeCab.Tagger("-Owakati")
token_text=tagger.parse(cleaned_text)


2022-12-19 08:33

I'm also a beginner, so I don't know if it will be helpful, but when I was using MeCab (just now), I got the same error and had a hard time, so I'll share it with you just in case!

In my case, I got this error when I was spacing columns of data frames.When I changed the type of the column from object to str, it worked well.I don't know the list, but I think the model is probably related.Good luck!!!


2022-12-19 09:01

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.