RMeCab doesn't work in Japanese.

Asked 2 years ago, Updated 2 years ago, 37 views

ASK: I want to make RMecCab work well with Japanese.

If you run a function of the RMeCab package against Japanese text data, it will fail as follows:
It says Paradise not found, but Paradise actually exists.By the way, it works well for English-only text data.

>library("RMeCab")
>r<-collocate("kumo.txt", node="Paradise", span=3)
file=kumo.txt 
Paradise not found

Checked the character code UTF-8 when saving kumo.txt.
![Enter a description of the image here

Viewed: R-studio Character Code UTF-8
![Enter a description of the image here

I checked the encoding using the software called Notepad++.
Enter a description of the image here

r

2022-09-30 16:46

1 Answers

RMeCab on Windows is designed for Shift-JIS instead of UTF-8.
Re-save the text data kumo.txt to be parsed in ANSI instead of UTF-8, and it worked.

Enter a description of the image here


2022-09-30 16:46

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.