I use Rails 3.2.13+MySQL5.
Collation in MySQL is collate utf8_unicode_ci.
I didn't know this function existed, so I was surprised that I searched and matched the data in full-width katakana in full-width hiragana.
So, one customer's data was registered in half-width katakana (bad smell!?)
They still search for it in a good way, but somehow I found out that Suzuki and Suzuki don't match.
>User.where("kana LIKE'%ski%'").count
(4.1ms) SELECT COUNT(*) FROM `users` WHERE(kana LIKE '%ski%')
=>42
>User.where("kana LIKE'% Suzuki%").count
(2.6ms) SELECT COUNT(*) FROM `users` WHERE(kana LIKE '% Suzuki %')
= > 0
>User.where("kana LIKE'%Suzuki%'").count
(2.6ms) SELECT COUNT(*) FROM `users` WHERE(kana LIKE '%Suzuki %')
= > 0
>pp User.where("kana LIKE'%ski%'").limit(10).map {|x|x.kana.sub(/...$/, 'xxx')}
User Load (1.1ms) SELECT `users`.* FROM `users` WHERE(kana LIKE '%ski%') LIMIT10
["Ski xxx",
"Ski xxx",
"Skitage xxx",
"Ski xxx",
"Skiノxxx",
"Scream xxx",
"Ski xxx",
US>"Sus xxx",
"Skitter xxx",
"Ski xxx"]
Why is this?I would appreciate it if you could let me know.
"By the way, ""Fukuda/Fukuda"" matches..."
User.where("kana LIKE'% factor %").count#=>9
User.where("kana LIKE'% Fukuda%'").count#=>9
User.where("kana LIKE'%Fukuda%'").count#=>9
I am curious about the recent article "Japanese Developers' Opinion on utf8_unicode_ci.
ruby-on-rails mysql rails-activerecord
"Since ""su"" in half-width kana has two characters, ""su"" and ""cho,"" I don't think it matches ""zu""."
"I think ""Fukuda"" matches ""Fukuda"" and ""O"" matches ""%"" wildcard."
"Also, ""Suki"" and ""Suzuki"" match for =, not LIKE."
LIKE matches one character at a time, but = seems to be for normalization and matching.
http://mysql.gr.jp/mysqlml/mysql/msg/15636 may be helpful.
(Half-width kana has been converted to full-width kana, so it's a little hard to understand, but please read it with your heart's eyes.)
© 2024 OneMinuteCode. All rights reserved.