import re
kor_txt = " "나다" ㄴㅐ| | fdㅑ ㅒ# fd|jkaFSA 498130$#@!*&^ %)(-_=+ []{}<>,./?'"`"')"""
# # kor_txt = kor_txt.replace("|","")
print(re.sub('[^--h|a-zA-Z|0-9|`!@#$%^&*()\[\]{}\-\_\'\"><\/\?]', '', kor_txt))
Results:
GANADA Gㄴㄷㅐ| | Cㅗᅴdfㅑfd|jkaFSA 498130$#@!*&^ %)(-_ []{}<>/?\'"`"\')'
The desired result:
It's ganada gan na ka ka ka FSA 498130$#@!*&^ %)(-_ []{}<>/?\'"`"\')'
The regular expression was created as above.
I want to remove all characters from the sentence except Korean, English, numbers, and special characters that I enter with my keyboard
If you use a regular expression or operation, it contains the "|" character. The "|" text keeps getting an error, so I want to exclude it
If you do not subtract the |operator, the |character is still alive in the result, such as
Results
.
The desired result is a sentence with | removed, as in the annotated part.
I heard that you have to write -||-ode|ga-he
to include all Korean letters, so how do I solve this case?
regex=r"(?):[^ㄱ-ㅎ-가-hea-zA-Z0-9'!@#$%^&*()\[\]{}\-\_\'\"><\/\?]|\|)"
We gathered the [^blah]
block and one \|
character into the (?:this side|that side)
block, and we processed it so that it matches everything if only one side matches.
© 2024 OneMinuteCode. All rights reserved.