a way of writing that matches the appearance of the same word more than n times in a regular expression

Asked 1 years ago, Updated 1 years ago, 271 views

I want to create a condition that matches the same word repeatedly.

Example)

@Apple@Tangerine@Banana@Apple@Vine

If there is a string like this, how should I write the condition that it matches if there are two or more strings from the same group when I consider starting with @ and ending with a half-width space in this string as a group?

In the case of the example, @Apple appears twice, so I would like to determine that @Apple matches.

Thank you for your cooperation.

regular-expression

2022-12-05 01:48

3 Answers

The writing style depends on the regular expression engine, but can be achieved with back reference of the captured group.

Example: @Apple@Tangerine@Banana@Apple@Vine
Regular expression:(@\S+).+\1
Regular Expression Checker

This answer is an application of similar answer.
\w may not match Japanese depending on the regular expression engine, so we used \S (all characters except blank characters) in this answer.

By the way, I had a hard time verifying this answer because I couldn't get the results as expected in the example you asked.
This is because the first @ apple has full-width spaces and the second @ apple has full-width spaces.
Computers and regular expression engines treat full-width and half-width characters as completely different characters, so I advise you not to confuse full-width and half-width characters when coding.


2022-12-05 04:19

Payaneco's answer (@\S+).+\1 is generally satisfactory, but what did you notice?

As mjy pointed out, a blank space is requested at the end.

@ grapes@ grapes

does not match if there is no space at the end.To work around this, you must move a blank space out of the group.

(@\S+).+\1

But now

@Grape @Grape @Grape

It matches the .To work around this, check after the \1 side.

(@\S+).+\1(|$)

"By the way, the questionnaire is ""how to write the same word when it appears more than n times,"" but it is limited to two times."Repeat is required to address this.

(@\S+)(.+\1){m}(|$) where m is an integer of n-1

However, some regular expression engines may not support {}.Therefore, I would like you to specify which regular expression engine to use when you ask a question.


2022-12-05 05:14

>>import re

>>>text='@Apple@Tangerine@Banana@Apple@Vine'
>>re.findall(r'(@\w+)(?=.+\1\b), text)
["@Apple"]

>>>text='@Apple@Tangerine@Banana@Apple@Vine@Banana'
>>re.findall(r'(@\w+)(?=.+\1\b), text)
["@Apple", "@Banana"]

>>>text='@Apple@Tangerine@Banana@Apple@Vine@BananaChocolate'
>>re.findall(r'(@\w+)(?=.+\1\b), text)
["@Apple"]


2022-12-05 05:22

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.