Search for OR with wildcards in egrep

Asked 2 years ago, Updated 2 years ago, 117 views

I tried to extract the server name by egrep from the file below.

===test.txt===
TokyoAsv01
TokyoBBsv01
tokyoCCsv01
TokyoAsv02
TokyoBBsv02
=====end====

cat/tmp/test.txt | egrep'*sv01 | * sv02'

This will work, but

cat/tmp/test.txt | egrep'tkyo*sv01 | tokyo*sv02'

and it would fail.
Also, if I want to extract only lines with AA and CC from test.txt, how can I write them?

Thank you for your cooperation.

Note:
I'm sorry, but only the last line with AA and CC was ~, but there was not enough explanation.

===test.txt===
TokyoAsv01
TokyoBBsv01
tokyoCCsv01
TokyoAsv02
TokyoBBsv02
osakaAsv01
osakaBBsv01
osakaCCsv01
=====end====

There are mixed servers from other locations, as shown in , and there are more than 10 AA and BB patterns.
Therefore, I would like to write only AA, CC, FF, HH of tokyo, but
In that case, is it difficult to nest OR conditions?
I tried as follows, but it didn't work.

cat/tmp/test.txt|egrep'tokyo|AA|CC|FF|HH|sv01|tokyo|AA|CC|FF|HHH|sv02'

I'm sorry for the delay.

grep

2022-09-30 21:11

3 Answers

cat/tmp/test.txt | egrep'tkyo*sv01 | tokyo*sv02'

You probably want to find the line where sv01 or sv02 comes after tokyo. * is more than 0 iterations of the previous regular expression (including just characters), so tokyo*sv01, etc. match tokyo.

To represent any string, place . (any single character) followed by * as follows:

egrep'tokyo.*sv01|tokyo.*sv02'/tmp/test.txt

Or

egrep'tokyo.*sv0[12]'/tmp/test.txt

Also, if I want to extract only lines with AA and CC from test.txt, how can I write them?

Connecting with | means OR as follows:

egrep'AA | CC'/tmp/test.txt

add

I only want to write AA, CC, FF, HH in tokyo.

(...) to group.

egrep'tokyo(AA|CC|FF|HH)sv0[12]'/tmp/test.txt


2022-09-30 21:11

I have already received a reply, so I would like to add a little more.

Repetition

Regular expression may be followed by one of peripheral repetition operators:

*The preceding item will be matched zero or more times.

As for the first regular expression,

cat/tmp/test.txt|egrep'*sv01|*sv02'

In *sv01|*sv02, the preceding item in *sv01|*sv02 is an empty string ("").No matter how many times you repeat an empty string, it is an empty string, so in the end,

cat/tmp/test.txt|egrep'sv01|sv02'

Same as

add

Therefore, I would like to write only AA, CC, FF, and HH of Tokyo, but

If you want to repeat the same character (twice), you can also write the following:

cat/tmp/test.txt | egrep'tkyo([ACFH])\1'

*I don't think the contents of the actual data are different, but there is a way to write it like this.


2022-09-30 21:11

* is the tokyo*sv01
because * is <0 or more iterations of the last character. - there is a string toky and
- o appears more than 0 times immediately after that
- Additional
sv01 present matches the .To illustrate a match pattern,
tokysv01tokyosv01tokyosv01tokyosv01tokyosv01

The match you want is tokyo, followed by sv01, then

egrep'tokyo.*sv01'/tmp/test.txt

Yes, . is any single character.

AACC seems to be easier or more complete.
egrep'AA|CC'/tmp/test.txt I think you mean…

Edit the original question and add it

tokyo and (AA or BB) and sv01, right?AND instead of OR.
For better readability, if you're an oira, you can pipe grep.

egrep'tokyo'/tmp/test.txt |egrep'AA | BB' |egrep'sv01'


2022-09-30 21:11

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.