[Python] To extract the next string of a particular string

Asked 1 years ago, Updated 1 years ago, 115 views

String retrieved from Excel.

Member||*NAYEON||*0.00||*0000036$$Member||*JEONGYEON||*0.00||*0000037$$Member||*MOMO||*0.00||*0000038$$Member||*SANA||*0.00||*0000039$$Member||*JIHYO||*0.00||*0000040$$Member||*MINA||*0.00||Member||*DAHYUN||*0.00||*0000041$$Member||*CHAEYOUNG||*0.00||*0000042$$Member||*TZUYU||*0.00||*0000043$$'

I want to extract only certain words (TWICE member names) from here.

So we truncated the string through the split and tried to get the value based on the specific repetition value "0.00".

s = 'Member||*NAYEON||*0.00||*0000036$Member|*0.00|*JEONGYEON||*0.0037$Member|*0.00|*MOMO||*0038$Member|*Member|*00$Member|*000$Member|*00$Member|*00$Member|*00$Member|*00$Member|$Member|*00$Member|*00$Member|*00$Member|*00$Member
s2 = s.split("||*")

s3 = s2[s2.index("0.00")-1]

This will only take the first NAYEON before "0.00". crying What should I do to get the names of all the other members?

Thank you.

python-3.x

2022-09-21 11:51

1 Answers

The index method you used gets only one matching index, and if nothing is to be imported, causes the ValueError exception. To import consecutively, you must:

s3 = []

start = 0
while True:
    try:
        start = s2.index('0.00', start + 1)
    except ValueError:
        break
    else:
        s3.append(s2[start - 1])
print(s3)  
# # ['NAYEON', 'JEONGYEON', 'MOMO', 'SANA', 'JIHYO', 'DAHYUN', 'CHAEYOUNG', 'TZUYU']

If you use index, the exception handling syntax makes the code longer, but it's not very readable.

List compliance allows you to write s3 in one line.

s3 = [s2[i-1] for i, v in enumerate(s2) if v == '0.00']
# # ['NAYEON', 'JEONGYEON', 'MOMO', 'SANA', 'JIHYO', 'MINA', 'DAHYUN', 'CHAEYOUNG', 'TZUYU']

Extracting a particular string pattern is worth considering the use of regular expressions.
As you wish

An attempt was made to get a value based on a specific repeat value of "0.00".

If you put this into a regular expression, it's as follows.

import re

re.findall(r"(\w+)\|{2}\*0\.00", s)
# # ['NAYEON', 'JEONGYEON', 'MOMO', 'SANA', 'JIHYO', 'MINA', 'DAHYUN', 'CHAEYOUNG', 'TZUYU']


2022-09-21 11:51

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.