DATA AFTER DUPLICATE CHARACTER AND METHOD FOR OBTAINING TIME DATA IN ORDER

Asked 2 years ago, Updated 2 years ago, 395 views

I would like to retrieve data from the following sentences in python 3.10. Is there a way to retrieve numbers in order by repeating them?

本文SENTENCE
Event A Number of participants Time: 18:00-21:00, Event B Number of participants Time: 18:00-21:00

補足 Supplemental
Stored the body as a.

取得Data you want to obtain
Number of people: 10 and 5

print(a.count('number of participants'))
joinuser_count = a.count ('Number of Participants')
for i in range (joinuser_count):
    pos=a.find('Number of Participants')
    b=a[pos:]
    pos2=b.find('people')
    c=b[:pos2]

"I would like to obtain the start time and end time of this item together with the above, but I cannot distinguish ""~"" when dividing, so I cannot divide it."
I'd like to know how to determine because replacing special characters will also affect ":".

time_get=a.find('~')
time_get_start=a[:time_get]

I thought it might be a simple content, so I searched a lot, but I couldn't find a way to improve it, so please let me know.

python python3

2022-09-30 22:04

2 Answers

There are some problems

  • If you look for the number of participants and look for the string "people" from that position, the character position of "people" should be returned, so it is not suitable for obtaining numbers
  • The second in-loop process may end up dealing with the same sub-string
  • Confusion between full-width "~" and half-width "~"
 a='Event A Participants 10 Hours: 18:00-21:00; Event B Participants 5 Hours: 18:00-21:10'
for_in range(a.count('number of participants')):
    k = 'Number of participants'
    p = a. find(k)
    if p>=0:
        a=a[p+len(k):]# Do not include the string "number of participants" itself specified by find
    p=a.find('people')
    if p>=0:
        mbr = a [:p]
        print(f'number of people {mbr}')
    p=a.find('~')
    if p>=0:
        print(a[p-5:p+1+5])

Regular expression (or regex, regular expression) makes this process relatively easy

  • If you're not used to it, you might want to use str.find, so choose it correctly
  • Mostly str slower than using
import re
a='Event A: Number of participants 10 Hours: 18:00-21:00; Event B: Number of participants 5 Hours: 18:00-21:10'
mbr = r 'Number of participants (\d+)'
tm = r'\d\d:\d\d to \d\d:\d\d'

for min re.finder (mbr, a):
    print(f'number of people {m[1]}')

    b=a [m.end():]
    m = re.search(tm,b)
    ifm:
        print(m[0])

Note: (docs.python.org)re---Regular expression manipulation


2022-09-30 22:04

import re

text = ''
Event A Number of participants Time: 18:00~21:00, Event B Number of participants Time: 18:00~21:00
'''.strip()

for min re.finder(
             r'Event.+?Number of participants (?P<Number of participants>\d+).*?'
             r'Time: (?P<Start Time>\d{2}:\d{2}) through (?P<End Time>\d{2}:\d{2}),
             text):
    print(m.groupdict())

#
{'Number of Participants': '10', 'Start Time': '18:00', 'End Time': '21:00'}
{'Number of Participants': '5', 'Start Time': '18:00', 'End Time': '21:00'}


2022-09-30 22:04

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.