About extracting phone numbers and email addresses using regular expressions

Asked 2 years ago, Updated 2 years ago, 17 views

I have a question about the code below."Let Python do the boring things" is the code that extracts the phone number and email address from the copied text and puts them into a list.

for groups in phone_regex.findall(text):

In this part, what kind of group will change into?
(i changes to 0-99 for for i in range(0,100))

Also, why

phone_num='-'.join([groups[1], groups[3], groups[5]])

Is groups[1] (with s) instead of group(1)?

import pyperclip as pcl
import re

phone_regex = re.compile(r'''(
    (\d{1,4}|\(\d{1,4}\))# Area code
    (\s|-)# delimited
    (\d{1,4})# Local office number
    (\s|-)# delimited
    (\d {3,4})# Subscriber Number
    (\s*(ext|x|ext.)\s*(\d{2,5})) ?# extension
    '', re.VERBOSE)

email_regex=re.compile(r'''(
    [a-zA-Z0-9._%+-] + # Username
    @
    [a-zA-Z0-9.-] + # Domain Name
    (\.[a-zA-Z]{2,4})#.after
    '', re.VERBOSE)

# Search Clipboard Text
text=(pcl.paste())
matches = [ ]
for groups in phone_regex.findall(text):
    phone_num='-'.join([groups[1], groups[3], groups[5]])
    ifgroups[8]!=':
        phone_num+='x'+groups[8]
    matches.append(phone_num)
for groups in email_regex.findall(text):
    matches.append (groups[0])

iflen(matches)>0:
    pcl.copy('\n'.join(matches))
    print('Copied')
    print('\n'.join(matches))
else:
    print('No phone number or email address found')

python

2022-09-29 22:26

1 Answers

phone_regex.findall(text) is a list of tuples in conclusion.

For example, if the text contains 03-1111-2222, 03-3333-4444, the following occurs:The table consists of a string that matches the regular expression and eight groups of regular expressions.

[('03-1111-222', '03', '-', '1111', '-', '2222', '', '', '', '', '03', '-', '3333', '-', '44444', '-', '-', '44444', ')

Therefore, groups substitutes the regular expression with a tuple of matching phone numbers and groups in the regular expression in the order in which they are listed (in the order in the text).Since the tuple is sequential, get the element as shown in groups[1].The reason why groups are used instead of groups is probably because they contain multiple groups in regular expressions.

For more information, refer to the (...) and re.findall sections of the official document from the re--- regular expression operation.

(...)
Matches a regular expression enclosed in parentheses and represents the beginning and end of a group.The contents of the group can be retrieved after the match has been executed, or the string can be matched with a special sequence of \number.To match literal '(' or ')', use (ya) or enclose it in a character class: [(], [)].

re.findall(pattern, string, flags=0)
Returns all non-overlapping matches from the pattern during string as a list of strings. The string is scanned from left to right, and matches are returned in the order in which they are found.Returns a list of groups if there is one or more groups in the pattern.Multiple groups in a pattern are a list of tuples.Empty matches are included in the results.


2022-09-29 22:26

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.