I want to match the string in the csv file with the csv file.

Asked 1 years ago, Updated 1 years ago, 61 views

I want to delete the second hyphen or subsequent character of each URL in out.csv.

The following error appears.I know it's a type error, but
I don't know what the situation is and the solution is.

TypeError: findall() missing 1 required positional argument: 'string'
import re
import csv

with open('out.csv', encoding='utf-8') asf:
    reader=csv.reader(f)
    URL = re.findall(r'^[^-]*-[^-]*')
    for URL in reader:
        print(f'{URL}')
https://www.abcde.com/-0w69e7e1w00- Ayeo
https://www.abcde.com/-0w69e7e9w70- Kakikakeko Corporation
https://www.abcde.com/-0w08e1e0w00- This is the last time I'm sorry.
https://www.abcde.com/-0w69e7e1w70- Right away
https://www.abcde.com/-0w69e6e2w54- What's wrong with you?

python3 regular-expression csv

2022-09-29 22:29

1 Answers

re.findall is a method similar to the following, so you must specify the second parameter as the string to be processed.The error is probably because the second parameter is not specified.

re.findall(pattern, string, flags=0)

Moreover, the source of the question substitutes csv data without using the results, so re.findall has no meaning.

URL=re.findall(r'^[^-]*-[^-]*')
for URL in reader:

Perhaps the one you really want to use is this re.compile?
I wouldn't be surprised if I had one parameter.

re.compile(pattern,flags=0)

Even if you use it, you must change the variable that stores the result object in re.compile and the variable that reads a line of csv.
You will then use the compile result object in the for loop.

Alternatively, if you want to keep re.findall, you should either read and process csv as a text file instead of csv, or change it to csv line by line in the for loop.

By the way, I think this article will help you with regular expressions in the URL.
How do I match Python regular expressions to a specific URL?
Check and extract URLs with python regular expressions
Retrieve domain in url regular expression (python)
Python regular expression again-match url
gruber/Liberal Regex Pattern for Web URLs
Characters allowed and not allowed in the URL
Detecting URLs by Regular Expressions

First of all, it's close to the original source and you don't use csv, so it's a little rough, but it's going to look like this.

import re

pattern=re.compile(r" (https?:\/\/[\w:%#$&\?\(\)~\.=\+\-]+\/-[^-,]*) -[^-,]*")

with open('out.csv', encoding='utf-8') asf:
    for row in f.readlines():
        m=pattern.match(row)
        ifm:
            print(m.group(1))

Here's the result:

https://www.abcde.com/-0w69e7e1w00
https://www.abcde.com/-0w69e7e9w70
https://www.abcde.com/-0w08e1e0w00
https://www.abcde.com/-0w69e7e1w70
https://www.abcde.com/-0w69e6e2w54


2022-09-29 22:29

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.