HTML <a target= partial regular expression

Asked 2 years ago, Updated 2 years ago, 45 views

import re


a = '<a target="_blank" href="https://ccc"><img src="bbb" border="0"></a>'

b =re.sub("<a+href=[^>]",'',a)

print(b)

I only want to delete <a target= to href="address"> in a.

I always wanted to remove the address after href using a regular expression because there was a different address, but I tried to make a regular expression <a+href=[^>]

It's not working... How should we make a regular show?

Thank you.

python-2.x

2022-09-20 22:13

1 Answers

Regular expressions are

It can vary widely depending on the .

We don't know all the patterns of the data you're dealing with, so please refer to the expression itself.

import re

a = '<a target="_blank" href="https://ccc"><img src="bbb" border="0"></a>'

# Use lookbeind, lookahead, which does not consume strings (consume) to submit
b = re.sub(r"(?<=<a).*?(?=>)", "", a)
print(b)

# Contrary to the above method, join through the capture group after reverse matching
m = re.search(r"(^<\w+).*?(>.*)", a)
if m:
    print(''.join(m.group(1, 2)))

The question below was incorrectly answered...

I thought you wanted to delete all the highlights. Whoo... Just ignore it. -_-;

import re

# The tag at the back also needs to be erased.
a = '<a target="_blank" href="https://ccc"><img src="bbb" border="0"></a>'

# If it's hard to erase them all at once, it's one way to erase them separately.
b = re.sub("</a>$", "", re.sub("^<a.*?>", "", a))
print(b)

# in generalization
b = re.sub("</\w+>$", "", re.sub("^<\w+(\s.*?)?>", "", a))
print(b)

# in a more generalized way
# Capture inner HTML of the outermost tag with full open/close
m = re.search(r"<(\w+)(\s.*?)?>(.*)</\1>", a)
if m:
    print(m.group(3))

# I don't want to explain the regular expression


2022-09-20 22:13

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.