I used Beautiful Soup to collect newspaper content on the site, but I'm worried that I can't change the lines in each paragraph.
temp = ""
for n in c:
temp = temp + str(n.get_text())
#html
<p>
ABC. DEF.
</p>
<p>
GH. JK.
</p>
<p>
LMN. OPQ.
</p>
I got the result using get_text(), but I'm disappointed.
Results obtained
Another problem is that strings are attached after ABC. DEF.GH.JK.LMN. OPQ. <-- <p></p>
Desired result value
ABC. DEF.
<--- Line break
GH. JK.
<--- Line break
LMN. OPQ.
<--- Line break
I think you can get each paragraph text separately as follows, save it on a list, etc., and use it on your own.
from bs4 import BeautifulSoup
html = """<p>
ABC. DEF.
</p>
<p>
GH. JK.
</p>
<p>
LMN. OPQ.
</p> """
soup = BeautifulSoup(html, "html5lib")
paragraphs = [p.get_text() for p in soup.find_all("p")]
for i, p in enumerate(paragraphs):
print(f'paragraph {i} -------------')
print(p)
paragraph 0 -------------
ABC. DEF.
paragraph 1 -------------
GH. JK.
paragraph 2 -------------
LMN. OPQ.
© 2024 OneMinuteCode. All rights reserved.