This is a tag inquiry when you crawl Python.

Asked 2 years ago, Updated 2 years ago, 20 views

#!/usr/bin/python
#-*- #-*- coding: utf-8 -*-
import requests
from bs4 import BeautifulSoup


url = "http://~~~~"
target_code = requests.get(url)
plain_text = target_code.text


soup = BeautifulSoup(plain_text,"lxml")


ranks = soup.findAll("div",id="grid")
print(ranks)

Output result

<div class="inner" id="grid"></div>

The part I need to crawl is table > tbody > td in the div You have to take a picture with a developer tool to see the div tag open.

View page sources on a page.

It only comes out like this.

In reality, there's a table inside the structure is like this How do I solve this problem when I crawl?

python

2022-09-22 15:07

1 Answers

I think I use ajax to dynamically fill the div. In this case, you have to think differently, and you can directly request ajax url to get the content.

Alternatively, you can use browser automation tools such as selenium to obtain HTML content with events completed (in this case, you have to utilize external processes, which is slow and resource intensive).)

Instead of using a heavy browser, you can only place the QtWebkit in memory and call url to get content.


2022-09-22 15:07

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.