How do I read dynamic data from crawling using python 2.7 requests?

Asked 1 years ago, Updated 1 years ago, 135 views

url="https://www.whoscored.com/Players/11119/Show/Lionel-Messi"
res = requests.get(url, headers=headers, proxies=proxies)
page_parser = bs4(res.content, "html.parser")

So we're going to practice bringing Messi's rating to the Huskord in this is how we're going to practice. The part of the table with Messi's rating recorded on the site is dynamic data, so when I printed out the page_parser variable, only the div tag on the outside came out without the table,tr,td tag with the rating record

I think it's coming over to Ajax, so I opened the developer tool and looked for it, but I don't know if I can't find it on the network, but I can't find it...

Question 1) How can I quickly determine whether the site's rating data is Ajax or simply JavaScript?

Question 2) How do I access the rating information of the site? But I don't use selenium! (Selenium has already been completed, but I want to do it using requests!

python-2.7 python requests crawling

2022-09-22 15:55

1 Answers

The first one. Finding with developer tools is the most efficient. Look at the XHR tab.

The link below looks like the link to get stats?

https://www.whoscored.com/StatisticsFeed/1/GetPlayerStatistics?category=summary&subcategory=all&statsAccumulationType=0&isCurrent=true&playerId=11119&teamIds=&matchId=&stageId=&tournamentOptions=&sortBy=Rating&sortAscending=&age=&ageComparisonType=&appearances=&appearancesComparisonType=&field=Overall&nationality=&positionOptions=&timeOfTheGameEnd=&timeOfTheGameStart=&isMinApp=false&page=&includeZeroValues=true&numberOfPlayersToPick=

Second. If you don't want to use a browser, you're going to take care of the javascript part by yourself. Of course, it's easy to get a string like json, but...If you have logic, such as receiving the result and doing it with javascript, you need to find a way to run javascript.


2022-09-22 15:55

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.