Do I understand crawl?

Hello, I'm posting a question because I got stuck while crawling.

If you press a query in the system that opens only when you log in, you want to crawl the Excel data that you see.

Looking at the network of developer tools, it seems that Excel data is being transferred from a site.

So to crawl this, is it right to search for Json crawl techniques?

I followed a link to a website and connected it to the address bar, and it said it was inaccessible... I wonder if I can crawl. Can I crawl the data shown only when the user logs in?

So Ajax is less than the session value of handing it over to the server where the data is stored?

I'm posting a question because I'm suspicious that you understand it well TT..

python crawling selenium ajax json

2022-09-20 22:32

1 Answers

Conclusion: Selenium+Headless Chrome combination should mimic behavior as if the user logged in with a browser himself.

The word 'Json crawling' you mentioned is a little vague...

Json is one of the formats that the server has determined how to return data for user requests. Still, to break down the Json crawling technique you mentioned,

Sending GET, POST requests to a typical request module

Selenium+Headless Chrome imitates users as if they were directly accessed by a browser

I think there are two. As a question, I guess it's not accessible by way 1, and we need to go to combination 2. Coding in number two is more complicated than number one.

2022-09-20 22:32

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656