I would like to inquire about Google Scraping.(Beginner warning)

Asked 2 years ago, Updated 2 years ago, 131 views

Hello, I would like to search for Daepoo in 'Google' as below and extract the movie title by putting it in a variable.

Q1. Why does the result value come out as None? Q2. The second selenium module below can be extracted properly, but what is the difference between the two codes?

import requests
from bs4 import BeautifulSoup


url = "https://www.google.com/search?q=daeboo"
html = requests.get(url)
soup = BeautifulSoup(html.text, 'html.parser')

title = soup.find("div", {"class" :"kno-ecr-pt"})

print(title)

=====================================================


from bs4 import BeautifulSoup
from selenium import webdriver


url = "https://www.google.com/search?q=daeboo"
driver = webdriver.Chrome('/Users/jameskwon/Downloads/chromedriver')
driver.get(url)

html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')

r = soup.select_one('div .kno-ecr-pt').text

print(r)

driver.close()

python beautifulsoup selenium

2022-09-22 19:08

1 Answers

A1. title = soup.find("div", {"class" :"kno-err-pt"}) does not have a matching value. The questioner may be searching in HTML on Google's search results screen, but it is not.

After downloading html, call data asynchronously using javascript. This part is done in the browser, so you can't get the results you want with requests alone.

A2. The method of using selenium is to run a browser and receive the results, so you can use the results that have been done up to javascript. So you get the results you want.

Then you can say that you only need to know the selenium method, but there are pros and cons. Most of all, the browser has to run, so it uses a lot of resources and is slow. On the other hand, the way to use requests is light and fast.


2022-09-22 19:08

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.