https://store.kakao.com/search/result/product?q=%EC%83%B4%ED%91%B8 I'll go to the site and I pressed F12 and got the address from the network https://store.kakao.com/a/search/products?timestamp=1628246554084&q=%EC%83%B4%ED%91%B8&sort=&td=&size=100&page=0&_=1628246554102
However, in the network, there is content/last/page/timestamp/totalCount
in the data.
from selenium import webdriver
From selenium.webdriver.common.keys import keys #Control the keyboard
import time #Library containing time-related functions
From bs4 import Beautiful Soup #When you drag a document
import requests
import json
import pandas as pd
path = "../drive/chromedriver.exe"
driver = webdriver.Chrome(path)
driver.get("https://store.kakao.com/")
time.sleep(1)
search_box = driver.find_element_by_class_name("wrap_util") # search window class
search_box.click() #MouseclickEvent Occurred
search_box = driver.find_element_by_name("tfSearch") #SearchwordEnterbox
search_box.click() #MouseclickEvent Occurred
search_box.send_keys ("shampoo") #Searchword
search_box.send_keys(Keys.RETURN) #Send an enter key to search the search box
Time.sleep(1) #Wait more than 1 second
url="https://store.kakao.com/a/search/products?timestamp=1628246554084&q=%EC%83%B4%ED%91%B8&sort=&td=&size=100&page=0&_=1628246554102"
customer_header = {
"referer":"https://store.kakao.com",
"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
}
response = requests.get(url,headers=customer_header)
data = json.loads(response.content )
print(data)
If you print it out like this, I think the contents value is in the data.
print(data["totalCount"])
You can't do this or print(data.get('totalCount'))
you can't do this.
I need the totalCount value, but it's so frustrating because I can't find it even if I use it here and there.
json python crawling
print(data['data']['totalCount'])
© 2024 OneMinuteCode. All rights reserved.