Python selenium crawling question! (Modify URL)

Asked 2 years ago, Updated 2 years ago, 57 views

I'm trying to crawl a product that meets the conditions on a site I wrote the code using selenium because it seemed to dynamically load the product list.

The structure of this site shows the url and the product list The url is the same that you enter a condition and show the list of products. ex) Product/data (Product List) Product/data (Product Color: Check Pink Conditions)

It's controlled by JavaScript, right?

So I'm going to use selenium I'm trying to enter a condition, search, and get a list of searched products... If you view html with driver.page_source, you will import html to get the full product list!!!!!

To explain, if you put the pink condition on the search condition, you need to get the HTml code with only the pink product list You will get the full product list html code....

Strangely, when you run selenium, the page moves like this At that time, there is only a pink product list. Even when inspecting with developer tools, the product is inspected well!!!!

Strangely, when you crawl, you get an html code that shows the entire list product. There are no products that meet the conditions. It just shows the products that are shown in the beginning.

What is this... What should I do? Masters I can't find it even if I search for it, so I'm posting a question like this.

I'll upload the chords, too.

The site URL, id, and password are temporarily put in.

def get_product(): 
    driver = webdriver.Chrome('chromedriver')
    driver.implicitly_wait(3)

    driver.get('http://carmanager.co.kr/')

    driver.find_element_by_name('userid').send_keys('id')
    driver.find_element_by_name('userpwd').send_keys(''pwd')

    driver.find_element_by_xpath('//*[@id="ui_loginarea"]/tbody/tr/td[2]/button').click() # Login
    driver.implicitly_wait(5)

    driver.get('http://carmanager.co.kr/Car/Data')
    driver.find_element_by_name('search_num').send_keys ('12da4506')

    driver.find_element_by_xpath('//*[@id="search"]/div/div[4]/table/tbody/tr/td[2]/button').click() # Search for products
    driver.implicitly_wait(20) 

    html = driver.page_source # After searching for the product, the result is html code
    return html

python selenium crawling

2022-09-22 13:03

1 Answers

Try it in headless mode.

options = webdriver.ChromeOptions()
options.add_argument('headless')

driver = webdriver.Chrome('chromedriver', chrome_options=options)

The driver.implicitly_wait(5) code is driver.get('http://carmanager.co.kr/') Try putting it in next time.

driver.get('http://carmanager.co.kr/')
driver.implicitly_wait(5)


2022-09-22 13:03

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.