This is a curling question using selenium and beautiful soup.

Hello, I'm a beginner who is practicing automatic login and crawling using Python.

I tried to crawl the information I wanted by logging in to a specific site, but it was blocked.

I logged in automatically through Selenium.

The problems are as follows:

The information I want is 20121206-504

shown at the bottom

If you look at the source 20121206-504 in the second image, it looks like the first image above.

What kind of sauce should I use to make it a beautiful soup? The implementation code is as follows:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from time import sleep
from selenium.webdriver.support.ui import Select
from bs4 import BeautifulSoup

driver = webdriver.Chrome() driver.get("https://address")
sleep(1) 
driver.find_element_by_name('regi_no').send_keys ('ID') 
driver.find_element_by_name('pass').send_keys ('password') 
driver.find_element_by_xpath('/html/body/div/form/center/input[1]').click()
sleep(3) 
driver.find_element_by_xpath('//*[@id="myModal01"]/div/div/div[3]/button').click()
sleep(2) 
driver.find_element_by_xpath('//*[@id="myModal02"]/div/div/div[3]/button').click()
sleep(2) 
driver.find_element_by_xpath('//*[@id="myModal03"]/div/div/div[3]/button').click()
sleep(2) 
driver.find_element_by_xpath('//*[@id="myModal04"]/div/div/div[3]/button').click()
sleep(2) 
driver.find_element_by_name('grcode').click()
sleep(2) 
driver.find_element_by_xpath('/html/body/table[3]/tbody/tr[3]/td[1]/p/font/span/select/option[2]').click()

crawling python

2022-09-20 17:53

1 Answers

If you have visited the page you want through a web driver, get the source of the page and work with Beautiful Soup.

html = driver.page_source
soup = BeautifulSoup(html, 'html.parser)

2022-09-20 17:53

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656