Launch JupiterLab 3.0.14 from Windows 10, ANACONDA.NAVIGATOR and click
[Selenium] How do I break through the scraping measures?When I copied and pasted the code, an error was displayed.
I searched for the error and tried to resolve it, but I couldn't resolve it because I didn't have enough skills.
Professor, please.
error messages:
NameError Traceback (most recent call last)
<ipython-input-20-ed48c50edfbb>in<module>
---->1html=get_page_from_amazon(url)
NameError: name 'url' is not defined
source code:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
def get_page_from_amazon(url):
text=""
# Launch Browser in Headless Mode
options=Options()
options.add_argument('--headless')
# Launch Browser
driver=webdriver.Chrome("chromedriver.exe", options=options)
driver.get(url)
driver.implicitly_wait(10)# Wait until 10 seconds if not found
text=driver.page_source
# US>Browser down
driver.quit()
return text
html=get_page_from_amazon(url)
soup = BeautifulSoup(html, 'lxml')
html=get_page_from_amazon(url)
reference site URL:
[Selenium] How to break through the scraping measures
The reason for the error itself is that "url
is not defined" as shown in the message, so I think you just have to define the URL to be scraped in advance.
url="https://amazon.co.jp/XXX/XXX/"#<- Added
html=get_page_from_amazon(url)
If the error message is in English, it is recommended that you try translating it first.
Machine Translation Results (Example):
NameError: No name 'url' defined
However, the code on the page referenced in the question appears to have been modified only by referring to the code on another site, so I'm not sure if the scraping itself works properly.
© 2024 OneMinuteCode. All rights reserved.