Hello, everyone I'm just learning about Python from the beginning, and I'm asking you a question because there's a blockage while crawling the webpage information. I tried hard to googling, but I couldn't find a sharp move, so I'm asking for your help.
Questions. As I turned the code below, if the Response code is 200 and there is no URL information, the entire result will be printed as None even if there is a name or phone number. I think there's something wrong with the exception. I wonder how to modify the URL information to print it out as None.
@@ (return_nm.text, find_cat.text, find_tell.text, find_adr.text, find_url.text) If you take out the text in this part, there's so much miscellaneous information that I want to keep it.
//
import requests
from bs4 import BeautifulSoup
def crawl(url):
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36'}
data = requests.get(url=url , headers=headers)
print(data,"/", url,"/",end = ' ')
return data.content
def parse(pageString):
try :
bsObj = BeautifulSoup(pageString, "html.parser")
#Name
find_nm = bsObj.find("strong",{"class" : "name"})
#Category
find_cat = bsObj.find("span", {"class" : "category"})
#Phone number
find_tell = bsObj.find("div", {"class": "txt"})
#Address
find_adr = bsObj.find("span", {"class": "addr"})
# Homepage
find_url = bsObj.find("a", {"class": "biz_url"})
return find_nm.text, find_cat.text, find_tell.text, find_adr.text, find_url.text
except :
pass
def printCompanyInfo(code):
url = "https://store.naver.com/restaurants/detail?id={}".format(code)
pageString = crawl(url)
companyInfo = parse(pageString)
print(companyInfo)
printCompanyInfo("33696029")
printCompanyInfo("13317484")
printCompanyInfo("32287256")
printCompanyInfo("37322689")
printCompanyInfo("36772108")
printCompanyInfo("413454114")
printCompanyInfo("31621852")
printCompanyInfo("13303181")
printCompanyInfo("37127150")
printCompanyInfo("34565498")
//
except Exception as e:
print(e)
print(find_nm, find_cat, find_tell, find_adr, find_url)
If I take it like this, there is no homepage, so there is an error with find_url
and None
.
hp_url = find_url.text if find_url else ''
I think you can handle it like this.
916 When building Fast API+Uvicorn environment with PyInstaller, console=False results in an error
574 Who developed the "avformat-59.dll" that comes with FFmpeg?
578 Understanding How to Configure Google API Key
620 Uncaught (inpromise) Error on Electron: An object could not be cloned
613 GDB gets version error when attempting to debug with the Presense SDK (IDE)
© 2024 OneMinuteCode. All rights reserved.