Data crawling question!

Asked 1 years ago, Updated 1 years ago, 365 views

I'm practicing crawling the store information (store name, address, number) of Goobne Chicken and saving it as a csv file Goobne Chicken's web structure (tags) is getting stuck because I don't understand.... Currently, I opened a chrome window with a web driver, pressed the search button, and came to the window where I could see the list of stores nationwide, but I haven't been able to crawl since... I'm using a web driver (chrome) and selenium! I know there's another easy way, but I'm only going to use Selenium and web drivers.

Next is Goobne Chicken's store information site. https://www.goobne.co.kr/store/search_store

def ChickenGoobne_store(result):
    ChickenGoobne = "https://www.goobne.co.kr/store/search_store.asp"
    wd = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
    wd.get(ChickenGoobne)
    time.sleep(5) 
    wd.get(ChickenGoobne) 
    time.sleep(5) 
    wd.execute_script("goSearch('S')") 
    time.sleep(5) 

    html = wd.page_source
    soupGN = BeautifulSoup(html, 'html.parser')
    storeGN =soupGN.find('div', attrs={'class':'desc'})
    store_GN = storeGN[5:]
## This is the result of store_GN, which is the data structure.
##<div class="desc"><dl><dt class="name">Gagapsudong</dt><dd class="local">"716<d/d>><Biryong-ro, Sudong-myeon, Namyangju-si, Gyeonggi-si, Gyeonggi-do, Gyeonggi-do, 716<dl<dl<d<d<d<d<d<d>d&

    for i in range(1, 1131): 
        store_name_dt = soupGN.select("div.desc > dt.name") ## TypeError; unhasable type: 'slice'
        store_name = store_name_dt.string
        store_info = store_GN.findAll('dd')
        store_tag = list(store_info.string)
        store_address = store_tag[0]
        store_phone = store_tag[1]

        result.append([store_name]+[store_address]+[store_phone])
    return

I think I just need to modify the for statement, but I don't understand the structure... How can I crawl store information in all parts of the country?

python crawling webdriver selenium-webdrive selenium

2022-10-25 11:41

1 Answers

I'm only telling you this much to study because you say it's a woven code that you didn't make it yourself.

https://www.google.com/search?q=beautifulsoup+select+select_one

storeGN =soupGN.select('div#content div.result-list div.desc')
for i in storeGN:
    print(i)


2022-10-25 11:41

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.