Duplicate Output and TypeError: Unhashable Type: 'ResultSet'

Asked 2 years ago, Updated 2 years ago, 18 views

I wrote the code as below, but I made them bring up to 10 pieces, but only 100 pieces of the same value are printed, and the message TypeError: undashable type: 'ResultSet' comes out.

I don't know where to touch. I ask for your help me.

import requests
from bs4 import BeautifulSoup
from apscheduler.schedulers.blocking import BlockingScheduler

sched = BlockingScheduler()

old_reports = []

def extract_reports(old_reports=[]):
    url = 'http://dart.fss.or.kr/dsac001/mainAll.do'
    r = requests.get(url)
    soup = BeautifulSoup(r.text, 'html.parser')

    lists = soup.select('div.table_list > table > tr')

    reports = []
    for dart in lists[:10]:
        report = dart.find_all("a")
        reports.append(report)

    new_reports = []
    for dart in lists:
        if report not in old_reports:
            new_reports.append(report)

    return new_reports

def send_reports():
    global old_reports
    new_reports = extract_reports(old_reports)
    if new_reports:
        for report in new_reports:
            if ('href' in report[1].attrs):
                print('[{}][{}][{}]'.format(report[0].text.strip(), report[1].text.strip(), report[1].attrs['href'].strip()))
    else :
        print ('[No Announcement]')

    old_reports += new_reports.copy()
    old_reports = list(set(old_reports))

send_reports()

sched.add_job(send_reports, 'interval', minute=5)

sched.start()

python

2022-09-21 12:37

1 Answers

1. Reason why only the same value is output

extract_reports() should be replaced with for report in reports instead of for dart in lists when checking for redundancy at the end. Maybe you made a mistake.

2. TypeError:unhashable type: 'ResultSet' Reason for error

Error in converting BeautifulSoup object to a set set.

Converting a Beautiful Soup object to a set set after converting it to a tuple works fine.

You can easily convert all the Beautiful Soup objects in old_reports by invoking the map(tuple, old_reports) command.

3. Additional errors

sched.When calling add_job(send_reports, 'interval', minute=5), the factor name is minutes, not minutes.

The following is the full code reflecting the above.

import requests
from bs4 import BeautifulSoup
from apscheduler.schedulers.blocking import BlockingScheduler

sched = BlockingScheduler()

old_reports = []

def extract_reports(old_reports=[]):
    url = 'http://dart.fss.or.kr/dsac001/mainAll.do'
    r = requests.get(url)
    soup = BeautifulSoup(r.text, 'html.parser')

    lists = soup.select('div.table_list > table > tr')

    reports = []
    for dart in lists[:10]:
        report = dart.find_all("a")
        reports.append(report)

    new_reports = []
    For report in reports: # Changed
        if not report in old_reports:
            new_reports.append(report)

    return new_reports

def send_reports():
    global old_reports
    new_reports = extract_reports(old_reports)
    if new_reports:
        for report in new_reports:
            if ('href' in report[1].attrs):
                print('[{}][{}][{}]'.format(report[0].text.strip(), report[1].text.strip(), report[1].attrs['href'].strip()))
    else :
        print ('[No Announcement]')

    old_reports += new_reports.copy()
    old_reports = list(map(list, set(map(tuple,old_reports))))) # Changed

send_reports()

schedule.add_job(send_reports, 'interval', minutes=5) # Changed

sched.start()


2022-09-21 12:37

If you have any answers or tips


© 2025 OneMinuteCode. All rights reserved.