Crawling Python comments

Asked 1 years ago, Updated 1 years ago, 50 views

'https://news.v.daum.net/v/20190728165812603'
I'd like to crawl the number of comments in this article.

url ='https://comment.daum.net/apis/v1/ui/single/main/@20190728165812603'

headers = {
    'Authorization': 'Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJncmFudF90eXBlIjoiYWxleF9jcmVkZW50aWFscyIsInNjb3BlIjpbXSwiZXhwIjoxNTY0NjcxNDAwLCJhdXRob3JpdGllcyI6WyJST0xFX0NMSUVOVCJdLCJqdGkiOiI3MDllNDI5MC0yZmJjLTRmOTUtOTJlOC1mMTAzMDk5ZjYyYTciLCJjbGllbnRfaWQiOiIyNkJYQXZLbnk1V0Y1WjA5bHI1azc3WTgifQ.fQU2739LvY9EZLlNs-Go1VlCVEtz-I-JdS_kKJeOLDc',
    'Origin': 'https://news.v.daum.net',
    'Referer': 'https://news.v.daum.net/v/20190728165812603',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36'
}

resp = requests.get('https://comment.daum.net/apis/v1/posts/@20190728165812603', headers=headers)
data = resp.json()
data['commentCount']

When this code is executed, 'JSONDecodeError: Expecting value: line 1 column 1 (char 0)' appears, and even if you print (resp), an error like this appears: Response [401] Can you tell me what's wrong?

crawling requests

2022-09-20 19:02

1 Answers

The result of printing the request value is as follows.

{'post': 
    {'id': value,
    'forumId': value,
    'postKey': value,
    'flags': value,
    'title': value,
    'url': value,
    'icon': value,
    'commentCount': value,
    'childCount': value,
    'popularOpened': Value
    }
}

Because the commentCount exists in the post value, the post value must be called first and then the commentCount value must be called.

Try the following:

import requests
import json

url ='https://comment.daum.net/apis/v1/ui/single/main/@20190728165812603'

headers = {
    'Authorization': Enter Authorization value,
    'Origin': 'https://news.v.daum.net',
    'Referer': 'https://news.v.daum.net/v/20190728165812603',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36'
}

r = requests.get(url, headers=headers)
j = r.json()['post']
print(j)
print(j['commentCount'])


2022-09-20 19:02

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.