Nodejs Request Module Cache-Control Question.

We are currently crawling the information to the URL below.

https://project-team.upbit.com/api/v1/disclosure?region=kr&per_page=10

The problem is that the crawling content is 5 minutes later than the actual posting on the homepage, but I checked that the response header is set to Cache-Control

So to avoid receiving cached information, Request module request header with cache-control option as below.

headers: {
  'cache-control': 'private,no-cache,no-store,must-revalidate,max-age=0',
  'pragma': 'no-cache',
  'expires': 0
}

By the way, there is still a delay problem, but it seems that the cache-control setting that you set is not working, so can you tell me what the problem is?

Or I would appreciate it if you could tell me how to request data that has not been cached. Crawling is so hard ㅜ<

crawling node.js

2022-09-21 11:34

1 Answers

Apart from caching on the server, the client may reuse the previous response when the request is exactly the same. I'm not sure if the Request module is like that either. Maybe it's because the response header allows a cache, but....

url += "&time=" + (new Date()).getTime();

Add one query list to the URL that changes every time.

Anyway, why doesn't Cache-Control work as intended? 😒
It's a problem that I'm always experiencing. I want someone who knows to tell me.

Would you like to change 'cache-control' to 'cache-control' just in case?

2022-09-21 11:34

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656