I want to send and receive data through Python's requests module.
If you use the request module, it is revealed that it is a scripted connection to a user-agent or a header like this, but if you arbitrarily modify the host, user-agent, origin, referrer, etc.,
Can I hide it like it's a normal web browser request?
Especially, host, origin, and referrer seem to come out only when you enter the menu in the browser.Is that correct?
python web-crawling http requests
Tell the server which browser to use. In Python's requests
module, the default values are as follows:
python-requests/2.22.0
Anyone can tell you're connecting to Python. Some servers may block this form of user agent string if it is detected. If you replace this user agent string with a Chrome browser, for example, it looks like this.
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.80 Safari/537.36
In this case, the user agent string alone cannot distinguish whether the server is a Chrome browser or a Python requests
module.
The domain name of the server. For example, if you send an http request to this site, it will be set as follows:
Host: hashcode.co.kr
An error occurs if this header is missing, multiple, or set to an invalid value.
Indicates from which address the request came from when sending a POST request. If the address you receive the request with origin
is different, you may have a problem. Anyway, it's one of the two things that works the same as when you don't set it up, or there's an error, so there's no reason to change it.
Header that tells you which page you have passed and reached. You can use it if you want to fool the server, but user-agent
would normally be enough to change.
In addition to what we've covered here, http has a variety of headers. It's good to look up each header or post a question, but the best way is to try it yourself. Please change the header and send the request. You can use the following code.
import requests
headers = requests.utils.default_headers()
headers.update ({'Header Name to Replace': 'Changed Header Value'})
requests.get('url', headers=headers)
requests.post('url', ..., heaers=headers)
# And so on...
© 2024 OneMinuteCode. All rights reserved.