Log in to Job Planet and ask for scraping ㅜㅜ

Asked 2 years ago, Updated 2 years ago, 147 views

I need to get the annual salary for each position that I can check after logging in on the Job Planet site with the code below. I really can't

//
from bs4 import BeautifulSoup
import urllib, http.cookiejar
cj = http.cookiejar.LWPCookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj)) 
urllib.request.install_opener(opener)

headers = {'User-Agent': 'Mozilla/5.0'}

params = urllib.parse.urlencode({"mode":"login", "user_email":"*******", "user_password":"******"})
params = params.encode('utf-8')
req = urllib.request.Request("https://www.jobplanet.co.kr/users/sign_in", headers=headers)
rej = urllib.request.Request("https://www.jobplanet.co.kr/companies/20575/salaries/", headers=headers)
res = opener.open(rej)

html = res.read()

python urllib login crawling scraping

2022-09-22 20:25

1 Answers

The site of the question does not process authentication by sending information to the form data when authenticating.

Send json string containing authentication information to body to process authentication.

import requests

URL = 'https://www.jobplanet.co.kr/users/sign_in'

user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36'
headers = {'Content-type': 'application/json', 'Accept': 'text/plain', 'User-Agent':user_agent}
login_data = {'user':{'email':'', 'password':'', 'remember_me':'true'}}

client = requests.session()
login_response = client.post(URL, json = login_data, headers = headers)
print(login_response.content.decode('utf-8'))

index = client.get(URL)
print(index.content.decode('utf-8'))


2022-09-22 20:25

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.