I have a question about crawling the contents of https page in php.

Asked 2 years ago, Updated 2 years ago, 113 views

Attempt to scratch the contents of https page with php. http was easily accessible as a snoopy class, There are a lot of data that https should be done using curl, so I am writing the access code with curl.

Current code.

<?php
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL,"https://access address");
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt ($ch, CURLOPT_SSLVERSION,3);
curl_setopt ($ch, CURLOPT_HEADER, 0);
curl_setopt ($ch, CURLOPT_POST, 1);
curl_setopt ($ch, CURLOPT_POSTFIELDS, "ID name=ID value&PW name=PW value"));
curl_setopt ($ch, CURLOPT_TIMEOUT, 30);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec ($ch);
curl_close ($ch);
echo $result;
?>

// The account and password delivery part were the same result even if implemented as an array.

The result is 500 codes. As far as I know, the 500 code is the result of an error in the php source code The current environment is Apache, php7. I also removed the annotation to use the curl dll file in php.ini, and I also checked that there was a dll file. If you've experienced the same thing or know how to solve it, please give me some advice!<

php https crawling

2022-09-21 21:12

1 Answers

There can be many reasons why it doesn't work.

(1) We use CSRF Token to prevent you from trying to access with ID/PW only like the code you posted. So it could be a problem.

(2) Is the access address you used correct for sending account information to POST? It should be an address that sends ID/PW to Post, not a site access address.

If you want to access https and read the information, you can read it by writing:

<?php
function getHTML($url){
   $ch = curl_init($url);
   curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
   curl_setopt($ch, CURLOPT_RANGE, '0-100');
   curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
   curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
   $content = curl_exec($ch);
   curl_close($ch);
   echo $url."<br>";

   echo $content;
}

getHTML("https://hashcode.co.kr");
?>


2022-09-21 21:12

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.