I'm creating a harmful site blocking page that blocks certain words with php

Hi, everyone. I'm going to make a page to block harmful sites on the website I make. Currently, parsing is trying to utilize Snoopy class.

/Start Snoopy / $snoopy = new snoopy;

/Parsing / $url = "http://www.naver.com"; $snoopy->fetch($url);

I read the web page as above If you have a specific string using a regular expression, You want to call up a page that notifies you of blocking websites.

But if you use this method, I think only the pages read in url (in the case of the above code, the main page of Naver) will be inspected Even when you go to another page from the recalled site (when you switch to an article link or another page) Is there any way to examine a particular string?

php dom parser crawling html

2022-09-22 08:07

1 Answers

You have to force yourself.

1. After parsing the site, you read through it to see if there are any words to block, right? In that step, the href property of all a tags present in the site is modified in this way.

$new_href = '/parse_and_block?url='.urlencode($href);

2. Ensure that the source you are working on supports the /parse_and_block route. The route retrieves the value corresponding to url of the GET variable, urldcode(), then parses the contents by executing CURL for that address, and repeats step 1 above.

2022-09-22 08:07

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656