What to do if the css attribute selector is not available in goutte?

Asked 1 years ago, Updated 1 years ago, 63 views

Hello.

I'm writing something like a tool to collect information using laravel5+goutte.
Some sites are struggling to get the URL to the next page because they do not use class in tag a and specify it with special attributes such as mode="next".

Where to retrieve HTML

<divid="pagenation">
    <ul>
        <li>
            <a mode="prev" href="hogehoge/1">Return</a>
        </li>
        <li>
            <a mode="current" href="hogehoge/2">-</a>
        </li>
        <li>
            <a mode="next" href="hogehoge/3"> proceed</a>
        </li>
    </ul>
</div>

Acquisition Code

$url=$crawler->filter('div.pagenation')->filter("a[rel='next']")->attr('href');

If there is a name in the a tag, I was thinking of using selectLink(), but
At the moment, I am wondering if there is a way to use the css selector in the filter part.

Could you give me some advice?

php laravel-5 web-scraping

2022-09-30 21:19

1 Answers

I solved myself.
I don't use the CSS attribute selector and it's roundabout, but I use foreach to
While determining the conditions, I evaluated the contents of the A tag and decided that if it matches, it would be synonymous with next.

$crawler->filter('ul')->each(function($row)use(&$URI){
      $row->filter('li')->each(function($pagenate)use(&$URI){
           if($pagenate->text()=='suggest'){
                $URL = $pagenate->filter('a')->attr('href');
           }
      });
 });


2022-09-30 21:19

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.