Questions about parsing while crawling using jsoup (Android)

Asked 1 years ago, Updated 1 years ago, 109 views

I'm a beginner developer who wants to create an Android app. I have a question about parsing while crawly using jsoup.

The following code was written using the select method. Elements elements = doc.select("tr[tabinex=3]");
And then, as you can see in the screenshot, the Android screen shows all the prices of all the countries.

What I want is to parse only certain elements of a country like the blue circle. In this case, how should I put the factor in the select method?

android java jsoup parsing crawling

2022-09-22 19:09

1 Answers

When testing Java code, we recommend jshell or groovy(groovysh). Please refer to the code below.

The answer you want is table tr[tabindex=3] td:eq(1) together.

groovy:000> @Grab(group='org.jsoup', module='jsoup', version='1.11.3')
groovy:001> go
===> null
groovy:000> import org.jsoup.*
===> org.jsoup.*
groovy:000> import org.jsoup.select.*
===> org.jsoup.*, org.jsoup.select.*
groovy:000> doc = Jsoup.connect("https://eshop-prices.com/").get()
groovy:000> doc.select("table tr[tabindex=3]")
===> [<tr data-table-searchable="" tabindex="3">
 <th><a href="https://eshop-prices.com/games/379-1-2-switch">1-2-Switch</a></th>
 <td>$69.95</td>
 <td>€49,99</td>
 <td>€49,99</td>
 <td>€49,99</td>
 <td>$68.24</td>
 <td>€49,99</td>
 <td>€49,99</td>
 <td>1 249,00 Kč</td>
 <td>399,00 kr.</td>
 <td>€49,99</td>
 <td>€49,99</td>
 <td>€49,99</td>
 <td>€49,99</td>
 <td>€49,99</td>
 <td>€49,99</td>
 <td>€49,99</td>
 <td>€49,99</td>
 <td class="l">¥5,378</td>
 <td>€49,99</td>
 <td>€49,99</td>
 <td>€49,99</td>
 <td>€49,99</td>
 <td class="h">$1,399.00</td>
 <td>€49,99</td>
 <td>$79.95</td>
 <td>429,00 kr</td>
 <td>209,80 zł</td>
 <td>€49,99</td>
 <td>€49,99</td>
 <td>3.749,00 ₽</td>
 <td>€49,99</td>
 <td>€49,99</td>
 <td>R779.00</td>
 <td>€49,99</td>
 <td>449,00 kr</td>
 <td>CHF64.90</td>
 <td>£39.99</td>
 <td>$49.99</td>
</tr>]
groovy:000> doc.select("table tr[tabindex=3] td")
===> [<td>$69.95</td>, <td>€49,99</td>, <td>€49,99</td>, <td>€49,99</td>, <td>$68.24</td>, <td>€49,99</td>, <td>€49,99</td>, <td>1 249,00 Kč</td>, <td>399,00 kr.</td>, <td>€49,99</td>, <td>€49,99</td>, <td>€49,99</td>, <td>€49,99</td>, <td>€49,99</td>, <td>€49,99</td>, <td>€49,99</td>, <td>€49,99</td>, <td class="l">¥5,378</td>, <td>€49,99</td>, <td>€49,99</td>, <td>€49,99</td>, <td>€49,99</td>, <td class="h">$1,399.00</td>, <td>€49,99</td>, <td>$79.95</td>, <td>429,00 kr</td>, <td>209,80 zł</td>, <td>€49,99</td>, <td>€49,99</td>, <td>3.749,00 ₽</td>, <td>€49,99</td>, <td>€49,99</td>, <td>R779.00</td>, <td>€49,99</td>, <td>449,00 kr</td>, <td>CHF64.90</td>, <td>£39.99</td>, <td>$49.99</td>]
groovy:000> doc.select("table tr[tabindex=3] td:eq(1)")
===> [<td>$69.95</td>]
groovy:000> doc.select("table tr[tabindex=3] td:eq(2)")
===> [<td>€49,99</td>]


2022-09-22 19:09

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.