Java to read information from the Billboard chart website.

Like the title, it's a code that uses Java parsing to scratch information on the Billboard chart website.

I'm not sure because I'm a beginner, but I think I use the URL class to create a url object and read it through buffered reader, but it popped up on Java that url class must handle exceptions. The result value is a parsing error that I made an exception to. Why can't I read it?

package s89;

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;


public class BillboardMain3 {
    public static void main(String[] args) {
        String newUrls="https://www.billboard.com/charts/hot-100/";
        URL url=null;
        try {
            url = new URL (newUrls); // Find Address
            //Strawing on Address
            BufferedReader reader = new BufferedReader(
                    new InputStreamReader(url.openStream(),"euc-kr"),8);
            String line = null;
            while((line = reader.readLine())!=null) { // Read one line at a time
                If(!line.trim().equals(") { // print if it is not blank.
                    System.out.println(line.trim());
                }
            }
        } } catch (Exception e) {
            System.out.println("Billboard Parsing error!!!");
        }
    }


}

parsing java

2022-09-22 19:31

2 Answers

You must be working on 200 Javas I'm doing that, too. lol
I think the billboard chart site blocked parsing. It comes out well if you type in the domain of another site.
If you touch the cord a little bit, the rest of it seems to work well. I think it's because it's been a while since I published a book.

2022-09-22 19:31

This is a groovy sample. Please refer to it.

groovy:000> @Grab(group='org.apache.httpcomponents', module='httpclient', version='4.4')
groovy:001> go
===> null
groovy:000> import org.apache.http.impl.client.*
groovy:000> import org.apache.http.client.methods.*
groovy:000> import org.apache.http.util.*
groovy:000> httpClient = HttpClients.createDefault()
groovy:000> httpGet = new HttpGet("https://www.billboard.com/charts/hot-100/")
groovy:000> response = httpClient.execute(httpGet)
===> HttpResponseProxy{HTTP/1.1 200 OK [Date: Sat, 28 Sep 2019 17:51:51 GMT, Content-Type: text/html; charset=UTF-8, Transfer-Encoding: chunked, Connection: keep-alive, CF-Cache-Status: HIT, Cache-Control: max-age=1, public, s-maxage=300, CF-Ray: 51d7915b2b73a261-ICN, Age: 275, Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct", Last-Modified: Fri, 27 Sep 2019 19:50:29 GMT, Set-Cookie: PGMINFO=cc:kr-ip:116.37.93.4; Max-Age=3600; Path=/; Domain=.billboard.com, Vary: Accept-Encoding, Via: 1.1 varnish (Varnish/5.2), X-Cache-Hits: HIT (25), X-Debug-Cookies: , X-Debug-Log: Removed cookies, X-NX-Host: www.billboard.com, X-Varnish: 917770121 932119920, Server: cloudflare] org.apache.http.client.entity.DecompressingEntity@60d6fdd4}
groovy:000> contents = EntityUtils.toString(response.getEntity(), "UTF-8")
===> <!doctype html>
<html class="" lang="">
<head>

<script>
        _udn = "billboard.com";
    </script>
<script>function utmx_section(){}function utmx(){}(function(){var
                k='67942495-39',d=document,l=d.location,c=d.cookie;
            if(l.search.indexOf('utm_expid='+k)>0)return;
            function f(n){if(c){var i=c.indexOf(n+'=');if(i>-1){var j=c.
                    indexOf(';',i);return escape(c.substring(i+n.length+1,j<0?c.
                    length:j))}}}var x=f('__utmx'),xx=f('__utmxx'),h=l.hash;d.write(

2022-09-22 19:31

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656