I am trying to read a website using the java.net package classes. The site

Question

0

Asked: May 26, 20262026-05-26T01:08:43+00:00 2026-05-26T01:08:43+00:00

I am trying to read a website using the java.net package classes. The site

0

I am trying to read a website using the java.net package classes. The site has content, and i see it manually in html source utilities in the browser. When I get its response code and try to view the site using java, it connects successfully but interprets the site as one without content(204 code). What is going on and is it possible to get around this to view the html automatically.

thanks for your responses:
Do you need the URL?

here is the code:

   URL hef=new URL(the website); 
   BufferedReader kj=null;
   int kjkj=((HttpURLConnection)hef.openConnection()).getResponseCode();
   System.out.println(kjkj);
    String j=((HttpURLConnection)hef.openConnection()).getResponseMessage();
   System.out.println(j);
   URLConnection g=hef.openConnection();
   g.connect();

   try{
           kj=new BufferedReader(new InputStreamReader(g.getInputStream()));


     while(kj.readLine()!=null)
     {
         String y=kj.readLine();

         System.out.println(y);
      } 
   }

    finally
    {
         if(kj!=null)
        {
            kj.close();
         }
    }


   }

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T01:08:44+00:00

Suggestions:

Assert than when manually accessing the site (with a web browser client) you are effectively getting a 200 return code
Make sure that the HTTP request issued from the automated (java-based) logic is similar/identical to that of what is sent by an interactive web browser client. In particular, make sure the User-Agent is identical (some sites purposely alter their responses depending on the agent).
You can use a packet sniffer, maybe something like Fiddler2 to see exactly what is being sent and received to/from the server
~~I’m not sure that the java.net package is robot-aware, but that could be a factor as well (can you check if the underlying site has robot.txt files).~~

Edit:
assuming you are using the java.net package’s HttpURLConnection class, the “robot” hypothesis doesn’t apply.
On the other hand you’ll probably want to use the connection’s setRequestProperty() method to prepare the desired HTTP header for the request (so they match these from the web browser client)
Maybe you can post the relevant portions of your code.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am trying to read a website using the java.net package classes. The site

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply