Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7432425
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 29, 20262026-05-29T09:27:20+00:00 2026-05-29T09:27:20+00:00

I am making a crawler, and need to get the data from the stream

  • 0

I am making a crawler, and need to get the data from the stream regardless if it is a 200 or not. CURL is doing it, as well as any standard browser.

The following will not actually get the content of the request, even though there is some, an exception is thrown with the http error status code. I want the output regardless, is there a way? I prefer to use this library as it will actually do persistent connections, which is perfect for the type of crawling I am doing.

package test;

import java.net.*;
import java.io.*;

public class Test {

    public static void main(String[] args) {

         try {

            URL url = new URL("http://github.com/XXXXXXXXXXXXXX");
            URLConnection connection = url.openConnection();

            DataInputStream inStream = new DataInputStream(connection.getInputStream());
            String inputLine;

            while ((inputLine = inStream.readLine()) != null) {
                System.out.println(inputLine);
            }
            inStream.close();
        } catch (MalformedURLException me) {
            System.err.println("MalformedURLException: " + me);
        } catch (IOException ioe) {
            System.err.println("IOException: " + ioe);
        }
    }
}

Worked, thanks: Here is what I came up with – just as a rough proof of concept:

import java.net.*;
import java.io.*;

public class Test {

    public static void main(String[] args) {
//InputStream error = ((HttpURLConnection) connection).getErrorStream();

        URL url = null;
        URLConnection connection = null;
        String inputLine = "";

        try {

            url = new URL("http://verelo.com/asdfrwdfgdg");
            connection = url.openConnection();

            DataInputStream inStream = new DataInputStream(connection.getInputStream());

            while ((inputLine = inStream.readLine()) != null) {
                System.out.println(inputLine);
            }
            inStream.close();
        } catch (MalformedURLException me) {
            System.err.println("MalformedURLException: " + me);
        } catch (IOException ioe) {
            System.err.println("IOException: " + ioe);

            InputStream error = ((HttpURLConnection) connection).getErrorStream();

            try {
                int data = error.read();
                while (data != -1) {
                    //do something with data...
                    //System.out.println(data);
                    inputLine = inputLine + (char)data;
                    data = error.read();
                    //inputLine = inputLine + (char)data;
                }
                error.close();
            } catch (Exception ex) {
                try {
                    if (error != null) {
                        error.close();
                    }
                } catch (Exception e) {

                }
            }
        }

        System.out.println(inputLine);
    }
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-29T09:27:21+00:00Added an answer on May 29, 2026 at 9:27 am

    Simple:

    URLConnection connection = url.openConnection();
    InputStream is = connection.getInputStream();
    if (connection instanceof HttpURLConnection) {
       HttpURLConnection httpConn = (HttpURLConnection) connection;
       int statusCode = httpConn.getResponseCode();
       if (statusCode != 200 /* or statusCode >= 200 && statusCode < 300 */) {
         is = httpConn.getErrorStream();
       }
    }
    

    You can refer to Javadoc for explanation. The best way I would handle this is as follows:

    URLConnection connection = url.openConnection();
    InputStream is = null;
    try {
        is = connection.getInputStream();
    } catch (IOException ioe) {
        if (connection instanceof HttpURLConnection) {
            HttpURLConnection httpConn = (HttpURLConnection) connection;
            int statusCode = httpConn.getResponseCode();
            if (statusCode != 200) {
                is = httpConn.getErrorStream();
            }
        }
    }
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I need to run a web crawler and I want to do it from
I'm looking into making a web crawler/spider but I need someone to point me
i'm making a crawler to get text html inside, i'm using beautifulsoup. when I
can any one help me out of this? i m actually making a crawler
I'm now making a web crawler. getting a link from HTML is easy part
currently im making some crawler script, one of problem is sometimes if i open
Making an adobe flex ui in which data that is calculated must use proprietary
Making a search result list (like in Google) is not very hard, if you
these day im making some web crawler script, but one of problem is my
making a new jsp and got a mock-up from some analyst. Notice the sections

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.