Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6732215
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T10:37:22+00:00 2026-05-26T10:37:22+00:00

I am trying to parse a non well formed DTD html file which i

  • 0

I am trying to parse a non well formed DTD html file which i retrieve by a inputstream with JSOUP, and get all the data in the TD fields.
How can i do that with JSoup?
I already looked at the http://jsoup.org/cookbook/ but i should need som example to get it started.

Thank you in advance.

I already tried the saxparser but i can`t get the DTD to work.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-             strict.dtd"> 
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="nl" lang="nl"> 
<TABLE class=personaltable cellSpacing=0 cellPadding=0> 
 <TBODY> 
  <TR class=alternativerow> 
   <TD>Nieuw beltegoed:</TD> 
   <TD>€ 1,00</TD></TR> 
  <TR> 
   <TD>Tegoed vorige periode:  
   <TD>€ 2,00</TD></TD></TR> 
  <TR class=alternativerow> 
   <TD>Tegoed tot 09-11-2011:  
   <TD>€ 10,00</TD></TD></TR> 
  <TR> 
   <TD> 
   <TD height=25></TD> 
  <TR class=alternativerow> 
   <TD>Verbruik sinds nieuw tegoed:</TD> 
   <TD>€ 0,33</TD></TR> 
  <TR> 
   <TD>Ongebruikt tegoed:</TD> 
   <TD>€ 12,00</TD></TR> 
  <TR class=alternativerow> 
   <TD class=f-Orange>Verbruik boven bundel:</TD> 
   <TD class=f-Orange>€ 0,00</TD></TR> 
  <TR> 
   <TD>Verbruik dat niet in de bundel zit*:</TD> 
   <TD>€ 0,00</TD></TR> 
  </TBODY> 
 </TABLE> 
</html> 

Edit:
I am getting a force close, i need the JSoup in my AsyncTask.
Here is the LOgcat:

10-20 21:07:36.679: ERROR/AndroidRuntime(1396): FATAL EXCEPTION: main
10-20 21:07:36.679: ERROR/AndroidRuntime(1396): java.lang.NullPointerException
10-20 21:07:36.679: ERROR/AndroidRuntime(1396):     at   com.sencide.AndroidLogin$MyTask.onPostExecute(AndroidLogin.java:276)
10-20 21:07:36.679: ERROR/AndroidRuntime(1396):     at com.sencide.AndroidLogin$MyTask.onPostExecute(AndroidLogin.java:1)
10-20 21:07:36.679: ERROR/AndroidRuntime(1396):     at android.os.AsyncTask.finish(AsyncTask.java:417)
10-20 21:07:36.679: ERROR/AndroidRuntime(1396):     at android.os.AsyncTask.access$300(AsyncTask.java:127)
10-20 21:07:36.679: ERROR/AndroidRuntime(1396):     at android.os.AsyncTask$InternalHandler.handleMessage(AsyncTask.java:429)
10-20 21:07:36.679: ERROR/AndroidRuntime(1396):     at android.os.Handler.dispatchMessage(Handler.java:99)
10-20 21:07:36.679: ERROR/AndroidRuntime(1396):     at android.os.Looper.loop(Looper.java:130)
10-20 21:07:36.679: ERROR/AndroidRuntime(1396):     at android.app.ActivityThread.main(ActivityThread.java:3835)
10-20 21:07:36.679: ERROR/AndroidRuntime(1396):     at java.lang.reflect.Method.invokeNative(Native Method)
10-20 21:07:36.679: ERROR/AndroidRuntime(1396):     at java.lang.reflect.Method.invoke(Method.java:507)
10-20 21:07:36.679: ERROR/AndroidRuntime(1396):     at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:847)
10-20 21:07:36.679: ERROR/AndroidRuntime(1396):     at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:605)
10-20 21:07:36.679: ERROR/AndroidRuntime(1396):     at dalvik.system.NativeStart.main(Native Method)

Here is the AsyncTask code:

public class MyTask extends AsyncTask<String, Integer, String> {
    private Elements tdsFromSecondColumn=null;
}

protected String doInBackground(String... params) {
      InputStream inputStreamActivity = response.getEntity().getContent();

                BufferedReader reader = new BufferedReader(new InputStreamReader(inputStreamActivity));
                StringBuilder sb = new StringBuilder();
                String line = null;

                while ((line = reader.readLine()) != null) {
                    sb.append(line + "\n");
                }

                /******* CLOSE CONNECTION AND STREAM *******/

                System.out.println(sb);
                inputStreamActivity.close();

                String kpn;
                kpn = sb.toString();

                Document doc = Jsoup.parse(kpn);
                Elements tdsFromSecondColumn = doc.select("table.personaltable td:eq(1)");
}

@Override 
    protected void onPostExecute(String result) { 
        //publishProgress(false); 
        TextView tv = (TextView)findViewById(R.id.lbl_top);

        for (Element tdFromSecondColumn : tdsFromSecondColumn) { 
            //System.out.println(tdFromSecondColumn.text()); 
            tv.setText("");
            tv.setText(tdFromSecondColumn.text());
        }
}
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T10:37:22+00:00Added an answer on May 26, 2026 at 10:37 am

    So, you have an InputStream and not an URL? You should then use the Jsoup#parse() method which takes an InputStream:

    Document document = Jsoup.parse(inputStream, charsetName, baseUri);
    // ...
    

    The charsetName should be the charset the document is originally encoded in. You can leave it null to let Jsoup decide or fallback to UTF-8. The baseUri should be the URL from which the HTML was originally served. You can leave it null, you’ll only not be able to resolve relative links.

    But if you actually have the original URL, then you could also just use Jsoup#connect():

    Document document = Jsoup.connect(url).get();
    // ...
    

    Regardless of the way you obtained the Document, you can use CSS selectors to select elements of interest in the document. See also the Jsoup cookbook on that subject. Here’s an example which extracts all the data from the 2nd column of the <table> with a class name of personaltable:

    Elements tdsFromSecondColumn = document.select("table.personaltable td:eq(1)");
    
    for (Element tdFromSecondColumn : tdsFromSecondColumn) {
        System.out.println(tdFromSecondColumn.text());
    }
    

    which results in:

    € 1,00
    € 2,00
    € 10,00
    
    € 0,33
    € 12,00
    € 0,00
    € 0,00
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am trying to parse an HTML file ( non strict one) using JavaScript
I'm trying to parse* a large file (> 5GB) of structured markup data. The
Trying to parse an HTML document and extract some elements (any links to text
I'm trying to parse an INI file using C++. Any tips on what is
I am trying to parse XML messages which are send to my C# application
I'm trying to parse UTF-8 XML file and save some parts of it to
I'm trying to parse an HTML document for a web indexing program. To do
I'm trying to parse a very large chunk of data into a new MongoDB
I'm trying to convert all instances of the > character to its HTML entity
I'm trying to parse the XML HttpResponse i get from a HttpPost to a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.