I want to extract this table with the JSoup-framework to save the content in a “table”-array. The first tr-tag is the table header. All followings (not included) describe the content.
<table style=h2 width=100% cellspacing="0" cellpadding="4" border="1" bgColor="#FFFFFF">
<tr>
<td align="left" bgcolor="#9999FF" >
<!-- 0 -->
Kl.
</td>
<td align="left" bgcolor="#9999FF" >
<!-- 3 -->
Std.
</td>
<td align="left" bgcolor="#9999FF" >
<!-- 4 -->
Lehrer
</td>
<td align="left" bgcolor="#9999FF" >
<!-- 5 -->
Fach
</td>
<td align="left" bgcolor="#9999FF" >
<!-- 6 -->
Raum
</td>
<td align="left" bgcolor="#9999FF" >
<!-- 7 -->
VLehrer
</td>
<td align="left" bgcolor="#9999FF" >
<!-- 8 -->
VFach
</td>
<td align="left" bgcolor="#9999FF" >
<!-- 9 -->
VRaum
</td>
<td align="left" bgcolor="#9999FF" >
<!-- 13 -->
Info
</td>
</tr>
<tr>
<!-- 1 0 -->
<td align="left" bgcolor="#FFFFFF" >
</td>
<!-- 1 3 -->
<td align="left" bgcolor="#FFFFFF" >
4
</td>
<!-- 1 4 -->
<td align="left" bgcolor="#FFFFFF" >
Méta
</td>
<!-- 1 5 -->
<td align="left" bgcolor="#FFFFFF" >
HU
</td>
<!-- 1 6 -->
<td align="left" bgcolor="#FFFFFF" >
</td>
<!-- 1 7 -->
<td align="left" bgcolor="#FFFFFF" >
Shne
</td>
<!-- 1 8 -->
<td align="left" bgcolor="#FFFFFF" >
</td>
<!-- 1 9 -->
<td align="left" bgcolor="#FFFFFF" >
</td>
<!-- 1 13 -->
<td align="left" bgcolor="#FFFFFF" >
</td>
</tr>
I already tested this one and some others, but I didn’t arrive them to work for me:
Using JSoup To Extract HTML Table Contents
Here’s some example code how you can select only the header:
You get the
Documentby …parsing a file:
Document doc = Jsoup.parse(f, null);(wherefis theFileandnullthe charset, please see jsoup documentation for mor infos)parsing a website:
Document doc = Jsoup.connect("http://your.url.here").get();(don’t miss thehttp://)The output:
Now, if you need an array (or better
List) of all entries you can create a new class where all informations of each entry is stored. Next you parse the Html via jsoup and fill all fields of the class as well as adding it to list.Next the code wich fills your entry (incl. the list where they are stored):
If you use your html from the first post you’ll get this output:
Note: I simply used
System.out.println(entries);for that. So the format of the output is from thetoString()Method ofEntry.Please see Jsoup documentation and especially the one for jsoup selector api.