I need some help in extracting the sub strings from the table from the link (http://www.informatik.uni-trier.de/~ley/pers/hd/k/Kumar:G=_Praveen.htm)..
I need to extract ONLY the names of the authors and store it into a 2D array..
For example:
a[0][0]= G. Praveen kumar a[0][1]= Anirban Sakar. a[1][0]= G. Praveen Kumar, a[1][1]= Arjun Kumar Murmu, a[1][2]= Biswas Parajuli , a[1][3]= Prasenjit Choudhury
and so on for the next row (till the end of the table)…
the code which i tried is given below..
I need to extract the names of the authors (substring) and store in a 2D array ,as the names are separated by commas and : followed by the name of the article..
I do not want the name of the article to be stored in the 2D array but only the names of person while the end of table.
Any help would be appreciated. Thanks in advance.
package codetrial;
import java.io.*;
import java.lang.String.*;
import org.jsoup.*;
import org.jsoup.nodes.*;
import java.io.BufferedWriter.*;
import java.io.FileWriter.*;
import java.io.IOException.*;
import java.util.*;
import org.apache.commons.lang.StringUtils;
public class Main {
public static void main(String[] args) {
try{
String a;
final String url="http://www.informatik.unitrier.de/~ley/pers/hd/k/Kumar:G=_Praveen.html";
Document doc = Jsoup.connect(url).get();
for(Element element : doc.select("table div.data") ) {
a = element.text();
String[] names = a.split(", "); // comma and space
String name_one = StringUtils.substringBetween(url, " ", ",");
String name_two = StringUtils.substringBetween(url, ",", ":");
System.out.println("person1 = " + name_one);
System.out.println("person2 = " +name_two);
for(String name : names) {
System.out.println(name);
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
You can use Jsoup library to do this. See my example:
Output: