Hi I want to scrape some text from a website using the JSoup library.

Question

0

Asked: June 17, 20262026-06-17T15:23:56+00:00 2026-06-17T15:23:56+00:00

Hi I want to scrape some text from a website using the JSoup library.

0

Hi I want to scrape some text from a website using the JSoup library. I have tried the following code, and that gives me the whole webpage, I want to just extract a specific line. Here is the code I am using:

Document doc = null;
try {
doc = Jsoup.connect("http://www.example.com").get();
} catch (IOException e) {
e.printStackTrace();
}
String text = doc.html();

System.out.println(text);

That prints out the following

<html>
 <head></head>
 <body>
  Martin,James,28,London,20k
  <br /> Sarah,Jackson,43,Glasgow,32k
  <br /> Alex,Cook,22,Liverpool,18k
  <br /> Jessica,Adams,34,London,27k
  <br /> 
 </body>
</html>

How can I extract just the 6th line that reads Alex,Cook,22,Liverpool,18k and put it into an array where each element is a word before a comma (eg: [0] = Alex, [1] = Cook, etc)

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T15:23:57+00:00

Maybe you have to format (?) the Result a bit:

    Document doc = Jsoup.connect("http://www.example.com").get();
    int count = 0; // Count Nodes

    for( Node n : doc.body().childNodes() )
    {
        if( n instanceof TextNode )
        {
            if( count == 2 ) // Node 'Alex'
            {
                String t[] = n.toString().split(","); // you have an array with each word as string now

                System.out.println(Arrays.toString(t)); // eg. output
            }
            count++;
        }
    }

Output:

[ Alex, Cook, 22, Liverpool, 18k ]

Edit:

Since you cant select TextNode‘s by its ccntent (only possible with Elements) you need a small workaround:

for( Node n : doc.body().childNodes() )
{
    if( n instanceof TextNode )
    {
        str = n.toString().trim();

        if( str.toLowerCase().startsWith("alex") ) // Node 'Alex'
        {
            String t[] = n.toString().split(","); // you have an array with each word as string now

            System.out.println(Arrays.toString(t)); // eg. output
        }
    }
}

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Hi I want to scrape some text from a website using the JSoup library.

Leave an answerCancel reply

1 Answer

Edit:

Leave an answer
Cancel reply