Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6756699
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T13:33:22+00:00 2026-05-26T13:33:22+00:00

What would be an optimal way, using Jsoup, to extract all HTML (either to

  • 0

What would be an optimal way, using Jsoup, to extract all HTML (either to a String, Document or Elements) between two blocks that conform to this pattern:

<strong>
 {any HTML could appear here, except for a <strong> pair}
</strong>

 ...
 {This is the HTML I need to extract. 
  any HTML could appear here, except for a <strong> pair}
 ... 

<strong>
 {any HTML could appear here, except for a <strong> pair}
</strong>

Using a regex this could be simple, if I apply it on the entire body.html():

(<strong>.+</strong>)(.+)(<strong>.+</strong>)
                       ^
                       +----- There I have my HTML content

But as I learned from a similar challenge, performance could be improved (even if the code is slightly longer) if I use an already Jsoup-parsed DOM — except that this time neither Element.nextSibling() nor Element.nextElementSibling() can come to the rescue.

I searched for something like jQuery’s nextUntil in Jsoup, for example, but couldn’t really find something similar.

Is it possible to come up with something better than the above regex-based approach?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T13:33:22+00:00Added an answer on May 26, 2026 at 1:33 pm

    I don’t know if it’s faster but maybe something like this will work:

    Elements strongs = doc.select("strong");
    Element f = strongs.first();
    Element l = strongs.last();
    Elements siblings = f.siblingElements();
    List<Element> result = siblings.subList(siblings.firstIndexOf(f) + 1,siblings.lastIndexOf(l));
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

What would be the optimal way to develop a basic graphical application for Windows
I believe conversion exactly to BigInteger[] would be optimal in my case. Anyone had
What would be the most optimal algorithm (performance-wise) to calculate the number of divisors
Would it be possible to print Hello twice using single condition ? if condition
Would it be possible to show an image in full screen mode using silverlight.
I'm trying to find the most optimal way to return a recordset containing a
I realize that parameterized SQL queries is the optimal way to sanitize user input
There's a common way to store multiple values in one variable, by using a
I would like to place some unknown number (normally between 3 and 9) of
I'm trying to find a manageable way to translate every visible string in an

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.