Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 183629
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 11, 20262026-05-11T15:09:00+00:00 2026-05-11T15:09:00+00:00

Using the following text as a sample, I need to be able to extract

  • 0

Using the following text as a sample, I need to be able to extract text between LI tags. Notice that the first LI is intentionally mal-formed as this may be the case. Said another way, I want everything from an LI tag to either it’s closing LI tag or the next LI opening tag.

    <UL> <LI class='test'>This is the first ListItem Text.  <LI>This is the second ListItem Test. </LI></UL> 

So far I have come up with:

<[Ll][Ii].*>(.*?)((?:<[Ll][Ii]>)|(?:</[Ll][Ii]>)) 

But this appears to be matching the first LI tag until the closing tag as one match with the group being the text of the 2nd LI tag. I’ve managed to get it to return the first set but never both. I’m using the ‘Dot matches newline’ option as well and this is .NET for which I need it to work. Thanks!

UPDATE

I had done some research prior to posting this question and did in fact see and understand that using regex’s to parse html is a bad idea. That being said, I only need to be able to get text from a couple LI tags here and there to determine what text to bulletize on a powerpoint slide. I thought there might be a simpler way to do it rather than dealing with a separate library, especially when use of third party libraries is tricky to deal with where I work. Unfortunately it appears that the HTML can end up mal-formed in certain situations when using an html rich text entry box on a page that allows you to bulletize text. Thanks for all of the recommendations against REGEX use for parsing HTML. I should have specified up front that I have read a lot of similar advice already but was looking for a quick work around for a simple set of circumstances.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-11T15:09:01+00:00Added an answer on May 11, 2026 at 3:09 pm

    If this is a recurring scenario, I would rather use an HTML parser. Parsing HTML with Regex will take a tremendous amount of time, and might still turn out buggy, because of malformed input (that you mentioned).

    Here’s one I found with a basic Google search:
    http://www.netomatix.com/products/Documentmanagement/HtmlParserNet.aspx

    UPDATE:

    Here are some related posts on StackOverflow:
    How do you parse a poorly formatted HTML file?
    What is the best way to parse html in C#?

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 189k
  • Answers 189k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer How they are stored is an implementation detail (depends on… May 12, 2026 at 5:50 pm
  • Editorial Team
    Editorial Team added an answer I think people use this to replicate Oracle's SEQUENCE. Basically… May 12, 2026 at 5:50 pm
  • Editorial Team
    Editorial Team added an answer div#overlay { position:absolute; bottom:0; right:0; } Something like this? z-index… May 12, 2026 at 5:50 pm

Related Questions

I'm running into real difficulties validating XML with XSD . I should prefix all
I have a PHP file with a mix of html, text and php includes
I need to obfuscate or encrypt some plain text data in my php 5.2
I am using Oracle 9 and JDBC and would like to encyrpt a clob

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.