I need to extract this text: Line 1 text. Line 2 text. Line 2

Question

0

Asked: June 5, 20262026-06-05T20:41:17+00:00 2026-06-05T20:41:17+00:00

I need to extract this text: Line 1 text. Line 2 text. Line 2

0

I need to extract this text:

Line 1 text.
Line 2 text. Line 2 some more text.
Line 3 text,
Line 4 text

from this HTML:

...
<tr><td class="td_my_custom_text">Line 1 text. 
<br>Line 2 text. Line 2 some more text.
<br>Line 3 text, 
<br>Line 4 text
<br></td></tr><tr><td>&nbsp;</td></tr>
...

Using this RegEx: <td\ class="td_my_custom_text">[\s\S]*?</td> I have managed to get something close but not close enough. <td class="td_my_custom_text">, <br> and </td> are still inside and I am stuck.

What needs to be changed in my regular expression to get rid of them?
Is there some Windows tool to automate this job and copy just extracted data to new file(s)? I have 5000+ files like this one and I am thinking about making a small program using regex or html parser but I would like to know if there is a better approach first.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-05T20:41:18+00:00

Editorial Team

2026-06-05T20:41:18+00:00Added an answer on June 5, 2026 at 8:41 pm

It looks you’re better off just stripping off the tags because that’s essentially what you’re doing.

You should also look at dasbinkenlight’s link in his comment to understand more about HTML parsing.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I need to extract this text: Line 1 text. Line 2 text. Line 2

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply