I have this block of html: <div> <p>First, nested paragraph</p> </div> <p>First, non-nested paragraph.</p>

Question

0

Asked: May 27, 20262026-05-27T11:53:16+00:00 2026-05-27T11:53:16+00:00

I have this block of html: <div> <p>First, nested paragraph</p> </div> <p>First, non-nested paragraph.</p>

0

I have this block of html:

<div>
  <p>First, nested paragraph</p>
</div>
<p>First, non-nested paragraph.</p>
<p>Second paragraph.</p>
<p>Last paragraph.</p>

I’m trying to select the first, non-nested paragraph in that block. I’m using PHP’s (perl style) preg_match to find it, but can’t seem to figure out how to ignore the p tag contained within the div.

This is what I have so far, but it selects the contents of the first paragraph contained above.

/<p>(.+?)<\/p>/is

Thanks!

EDIT

Unfortunately, I don’t have the luxury of a DOM Parser.

I completely appreciate the suggestions to not use RegEx to parse HTML, but that’s not really helping my particular use case. I have a very controlled case where an internal application generated structured text. I’m trying to replace some text if it matches a certain pattern. This is a simplified case where I’m trying to ignore text nested within other text and HTML was the simplest case I could think of to explain. My actual case looks something a little more like this (But a lot more data and minified):

#[BILLINGCODE|12345|11|15|2001|15|26|50]#
[ITEM1|{{Escaped Description}}|1|1|4031|NONE|15]
#[{{Additional Details }}]#
[ITEM2|{{Escaped Description}}|3|1|7331|NONE|15]
[ITEM3|{{Escaped Description}}|1|1|9431|NONE|15]
[ITEM4|{{Escaped Description}}|1|1|5131|NONE|15]

I have to reformat a certain column of certain rows to a ton of rows similar to that. Helping my first question would help actual project.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T11:53:17+00:00

Editorial Team

2026-05-27T11:53:17+00:00Added an answer on May 27, 2026 at 11:53 am

Your regex won’t work. Even if you had only non nested paragraph, your capturing parentheses would match First, non-nested ... Last paragraph..

Try:

<([^>]+)>([^<]*<(?!/?\1)[^<]*)*<\1>

and grab \2 if \1 is p.

But an HTML parser would do a better job of that imho.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have this block of html: <div> <p>First, nested paragraph</p> </div> <p>First, non-nested paragraph.</p>

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply