So I have many large text paragraphs to parse. The end goal is to

Question

0

Asked: June 18, 20262026-06-18T11:33:03+00:00 2026-06-18T11:33:03+00:00

So I have many large text paragraphs to parse. The end goal is to

0

So I have many large text paragraphs to parse.
The end goal is to separate the paragraphs into smaller postings, so I can insert them into mysql.

Here’s a very short example of one of the paragraphs in a string:

<?php
$longstring = '

(<b>John Smith</b>) at <b class="datetimeGMT">2011-01-10 22:13:01 GMT</b><hr>
Lots of text entered here under the first line.<br>And most of it is html, since it is for displaying in a web browser.<br></br></br>

(<b>Alan Slappy</b>) at <b class="datetimeGMT">2011-01-11 13:12:00 GMT</b><hr>
Forgot to put one more thing in the notes.........<br>blah blah blah
(<b>Joe Mama</b>) at <b class="datetimeGMT">2011-01-13 10:15:00 GMT</b><hr>
Groceries list:<br>Watermelons<br>Floss<br><br>email doctor
';

?>

Yep, I have a freaky project of parsing these strings for each entry.
Yes, I agree with anyone that this is not a cool task. the original developer allowed for appending text to the original text. Not a bad idea for some occasions, but for me it is.

I do need help with how to RegEx this beast and place it into a foreach loop so I can start cleaning it up.

Here’s how far I got:

<?php

if(preg_match_all('/\(<b>.*?<hr>/', $longstring, $matches)){
print_r($matches);
}
/* output: 
Array 
( 
    [0] => Array 
        ( 
         [0] => (<b>John Smith</b>) at <b class="datetimeGMT">2011-01-10 22:13:01 GMT</b><hr>
         [1] => (<b>Alan Slappy</b>) at <b class="datetimeGMT">2011-01-11 13:12:00 GMT</b><hr> 
         [2] => (<b>Joe Mama</b>) at <b class="datetimeGMT">2011-01-13 10:15:00 GMT</b><hr> 
        ) 
) 
*/ 
?>

So, I’m actually doing pretty good with looping through the tops of each entry. I’m kinda proud I figured that out. (regex is my nemesis)

So now I’m stuck figuring out how to include the actual text below each iteration.

Anyone have an idea on how I can adjust the preg_match_all to account for the text below each “header”?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-18T11:33:04+00:00

Editorial Team

2026-06-18T11:33:04+00:00Added an answer on June 18, 2026 at 11:33 am

Try this

if(preg_match_all('/\(<b>(?:(?!\(<b>).)*/s', $longstring, $matches)){
  print_r($matches);
}

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

So I have many large text paragraphs to parse. The end goal is to

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply