I have a string like the following:
$string = "
<paragraph>apples are red...</paragraph>
<paragraph>john is a boy..</paragraph>
<paragraph>this is dummy text......</paragraph>
";
I would like to split this string into an array contanining the text found between the <paragraph></paragraph> tags. E.g something like this:
$string = "
<paragraph>apples are red...</paragraph>
<paragraph>john is a boy..</paragraph>
<paragraph>this is dummy text......</paragraph>
";
$paragraphs = splitParagraphs($string);
/* $paragraphs now contains:
$paragraphs[0] = apples are red...
$paragraphs[1] = john is a boy...
$paragraphs[1] = this is dummy text...
*/
Any ideas?
P.S it should be case insensitive, <paragraph>, <PARAGRAPH>, <Paragraph> should all be treated the same way.
Edit: This is not XML, there are a lot of things here which will break the structure of XML hence I cannot use SimpleXML etc. I need a regular expression which will parse this out.
If this is actually XML then I agree with the other answers. But if it isn’t valid XML, but just something that looks vaguely like XML then you should not try to parse it with an XML parser. Instead you can use a regular expression:
Output:
Note that the
imeans case-insensitive and thesallows new lines to match in the text. All text not inside paragraph tags will be ignored.