I’ve read a few questions on here re parsing HTML with regex, and I

Question

0

Asked: May 13, 20262026-05-13T13:31:14+00:00 2026-05-13T13:31:14+00:00

I’ve read a few questions on here re parsing HTML with regex, and I

0

I’ve read a few questions on here re parsing HTML with regex, and I understand that this is, on the whole, a terrible idea.

Having said this, I have a very specific problem that I think Regex might be the answer to. I’ve been fumbling around trying to work out the answer but I’m new (today) to Regex, and I was hoping some kind hearted person may be able to help me out.

I have an array of strings that always follow the format

STUFF HERE<a href="somewhere" title="something" target="_blank">name of thing</a>STUFF HERE

What I’m hoping to achieve is to be left with just the ‘somewhere’ and the ‘name of thing, so that I can output just <a href="somewhere">name of thing</a>.

The array of strings comes from an RSS feed of links on my Facebook profile, if you happen to be interested.

Many, many thanks for any help.

Jack

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-13T13:31:14+00:00

$str = 'STUFF HERE<a href="somewhere" title"something" target="_blank">name of thing</a>STUFF HERE';
$success = preg_match('/.*href=\"([^\"]+)\".*>([^<]+)<.*/i', $str, $matches);
if ($success) {
    echo $matches[1];
    echo $matches[2];
} else {
    echo "Parsing failed.";
}

The parenthetical clauses isolate portions of the match for the $matches array. If the pattern matches the string at all, then $matches[1] would contain your href and $matches[2] would contain your link text.

Inside the parenthesis, I’m defining the meat of those segments you’re interested with exclusion characters. The first one is [^\”]+, which is one-or-more of any character except double quote. The latter is [^<]+, which is one or more of any character except less than. This ensures that, if the markup is consistently in the format you provided, then you have well-defined boundaries on either side of the portions you’re interested in.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ve read a few questions on here re parsing HTML with regex, and I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply