I am trying to parse a HTTP document to extract portions of the document,

Question

0

Asked: June 12, 20262026-06-12T14:36:34+00:00 2026-06-12T14:36:34+00:00

I am trying to parse a HTTP document to extract portions of the document,

0

I am trying to parse a HTTP document to extract portions of the document, but am unable to get the desired results. Here is what I have got:

<?php

// a sample of HTTP document that I am trying to parse
$http_response = <<<'EOT'
<dl><dt>Server Version: Apache</dt>
<dt>Server Built: Apr  4 2010 17:19:54
</dt></dl><hr /><dl>
<dt>Current Time: Wednesday, 10-Oct-2012 06:14:05 MST</dt>
</dl>
I do not need anything below this, including this line itself
......
EOT;

echo $http_response;
echo '********************';
$count = -1;
$a = preg_replace("/(Server Version)([\s\S]*?)(MST)/", "$1$2$3", $http_response, -1, $count);
echo "<br> count: $count" . '<br>';
echo $a;

I still see the string “I do not need …” in the output. I do not need that string. What am I doing wrong?
How do I easily remove all other HTML tags as well?

Thanks for your help.

-Amit

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-12T14:36:35+00:00

You are matching everything from Server Version until MST. And only the part that is matched will later be modified by preg_replace. Everything not covered by the regex remains untouched.

So to replace the string part before your first anchor, and the text following, you also must match them first.

= preg_replace("/^.*(Server Version)(.*?)(MST).*$/s", "$1$2$3",

See the ^.* and .*$. Both will be matched, but aren’t mentioned in the replacement pattern; so they get dropped.

Also of course, might be simpler to just use preg_match() in such cases …

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am trying to parse a HTTP document to extract portions of the document,

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply