I’m trying to remove everything after and including ‘.html’ in a web address string. Current (failing) code is:
$input = 'http://example.com/somepage.html?foo=bar&baz=x';
$result = preg_replace("/(.html)[^.html]+$/i",'',$input);
Desired outcome:
value of $result is 'http://example.com/somepage'
Some other examples of $input that should lead to same value $result:
http://example.com/somepage
http://example.com/somepage.html
http://example.com/somepage.html?url=http://example.com/index.html
Your regular expresson is wrong, it would only match strings ending with
<one char> "html" <one or more chars matching ., h, t, m or l>. Sincepreg_replacejust returns the string “as-is” if there was no match, you’d be fine with matching the literal.htmland ignoring anything after it: