I am trying to make a so called text cleaner so that I could

Question

0

Asked: June 12, 20262026-06-12T18:11:04+00:00 2026-06-12T18:11:04+00:00

I am trying to make a so called text cleaner so that I could

0

I am trying to make a so called text cleaner so that I could get rid of a few html elements without using the strip_tags() function.

My regex looks like this: <em>|</em>|<p[^>]*>|</p[^>]*>|<span[^>]*>|</span[^>]*>|<div[^>]*>|</div[^>]*>| |<table[^>]*>(.*?)</table[^>]*>

My code looks like this:

$string = "some very messy string here ";
$pattern = '<em>|</em>|<p[^>]*>|</p[^>]*>|<span[^>]*>|</span[^>]*>|<div[^>]*>|</div[^>]*>|&nbsp;|<table[^>]*>(.*?)</table[^>]*>';
$replace = ' ';

$clean =  preg_replace($pattern, $replace, $string);

echo $clean;

For reasons that are beyond my understanding the echo returns nothing.

Thank you for your time

UPDATE #1

If you are asking if I want to get rid of the tables with all the content inside them the answer is yes.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-12T18:11:05+00:00

Your regular expression needs delimiters. For example:

$pattern = '~<em>|</em>|<p[^>]*>|</p[^>]*>|<span[^>]*>|</span[^>]*>|<div[^>]*>|</div[^>]*>|&nbsp;|<table[^>]*>(.*?)</table[^>]*>~';

Read up on delimiters here.

Also note that some HTML specifications (all but XHTML as far as I know) allow uppercase tags, too. So consider adding the modifier for case-insensitivity to your regular expression. Furthermore, removing tables might not work if there are linebreaks between the opening and closing tags (because . does not match line breaks by default). Add the DOTALL modifier s to solve this:

$pattern = '~<em>|</em>|<p[^>]*>|</p[^>]*>|<span[^>]*>|</span[^>]*>|<div[^>]*>|</div[^>]*>|&nbsp;|<table[^>]*>(.*?)</table[^>]*>~is';

One final note: as the others pointed out regex solutions to HTML problems should be taken with a grain of salt. Nested tables will cause issues, as will comments. If you know the data you are dealing with very well, the problem might be much less complex than general HTML. But be sure your code is at least valid and you know about all oddities like nested structures and HTML characters in comments and so on.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am trying to make a so called text cleaner so that I could

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply