i have some simple code that does a preg match:
$bad_words = array('dic', 'tit', 'fuc',); //for this example i replaced the bad words
for($i = 0; $i < sizeof($bad_words); $i++)
{
if(preg_match("/$bad_words[$i]/", $str, $matches))
{
$rep = str_pad('', strlen($bad_words[$i]), '*');
$str = str_replace($bad_words[$i], $rep, $str);
}
}
echo $str;
So, if $str was "dic" the result will be ‘*‘ and so on.
Now there is a small problem if $str == f.u.c. The solution might be to use:
$pattern = '~f(.*)u(.*)c(.*)~i';
$replacement = '***';
$foo = preg_replace($pattern, $replacement, $str);
In this case i will get ***, in any case. My issue is putting all this code together.
I’ve tried:
$pattern = '~f(.*)u(.*)c(.*)~i';
$replacement = 'fuc';
$fuc = preg_replace($pattern, $replacement, $str);
$bad_words = array('dic', 'tit', $fuc,);
for($i = 0; $i < sizeof($bad_words); $i++)
{
if(preg_match("/$bad_words[$i]/", $str, $matches))
{
$rep = str_pad('', strlen($bad_words[$i]), '*');
$str = str_replace($bad_words[$i], $rep, $str);
}
}
echo $str;
The idea is that $fuc becomes fuc then I place it in the array then the array does its jobs, but this doesn’t seem to work.
First of all, you can do all of the bad word replacements with one (dynamically generated) regex, like this:
Now, you have the problem of people adding periods in between the word, which you can search for with another regex and replace them as well. However, you must keep in mind that
.matches any character in a regex, and must be escaped (withpreg_quote()or a backslash).This will create a
$bad_wordsarray similar to:Now, you can use this new
$bad_wordsarray just like the above one to replace these obfuscated ones.Hint: You can make this
array_map()call “better” in the sense that it can be smarter to catch more obfuscations. For example, if you wanted to catch a bad word separated with either a period or a whitespace character or a comma, you can do:Now if you make that obfuscation group optional, you’ll catch a lot more bad words:
Now, bad words should match:
And many more.