I have to implement a “bad words” filter on my website, which is a classifieds website.
I have a big list of “bad words” but don’t know which method is best to compare the user inputs to.
In my case, a textarea inside a form, needs to be checked for “bad words”.
<form name="test" action="test.php" method="post">
Inside test.php I fetch the textarea, and need to compare it…
My Q is, would you compare it to an external text-file with bad words, or an array with bad-words?
The array I think is better, so I don’t need any external functions etc, but I need to be sure…
What do you think?
Thanks
An array/list would be quicker overall if you are checking many words. You only have to read the file once and then each check will be against the list.
However, in your application (assuming you want to go ahead despite the pitfalls) it might be better to read the file only when you need to. That way the file could be updated while the application is still running and you wouldn’t have to stop and restart the application or call some admin function to reparse the file.
The delay in submission probably won’t be noticed by the user anyway. Though using a caching algorithm to see if the file has changed would minimise this.