I created an array to fetch a file and then parse the contents of that file. I already filtered out the words less than 4 characters with if(strlen($value) < 4): unset($content[$key]); endif;
My question is this – I want to remove common words from the array, but there is quite a few of them. Instead of doing these checks over and over and over on each array value, I was wondering if there was a more efficient way to do this?
Here’s a sample of the code I currently am using. This list could be huge and I am thinking that there has to be a better(more efficient) way?
foreach ($content as $key=>$value) {
if(strlen($value) < 4): unset($content[$key]); endif;
if($value == 'that'): unset($content[$key]); endif;
if($value == 'have'): unset($content[$key]); endif;
if($value == 'with'): unset($content[$key]); endif;
if($value == 'this'): unset($content[$key]); endif;
if($value == 'your'): unset($content[$key]); endif;
if($value == 'will'): unset($content[$key]); endif;
if($value == 'they'): unset($content[$key]); endif;
if($value == 'from'): unset($content[$key]); endif;
if($value == 'when'): unset($content[$key]); endif;
if($value == 'then'): unset($content[$key]); endif;
if($value == 'than'): unset($content[$key]); endif;
if($value == 'into'): unset($content[$key]); endif;
}
Here’s how I’d do it:
The way it works: make an array, full of empty strings, where the keys are the substrings you want to remove/replace. the just use
str_replace, pass the keys as a first argument, the array itself as the second argument and, in the result in this case is:some words to be replaced. This code has been tested and works just fine.When dealing with an array, just implode it with some wacky delimiter (like
%@%@%or something) andstr_replacethe lot, explode the lot again and Bob’s your uncleWhen it comes to replacing all words with less than 3 characters (which I forgot about in my original answer), that’s something a regex is good at… I’d say something like
preg_replace('(\b|[^a-z])[a-z]{1,3}(\b|[^a-z])/i','$1$2',implode(',',$targetArray));or someting like that.You might want to test this one out, because this is just off the top of my head, and untested. But this would seem to enough to get you started