Goodmorning coding lads,
I’m writing a small regex too clean filenames from special characters (&, *, etc.)
This is how my code looks like:
public function clean($string, $bool = false)
{
$string = html_entity_decode($string, ENT_QUOTES);
$string = str_replace("'", "", $string);
$string = str_replace('"', "", $string);
$string = str_replace("&", "en", $string);
$string = str_replace("-", "_", $string);
//ALLE VAGE TEKENS VERVANGEN MET _
$weirdChars = Proces::normalInput($string , true);
if(count($weirdChars[0]) > 0)
{
foreach($weirdChars[0] as $char)
{
$string = str_replace($char, "_", $string);
}
}
if($bool)
$string = ucfirst(preg_replace('!_+!', '_', strtolower($string)));
else
$string = preg_replace('!_+!', '_', strtolower($string));
if(isset($string[0]) && $string[0] == "_")
$string = substr($string, 1);
if(substr($string, -1) == "_")
return substr($string,0,-1);
return $string;
}
public function normalInput($string, $bool = false) //STRING
{
$patern = '/[^_a-zA-Z0-9-]/';
if(preg_match_all($patern, $string, $matches))
{
if($bool)
return $matches;
else
return false;
}
else
{
if($bool)
return $matches;
else
return true;
}
}
These 2 methods are working together and working perfectly but I noticed a little problem.
The pattern I use in the normalInput method is like:
$patern = '/[^_a-zA-Z0-9-]/';
This is good but I want to exclude the dots in a filename (otherwise my file-extension will get like blaatfoo_pdf instead of blaatfoo.pdf).
Can you help me with this one?
Kind Regards,
Jordy Suos (take a cup of coffee and a nice sigarette at this beautiful morning.. ON ME)
Goedemorgen. 😉
You can either use whitelisting or blacklisting:
That’s all there’s to it, so you don’t need two functions counting 50 lines. I prefer the whitelist method, because you’ll never know what characters you’re going to receive as input, and there are quite some characters you don’t want in your filenames.
I’d also suggest to look into your variable and function naming, because
$boolisn’t really descriptive. Call it$ucFirstif you want.