I am trying to sanitize a string so that it can be used to be put in an URL. This is just for show in the URL. Now I was using this function in PHP which worked fine:
$CleanString = IconV('UTF-8', 'ASCII//TRANSLIT//IGNORE', $String);
$CleanString = Preg_Replace("/[^a-zA-Z0-9\/_|+ -]/", '', $CleanString);
$CleanString = StrToLower( Trim($CleanString, '-') );
$CleanString = Preg_Replace("/[\/_|+ -]+/", $Delimiter, $CleanString);
Now I am trying to put this in C#, the regex’s are no problem but the first line is a bit tricky. What is the safe way to replace characters as é á ó with their normal equivalents a e o?
For example, above would transer:
The cát ís running & getting away
to
the-cat-is-running-getting-away
The
CharUnicodeInfo.GetUnicodeCategory(c)method can tell you if a character is a “Non spacing mark”. This can only be used when the string is in a form where accents (“diacritics”) are separated from their letter, which can be obtained withNormalize(NormalizationForm.FormD).Here is the full string extension method: