I have a set of 40 characters that have their own code points. For example, U0678, u0679 and so on. How to retrieve words, strings and sub strings that only contain those characters from the text based on their code point, ignoring all other characters? I’m suffering with my old code
private string token(string x)
{
Regex exclude = new Regex(@"\d|\s+|/|-|[A-Za-z]", RegexOptions.Compiled);
return string.Join(" ",
(from s in Regex.Split(x, "([ \\t{}():;.,!ـ؛،؟ \"\n])")
where !exclude.IsMatch(s)
select s).ToArray());
}
Edited. Assume i have the string “aaa bbb ccc ddd “. Then I would like to retrieve the words aaa and bbb only. then I would like to do something like
Regex regEx = new Regex(@"\u0041|\u0042");
Match match = regEx.Match(mystring);
if(match.Success)
then do somthing
But i have 40 characters.
1 Answer