I am trying to remove all the vowels from a string except for the first and last character. I have tried with 2 expressions and using 2 ways but in vain. I have described them below. Does anybody has a regular expression for this?
e.g.
original string — source = apeaple
after regex — source_modified = apple (this is what is expected)
I tried the expression ([a-zA-Z])[aeiouAEIOU]([a-zA-Z]) but this expression is removing repeated character as well. So the following is happening when i apply the above expression
code used —
Regex reg = new Regex("([a-zA-Z])[aeiouAEIOU]([a-zA-Z])");
string source_modified = reg.Replace(source, "");
original string — source = apeaple
after code execution — source_modified = aple (repeating character removed)
code used — string source_modified = Regex.Replace(source, "([a-zA-Z])[aeiouAEIOU]([a-zA-Z])", "$1" + "$2");
original string — source = apeaple
after code execution — source_modified = apaple (just 1 vowel gets removed)
i also tried ([a-zA-Z])[aeiouAEIOU]*([a-zA-Z]) but this is removing just 1 vowel and not all. So the following is happening when i apply the above expression
code used —
Regex reg = new Regex("([a-zA-Z])[aeiouAEIOU]*([a-zA-Z])");
string source_modified = reg.Replace(source, "");
original string — source = apeaple
after code execution — source_modified = “” (all characters are removed)
code used — string source_modified = Regex.Replace(source, "([a-zA-Z])[aeiouAEIOU]*([a-zA-Z])", "$1" + "$2");
original string — source = apeaple
after code execution — source_modified = apeple
You need some lookaround like so
C# supports it and it’s very powerful
Update 1
T.W.R.Cole informs me that there is a special rule in the English language (“this doesn’t work for words like “Anyanka” where an inner ‘y’ is used as a consonant”)
The following change should do this, using the technique of negative lookahead:
This time enable the regex modifier that matches case insensitive, it makes the regex simpler than the original
if a y followed by another y still means that the y is a consonant (euh… is there such a word) and thus should not disappear than a y must be listed in the last character class as well :
I repeat that I used C# as my regex dialect which has good support for lookaround techniques.