How do I convert string to upper case String.toUpperCase() ignoring special characters like and all others. The problem is that it becomes   and browser does not recognize them as special HTML characters.
I came up with this but it does not cover all special characters:
public static String toUpperCaseIgnoreHtmlSymbols(String str){
if(str == null) return "";
str = str.trim();
str = str.replaceAll("(?i) "," ");
str = str.replaceAll(""",""");
str = str.replaceAll("&","&");
//etc.
str = str.toUpperCase();
return str;
}
Are you only interested in skipping HTML Entities, or do you also want to skip tags? What about chunks of javascript? URL’s in links?
If you need to support that kind of stuff, you won’t be able to avoid using a ‘real’ HTML parser instead of a regex. For example, parse the document using jsoup, manipulate the resulting Document, and convert it back to HTML:
now:
will produce: