I’m trying to match a control character in the form \^c where c is any valid character for control characters. I have this regular expression, but it’s not currently working: \\[^][@-z]
I think the problem lies with the fact that the caret character (^) is part of the regular expressions parsing engine.
Match an ASCII text string of the form
^Xusing the pattern\^., nothing more. Match an ASCII text string of the form\^Xwith the pattern\\\^.. You may wish to constrain that dot to[?@_\[\]^\\], so\\\^[A-Z?@_\[\]^\\]. It’s easier to read as[?\x40-\x5F]for the bracketed character class, hence\\\^[?\x40-\x5F]for a literal BACKSLASH, followed by a literal CIRCUMFLEX, followed by something that turns into one of the valid control characters.Note that that is the result of printing out the pattern, or what you’d read from a file. It’s what you need to pass to the regex compiler. If you have it as a string literal, you must of course double each of those backslashes.
`\\\\\\^[?\\x40-\\x5F]"Yes, it is insane looking, but that is because Java does not support regexes directly as Groovy and Scala — or Perl and Ruby — do. Regex work is always easier without the extra bbaacckksslllllaasshheesssssess. 🙂If you had real control characters instead of indirect representations of them, you would use
\pCfor all literal code points with the property GC=Other, or\p{Cc}for just GC=Control.