I need to replace characters in a QString based on their
QChar::category. In stdlib terms I want to
string.erase(std::remove_if(begin(string), end(string),
[](QChar c) {
QChar::Category cat = c.category();
return cat == .... || cat == ...; }),
string.end());
Alternatively, I’m happy with a regexp that works on unicode character
categories that I can use for QString::replace.
Is that possible with QString or do I really need to turn the string
in a std::vector<QChar> and back?
Edit: The categories I want to keep:
- for the first charater: $, _, or any character in the Unicode categories “Uppercase letter (Lu)”, “Lowercase letter (Ll)”, “Titlecase letter (Lt)”, “Modifier letter (Lm)”, “Other letter (Lo)”, or “Letter number (Nl)”
- for the rest: the first bullet plus any U+200C zero width non-joiner characters, U+200D zero width joiner characters, and characters in the Unicode categories “Non-spacing mark (Mn)”, “Spacing combining mark (Mc)”, “Decimal digit number (Nd)”, or “Connector punctuation (Pc)”.
I can do first/rest in multiple passes.
Qt provides its own ways to do such things. Whether it is good, or not is doubtful, but Qt idiomatic would be
Of course, you can do it with lambdas and
std::for_eachbut it is not
Qtidiomatic.Note, that removing symbols from a string is slower, then adding new, if space was reserved, that is why the first code sample is fast.