I have a console application in linux that doesn’t handle unicode/UTF8 and I have no plans of implementing it, as there is just no need — other than this problem.
User’s are entering commands into the the prompt containing no ascii characters and this is causing grief as far as mysterious problems with data and also in text config files that should only contain ascii characters.
What is the best way to go about dealing with this problem?
Is there a not too complicated way of converting the unicode strings back to ascii or removing any removing any characters that can’t be printed using the visible part of ascii character set?
Unicode just seems like a complete nightmare too me.
UTF-8 and many single-byte character sets are ASCII-compatible, values between 0-127 represent the proper ASCII characters.
(In case of UTF-8, each byte of character that has a multi-byte sequence is outside this range.) Filtering out the rest solves your issue.
You should definitely change your attitude and support UTF-8 though.