I have a large text file that contains a few unicode characters that make LaTeX crash. How can I find non-ASCII characters in a file with sed, and the like in a Linux bash?
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Try:
Which can be used like:
Within
[]^means “not”. So[^ -~]means characters not between space and ~. So excluding control chars, this matches non ASCII characters, and is a more portable though slightly less accurate version of[^\x00-\x7f]below. The\+means1 or moreand will get multibye characters to have a color shown around the complete character(s), rather than interspersed in each byte, thus corrupting the multibyte sequence