How can I correctly read files in encodings other than UTF8 in Awk?
I have a file in Hebrew/Windows-1255 encoding.
A simple {print $0} awk prints stuff like �.
how can I make it read correctly?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
awk itself doesn’t have any support for handling different encodings. It will honor the locale specified in the environment, but your best bet is to transcode the input to the proper encoding before handing it off to awk.
-f is the format you want to convert from, -t is the target format, and -c skips over any invalid characters which prematurely terminate iconv’s operation. Of course –help will give more details.