I have a bunch of files with a mixtures of encodings mainly ISO-8859-1 and UTF-8.
I would like to make all files UTF-8, but when trying to batch encode this files using iconv some problems arise. (Files cuts by half, etc.)
I supposse the reason is that iconv requires to know the ‘from’ encoding, so if the command looks like this
iconv -f ISO-8859-1 -t UTF-8 in.php -o out.php
but ‘in.php’ if already UTF-8 encoded, that causes problems (correct me if I’m wrong)
Is there a way, that I can list all the files whose encoding is not UTF-8?
You can’t find files that are definitely ISO-8859-1, but you can find files that are valid UTF-8 (which unlike most multibyte encodings give you a reasonable assurance that they are in fact UTF-8). moreutils has a tool
isutf8which can do this for you. Or you can write your own, it would be fairly simple.