I am trying to import a file into some software, but it complains the file is not saved as UTF-8. I’ve checked my editor, gedit, and it claims it is being saved as such. I also tried saving as a Windows file, instead of Linux, but this did not help. So, I cut the file into parts, and found, 99% of the file is fine, but somewhere among about 3 lines of text, something is making the software upset. The file has many different languages in it, so lots of unusual symbols. Is it possible for some symbols in a document to not be from UTF-8?
I am trying to import a file into some software, but it complains the
Share
The character “A” that you mention in a comment is:
And in UTF-8 is encoded as:
You can check whether these are the bytes you have in the file (most likely).
If so, then it is a bug in your software. Maybe it tries to autodiscover the encoding or type of the file by looking to the first bytes of the file, and it gets confused somehow.
Maybe it sees the first byte (0xEF) and it cluelessly expects a BOM (Byte Order Mark), which is UTF-8: 0xEF 0xBB 0xBF. But it is not there, so it throws an error.