Occasionally I get these errors occurring:
An invalid character was found in the mail header: ''
which didn’t make any sense, upon investigation it seems there’s some invisible character in there.
I know which user this is, so I select them from the DB:
select email from user where email = 'their@address.com'
the user’s email appears as their@address.com, but copying it into a text editor, shows a wierd leading char:

So why does the sql equality operator match, when it isnt the same string? because its some invisible char?
If I save just that leading char in the text file as unicode and open it in a hex editor, I see this:
FF FE 0E 20
Update: the offending bytes are:
E2 80 8E
What is this crazyness, how did it get there?
How can I prevent this in future, and how can I clean my database (as there are a few of these)
These are the relevant headers from when the user was created:
Content-Type: application/x-www-form-urlencoded
Accept: application/json, text/javascript, */*; q=0.01
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Thanks
The bytes FF FE are U+FEFF BYTE ORDER MARK in UTF-16BE encoding, and 0E 20 are U+200E LEFT-TO-RIGHT MARK in the same encoding. At the start of a file, they are harmless, at least if the content is in a left-to-right writing system, like the Latin alphabet.
I cannot make a guess on their origin, especially since I didn’t quite get what file is being discussed and how it was created (from a form post? from the database? some other way? how?).