Almost 5 years ago Joel Spolsky wrote this article, ‘The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)’.
Like many, I read it carefully, realizing it was high-time I got to grips with this ‘replacement for ASCII’. Unfortunately, 5 years later I feel I have slipped back into a few bad habits in this area. Have you?
I don’t write many specifically international applications, however I have helped build many ASP.NET internet facing websites, so I guess that’s not an excuse.
So for my benefit (and I believe many others) can I get some input from people on the following:
- How to ‘get over’ ASCII once and for all
- Fundamental guidance when working with Unicode.
- Recommended (recent) books and websites on Unicode (for developers).
- Current state of Unicode (5 years after Joels’ article)
- Future directions.
I must admit I have a .NET background and so would also be happy for information on Unicode in the .NET framework. Of course this shouldn’t stop anyone with a differing background from commenting though.
Update: See this related question also asked on StackOverflow previously.
Since I read the Joel article and some other I18n articles I always kept a close eye to my character encoding; And it actually works if you do it consistantly. If you work in a company where it is standard to use UTF-8 and everybody knows this / does this it will work.
Here some interesting articles (besides Joel’s article) on the subject:
A quote from the first article; Tips for using Unicode: