I’m working on a project which needs to be Unicode aware. PHP provides bunch of useful functions like str_count_words() to calculate the number of words in some input, but they won’t work against UTF-8 data in PHP < 6 which is a shame. The same applies to strlen(), strrev(), etc.
What should I do about this? PHP 6 is still not even out yet so I can’t require people to have it to use my software…
Should I just write a wrapper library for string functions that will either use PHP 6’s functions or my own in case the version is below 6?
You could use the multibyte string comparison functions.
Another good idea might be looking at how others do it, especially well-established and matured systems like WordPress and Drupal. As far as I am aware, they all have own wrappers around multibyte functions.
Additional possibly interesting resources: