I have the following test script on my server:
<?php
echo "Test is: " . $_GET['test'];
?>
If I call it with a url like example.com/script.php?test=ɿ (ɿ being a multibyte character), the resulting page looks like this:
Test is: É¿
If I try to do anything with the value in $_GET[‘test’], such as save it a mysql database, I have the same problem. What do I need to to do make PHP handle this value correctly?
Have you told the user agent your HTTP response is UTF-8?
You might also want to ensure your HTML markup declares the encoding also, e.g.
For your database, are your tables and mysql client settings set up for UTF-8? If you check your database using a mysql command line client, is your terminal environment set up to expect UTF-8?
In a nutshell, you must check every step: from the raw source data, the code which touches it, the storage systems which retain it, and the tools you use to display and debug it.