Using PHP 5.3.2, I’m having trouble with handling a request for a page whose name has an umlaut in it: ö
Making the request using Firefox + Live HTTP Headers for the test_ö_test.htm page, I can see firefox automatically converts/encodes the umlaut when it makes a request:
GET /test_%C3%B6_test.htm HTTP/1.1
Now, using http://meyerweb.com/eric/tools/dencoder/ I am able to encode/decode between test_ö_test.htm and test_%C3%B6_test.htm, so I figure that encoding is correct.
Using PHP’s urldecode(), I get test_ö_test.htm
And the hated 404 is returned. Note that test_ö_test.htm does exist on the file system.
When I test with javascript’s escape() I get test_%F6_test.htm. When I plug that into my browser, I get the content page returned successfully. urldecode() turns that back into the umlaut.
Your page is declared as ISO-8859-1, while your data is UTF-8 encoded. This results in the browser trying to interpret the two byte UTF-8 sequence 0xc3 0xb6 as the two character Latin-1 sequence “LATIN CAPITAL LETTER A WITH TILDE” “PILCROW SIGN”. Your data and the content encoding of the page need to agree.