Something I still don’t understand when performing an http-get request to the server is what the advantage is in using JS function encodeURIcomponent to encode each component of the http-get.
Doing some tests I saw the server (using PHP) gets the values of the http-get request properly also if I don’t use encodeURIcomponent!
Obviously I still need to encode at client level the special character & ? = / : otherwise an http-get value like this “peace&love=virtue” would be considered as new key value pair of the http-get request instead of a one single value.
But why does encodeURIcompenent encodes also many other characters like ‘è’ for example which is translated into %C3%A8 that must be decoded on a PHP server using the utf8_decode function.
By using encodeURIcomponent all values of the http-get request are utf8 encoded, therefore when getting them in PHP I have to call each time the utf8_decode function on each $_GET value which is quite annoying.
Why can’t we just encode only the & ? = / : characters?
see also: JS encodeURIComponent result different from the one created by FORM
It shows that encodeURIComponent does not even encode properly because a simple browser FORM GET encodes characters like ‘€’, in different way. So I still wonder what does this encodeURIComponent is for?
This is a character encoding issue (again). As Gaby stated, URIs are a sequence of ASCII characters (thus only bytes of the range 0–127). So any other character, that is not in ASCII, needs to be encoded with the Percent-Encoding.
And since UTF-8 is the new “universal character encoding”, nowadays user agents interpret the URI to be UTF-8 encoded. But these UTF-8 encoded words are themselves also encoded with the Percent-Encoding since URIs cannot contain any other characters except those in ASCII.
That means, when you enter
http://en.wikipedia.org/wiki/€into your browser’s address field, your browser looks up the UTF-8 code for€(0xE282AC) and applies the Percent-Encoding on it (%E2%82%AC). Sohttp://en.wikipedia.org/wiki/€will actually result inhttp://en.wikipedia.org/wiki/%E2%82%AC.To show you that this is true, just enter
http://en.wikipedia.org/wiki/%E2%82%ACinto your address field and your browser will probably turn that intohttp://en.wikipedia.org/wiki/€. That is because nowadays user agents interpret the URI to be UTF-8 encoded.Now back to your initial question, why you should apply the Percent-Encoding explicitly: Imagine you have a web page where you want to link to the Wikipedia article on the Euro sign. If you just write the URI with a plain
€:Your browser will use the character encoding of the document for the
€character. That means, if your document’s encoding is Windows-1252 (as in your other question), the€will be encoded as 0x80 and the URI would behttp://en.wikipedia.org/wiki/%80(this actually works because Wikipedia is that clever to guess as Windows-1252 is the most popular character encoding with a printable character on 0x80).But if your document’s encoding is ISO 8859-15, the
€will be encoded as 0xA4 that represents the currency sign¤in ISO 8859-1 (Wikipedia will chose ISO 8859-1 because 0xA4 is an invalid byte sequence in UTF-8 and HTTP specifies ISO 8859-1 as default character encoding).So I recommend to always use the Percent-Encoding to avoid mistakes. Don’t let the user agents guess what you mean.