Does anyone know the full list of characters that can be used within a GET without being encoded? At the moment I am using A-Z a-z and 0-9… but I am looking to find out the full list.
I am also interested into if there is a specification released for the up coming addition of Chinese, Arabic url’s (as obviously that will have a big impact on my question)
EDIT: As @Jukka K. Korpela correctly points out, RFC 1738 was updated by RFC 3986.
This has expanded and clarified the characters valid for host, unfortunately it’s not easily copied and pasted, but I’ll do my best.
In first matched order:
Original answer from RFC 1738 specification:
^ obsolete since 1998.