What’s the most reliable, generic way to construct a self-referential URL? In other words, I want to generate the http://www.site.com%5B:port%5D portion of the URL that the user’s browser is hitting. I’m using PHP running under Apache.
A few complications:
-
Relying on $_SERVER[‘HTTP_HOST’] is dangerous, because that seems to come straight from the HTTP Host header, which someone can forge.
-
There may or may not be virtual hosts.
-
There may be a port specified using Apache’s Port directive, but that might not be the port that the user specified, if it’s behind a load-balancer or proxy.
-
The port may not actually be part of the URL. For example, 80 and 443 are usually omitted.
-
PHP’s $_SERVER[‘HTTPS’] doesn’t always give a reliable value, especially if you’re behind a load-balancer or proxy.
-
Apache has a UseCanonicalName directive, which affects the values of the SERVER_NAME and SERVER_PORT environment variables. We can assume this is turned on, if that helps.
The most reliable way is to provide it yourself.
The site should be coded to be hostname neutral, but to know about a special configuration file. This file doesn’t get put into source control for the codebase because it belongs to the webserver’s configuration. The file is used to set things like the hostname and other webserver-specific parameters. You can accomodate load balancers, changing ports, etc, because you’re saying if an HTTP request hits that code, then it can assume however much you will let it assume.
This trick also helps development, incidentally. 🙂