I’ve always believed that the HTTP Content-Type should correctly identify the contents of a returned resources. I’ve recently noticed a resource from google.com with a filename similar to /extern_chrome/799678fbd1a8a52d.js that contained HTTP headers of:
HTTP/1.1 200 OK
Expires: Mon, 05 Sep 2011 00:00:00 GMT
Last-Modified: Mon, 07 Sep 2009 00:00:00 GMT
Content-Type: text/html; charset=UTF-8
Date: Tue, 07 Sep 2010 04:30:09 GMT
Server: gws
Cache-Control: private, x-gzip-ok=""
X-XSS-Protection: 1; mode=block
Content-Length: 19933
The content is not HTML, but is pure JavaScript. When I load the resource using a local proxy (Burp Suite), the proxy states that the MIME type is “script”.
Is there an accepted method for determining what is returned from a web server? The Content-type header seems to usually be correct. Extensions are also an indicator, but not always accurate. Is the only accurate method to examine the contents of the file? Is this what web browsers do to determine how to handle the content?
The browser knows it’s JavaScript because it reached it via a
<script src="...">tag.If you typed the URL to a .js file into your URL’s address bar, then even if the server did return the correct Content-Type, your browser wouldn’t treat the file as JavaScript to be executed. (Instead, you would probably either see the .js source code in your browser window, or be prompted to save it as a file, depending on your browser.)
Browsers don’t do anything with JavaScript unless it’s referenced by a
<script>tag, plain and simple. No content-sniffing is required.