Let’s say a browser encounters a link like this: <a href=’stackoverflowhome.html’>home</a> This is clearly

Question 1

Let’s say a browser encounters a link like this:

<a href='stackoverflowhome.html'>home</a>

This is clearly a relative url to an html file in the current directory, but how does the browser know that the .html is a file extension, and not a TLD (top level domain)? Does it have a list of common file extensions, or a list of TLDs? And if so, is it manually updated whenever a new file format becomes commonly used, or when the list of accepted TLDs change, for example with brand tlds?

Question 2

It’s because that is how RFC 3986 specified that URIs should be parsed. If the URI does not have a scheme (a set of characters followed by a colon – e. g. http: or gopher:) then it must be treated as a relative URI. Quoting from the RFC:

A URI-reference is either a URI or a relative reference. If the
URI-reference’s prefix does not match the syntax of a scheme followed
by its colon separator, then the URI-reference is a relative
reference.

User-agents are allowed to make their best guess about what the user meant (see section 4.5) especially in cases where the context is ambiguous (such as URL bars in browsers) but the RFC recommends against it where the URI will be around for a long time as the best guess of user-agents will change over time, thus leading to URIs that don’t resolve to the same resource depending on the time they are accessed or the user-agent they are accessed with.

Editorial Team · Answer 1 · 2026-06-10T18:36:44+00:00

It’s because that is how RFC 3986 specified that URIs should be parsed. If the URI does not have a scheme (a set of characters followed by a colon – e. g. http: or gopher:) then it must be treated as a relative URI. Quoting from the RFC:

A URI-reference is either a URI or a relative reference. If the
URI-reference’s prefix does not match the syntax of a scheme followed
by its colon separator, then the URI-reference is a relative
reference.

User-agents are allowed to make their best guess about what the user meant (see section 4.5) especially in cases where the context is ambiguous (such as URL bars in browsers) but the RFC recommends against it where the URI will be around for a long time as the best guess of user-agents will change over time, thus leading to URIs that don’t resolve to the same resource depending on the time they are accessed or the user-agent they are accessed with.

Editorial Team
2026-06-10T18:36:44+00:00Added an answer on June 10, 2026 at 6:36 pm

It’s because that is how RFC 3986 specified that URIs should be parsed. If the URI does not have a scheme (a set of characters followed by a colon – e. g. http: or gopher:) then it must be treated as a relative URI. Quoting from the RFC:

A URI-reference is either a URI or a relative reference. If the
URI-reference’s prefix does not match the syntax of a scheme followed
by its colon separator, then the URI-reference is a relative
reference.

User-agents are allowed to make their best guess about what the user meant (see section 4.5) especially in cases where the context is ambiguous (such as URL bars in browsers) but the RFC recommends against it where the URI will be around for a long time as the best guess of user-agents will change over time, thus leading to URIs that don’t resolve to the same resource depending on the time they are accessed or the user-agent they are accessed with.

0

Reply

Share
Share

Share on Facebook

Share on Twitter

Share on LinkedIn

Share on WhatsApp

Report — Editorial Team, 2026-06-10T18:36:44+00:00Added an answer on June 10, 2026 at 6:36 pm

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Let’s say a browser encounters a link like this: <a href=’stackoverflowhome.html’>home</a> This is clearly

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply