I’m just curious. Is there a reason why the “higher powers that be” have never added to the HTML spec (or XML specs, for that matter,) a shorthand method for closing tags that contain content?
For instance, lets say I have the following html table:
<table>
<tr><td>foo</td><td>bar</td><td>foo</td></tr>
<tr><td>bar</td><td>foo</td><td>bar</td></tr>
<tr><td>1</td><td>2</td><td>3</td></tr>
</table>
Is there any reason why a shorthand tag couldn’t be used to close each tag?
Maybe something like this:
<table>
<tr><td>foo</><td>bar</><td>foo</></>
<tr><td>bar</><td>foo</><td>bar</></>
<tr><td>1</><td>2</><td>3</></>
</>
This could save a few bytes of data and the only downfall that I can see is that you can’t quickly tell which tag (or tags) have been closed if they aren’t given a named reference. However, this could be a useful option if you are dynamically generating content and want to save a few bytes in your data.
I’m positive I’m not the first person to think of this. Why hasn’t such functionality been added to any HTML or XML specification?
HTML, as defined by specifications up and including HTML 4.01, does have shorthand methods for closing elements with content. This is just theoretical, but the question is really why browsers did not implement such features. The features are mentioned, among other things, in HTML 4.01 spec in B.3.7 Shorthand markup, and they include NET (= Null End Tag), e.g.
which is by the formal definitions equivalent to
The spec says: “Documents that use them are conforming SGML documents, but are unlikely to work with many existing HTML tools.” This is an understatement in the sense that no browser ever implemented HTML by those specifications, i.e. as an SGML application (though some very rare browsers made some attempts at that direction). The issue is still reflected in HTML validation (in classic sense, excluding HTML5 validation, which plays by its own rules); see the Saga of Slashed Validators.
So why didn’t browsers implement the specs in this respect? It would have been easy, in a sense, to take some existing SGML parser and include it into a browser. But the explanation is that browsers were written in ad hoc ways, fast and loose and pragmatically, paying little or no attention to issues like generalized markup. HTML specifications were written quite some time after first browsers, mostly standardizing existing practice but also throwing in some new principles. HTML was kind of retrofitted to SGML in the formal sense, but this was never taken seriously by browser vendors.