(I’m interested in HTML 4.01 and HTML5, if there should be differences)
Does the lang attribute on an img element apply to the src attribute, too? Or is it only for the alt and title attributes?
Example:
<img src="example.png" alt="a red foobar" lang="en" />
Is the image “example.png” considered to be in English? (think of screenshots of a forum thread, or a graphical representation of a word, or a scan of a document)
If it’s true, images with non-linguistic content would need to get lang="zxx". But that would apply to the alt/title attributes, too, which would be incorrect.
HTML 4.01 defines the
langattribute as specifying “the base language of an element’s attribute values and text content”, whereas HTML5 defines it as “the primary language for the element’s contents and for any of the element’s attributes that contain text”. The difference is apparently in the formulation only. Thelangattribute specifies the language ofaltandtitleattribute as well as other attributes that may contain prose text, as opposite to code-like values like URLs orstyleattributes, where (human) language is not applicable.The
srcattribute itself is not of any (human) language, logically. So the question is whether thelangattribute extends to the image denoted by thesrcattribute. This is a fairly theoretical question – what impact on software behavior could the answer possibly have? Anyway, the answer depends on what we understand as “text content” (images are text in a sense, in formatting, but probably HTML 4.01 means to refer to actual character data only) and as “element’s contents” (is an image part of theimgelement’s contents?). Overall, it seems that the language of the image (though a feasible concept) cannot be specified in HTML.So there is no need to worry about images with non-linguistic content. For text content that is “non-linguistic” (i.e. not text in any human language but e.g. some code notation, or a random sequence of character), using
lang=""is what HTML5 recommends. It’s also the practical approach. In the few cases whetelangattribute has any impact, as in automatic hyphenation,lang=""effectively means that no language rules are applied (e.g., no hyphenation). This is different from omitting the attribute, which means that the element inherits language information from its parent.