Tristan Miller said:
What is the correct semantics of a lang parameter in an anchor tag?
The semantics of the lang attribute is very complex. I don't mean what
the specifications say (which has been cited and summarized here). They
don't say very much, and that exactly is the problem. As soon as you
really start using language markup, you start encountering all kinds of
problems. And there's really no good summary of even the problems. (Or,
rather, there is, but it's available in Finnish only, and I don't think
I have time and energy to translate it, especially due to the miniscule
practical effect that language markup has at present, or in the near
future.)
<a href="foo" lang="ru">bar</a>
To summarize the situation: the word "bar" is declared as being Russian,
whereas nothing is said about the linked document's language. In
principle, "foo" is declared Russian too, and this might be relevant to
a speech browser that is asked to tell information about a link,
including its URL. URLs _can_ be spoken, and sometimes need to.
But if you write Russian words in a transliteration, using Latin
letters, such as "bar" literally, I would advice against using the lang
attribute at all. Beware that I am now advicing you to break a WCAG 1.0
priority 1 requirement (which is, in fact, broken by the WCAG 1.0
document itself, too, and by virtually all W3C documents) - the
requirement that language changes be indicated in markup.
I have two reasons to my advice:
1. Browsers, such as IE 6, are known to let the lang attribute affect
fonts too. They may even get wild and make the use frustrated when
they look for a font containing Cyrillic letters, despite the fact
that the text contains Latin letters only. And if they find such
a font, they may use it for the transliterated Russian text, making
it look different from the rest of the text. So
<p>My favorite author is <span lang="ru">Pushkin</span></p>
may result in "Pushkin" displayed in a considerably different font.
So although browsers don't do much _useful_ with lang attributes,
they surely know how to mess things up.
2. There is no way to indicate the transliteration method. What would
<span lang="ru">chas</span> mean? Should it be spoken (and spelling
checked, and indexed, etc.) according to the transliteration that
is common in English language context, or by French rules, or by
German rules, or by standard (ISO 9) rules? It would be just a wild
guess that the language of the enclosing text dictates this.
If you actually write Russian in Cyrillic letters, then different cans of
worms are opened.