lang parameter in anchor tag

Discussion in 'HTML' started by Tristan Miller, Sep 26, 2003.

  1. Greetings.

    What is the correct semantics of a lang parameter in an anchor tag? For
    example,

    <a href="foo" lang="ru">bar</a>

    Does this mean

    (a) The word "bar" is in Russian (and, for example, should be pronounced as
    such by a voice browser), or

    (b) The document "foo" is in Russian, but "bar" is still in whatever
    language its container element is?

    If the correct interpretation is (b), then I take it that to get the
    semantics of (a), I need to write something like this:

    <span lang="ru"><a href="foo">bar</a></span>

    Correct?

    --
    _
    _V.-o Tristan Miller [en,(fr,de,ia)] >< Space is limited
    / |`-' -=-=-=-=-=-=-=-=-=-=-=-=-=-=-= <> In a haiku, so it's hard
    (7_\\ http://www.nothingisreal.com/ >< To finish what you
    Tristan Miller, Sep 26, 2003
    #1
    1. Advertising

  2. In post <>
    Tristan Miller said...

    > What is the correct semantics of a lang parameter in an anchor tag? For
    > example,
    >
    > <a href="foo" lang="ru">bar</a>


    lang = language-code [CI]
    This attribute specifies the base language of an element's attribute
    values and text content. The default value of this attribute is
    unknown. http://www.w3.org/TR/html401/struct/dirlang.html#adef-lang

    > Does this mean
    >
    > (a) The word "bar" is in Russian (and, for example, should be pronounced as
    > such by a voice browser), or


    Language information specified via the lang attribute may be used by a
    user agent to control rendering in a variety of ways. Some situations
    where author-supplied language information may be helpful include:
    Assisting search engines
    Assisting speech synthesizers
    Helping a user agent select glyph variants for high quality typography
    Helping a user agent choose a set of quotation marks
    Helping a user agent make decisions about hyphenation, ligatures, and spacing
    Assisting spell checkers and grammar checkers
    http://www.w3.org/TR/html401/struct/dirlang.html#adef-lang

    > (b) The document "foo" is in Russian, but "bar" is still in whatever
    > language its container element is?


    "bar" and if read the "foo" URL text would be in russian.

    > If the correct interpretation is (b), then I take it that to get the
    > semantics of (a), I need to write something like this:
    >
    > <span lang="ru"><a href="foo">bar</a></span>


    what you had first is right as lang applies to an "element's attribute
    values and text content." one day browsers may even support it.

    also have a look at sending content-language headers

    The "Content-Language" header is intended for use in the case where
    one desires to indicate the language(s) of something that has RFC
    822-like headers, such as MIME body parts or Web documents.
    http://www.ietf.org/rfc/rfc3282.txt

    --
    27/September/2003 12:14:07 am
    =?iso-8859-1?Q?brucie?=, Sep 26, 2003
    #2
    1. Advertising

  3. Tristan Miller wrote:

    > <a href="foo" lang="ru">bar</a>
    >
    > Does this mean
    >
    > (a) The word "bar" is in Russian (and, for example, should be pronounced as
    > such by a voice browser), or
    >
    > (b) The document "foo" is in Russian, but "bar" is still in whatever
    > language its container element is?


    Why not just *read* the specs? They say:

    (c) The word "bar" is in Russian. The word "foo" is also in Russian. The
    document that "foo" points to could be in any language.

    If you want to say that the document that foo points to is in Russian, use
    hreflang="ru".

    > If the correct interpretation is (b), then I take it that to get the
    > semantics of (a), I need to write something like this:
    >
    > <span lang="ru"><a href="foo">bar</a></span>


    Well, (b) isn't correct. Technically neither is (a), but for all intents
    as purposes, as it doesn't really matter which language a URL is in, (c)
    and (a) are close enough.

    --
    Toby A Inkster BSc (Hons) ARCS
    Contact Me - http://www.goddamn.co.uk/tobyink/?id=132
    playing://rem/new_adventures_in_hifi/08_bittersweet_me.ogg
    Toby A Inkster, Sep 26, 2003
    #3
  4. Greetings.

    In article <>, Toby A Inkster
    wrote:
    > Why not just *read* the specs? They say:


    Because my web proxy server was down and I didn't have the foresight to
    download a copy for viewing offline? Or is that not a good enough reason?

    I'm sorry if answering this question took up too much of your time, but
    then, you were never obligated to respond.

    --
    _
    _V.-o Tristan Miller [en,(fr,de,ia)] >< Space is limited
    / |`-' -=-=-=-=-=-=-=-=-=-=-=-=-=-=-= <> In a haiku, so it's hard
    (7_\\ http://www.nothingisreal.com/ >< To finish what you
    Tristan Miller, Sep 26, 2003
    #4
  5. Tristan Miller

    DU Guest

    Tristan Miller wrote:

    > Greetings.
    >
    > What is the correct semantics of a lang parameter in an anchor tag? For
    > example,
    >
    > <a href="foo" lang="ru">bar</a>
    >
    > Does this mean
    >
    > (a) The word "bar" is in Russian (and, for example, should be pronounced as
    > such by a voice browser), or
    >
    > (b) The document "foo" is in Russian, but "bar" is still in whatever
    > language its container element is?
    >


    The document referenced by "foo" is not associated at all to a russian
    file. It could be in any language. If you had added hreflang="ru", then
    it would have meant that the referenced "foo" resource is written in
    russian. Also, charset="koi8-r" would have also identified the character
    set of foo in that way.

    > If the correct interpretation is (b), then I take it that to get the
    > semantics of (a), I need to write something like this:
    >
    > <span lang="ru"><a href="foo">bar</a></span>
    >


    Just a word of caution. If no available font can render the resource in
    the requested language, browsers usually trigger a font download dialog
    modal window. Since russian is widely supported in unicode fonts, then
    no problem but that is not the case for other Asian languages.

    > Correct?
    >


    The handling of language related attributes is not obvious IMO. As
    others mentioned (I agree with them on this), you should check the specs.

    DU
    --
    Javascript and Browser bugs:
    http://www10.brinkster.com/doctorunclear/
    - Resources, help and tips for Netscape 7.x users and Composer
    - Interactive demos on Popup windows, music (audio/midi) in Netscape 7.x
    http://www10.brinkster.com/doctorunclear/Netscape7/Netscape7Section.html
    DU, Sep 26, 2003
    #5
  6. Tristan Miller <> wrote:

    > What is the correct semantics of a lang parameter in an anchor tag?


    The semantics of the lang attribute is very complex. I don't mean what the
    specifications say (which has been cited and summarized here). They don't
    say very much, and that exactly is the problem. As soon as you really
    start using language markup, you start encountering all kinds of problems.
    And there's really no good summary of even the problems. (Or, rather,
    there is, but it's available in Finnish only, and I don't think I have
    time and energy to translate it, especially due to the miniscule practical
    effect that language markup has at present, or in the near future.)

    > <a href="foo" lang="ru">bar</a>


    To summarize the situation: the word "bar" is declared as being Russian,
    whereas nothing is said about the linked document's language. In
    principle, "foo" is declared Russian too, and this might be relevant to a
    speech browser that is asked to tell information about a link, including
    its URL. URLs _can_ be spoken, and sometimes need to.

    But if you write Russian words in a transliteration, using Latin letters,
    such as "bar" literally, I would advice against using the lang attribute
    at all. Beware that I am now advicing you to break a WCAG 1.0 priority 1
    requirement (which is, in fact, broken by the WCAG 1.0 document itself,
    too, and by virtually all W3C documents) - the requirement that language
    changes be indicated in markup.

    I have two reasons to my advice:
    1. Browsers, such as IE 6, are known to let the lang attribute affect
    fonts too. They may even get wild and make the use frustrated when
    they look for a font containing Cyrillic letters, despite the fact
    that the text contains Latin letters only. And if they find such
    a font, they may use it for the transliterated Russian text, making
    it look different from the rest of the text. So
    <p>My favorite author is <span lang="ru">Dostoevskij</span>

    >
    > Does this mean
    >
    > (a) The word "bar" is in Russian (and, for example, should be
    > pronounced as such by a voice browser), or
    >
    > (b) The document "foo" is in Russian, but "bar" is still in whatever
    > language its container element is?
    >
    > If the correct interpretation is (b), then I take it that to get the
    > semantics of (a), I need to write something like this:
    >
    > <span lang="ru"><a href="foo">bar</a></span>
    >
    > Correct?
    >




    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html
    Jukka K. Korpela, Sep 26, 2003
    #6
  7. Tristan Miller <> wrote:

    > What is the correct semantics of a lang parameter in an anchor tag?


    The semantics of the lang attribute is very complex. I don't mean what
    the specifications say (which has been cited and summarized here). They
    don't say very much, and that exactly is the problem. As soon as you
    really start using language markup, you start encountering all kinds of
    problems. And there's really no good summary of even the problems. (Or,
    rather, there is, but it's available in Finnish only, and I don't think
    I have time and energy to translate it, especially due to the miniscule
    practical effect that language markup has at present, or in the near
    future.)

    > <a href="foo" lang="ru">bar</a>


    To summarize the situation: the word "bar" is declared as being Russian,
    whereas nothing is said about the linked document's language. In
    principle, "foo" is declared Russian too, and this might be relevant to
    a speech browser that is asked to tell information about a link,
    including its URL. URLs _can_ be spoken, and sometimes need to.

    But if you write Russian words in a transliteration, using Latin
    letters, such as "bar" literally, I would advice against using the lang
    attribute at all. Beware that I am now advicing you to break a WCAG 1.0
    priority 1 requirement (which is, in fact, broken by the WCAG 1.0
    document itself, too, and by virtually all W3C documents) - the
    requirement that language changes be indicated in markup.

    I have two reasons to my advice:

    1. Browsers, such as IE 6, are known to let the lang attribute affect
    fonts too. They may even get wild and make the use frustrated when
    they look for a font containing Cyrillic letters, despite the fact
    that the text contains Latin letters only. And if they find such
    a font, they may use it for the transliterated Russian text, making
    it look different from the rest of the text. So
    <p>My favorite author is <span lang="ru">Pushkin</span></p>
    may result in "Pushkin" displayed in a considerably different font.
    So although browsers don't do much _useful_ with lang attributes,
    they surely know how to mess things up.

    2. There is no way to indicate the transliteration method. What would
    <span lang="ru">chas</span> mean? Should it be spoken (and spelling
    checked, and indexed, etc.) according to the transliteration that
    is common in English language context, or by French rules, or by
    German rules, or by standard (ISO 9) rules? It would be just a wild
    guess that the language of the enclosing text dictates this.

    If you actually write Russian in Cyrillic letters, then different cans of
    worms are opened.

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html
    Jukka K. Korpela, Sep 26, 2003
    #7
  8. Greetings.

    In article <Xns940319F5DC04jkorpelacstutfi@130.133.1.4>, Jukka K. Korpela
    wrote:
    > If you actually write Russian in Cyrillic letters, then different cans of
    > worms are opened.


    Well, yes, but since it's a short amount of text in an overwhelmingly
    iso-8559-1 document, I was planning on using HTML entities rather than
    using Unicode or mixing character encodings. That is, I had originally
    intended to write something like this:

    <a lang="ru" href="http://www.cs.toronto.edu/~kol/" title="Antonina
    Kolokolova's web
    page">Антонина
    Колоколова</a>

    And have it show up as follows, with a mouseover tooltip (or whatever other
    mechanism the browser provides, if any) displaying the English title: (Note
    that this message is in UTF-8.)

    Ðнтонина Колоколова

    In light of what I have learned from this thread, though, I suppose that the
    browser will consider the anchor's title, "Antonina Kolokolova's web page",
    to be in Russian rather than English. I don't suppose there's any way
    around this; I seem to recall reading that one isn't allowed to put markup
    inside anchor tags. That is, I would not be allowed to write the
    following, correct?

    <a href="http://www.cs.toronto.edu/~kol/" title="Antonina
    Kolokolova's web page"><span
    lang="ru">Антонина
    Колоколова</span></a>

    So assuming I drop the anchor title and end up with just

    <a lang="ru"
    href="http://www.cs.toronto.edu/~kol/">Антонина
    Колоколова</a>

    am I still exposing myself to potential cans of worms?

    --
    _
    _V.-o Tristan Miller [en,(fr,de,ia)] >< Space is limited
    / |`-' -=-=-=-=-=-=-=-=-=-=-=-=-=-=-= <> In a haiku, so it's hard
    (7_\\ http://www.nothingisreal.com/ >< To finish what you
    Tristan Miller, Sep 28, 2003
    #8
  9. Tristan Miller

    Dylan Parry Guest

    Tristan Miller wrote:

    > I seem to recall reading that one isn't allowed to put markup
    > inside anchor tags.


    Span within the anchor is perfectly valid.

    --
    Dylan Parry
    http://www.webpageworkshop.co.uk - FREE Web tutorials and references
    Now playing: Beethoven - Symphony No. 3 in E flat major, Op. 55 "Eroica"
    Dylan Parry, Sep 28, 2003
    #9
  10. Tristan Miller <> wrote:

    >> If you actually write Russian in Cyrillic letters, then different
    >> cans of worms are opened.

    >
    > Well, yes, but since it's a short amount of text in an overwhelmingly
    > iso-8559-1 document, I was planning on using HTML entities rather than
    > using Unicode or mixing character encodings.


    I see. Then the page will depend on browser support to Cyrillic letters.
    The "HTML entities", or actually character references to be exact, are
    relatively well supported - but people often use systems with insufficient
    fonts. This is one reason why transliterations are often used. Another
    reason is that people who don't know Russian still get some idea of a text
    when it's written as transliterated. (I remember how I started three times
    studying elementary Russian at the university, and I always quitted after
    a few lessons, since those odd characters were all Greek to me.)

    > <a lang="ru" href="http://www.cs.toronto.edu/~kol/" title="Antonina
    > Kolokolova's web
    > page">Антонина
    > Колоколова<
    > /a>


    Technically, this is incorrect information since the lang attribute
    specifies the language of the element content and all attributes.
    But why not write the title attribute value in Russian, since the page is
    in Russian?

    > In light of what I have learned from this thread, though, I suppose
    > that the browser will consider the anchor's title, "Antonina
    > Kolokolova's web page", to be in Russian rather than English.


    Well, it should. But useful support to lang attributes is very limited.

    > That is, I would not
    > be allowed to write the following, correct?
    >
    > <a href="http://www.cs.toronto.edu/~kol/" title="Antonina
    > Kolokolova's web page"><span
    > lang="ru">Антонина
    > Колоколова<
    > /span></a>


    That would be allowed, since <span> elements (and other text-level markup)
    are allowed inside <a> elements.

    But it would say that "Antonina Kokokolova's web page" is in English,
    although we know that the name is Russian, and in principle it should be
    pronounced with this in mind. There's no way around this, since attribute
    values are plain text by definition. Oh well, in the deepest theory,
    Unicode has some fancy tools for indicating language with special
    characters, but that gets far too theoretical even for me (and I will eat
    a worm if there is a browser that supports such things).

    > So assuming I drop the anchor title and end up with just
    >
    > <a lang="ru"
    > href="http://www.cs.toronto.edu/~kol/">Анто&#10
    > 85;ина
    > Колоколова<
    > /a>
    >
    > am I still exposing myself to potential cans of worms?


    The only problem I can see with that is the potential lack of Cyrillic
    fonts on people's browsers, and naturally the fact that most of the
    world's population doesn't understand Russian. But if you know a _useful_
    value for it, go ahead and use it without worrying too much about lang
    attributes.

    For a general audience, which may or may not understand Russian, I would
    use a link like the above, with Russian link text, followed by a simple
    textual explanation like "(Antonina Kolokolova's web page, in Russian)".

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html
    Jukka K. Korpela, Sep 28, 2003
    #10
  11. Tristan Miller

    DU Guest

    Tristan Miller wrote:

    > Greetings.
    >
    > In article <Xns940319F5DC04jkorpelacstutfi@130.133.1.4>, Jukka K. Korpela
    > wrote:
    >
    >>If you actually write Russian in Cyrillic letters, then different cans of
    >>worms are opened.

    >
    >
    > Well, yes, but since it's a short amount of text in an overwhelmingly
    > iso-8559-1 document, I was planning on using HTML entities rather than
    > using Unicode or mixing character encodings. That is, I had originally
    > intended to write something like this:
    >
    > <a lang="ru" href="http://www.cs.toronto.edu/~kol/" title="Antonina
    > Kolokolova's web
    > page">Антонина
    > Колоколова</a>
    >


    [snipped]

    The above does not make sense. lang defines the language for the content
    as well as the advisory title attribute value: here, you have 2
    languages used.
    The referenced resource used koi8-r but is entirely written in English:
    not consistent but this might be doable (I doubt this).

    I think it would be a lot more sensible to offer 2 links here. How about:

    <a lang="ru" href="http://www.cs.toronto.edu/~kol/" charset="koi8-r"
    hreflang="ru" title="Страничка Ðнтонина Колоколова">
    Антонина
    Колоколова</a>
    which would lead to a document written in Russian in koi8-r

    and

    <a lang="en" href="http://www.cs.toronto.edu/otherFile..."
    charset="iso-8859-1" hreflang="en" title="Antonina Kolokolova's web
    page"> Antonina Kolokolova</a>
    which would lead to a document written in English in iso-latin

    Links, title and attributes would be clear and consistent.



    Note that the document at
    http://www.cs.toronto.edu/~kol/
    uses an incorrect doctype declaration.

    { Note that the public identifier section of the DOCTYPE declaration is
    case sensitive. Some versions of Netscape Composer are known to insert
    the lower-case "-//w3c//dtd html 4.0 transitional//en", rather than the
    correct mixed-case "-//W3C//DTD HTML 4.0 Transitional//EN".
    }
    http://www.htmlhelp.org/faq/html/basics.html#doctype

    http://www.w3.org/QA/2002/04/valid-dtd-list.html

    DU
    --
    Javascript and Browser bugs:
    http://www10.brinkster.com/doctorunclear/
    - Resources, help and tips for Netscape 7.x users and Composer
    - Interactive demos on Popup windows, music (audio/midi) in Netscape 7.x
    http://www10.brinkster.com/doctorunclear/Netscape7/Netscape7Section.html
    DU, Sep 28, 2003
    #11
  12. Greetings.

    In article <bl7env$ama$>, DU wrote:
    > The referenced resource used koi8-r but is entirely written in English:
    > not consistent but this might be doable (I doubt this).
    >
    > I think it would be a lot more sensible to offer 2 links here. How about:
    >
    > <a lang="ru" href="http://www.cs.toronto.edu/~kol/" charset="koi8-r"
    > hreflang="ru" title="Страничка Ðнтонина Колоколова">
    > Антонина
    > Колоколова</a>
    > which would lead to a document written in Russian in koi8-r


    Well, this would not be appropriate given, as you said, that Antonina's
    index page is in English, not Russian. (Other parts of her site are in
    Russian, but not the specific file I'm linking to.)

    > Note that the document at
    > http://www.cs.toronto.edu/~kol/
    > uses an incorrect doctype declaration.


    Unfortunately, I don't have any control over other people's HTML coding
    skills (or lack thereof). If this error really irks you you can e-mail her
    yourself. :)

    Regards,
    Tristan

    --
    _
    _V.-o Tristan Miller [en,(fr,de,ia)] >< Space is limited
    / |`-' -=-=-=-=-=-=-=-=-=-=-=-=-=-=-= <> In a haiku, so it's hard
    (7_\\ http://www.nothingisreal.com/ >< To finish what you
    Tristan Miller, Sep 29, 2003
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Sunny

    Anchor <a name=... > tag

    Sunny, Jul 1, 2004, in forum: ASP .Net
    Replies:
    8
    Views:
    527
    Sunny
    Jul 2, 2004
  2. Jeff Jarrell
    Replies:
    0
    Views:
    545
    Jeff Jarrell
    Oct 26, 2005
  3. shruds
    Replies:
    1
    Views:
    792
    John C. Bollinger
    Jan 27, 2006
  4. Maxime
    Replies:
    7
    Views:
    3,300
    Jukka K. Korpela
    Mar 20, 2007
  5. eivind

    div tag and anchor tag

    eivind, Jan 5, 2004, in forum: Javascript
    Replies:
    1
    Views:
    127
    eivind
    Jan 15, 2004
Loading...

Share This Page