Diacritical marks in HTML?

Discussion in 'HTML' started by Girish Sharma, Nov 27, 2004.

  1. Is it possible to somehow encode diacritical marks such as a dot above and
    below any letter, a tilde, a bar, or an accent above any letter? I didn't
    find them in the ISO-8859-1 HTML entities.

    I want to do it in a way that can be viewed on any browser.

    Thanks.

    Girish Sharma
     
    Girish Sharma, Nov 27, 2004
    #1
    1. Advertising

  2. Girish Sharma

    mscir Guest

    Girish Sharma wrote:

    > Is it possible to somehow encode diacritical marks such as a dot above and
    > below any letter, a tilde, a bar, or an accent above any letter? I didn't
    > find them in the ISO-8859-1 HTML entities.
    >
    > I want to do it in a way that can be viewed on any browser.
    >
    > Thanks.
    > Girish Sharma


    Maybe this approach will work for you: UTF-8

    http://www.music.indiana.edu/tfm/diacrits.html
    http://www.tony-franks.co.uk/UTF-8.htm
    http://www.slovo.info/unifonts.htm
    http://www.ioplex.com/~miallen/encdec/dl/tests/utf8.html
     
    mscir, Nov 27, 2004
    #2
    1. Advertising

  3. Girish Sharma

    Philip Ronan Guest

    Girish Sharma wrote:

    > Is it possible to somehow encode diacritical marks such as a dot above and
    > below any letter, a tilde, a bar, or an accent above any letter? I didn't
    > find them in the ISO-8859-1 HTML entities.
    >
    > I want to do it in a way that can be viewed on any browser.
    >
    > Thanks.
    >
    > Girish Sharma


    ANY browser? I think that's going to be difficult.

    If the characters aren't part of the Latin1 character set (iso-8859-1), you
    might have better luck with Unicode (UTF-8).

    If the characters you want aren't widely available, then you can use
    "combining diacritical marks" to assemble them. I'm not sure how many
    browsers support this, but here's a link anyway.
    http://www.alanwood.net/unicode/combining_diacritical_marks.html

    Phil
    --
    phil [dot] ronan @ virgin [dot] net
    http://vzone.virgin.net/phil.ronan/
     
    Philip Ronan, Nov 27, 2004
    #3
  4. Girish Sharma

    Richard Guest

    Philip Ronan wrote:

    > Girish Sharma wrote:


    >> Is it possible to somehow encode diacritical marks such as a dot above
    >> and below any letter, a tilde, a bar, or an accent above any letter? I
    >>didn't find them in the ISO-8859-1 HTML entities. I want to do it in a
    >>way that can be viewed on any browser. Thanks. Girish Sharma


    > ANY browser? I think that's going to be difficult.


    > If the characters aren't part of the Latin1 character set (iso-8859-1),
    > you might have better luck with Unicode (UTF-8).


    > If the characters you want aren't widely available, then you can use
    > "combining diacritical marks" to assemble them. I'm not sure how many
    > browsers support this, but here's a link anyway.
    > http://www.alanwood.net/unicode/combining_diacritical_marks.html


    But wouldn't the user have to his browser set to interpret that utf-8
    coding?
    Like a site using a certain font that is not in general use, he won't see
    it.
     
    Richard, Nov 27, 2004
    #4
  5. Girish Sharma

    mscir Guest

    Richard wrote:

    > Philip Ronan wrote:
    >
    > > Girish Sharma wrote:

    >
    > >> Is it possible to somehow encode diacritical marks such as a dot above
    > >> and below any letter, a tilde, a bar, or an accent above any letter? I
    > >>didn't find them in the ISO-8859-1 HTML entities. I want to do it in a
    > >>way that can be viewed on any browser. Thanks. Girish Sharma

    >
    > > ANY browser? I think that's going to be difficult.

    >
    > > If the characters aren't part of the Latin1 character set (iso-8859-1),
    > > you might have better luck with Unicode (UTF-8).

    >
    > > If the characters you want aren't widely available, then you can use
    > > "combining diacritical marks" to assemble them. I'm not sure how many
    > > browsers support this, but here's a link anyway.
    > > http://www.alanwood.net/unicode/combining_diacritical_marks.html

    >
    > But wouldn't the user have to his browser set to interpret that utf-8 coding?
    > Like a site using a certain font that is not in general use, he won't see it.


    I thought that any browser that supported utf-8 would show the
    characters correctly if the page included:

    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">

    Mike
     
    mscir, Nov 27, 2004
    #5
  6. "Girish Sharma" <> wrote:

    > Is it possible to somehow encode diacritical marks such as a dot above
    > and below any letter, a tilde, a bar, or an accent above any letter? I
    > didn't find them in the ISO-8859-1 HTML entities.


    Just a few of them belong to ISO-8859-1 (e.g., "a" with tilde).

    > I want to do it in a way that can be viewed on any browser.


    Impossible.

    See e.g. http://www.cs.tut.fi/~jkorpela/html/chars.var for a general
    discussion. The most practical way might be to use a Unicode-capable editor
    and author your documents in UTF-8. That way you would see the characters
    themselves while working with a document. For casual occurrences of
    diacritic marks, it might be simplest to write them using character
    references.

    To write e.g. letter "a" with dot above, you could use either the
    precomposed character LATIN SMALL LETTER A WITH DOT ABOVE as
    ȧ
    or normal letter "a" followed by COMBINING DOT ABOVE:

    Generally the former works more often, and qualitatively better, when
    available - but only a relatively small number of such precomposed
    characters exist in Unicode.

    Browser support to _simple_ composition (base letter and one diacritic) is
    tolerable in modern browsers (IE 6, Firefox, etc), but nasty surprises
    should be expected for many combinations, especially if you try to put
    several diacritics on a character.

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html
     
    Jukka K. Korpela, Nov 27, 2004
    #6
  7. Girish Sharma

    Liz Guest

    In message <Xns95AEF274B4118jkorpelacstutfi@193.229.0.31>
    "Jukka K. Korpela" <> wrote:

    > "Girish Sharma" <> wrote:
    >
    > > Is it possible to somehow encode diacritical marks such as a dot above
    > > and below any letter, a tilde, a bar, or an accent above any letter? I
    > > didn't find them in the ISO-8859-1 HTML entities.

    >


    >
    > To write e.g. letter "a" with dot above, you could use either the
    > precomposed character LATIN SMALL LETTER A WITH DOT ABOVE as
    > ȧ


    Can you explain why this is better than &aring; please?
    (This isn't a challenge, it's genuine ignorance, plus I've got loads of
    &aacute;s in my Galapagos pages)

    Thanks

    Slainte

    Liz

    --
     
    Liz, Nov 28, 2004
    #7
  8. Liz <> wrote:

    >> To write e.g. letter "a" with dot above, you could use either the
    >> precomposed character LATIN SMALL LETTER A WITH DOT ABOVE as ȧ

    >
    > Can you explain why this is better than &aring; please?


    If the text to be presented contains a with dot above, then a with dot
    above is the correct character, and quite distinct from a with ring above.
    Of course it was just an example - letter a with dot is pretty rare
    (probably used only in Ulithian, which is spoken by 3,000 people, though it
    could appear e.g. as a mathematical symbol, too).

    > (This isn't a challenge, it's genuine ignorance, plus I've got loads of
    > &aacute;s in my Galapagos pages)


    No problem with that; the diacritic used in Spanish is the acute accent,
    and &aacute; is one way of presenting letter a with acute.

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html
     
    Jukka K. Korpela, Nov 28, 2004
    #8
  9. Girish Sharma

    Liz Guest

    In message <Xns95AFA808FC668jkorpelacstutfi@193.229.0.31>
    "Jukka K. Korpela" <> wrote:

    > Liz <> wrote:
    >
    > >> To write e.g. letter "a" with dot above, you could use either the
    > >> precomposed character LATIN SMALL LETTER A WITH DOT ABOVE as ȧ

    > >
    > > Can you explain why this is better than &aring; please?

    >
    > If the text to be presented contains a with dot above, then a with dot
    > above is the correct character, and quite distinct from a with ring above.
    > Of course it was just an example - letter a with dot is pretty rare
    > (probably used only in Ulithian, which is spoken by 3,000 people, though it
    > could appear e.g. as a mathematical symbol, too).

    Aaah.
    I was even more ignorant than I thought. :-(
    >
    > > (This isn't a challenge, it's genuine ignorance, plus I've got loads of
    > > &aacute;s in my Galapagos pages)

    >
    > No problem with that; the diacritic used in Spanish is the acute accent,
    > and &aacute; is one way of presenting letter a with acute.

    Thank goodness. :)

    Thanks and slainte

    Liz

    --
    Virtual Liz now at http://www.v-liz.com
    Kenya; Tanzania; Namibia; India; Seychelles; Galapagos
    "I speak of Africa and golden joys"
     
    Liz, Nov 28, 2004
    #9
  10. Thanks to all who replied to my request. I have tried a test using UTF-8 as
    suggested, but commonly used Sanskrit transliteration diacritical marks did
    not work well in either IE or Mozilla.

    Girish Sharma

    "Girish Sharma" <> wrote in message
    news:8ZTpd.1415$...
    > Is it possible to somehow encode diacritical marks such as a dot above and
    > below any letter, a tilde, a bar, or an accent above any letter? I didn't
    > find them in the ISO-8859-1 HTML entities.
    >
    > I want to do it in a way that can be viewed on any browser.
    >
    > Thanks.
    >
    > Girish Sharma
    >
    >
    >
    >
     
    Girish Sharma, Nov 29, 2004
    #10
  11. Girish Sharma

    mscir Guest

    Girish Sharma wrote:
    > Thanks to all who replied to my request. I have tried a test using UTF-8 as
    > suggested, but commonly used Sanskrit transliteration diacritical marks did
    > not work well in either IE or Mozilla.


    Would you post the url for the site? I want to learn more about this,
    apparently using utf-8 is more complicated than I thought. I'm surprised
    it's not more straight-forward to include different character sets in
    web pages.

    Mike
     
    mscir, Nov 29, 2004
    #11
  12. mscir <> wrote:

    > Girish Sharma wrote:
    >> Thanks to all who replied to my request. I have tried a test using
    >> UTF-8 as suggested, but commonly used Sanskrit transliteration
    >> diacritical marks did not work well in either IE or Mozilla.

    >
    > Would you post the url for the site? I want to learn more about this,
    > apparently using utf-8 is more complicated than I thought. I'm
    > surprised it's not more straight-forward to include different character
    > sets in web pages.


    How could things be more straightforward than including the character
    itself in utf-8 encoding?

    The encoding isn't really the issue. You can use any encoding, if desired,
    and present the characters using character references, say ṣ for
    letter s with dot below. There's of course the problem that the user's
    browser might not have a suitable font, or might be unable to use it.

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html
     
    Jukka K. Korpela, Dec 1, 2004
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. adamskim

    French diacritical marks

    adamskim, Dec 13, 2004, in forum: Java
    Replies:
    4
    Views:
    697
    Real Gagnon
    Dec 13, 2004
  2. Dado
    Replies:
    5
    Views:
    1,084
  3. Berteun Damman

    textwrap and combining diacritical marks

    Berteun Damman, Jun 28, 2007, in forum: Python
    Replies:
    1
    Views:
    351
    Berteun Damman
    Jun 28, 2007
  4. Paul Barry

    removing diacritical marks

    Paul Barry, Mar 17, 2006, in forum: Ruby
    Replies:
    2
    Views:
    240
    Paul Battley
    Mar 17, 2006
  5. jiverbean

    Diacritical marks in array don't translate

    jiverbean, Nov 11, 2005, in forum: Javascript
    Replies:
    15
    Views:
    238
    Dag Sunde
    Nov 12, 2005
Loading...

Share This Page