which charset to use?

Discussion in 'HTML' started by richard, May 1, 2012.

  1. richard

    richard Guest

    The w3.org validator is having fits with special characters.
    The page is currently using utf-8.
    When it comes an e with the apostrophe over it, or some other character
    with a secondary character over it, it fails to validate.
    So which charset is proper for an xhtml transitional page for these
    characters?

    http://www.1littleworld.net/songs/Asongs.html
    richard, May 1, 2012
    #1
    1. Advertising

  2. På Tue, 01 May 2012 20:31:56 +0200, skrev richard <>:

    > The w3.org validator is having fits with special characters.
    > The page is currently using utf-8.
    > When it comes an e with the apostrophe over it, or some other character
    > with a secondary character over it, it fails to validate.
    > So which charset is proper for an xhtml transitional page for these
    > characters?
    >
    > http://www.1littleworld.net/songs/Asongs.html


    ISO-8859-1 might be a good choice. Alternately, replace all such special
    characters with HTML entities (using &eacute; for é, for instance).

    A list of entities can be found here:
    http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
    or here:
    http://www.w3schools.com/tags/ref_entities.asp

    --
    Kim André Akerø
    -
    (remove NOSPAM to contact me directly)
    Kim André Akerø, May 1, 2012
    #2
    1. Advertising

  3. richard

    Doug Miller Guest

    richard <> wrote in news:ty1lmmgy8tv2$.3dluxr9xmy65$.dlg@
    40tude.net:

    > The w3.org validator is having fits with special characters.
    > The page is currently using utf-8.
    > When it comes an e with the apostrophe over it, or some other character
    > with a secondary character over it, it fails to validate.


    This is probably because you coded the character incorrectly in your HTML.

    If you want, for example, an e with an accent grave (not "e with the apostrophe over it"), you
    should write &egrave; or è instead of attempting to insert the actual hex representation
    of the e-grave (0xE8) in your source document.

    > So which charset is proper for an xhtml transitional page for these
    > characters?


    UTF-8, combined with valid HTML.

    >
    > http://www.1littleworld.net/songs/Asongs.html
    >
    Doug Miller, May 1, 2012
    #3
  4. richard

    Doug Miller Guest

    Doug Miller <> wrote in
    news:XnsA046B43AC1FBEdougmilmaccom@88.198.244.100:

    Sorry, hit 'send' just a bit too soon. Table of valid HTML
    characters can be found here:

    http://www.lookuptables.com/


    > richard <> wrote in
    > news:ty1lmmgy8tv2$.3dluxr9xmy65$.dlg@ 40tude.net:
    >
    >> The w3.org validator is having fits with special characters.
    >> The page is currently using utf-8.
    >> When it comes an e with the apostrophe over it, or some other
    >> character with a secondary character over it, it fails to
    >> validate.

    >
    > This is probably because you coded the character incorrectly in
    > your HTML.
    >
    > If you want, for example, an e with an accent grave (not "e with
    > the apostrophe over it"), you should write &egrave; or è
    > instead of attempting to insert the actual hex representation
    > of the e-grave (0xE8) in your source document.
    >
    >> So which charset is proper for an xhtml transitional page for
    >> these characters?

    >
    > UTF-8, combined with valid HTML.
    >
    >>
    >> http://www.1littleworld.net/songs/Asongs.html
    >>

    >
    >
    Doug Miller, May 1, 2012
    #4
  5. richard

    richard Guest

    On Tue, 1 May 2012 21:44:23 +0000 (UTC), Doug Miller wrote:

    > Doug Miller <> wrote in
    > news:XnsA046B43AC1FBEdougmilmaccom@88.198.244.100:
    >
    > Sorry, hit 'send' just a bit too soon. Table of valid HTML
    > characters can be found here:
    >
    > http://www.lookuptables.com/
    >
    >
    >> richard <> wrote in
    >> news:ty1lmmgy8tv2$.3dluxr9xmy65$.dlg@ 40tude.net:
    >>
    >>> The w3.org validator is having fits with special characters.
    >>> The page is currently using utf-8.
    >>> When it comes an e with the apostrophe over it, or some other
    >>> character with a secondary character over it, it fails to
    >>> validate.

    >>
    >> This is probably because you coded the character incorrectly in
    >> your HTML.
    >>
    >> If you want, for example, an e with an accent grave (not "e with
    >> the apostrophe over it"), you should write &egrave; or è
    >> instead of attempting to insert the actual hex representation
    >> of the e-grave (0xE8) in your source document.
    >>
    >>> So which charset is proper for an xhtml transitional page for
    >>> these characters?

    >>
    >> UTF-8, combined with valid HTML.
    >>
    >>>
    >>> http://www.1littleworld.net/songs/Asongs.html
    >>>

    >>
    >>


    explain then how come when I replaced the e' with the regular e, the code
    now shows a ? in inverted colors?
    "e" is not a standard utf-8 character?
    since when?
    richard, May 1, 2012
    #5
  6. richard

    Doug Miller Guest

    richard <> wrote in news:1h4f9mzzj2yjb.yr6t6w4e5u3f$.dlg@
    40tude.net:

    > On Tue, 1 May 2012 21:44:23 +0000 (UTC), Doug Miller wrote:
    >
    >> Doug Miller <> wrote in
    >> news:XnsA046B43AC1FBEdougmilmaccom@88.198.244.100:
    >>
    >> Sorry, hit 'send' just a bit too soon. Table of valid HTML
    >> characters can be found here:
    >>
    >> http://www.lookuptables.com/
    >>
    >>
    >>> richard <> wrote in
    >>> news:ty1lmmgy8tv2$.3dluxr9xmy65$.dlg@ 40tude.net:
    >>>
    >>>> The w3.org validator is having fits with special characters.
    >>>> The page is currently using utf-8.
    >>>> When it comes an e with the apostrophe over it, or some other
    >>>> character with a secondary character over it, it fails to
    >>>> validate.
    >>>
    >>> This is probably because you coded the character incorrectly in
    >>> your HTML.
    >>>
    >>> If you want, for example, an e with an accent grave (not "e with
    >>> the apostrophe over it"), you should write &egrave; or è
    >>> instead of attempting to insert the actual hex representation
    >>> of the e-grave (0xE8) in your source document.
    >>>
    >>>> So which charset is proper for an xhtml transitional page for
    >>>> these characters?
    >>>
    >>> UTF-8, combined with valid HTML.
    >>>
    >>>>
    >>>> http://www.1littleworld.net/songs/Asongs.html
    >>>>
    >>>
    >>>

    >
    > explain then how come when I replaced the e' with the regular e,
    > the code now shows a ? in inverted colors?


    Maybe *you* should explain why the validator is still complaining about an invalid character
    that you *claim* you replaced.

    Look at your file. IT'S STILL THERE. Or perhaps it's another one. Regardless, you still have
    invalid UTF-8 characters in the document.

    > "e" is not a standard utf-8 character?


    Of course it is. But è is NOT, and it's still in there, along with at least one other character that
    *also* is not valid:

    title="Raymond Lefèvre - Soul Coaxing (Ame Câline)"
    Doug Miller, May 2, 2012
    #6
  7. richard

    Doug Miller Guest

    In article <>, d wrote:
    >On Tue, 1 May 2012 23:37:30 +0000 (UTC), Doug Miller
    ><> wrote:
    >
    >>Maybe *you* should explain why the validator is still complaining about an

    > invalid character
    >>that you *claim* you replaced.
    >>
    >>Look at your file. IT'S STILL THERE. Or perhaps it's another one. Regardless,

    > you still have
    >>invalid UTF-8 characters in the document.
    >>
    >>> "e" is not a standard utf-8 character?

    >>
    >>Of course it is. But è is NOT, and it's still in there, along with at least

    > one other character that
    >>*also* is not valid:
    >>
    >>title="Raymond Lefèvre - Soul Coaxing (Ame Câline)"

    >
    >ROTFLOL, st00pid strikes again.


    Evan, you're becoming a real pain in the ass. You NEVER contribute anything
    useful to this newsgroup. The only posts you ever make are derogatory of
    Richard. Yes, Richard is an ass -- but we don't need you to keep pointing it
    out. We're all aware of it. Now just **** off.
    Doug Miller, May 2, 2012
    #7
  8. 2012-05-02 0:44, Doug Miller wrote:

    > Table of valid HTML
    > characters can be found here:
    >
    > http://www.lookuptables.com/


    No, it is just a poorly presented table of entities for some characters,
    with some disinformation that is not pragmatically fatal but plain wrong
    anyway.

    The set of valid characters in HTML consists of all ISO 10646 and
    Unicode characters except for a handful of control characters, though
    XHTML imposes some more limitations.

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Jukka K. Korpela, May 2, 2012
    #8
  9. Andreas Prilop, May 3, 2012
    #9
  10. richard

    Doug Miller Guest

    Andreas Prilop <> wrote in
    news:p-hannover.de:

    > On Tue, 1 May 2012, richard wrote:
    >
    >> So which charset is proper for an xhtml transitional page
    >> for these characters?
    >> http://www.1littleworld.net/songs/Asongs.html

    >
    > Fix your silly dates first:


    There's nothing the matter with his dates. This is purely a vanity site, apparently intended to be
    viewed only by its author -- who is an American, and quite sensibly uses the date format most
    commonly used in the United States.
    Doug Miller, May 3, 2012
    #10
  11. richard

    Tim Streater Guest

    In article
    <-hannover.de>,
    Andreas Prilop <> wrote:

    > On Tue, 1 May 2012, richard wrote:
    >
    > > So which charset is proper for an xhtml transitional page
    > > for these characters?
    > > http://www.1littleworld.net/songs/Asongs.html

    >
    > Fix your silly dates first:
    > http://www.w3.org/QA/Tips/iso-date


    He doesn't have to use silly ISO dates. He can use any local format as
    long as its *unambiguous*. That is all that matters.

    --
    Tim

    "That excessive bail ought not to be required, nor excessive fines imposed,
    nor cruel and unusual punishments inflicted" -- Bill of Rights 1689
    Tim Streater, May 3, 2012
    #11
  12. richard

    dorayme Guest

    In article
    <-hannover.de>,
    Andreas Prilop <> wrote:
    ....

    > Fix your silly dates first:
    > http://www.w3.org/QA/Tips/iso-date


    It lists some problems with different formats. It says optimistically
    "Fortunately, there is one solution in the ISO-developed international
    date format" and "The international format defined by ISO (ISO 8601)
    tries to address all these problems by defining a numerical date
    system as follows: YYYY-MM-DD".

    However it does not fix things for people who do not happen to know
    this definition. I like how it eggs a big fat pudding arguing:

    "In most cases, writing the date in full letters would be better...
    .... easy to understand for any English-speaking audience.

    "But this system does not cross borders much better than its numerical
    counterparts: does the french 12 Aout 2042 actually mean something for
    a Japanese person? Or when you notice a e?oa44iN03ae16i? in Japanese
    which is 16 March 1969 in English."

    Notice the words "cross borders"? If a website is written in English
    and the unambiguous long date form is used, order not being so
    important, there are many borders it crosses just fine. In fact, the
    date crosses the borders in at least as much comfort as the rest of
    its fellow travelling text in the website, and it is just as
    hospitably treated and understood. If '23 April 2012' is not
    understood in Buginese, but the rest of the site is, then maybe a cat
    can really smile without having a face.

    It would be easier to teach a robot translator how to translate '7
    April 2012' or 'April 7 2012' (or any arrangement that was unambiguous
    to an English speaker who knew basic things about the meanings of the
    words and the dating system of days, months, years) than to to teach
    billions of people a standard.

    It might well be true however, that using the ISO standard would make
    the job of robot translators easier. But they need the least help! The
    only help they need is unambiguity and the long-form English dates are
    that.

    --
    dorayme
    dorayme, May 3, 2012
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Benny Ng
    Replies:
    0
    Views:
    359
    Benny Ng
    Jun 17, 2004
  2. Familyman
    Replies:
    3
    Views:
    511
    Alan J. Flavell
    Feb 9, 2006
  3. Zahpod
    Replies:
    4
    Views:
    656
    Zahpod
    Apr 21, 2006
  4. Chris
    Replies:
    0
    Views:
    678
    Chris
    Oct 13, 2005
  5. optimistx

    javascript charset <> page charset

    optimistx, Aug 14, 2008, in forum: Javascript
    Replies:
    2
    Views:
    259
    optimistx
    Aug 15, 2008
Loading...

Share This Page