page encoding question

Discussion in 'HTML' started by Tony Vella, Dec 19, 2005.

  1. Tony Vella

    Tony Vella Guest

    I am preparing a series of philatelic html pages (lots of text and a few
    scans of stamps) which will include alpha-characters (accents) in Italian,
    French, Spanish, Portuguese and Danish. The pages I have finished in draft
    form so far I have encoded UTF-8 but I have just been told that 99% of the
    world will not be able to read them and that I should go through all the
    pages and re-encode them "western european - windows (1252)". I guess what
    I would like to know is what encoding would be most effective for these
    particular languages. Any advice and pointers will be appreciated.
    --
    Tony Vella in Ottawa, Canada
     
    Tony Vella, Dec 19, 2005
    #1
    1. Advertising

  2. Tony Vella wrote:

    > I am preparing a series of philatelic html pages (lots of text and a few
    > scans of stamps) which will include alpha-characters (accents) in Italian,
    > French, Spanish, Portuguese and Danish. The pages I have finished in draft
    > form so far I have encoded UTF-8 but I have just been told that 99% of the
    > world will not be able to read them


    That is rubbish. UTF-8 is very well supported (so much so, that I can't
    remember the last time I came across a system that couldn't handle it).

    --
    David Dorward <http://blog.dorward.me.uk/> <http://dorward.me.uk/>
    Home is where the ~/.bashrc is
     
    David Dorward, Dec 19, 2005
    #2
    1. Advertising

  3. Tony Vella

    Dan Guest

    Tony Vella wrote:
    > I am preparing a series of philatelic html pages (lots of text and a few
    > scans of stamps) which will include alpha-characters (accents) in Italian,
    > French, Spanish, Portuguese and Danish. The pages I have finished in draft
    > form so far I have encoded UTF-8 but I have just been told that 99% of the
    > world will not be able to read them and that I should go through all the
    > pages and re-encode them "western european - windows (1252)". I guess what
    > I would like to know is what encoding would be most effective for these
    > particular languages. Any advice and pointers will be appreciated.


    While UTF-8 is actually very widely supported, and thus there's no
    reason to change your encoding from this (if your server sends a proper
    content-type header indicating the encoding), the Western European
    languages you are using should work all right in the standard Western
    encoding iso-8859-1 as well. Avoid windows-1252; it's a proprietary
    Microsoft set.

    --
    Dan
     
    Dan, Dec 19, 2005
    #3
  4. "Dan" <> skrev i meddelandet
    news:...
    > Tony Vella wrote:
    > > I am preparing a series of philatelic html pages (lots of text and a few
    > > scans of stamps) which will include alpha-characters (accents) in

    Italian,
    > > French, Spanish, Portuguese and Danish. The pages I have finished in

    draft
    > > form so far I have encoded UTF-8 but I have just been told that 99% of

    the
    > > world will not be able to read them and that I should go through all the
    > > pages and re-encode them "western european - windows (1252)". I guess

    what
    > > I would like to know is what encoding would be most effective for these
    > > particular languages. Any advice and pointers will be appreciated.

    >
    > While UTF-8 is actually very widely supported, and thus there's no
    > reason to change your encoding from this (if your server sends a proper
    > content-type header indicating the encoding), the Western European
    > languages you are using should work all right in the standard Western
    > encoding iso-8859-1 as well. Avoid windows-1252; it's a proprietary
    > Microsoft set.
    >
    > --
    > Dan


    I am using
    iso-8859-1 at the moment but I am going to change it into UTF-8 to add the
    pages in Russian and Chinese
    (just now I have little in these languages)

    --
    Luigi Donatello Asero
    https://www.scaiecat-spa-gigi.com/sv/oversattning.php
     
    Luigi Donatello Asero, Dec 19, 2005
    #4
  5. "Tony Vella" <> wrote:

    > I am preparing a series of philatelic html pages (lots of text and a few
    > scans of stamps) which will include alpha-characters (accents) in Italian,
    > French, Spanish, Portuguese and Danish.


    They are all covered by the ISO-8859-1 encoding, except for some punctuation
    marks and letters like the oe ligature. If you use windows-1252, you get the
    punctuation marks and the ligature, too.

    >The pages I have finished in draft
    > form so far I have encoded UTF-8 but I have just been told that 99% of the
    > world will not be able to read them


    Nonsense. More probably, 99 % of the WWW users _are_ able to read them. Well,
    let's say 97.6 %. After all, 96,3 % of all percentages have just been made
    up, and the remaining 4,7 % have been miscalculated.

    > and that I should go through all the
    > pages and re-encode them "western european - windows (1252)".


    I wouldn't do that at this point, unless you have good tools that do such
    things for you with minimal effort.

    > I guess what
    > I would like to know is what encoding would be most effective for these
    > particular languages.


    If you were just about to start the project, I would recommend ISO-8859-1 (or
    windows-1252 if you need those extras) - not because of wider browser
    coverage (though there is a _small_ improvement to be gained there) but
    because those encodings are somewhat more efficient (one byte per character,
    whereas UTF-8 uses two bytes for some of the characters you'd use).

    UTF-8 is certainly simpler in the future if you'll ever need to add
    characters in other languages.

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html
     
    Jukka K. Korpela, Dec 19, 2005
    #5
  6. On Mon, 19 Dec 2005, David Dorward wrote:

    > Tony Vella wrote:
    >
    > > I am preparing a series of philatelic html pages (lots of text and
    > > a few scans of stamps) which will include alpha-characters
    > > (accents) in Italian, French, Spanish, Portuguese and Danish. The
    > > pages I have finished in draft form so far I have encoded UTF-8
    > > but I have just been told that 99% of the world will not be able
    > > to read them

    >
    > That is rubbish.


    Agreed

    > UTF-8 is very well supported (so much so, that I can't remember the
    > last time I came across a system that couldn't handle it).


    Broad agreement with that, but there are exceptions...

    Well, Netscape 4.* versions do a pretty good job of *rendering* utf-8,
    but do keep in mind that, if any form submission is required, then NN4
    badly mangles utf-8. Whether it's worth understanding how to
    implement a workaround for that old zombie is debatable, of course:
    I'm just mentioning that it's not without a problem.

    cheers

    (The original WebTV is also hopeless at rendering anything other than
    a subset of Windows-1252, but ho hum.)
     
    Alan J. Flavell, Dec 20, 2005
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?V0VJV0VJV0VJ?=

    encoding problem on ASP .net page

    =?Utf-8?B?V0VJV0VJV0VJ?=, Apr 16, 2004, in forum: ASP .Net
    Replies:
    2
    Views:
    718
    Joerg Jooss
    Apr 16, 2004
  2. Hardy Wang

    Encoding.Default and Encoding.UTF8

    Hardy Wang, Jun 8, 2004, in forum: ASP .Net
    Replies:
    5
    Views:
    19,064
    Jon Skeet [C# MVP]
    Jun 9, 2004
  3. Replies:
    1
    Views:
    23,584
    Real Gagnon
    Oct 8, 2004
  4. Tony Vella

    page encoding question - thank you

    Tony Vella, Dec 20, 2005, in forum: HTML
    Replies:
    2
    Views:
    339
    Toby Inkster
    Dec 21, 2005
  5. Replies:
    2
    Views:
    423
Loading...

Share This Page