i18n - how best to provide multilingual content

Discussion in 'HTML' started by CptDondo, Jan 20, 2007.

  1. CptDondo

    CptDondo Guest

    I have a small, embedded app that uses a webserver to serve up pages
    showing status, etc.

    Right now all the pages are hard-coded in English. We need to provide
    multi-lingual support.

    All of the pages are PHP generated. Ideally, I'd like for the PHP
    backend to serve up the language based a) the user's locale, and if that
    is not set, its own locale.

    The PHP backend creates the pages on the fly from XML templates, so it
    wouldn't be that hard for us to change the language.

    But... I don't know the best way to do that. What is the current 'state
    of the art' for language on demand in web content?

    Thanks....
    CptDondo, Jan 20, 2007
    #1
    1. Advertising

  2. CptDondo

    Toby Inkster Guest

    CptDondo wrote:

    > But... I don't know the best way to do that. What is the current 'state
    > of the art' for language on demand in web content?


    Do you mean automatic translation; or do you mean serving up the best
    choice of human-written translations?

    Automatic translations are rubbish -- they are laughably bad, and will
    present an entirely unprofessional image. Do not even consider using them,
    except on a site that's indented to be ridiculed.

    That's not to say that their completely useless -- tools like Babelfish
    are useful for the *visitor* if they find a foreign site that they would
    like to read -- you can usually get the gist of it. But for the author,
    they are rubbish.

    For human-written translations, assuming you have got good translators,
    the situation is much better. Catering for a visitor in their own language
    shows that you're willing to make the extra effort to do business with
    them.

    Many companies will offer entirely different sites for each language. If
    you have the resources to manage such a layout, it is often the best
    choice because:

    1. It allows URLs to be tailored to the language. e.g.
    /en/information/garden/lawn-mower
    /fr/renseignement/jardin/fauchage
    which should help with multi-lingual search engine optimisation.

    2. It allows for a different information focus in each language.
    For example, I was once told by a translator that translating
    technical manuals between cultures involves so much more than
    word-for-word translation. People of different cultures expect
    to find different things in their documentation. Americans expect
    the manual to be a tour-de-force of the product's unique features,
    virtually an advertisement for the product; Western Europeans
    expect a fairly dry step-by-step explanation of how to use the
    product to accomplish different aims; Eastern Europeans expect
    information on how to repair the product when it breaks, as in
    their experience, these things inevitably do.

    3. It allows you to take baby-steps. Say, you've decided you want
    to expand into the German market, but you're not sure how much
    business you'll do there, so don't want to invest a lot of money
    having your entire site translated into German. You may want to
    just create a single page site in German, with basic information
    about your company, explain that the site's German translation is
    still pending, that there is more information on the English
    version of the site, and provide the telephone extension for Gunther,
    who works in your New York office, but was born and raised in Munich.
    As your German sales take off, you then plough back some of the money
    into improving the German site. Perhaps one day, the German market
    will be so important to you that you open an office in Berlin, and
    allow them to maintain the German site directly.

    The other approach with human-written translations is to have a single
    site available in multiple languages. For example, you ask/detect a user's
    preferred language, and then when they go to:

    /information/garden/lawn-mower

    a PHP script serves up the information in the correct language. If a
    translation is not available for that particular page (say, it's a new
    product, so the translators haven't finished with it yet), then you just
    serve up the English page. This is a reasonably good method, but it
    doesn't have advantages #1 and #2 above. It kind of has #3, but your
    baby-steps look a little silly because they end up as a mixture of, in the
    above example, German and English. This method can ease maintenance though.

    Always be careful not to let the translated versions of the site fall too
    far behind the English version in updatedness.

    --
    Toby A Inkster BSc (Hons) ARCS
    Contact Me ~ http://tobyinkster.co.uk/contact
    Toby Inkster, Jan 20, 2007
    #2
    1. Advertising

  3. CptDondo

    Andy Dingley Guest

    CptDondo wrote:

    > Right now all the pages are hard-coded in English. We need to provide
    > multi-lingual support.


    For serious non-XSLT work, consider JSP instead of PHP. The i18n tools
    are vastly better. Read the O'Relliy Java internationalization book,
    just for a guide to web i18n.

    > backend to serve up the language based a) the user's locale, and if that
    > is not set, its own locale.


    Make the selection completely user-selectable, with cookie persistence,
    with the methods you describe setting the default. It works just the
    same by default, but it's more flexible for casual users finding
    themselves using other people's computers' It's a real nuisance
    otherwise!

    > The PHP backend creates the pages on the fly from XML templates, so it
    > wouldn't be that hard for us to change the language.


    XML or XSLT ? If you structure the data model reasonably well, it's
    not hard to extract text strings stored in groups for each function,
    one for each language. It's easier to manage the translation and
    deployment though if the text are grouped by language into separate
    files and identified by a short identifier. The XSLT document()
    function is especially handy.
    Andy Dingley, Jan 20, 2007
    #3
  4. CptDondo

    Rik Guest

    Andy Dingley wrote:
    >> backend to serve up the language based a) the user's locale, and if
    >> that is not set, its own locale.

    >
    > Make the selection completely user-selectable, with cookie
    > persistence, with the methods you describe setting the default. It
    > works just the
    > same by default, but it's more flexible for casual users finding
    > themselves using other people's computers' It's a real nuisance
    > otherwise!


    Check, the order in which I determine language:
    - Explicitly set (by a GET variable, or pseudo one like /en/ or /de/ etc.
    taken into a rewrite)
    - Cookie
    - HTTP-Accept-Language in the header
    - Geo-IP info (there are free databases available, which are mostly
    accurate enough to determine the coutry most of the time)
    - System default

    After determining the language the cookie will be sent/overwritten with the
    current choice.

    Grtz,
    --
    Rik Wasmus
    Rik, Jan 20, 2007
    #4
  5. CptDondo

    J.O. Aho Guest

    Andy Dingley wrote:

    > XML or XSLT ? If you structure the data model reasonably well, it's
    > not hard to extract text strings stored in groups for each function,
    > one for each language. It's easier to manage the translation and
    > deployment though if the text are grouped by language into separate
    > files and identified by a short identifier. The XSLT document()
    > function is especially handy.


    Seen such files in the Gnome2 application desktop icons, they only have one
    short line in each language, the application description, but those files are
    big, think how large files will become if you have 20-30 languages and you
    have to replace the big file each time a language is updated or added, it's
    easier IMHO to handle files that has only one language and on the server side
    script it's easy select the right language file and use a backup if a
    translation would be missing.

    --

    //Aho
    J.O. Aho, Jan 20, 2007
    #5
  6. V Sat, 20 Jan 2007 14:27:37 +0100, Rik napsal(a):

    > Andy Dingley wrote:
    >>> backend to serve up the language based a) the user's locale, and if
    >>> that is not set, its own locale.

    >>
    >> Make the selection completely user-selectable, with cookie
    >> persistence, with the methods you describe setting the default. It
    >> works just the
    >> same by default, but it's more flexible for casual users finding
    >> themselves using other people's computers' It's a real nuisance
    >> otherwise!

    >
    > Check, the order in which I determine language:
    > - Explicitly set (by a GET variable, or pseudo one like /en/ or /de/ etc.
    > taken into a rewrite)
    > - Cookie
    > - HTTP-Accept-Language in the header
    > - Geo-IP info (there are free databases available, which are mostly
    > accurate enough to determine the coutry most of the time)
    > - System default
    >
    > After determining the language the cookie will be sent/overwritten with the
    > current choice.
    >


    Thanks. I'll probably do something like that - I've thought about
    using the 'HTTP-Accept-Language' var from the header. I just don't know
    how many people actually set those correctly.

    I guess I didn't phrase my question accurately enough; it has been a long
    week.

    I have XML templates that define item labels in a form. The XML has
    various tags that provide nav info and so on. This is on an embedded
    system, with only a small number of phrases that would need translation; I
    probably have less than 200 phrases, mostly one and two words.

    I have XML templates of the following form:

    <item id="myname" value="" index="5" type="text">My Name</item>

    The PHP backend reads that line, and creates a form entry for myname, with
    the label "My Name". What I want to do is to replace the english "My
    Name" with the appropriate words in the user's language.

    I'm thinking of a mechanism similar to the .po files, where the PHP
    backend look up the text in a translation file. Or even something like
    this:

    <item id="myname" value="" index="5" type="text" text="My Name"/>

    and the PHP backend would look up the text value for "My Name" in a lookup
    table for the user's language.

    (Aside: I guess I failed to use Google correctly yesterday.... PHP has
    support for gettext! <http://us3.php.net/gettext> So that's how I think I
    will go...)

    --Yan
    Captain Dondo, Jan 20, 2007
    #6
  7. CptDondo

    Rik Guest

    Captain Dondo wrote:
    > V Sat, 20 Jan 2007 14:27:37 +0100, Rik napsal(a):
    >
    >> Andy Dingley wrote:
    >>>> backend to serve up the language based a) the user's locale, and if
    >>>> that is not set, its own locale.
    >>>
    >>> Make the selection completely user-selectable, with cookie
    >>> persistence, with the methods you describe setting the default. It
    >>> works just the
    >>> same by default, but it's more flexible for casual users finding
    >>> themselves using other people's computers' It's a real nuisance
    >>> otherwise!

    >>
    >> Check, the order in which I determine language:
    >> - Explicitly set (by a GET variable, or pseudo one like /en/ or /de/
    >> etc. taken into a rewrite)
    >> - Cookie
    >> - HTTP-Accept-Language in the header
    >> - Geo-IP info (there are free databases available, which are mostly
    >> accurate enough to determine the coutry most of the time)
    >> - System default
    >>
    >> After determining the language the cookie will be sent/overwritten
    >> with the current choice.
    >>

    >
    > Thanks. I'll probably do something like that - I've thought about
    > using the 'HTTP-Accept-Language' var from the header. I just don't
    > know
    > how many people actually set those correctly.


    Not that many set it themselves, however, most browsers will set during
    installation to the most probable language (based in install-languages
    choice or for instance OS locale).

    > This is on an embedded
    > system, with only a small number of phrases that would need
    > translation; I probably have less than 200 phrases, mostly one and
    > two words.


    > I'm thinking of a mechanism similar to the .po files, where the PHP
    > backend look up the text in a translation file. Or even something
    > like this:
    >
    > <item id="myname" value="" index="5" type="text" text="My Name"/>
    >
    > and the PHP backend would look up the text value for "My Name" in a
    > lookup table for the user's language.
    >
    > (Aside: I guess I failed to use Google correctly yesterday.... PHP
    > has support for gettext! <http://us3.php.net/gettext> So that's how
    > I think I will go...)


    Check, with a limited amount of frases that would be my choice. A lot
    harder to maintain in translating entire pages/documents though.
    --
    Rik Wasmus
    Rik, Jan 20, 2007
    #7
  8. CptDondo

    aa Guest

    "CptDondo" <> wrote in message
    news:...

    > Right now all the pages are hard-coded in English. We need to provide
    > multi-lingual support.
    >
    > All of the pages are PHP generated. Ideally, I'd like for the PHP
    > backend to serve up the language based a) the user's locale, and if that
    > is not set, its own locale.
    >
    > The PHP backend creates the pages on the fly from XML templates, so it
    > wouldn't be that hard for us to change the language.
    >
    > But... I don't know the best way to do that. What is the current 'state
    > of the art' for language on demand in web content?
    >
    > Thanks....


    If by XML templates you mean structired contents in in defferent languages,
    then all you need is just a presentatinal template in Unicode.
    The problem usually arises with non-european languages which probably would
    not fit into a european page layout.
    As to language selection, you might want to consider an explicit selection
    of a language in the menu for the language detected automatically, is not
    always what a visitor wants.
    aa, Jan 21, 2007
    #8
  9. CptDondo

    John Murtari Guest

    CptDondo <> writes:

    > I have a small, embedded app that uses a webserver to serve up pages
    > showing status, etc.
    >
    > Right now all the pages are hard-coded in English. We need to provide
    > multi-lingual support.
    >
    > All of the pages are PHP generated. Ideally, I'd like for the PHP
    > backend to serve up the language based a) the user's locale, and if
    > that is not set, its own locale.
    >
    > The PHP backend creates the pages on the fly from XML templates, so it
    > wouldn't be that hard for us to change the language.
    >
    > But... I don't know the best way to do that. What is the current
    > 'state of the art' for language on demand in web content?
    >
    > Thanks....


    Like you said, go with gettext(). We just finished a
    fairly large app that was to be multilingual. We used gettext
    for the small stuff like "Login here". In cases where there
    were larger blocks of text we would set a variable $defLang
    based on the language the user was using, and in the code,

    include(TEXT_DIR."/$defLang/this-usage.txt");

    whenever we needed it. There was a root TEXT_DIR, with sub
    dirs for each locale. File names were the same and it made
    it easy for the end users to update each file for each language.

    For graphic buttons it was a similar approach:

    <img src=<?=BUTTON_DIR."/$defLang/login.gif"?> ......>

    It worked easy for us. Probably the biggest thing we
    did for the end user was create a simple PHP page that would
    scan all the source files for gettext and put up a tabular
    display of each phrase and the translation, if any, in each
    of the target languages, and they could enter the translations
    right there. They did not have to deal with the raw message
    files.

    Hope this helps.
    --
    John
    ___________________________________________________________________
    John Murtari Software Workshop Inc.
    jmurtari@following domain 315.635-1968(x-211) "TheBook.Com" (TM)
    http://thebook.com/
    John Murtari, Jan 24, 2007
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. JoanneC
    Replies:
    0
    Views:
    419
    JoanneC
    Aug 28, 2003
  2. JoanneC
    Replies:
    0
    Views:
    512
    JoanneC
    Aug 30, 2003
  3. Guest
    Replies:
    2
    Views:
    707
    Guest
    Jul 31, 2006
  4. Rajeev

    Can anyone provide methe best books for C?

    Rajeev, Mar 14, 2006, in forum: C Programming
    Replies:
    8
    Views:
    258
    Rajeev
    Mar 14, 2006
  5. jm
    Replies:
    0
    Views:
    257
Loading...

Share This Page