multilingual code

P

Philipp Kraus

Hi,

I would like to set up my code multilingual, in detail I would like to
set up the text of the exception during compiletime to a language.
Is there any toolbox for designen the languages files and how to
include then into my code?

My first try, do the languages text into an enum and set up the inlcude
with #ifdef.

Thanks

Phil
 
T

Thomas Jollans

Hi,

I would like to set up my code multilingual, in detail I would like to
set up the text of the exception during compiletime to a language.
Is there any toolbox for designen the languages files and how to include
then into my code?

My first try, do the languages text into an enum and set up the inlcude
with #ifdef.

GNU gettext, maybe?
 
J

James Kanze

Philipp Kraus writes:
In C++, this is handled by the localization library. The
general approach used in C and C++ is that application
messages are written in one language, usually English, and are
looked up in a message catalog for a different language. If
found, the translated message string gets returned, otherwise
the application's native message string is returned by
default, so that the original message gets used.

Except that the standard library interface in locale takes an
int as the message id, with a separate parameter for the
default in case the message isn't found:-(.
Your C++ implementation should have some tool for extracting
application messages from the C++ source. Your application
messages probably need to be marked up with some syntactical
sugar so that your application message extraction tool picks
them up. You would then translate your application messages,
and create a message catalog file, using your C++
imlementation's tools.
Back in your C++ code, after instantiating a std::locale, you
use it to instantiate a std::messages<char> or
a std::messages<wchar_t> facets, then use the facet's open()
to open a message catalog, and get() to look up your
application message in the message catalog.
The C++ language standardizes only the message retrieval part,
the std::locale and std::messages facets. The creation of
message catalogs is out of scope, and is implementation
defined. On Unix and Linux, the GNU gettext package provides
the necessary tools. GNU gettext also offers an alternative
C API in place of the standard C++ classes. GNU gettext's
C API includes some nice enhanced functionality that's not
supported by the C++ library.

I wonder how implementations do map the int message identifier
to the string that the Unix interface expects.

In practice, it really doesn't matter, because just picking up
a different message is far from sufficient. In the old days,
people accepted messages like "1 Error(s) found", but today, one
rather expects something more "human". Which means respecting
the grammar when generating messages. And since each language
has a different grammar, you're pretty much stuck with writing
a separate dynamically linked module for each language, and
loading the appropriate one.

(A simple example of the sort of problems you'll encounter:
std::cerr << errorCount << " " << error[count == 1] << " found;
works well in English, provided that error is initialized:
char const* error[] = { "error", "errors" };
In French, however, "found" becomes "trouvée". Or "trouvées",
depending on the count---suddenly found also needs a table
lookup. In German, found must come before the noun (in this
case, at least), and found changes depending on the number,
whereas error ("Fehler") doesn't. And in Russian, from what
I've been told, numbers like 21 or 31 also take a singular.)
 
J

James Kanze

Yes, I was still thinking in gettext terms.
… and not to mention that gcc's std::messages facet is broken. It uses
global state. Instantiating two std::messages facets in different locales
will result in chaos.
Sadly, for the moment, it appears that std::messages is broken goods.

In general, I fear that there is no really workable portable
solution. There's Windows, and Unix, and it wouldn't surprise
me if Apple handled the issue differently than most of the other
Unix (and that there were subtle variations between Unix,
despite Posix having standardized a good deal).
Recent GNU gettext libraries actually have a mechanism for
handling this, that is, having multiple plural forms and
selecting the correct one for a given numerical quantity.

And handling different genders. And different rules for
according number and gender. Including between verbs. (Some
languages mark verbs with gender.)

A lot of people are making significant progress, but human
languages are extremely complex and varied. For the moment, if
you want linguistically correct output in various language,
I think the only solution is to create a dynamically linked
object for each language, programming all of the special cases
by hand.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,773
Messages
2,569,594
Members
45,121
Latest member
LowellMcGu
Top