JKop said:
Pekka Jarvela posted:
I am using Visual Studio C++ .NET and when I try to print words with
umlaut letters, for instance
printf("Pässinpää-ääliö");
letters with dots over them, äö, will not be printed correctly on the
screen. I tried the trick
[...]
Windows 95 -> Windows ME were all ANSI, ie. 8-bit charachters = 255
possible different charachters. If you wanted foreign charachters, eg.
Arabic, Chinese, then you had to install a different codepage. You'd
to switch between codepages and could not display them both at once.
Not quite. Win9x use multi-byte character sets in some locales, certainly
Chinese. So you can have more than 256 characters, but each character can
take more than one char.
Also, if you've got "Microsoft Layer for Unicode" installed, you can use
Unicode on Win9x.
All versions of Windows NT, including Windows 2000 were Unicode, ie.
16-Bit characters = 65,535 possible different charachters.
With Windows XP came hope, all versions are Unicode, both home and
professional edition.
XP is NT. Internally, everything is done with UCS-2, but applications
compiled for ANSI still get ANSI of some flavor.
Practically, it doesn't matter too much for the application.
But still, here comes a bit of irony: On my system, WinXP
Professional, the following
MessageBoxA(blah,"€6.72",blah,blah); //ANSI version
works perfectly, ie. the euro sign _is_ displayed, but:
MessageBoxW(blah,L"€6.72",blah,blah); //Unicode version
does _not_ display the euro sign!!
This is because your source code is ANSI, so you're entering the euro
symbol using the Microsoft-specific code 128. In ANSI mode, Windows maps
that to the appropriate Unicode codepoint 0x20AC before displaying it.
I'd guess that in Unicode mode, the compiler naively maps that to Unicode
codepoint 0x0080, which is not the euro symbol.
Try using '\x20AC' in the Unicode version.
Umlated charachters _are_ included in ANSI, so I presume your problemo
may simply be that the umlated charachters are _not_ in the font
you're using. Try changing font.
Or not in the ANSI codepage you're using. Actually, in console windows,
it tends to use the OEM codepage, which will distinct from any ANSI
codepage. (in particular the one that the IDE is probably using)
I would recommend not using non-ASCII characters in source code, and in
console windows.
And if you just want to make it work now, look for a font that uses the
OEM codepage, like Terminal or Lucida ConsoleP (note the P), in your
editor. (or in charmap, since it may be hard to enter accented characters
in the OEM codepage)
#include <cstdio>
int main(void)
{
printf("P\204ssinp\204\204-\204\204li\224\n");
printf("P\344ssinp\344\344-\344\344li\366\n");
return 0;
}
At the console window, the first line will be correct. Piped to a file
and opened in notepad, at least with the "Windows: Western" (almost ISO-
8859-1) codepage, the second line will be correct. (The second is
identical to what Pekka Jarvela posted.)
wprintf(L"P\344ssinp\344\344-\344\344li\366\n");
should look right when opened in Unicode-capable notepad, although you'd
need to make sure your stdout was Unicode. (I'm not really familiar with
wprintf...)
-josh