Umlaut letters in C++

Pekka Jarvela · Apr 28, 2004

I am using Visual Studio C++ .NET and when I try to print words with
umlaut letters, for instance

printf("Pässinpää-ääliö");

letters with dots over them, äö, will not be printed correctly on the
screen. I tried the trick

#ifdef _UNICODE
int wmain(void)
#else
int main(void)
#endif

but it didn't help. How can I get printf to produce umlaut letters
correctly?

Pekka

Stewart Gordon · Apr 28, 2004

Pekka said:
I am using Visual Studio C++ .NET and when I try to print words with
umlaut letters, for instance

printf("Pässinpää-ääliö");

letters with dots over them, äö, will not be printed correctly on the
screen.

<snip>

Assuming you mean it's printing 'different' characters, it's a character
set issue.

Windows uses the ANSI character set (with a few additions), at least
when it isn't using Unicode.

Your program is obviously running in a DOS window. DOS uses the IBM
character set (one of various versions thereof). So what you are
probably seeing is the IBM characters with the same codes as the ANSI
characters you're typing in your (presumably) Windows-based editor.

Look up the codes here:

http://www.i18nguy.com/unicode/codepages.html#ibmdos

Stewart.

JKop · Apr 28, 2004

Pekka Jarvela posted:

I am using Visual Studio C++ .NET and when I try to print words with
umlaut letters, for instance

printf("Pässinpää-ääliö");

letters with dots over them, äö, will not be printed correctly on the
screen. I tried the trick

#ifdef _UNICODE
int wmain(void)
#else
int main(void)
#endif

but it didn't help. How can I get printf to produce umlaut letters
correctly?

Pekka

Firstly,

Windows 95 -> Windows ME were all ANSI, ie. 8-bit charachters = 255 possible
different charachters. If you wanted foreign charachters, eg. Arabic,
Chinese, then you had to install a different codepage. You'd to switch
between codepages and could not display them both at once.

All versions of Windows NT, including Windows 2000 were Unicode, ie. 16-Bit
characters = 65,535 possible different charachters.

With Windows XP came hope, all versions are Unicode, both home and
professional edition.

But still, here comes a bit of irony: On my system, WinXP Professional, the
following

MessageBoxA(blah,"€6.72",blah,blah); //ANSI version

works perfectly, ie. the euro sign _is_ displayed, but:

MessageBoxW(blah,L"€6.72",blah,blah); //Unicode version

does _not_ display the euro sign!!

josh · Apr 29, 2004

JKop said:
Pekka Jarvela posted:

I am using Visual Studio C++ .NET and when I try to print words with
umlaut letters, for instance

printf("Pässinpää-ääliö");

letters with dots over them, äö, will not be printed correctly on the
screen. I tried the trick

Click to expand...

[...]
Windows 95 -> Windows ME were all ANSI, ie. 8-bit charachters = 255
possible different charachters. If you wanted foreign charachters, eg.
Arabic, Chinese, then you had to install a different codepage. You'd
to switch between codepages and could not display them both at once.

Not quite. Win9x use multi-byte character sets in some locales, certainly
Chinese. So you can have more than 256 characters, but each character can
take more than one char.

Also, if you've got "Microsoft Layer for Unicode" installed, you can use
Unicode on Win9x.

All versions of Windows NT, including Windows 2000 were Unicode, ie.
16-Bit characters = 65,535 possible different charachters.

With Windows XP came hope, all versions are Unicode, both home and
professional edition.

XP is NT. Internally, everything is done with UCS-2, but applications
compiled for ANSI still get ANSI of some flavor.

Practically, it doesn't matter too much for the application.

But still, here comes a bit of irony: On my system, WinXP
Professional, the following

MessageBoxA(blah,"€6.72",blah,blah); //ANSI version

works perfectly, ie. the euro sign _is_ displayed, but:

MessageBoxW(blah,L"€6.72",blah,blah); //Unicode version

does _not_ display the euro sign!!

This is because your source code is ANSI, so you're entering the euro
symbol using the Microsoft-specific code 128. In ANSI mode, Windows maps
that to the appropriate Unicode codepoint 0x20AC before displaying it.
I'd guess that in Unicode mode, the compiler naively maps that to Unicode
codepoint 0x0080, which is not the euro symbol.

Try using '\x20AC' in the Unicode version.

Umlated charachters _are_ included in ANSI, so I presume your problemo
may simply be that the umlated charachters are _not_ in the font
you're using. Try changing font.

Or not in the ANSI codepage you're using. Actually, in console windows,
it tends to use the OEM codepage, which will distinct from any ANSI
codepage. (in particular the one that the IDE is probably using)

I would recommend not using non-ASCII characters in source code, and in
console windows.

And if you just want to make it work now, look for a font that uses the
OEM codepage, like Terminal or Lucida ConsoleP (note the P), in your
editor. (or in charmap, since it may be hard to enter accented characters
in the OEM codepage)

#include <cstdio>

int main(void)
{
printf("P\204ssinp\204\204-\204\204li\224\n");
printf("P\344ssinp\344\344-\344\344li\366\n");
return 0;
}

At the console window, the first line will be correct. Piped to a file
and opened in notepad, at least with the "Windows: Western" (almost ISO-
8859-1) codepage, the second line will be correct. (The second is
identical to what Pekka Jarvela posted.)

wprintf(L"P\344ssinp\344\344-\344\344li\366\n");

should look right when opened in Unicode-capable notepad, although you'd
need to make sure your stdout was Unicode. (I'm not really familiar with
wprintf...)

-josh

Help in hangman game	1	Jul 24, 2023
C language. work with text	3	Dec 10, 2021
Boomer trying to learn coding in C and C++	6	Dec 16, 2022
Problem with reflection in C#	4	Jul 7, 2023
Need help with first C# console program	0	Sep 4, 2015
[C language] Issue in the Lotka-Volterra model.	0	Jun 28, 2023
Drawing missing in bitmap in a pure C win32 program	4	Jun 3, 2023
How to try a range of hex values in C# code ?	0	Nov 19, 2022

Umlaut letters in C++

Pekka Jarvela

Stewart Gordon

JKop

josh

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads