toupper and locale

G

gelbeiche

I have a question regarding the following
small C program.

#include <locale.h>

int main()
{
char* loc = 0;
char before,after;
int i;
loc = setlocale(LC_ALL,"de_DE.iso88591");
printf("locale:%s\n", loc);
before = 'ä';
printf("decimal value:%d \n",before);
printf("hex value:%x \n",before);
printf("char value:%c \n",before);
i = toupper(before);
printf("i=%i\n",i);
after = (char) i;
printf("decimal value:%d \n",after);
printf("hex value:%x \n",after);
printf("after=%c\n",after);
}

At Linux the output of the program looks ok.
It converts the German umlaut 'ä' to the upper
'Ä'.
output at linux:
locale:de_DE.iso88591
decimal value:-28
hex value:ffffffe4
char value:ä
i=196
decimal value:-60
hex value:ffffffc4
after=Ä

But now look at the output at HPUX (compiled
with HPUXs C-compiler cc):
locale:de_DE.iso88591 de_DE.iso88591 de_DE.iso88591 de_DE.iso88591 de_DE.iso88591 de_DE.iso88591
decimal value:-28
hex value:ffffffe4
char value:ä
i=0
decimal value:0
hex value:0
after=

The locale is multiple times printed
and the character is not converted properly.

When I call `locale -a` at HPUX the de_DE.iso88591
locale is listed.

What is the reason for this misbehaviour ?
 
A

Alexei A. Frounze

....
I recommend using Unicode, w/o any code pages and locales whatsoever.

Alex
 
E

Eric Sosman

gelbeiche said:
I have a question regarding the following
small C program.

#include <locale.h>

int main()
{
char* loc = 0;
char before,after;
int i;
loc = setlocale(LC_ALL,"de_DE.iso88591");
printf("locale:%s\n", loc);
before = 'ä';
printf("decimal value:%d \n",before);
printf("hex value:%x \n",before);
printf("char value:%c \n",before);
i = toupper(before);

i = toupper((unsigned char)before);
printf("i=%i\n",i);
after = (char) i;
printf("decimal value:%d \n",after);
printf("hex value:%x \n",after);
printf("after=%c\n",after);
}

At Linux the output of the program looks ok.
It converts the German umlaut 'ä' to the upper
'Ä'.
output at linux:
locale:de_DE.iso88591
decimal value:-28
hex value:ffffffe4
char value:ä
i=196
decimal value:-60
hex value:ffffffc4
after=Ä

But now look at the output at HPUX (compiled
with HPUXs C-compiler cc):
locale:de_DE.iso88591 de_DE.iso88591 de_DE.iso88591 de_DE.iso88591 de_DE.iso88591 de_DE.iso88591
decimal value:-28
hex value:ffffffe4
char value:ä
i=0
decimal value:0
hex value:0
after=

The locale is multiple times printed
and the character is not converted properly.

When I call `locale -a` at HPUX the de_DE.iso88591
locale is listed.

What is the reason for this misbehaviour ?

The character conversion is incorrect because the
argument to toupper() is incorrect. The argument to a
<ctype.h> function must be an `unsigned char' value or
the negative value EOF; the functions are not defined
for any other argument values.

The locale string shows that the current locale is
made up of several components (LC_COLLATE, LC_CTYPE, ...),
and each component can be set independently to a different
set of conventions. The Linux library seems to recognize
the special case that all components are set the same way;
HP-UX appears to list each component separately. Try this
experiment:

setlocale(LC_ALL, "C");
setlocale(LC_MONETARY, "de_DE.iso88591");
printf ("Locale = %s\n", setlocale(LC_ALL, NULL));

... and I think you will see what is happening.
 
G

gelbeiche

Eric Sosman said:
The character conversion is incorrect because the
argument to toupper() is incorrect. The argument to a
<ctype.h> function must be an `unsigned char' value or
the negative value EOF; the functions are not defined
for any other argument values.
Thanks for your explanation. The function works now.
The locale string shows that the current locale is
made up of several components (LC_COLLATE, LC_CTYPE, ...),
and each component can be set independently to a different
set of conventions. The Linux library seems to recognize
the special case that all components are set the same way;
HP-UX appears to list each component separately. Try this
experiment:

setlocale(LC_ALL, "C");
setlocale(LC_MONETARY, "de_DE.iso88591");
printf ("Locale = %s\n", setlocale(LC_ALL, NULL));

... and I think you will see what is happening.

Yes, I see. Thanks again.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,610
Members
45,254
Latest member
Top Crypto TwitterChannel

Latest Threads

Top