The C execution character set

R

Roger Leigh

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

This question pertains to both Standard C and platform-dependent
features, hence the crosspost.

I'm trying to understand exactly how the "execution character set"
works. On GNU/Linux, using GCC >= 3.4, if I compile a C source file
(any encoding), by default the execution character set is UTF-8, and
the wide execution character set is UTF-32.

What I want to understand is what the implications of this are on the
various operations I might want to perform on any of the strings. As
an example:

#include <locale.h>
#include <stdio.h>
#include <wchar.h>

int
main (void)
{
setlocale (LC_ALL, "");
printf("‘Name1’\n");
printf("%ls\n", L"‘Name2’");
fwide(stderr, 1);
fwprintf(stderr, L"‘Name3’\n");
fwprintf(stderr, L"%s\n", "‘Name4’");
printf("‘Name5’\n");
return 0;
}

If I run in a normal (UTF-8) locale:

$ ./test
‘Name1’
‘Name2’
‘Name3’
‘Name4’
‘Name5’

Now, running in a C locale:

$ ./test
'Name3'
‘Name1’
‘Name5’

"‘Name1’" and "‘Name5’" are the same. These passed through
byte-for-byte identical. No conversions took place, I think.

"‘Name1’" (wide→narrow) was lost. Why?
"‘Name4’" (narrow→wide) was lost. Why?

"‘Name3’" (wide→wide) was *not* lost. Moreover, it was
transliterated (UTF-32->US-ASCII) into a readable form for the locale.
Where does the conversion take place, and how does the C runtime know
what the source and destination charset are? I can't replicate the
conversion with iconv(), so I'd like to know how to do it by hand.

I'd like to understand the reasons for why each of these cases work
the way they do.


Thanks,
Roger

- --
Roger Leigh
Printing on GNU/Linux? http://gimp-print.sourceforge.net/
Debian GNU/Linux http://www.debian.org/
GPG Public Key: 0x25BFB848. Please sign and encrypt your mail.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iD8DBQFCuzQ4VcFcaSW/uEgRApOUAJ49ghx4LxRo8Tn0RdOafRjdACDBqQCfQOA7
KLhWn0VmNzDLFD8gPHBFpgU=
=rX99
-----END PGP SIGNATURE-----
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top