Unicode is a very big character set where each character has its own
index. There is
thousands of characters in this set. Unicode means standard it is not
character encoding. There also exists standard with name ISO 10646.
Theoretically ISO 10646 can handle about billions of characters. The
first 65 536 characters of ISO 10646 are identical with Unicode
standard. Advantage of Unicode or ISO 10646 is that these formats cover
almost every character you would ever need.
Non-Wide Characters - reprezented with CHAR:
Many charsets (ISO 8859-1, ISO 8859-2, ...) include 256 characters - it
means that it is not possible to cover every language in such small
number of characters. But many applications are not able to manage
Unicode at this time so use some of encodings/character representations
available in your OS:
standardized charsets ISO 8859...
or windows-125X ...
or Mac x-mac-ce ...etc
or UTF-8.
UTF? yes but it is reprezented with WIDE CHAR.
UTF-8 is a way how to write a character to file: ASCII characters are
represented with one byte and other characters are represented with
more than one byte.
example: 11000011-10101101
UTF-16: All characters are represented with two bytes. Some of those
characters have a special meaning.
example: 11101101-00000000
To represent all languages as much as possible use wchar_t (one
character), wstring (string). These types are __usually__ able to cover
all characters in Unicode standard with 4 bytes but it can be also 2
bytes. w means wide characters. To use them you have to use streams for
wide characters. Please see std::locale, std::locale::facet. When
using w-objects you have to be sure about your current
encoding/charset.
Usually we express text in programs with CHARs (We can be happy enough
with chars) but sometime we want to use a different language, very
different language that is not covered in the available encoding (with
256 characters, windows-125X, ISO88...). We can handle text in program
like Unicode set (and we can be happy as well) but we (in C++) usually
write to file using available encoding (non-Unicode)in our OS because
it is not possible when using std::. One way is
http://www.codeproject.com/vcpp/stl/upgradingstlappstounicode.asp?print=true
another way is using C function fwrite:
wchar_t myWString[] = L"Some strange characters."
fwrite(myWString, sizeof(wchar_t), sizeof(myWString)/sizeof(wchar_t),
myFile );
but is is not portable.