How to create an UTF-16 text file with iostream ?

Timothy Madden · Dec 20, 2009

Hello

I would like to export data from my application to CSV (comma separated
values) in Unicode because I have Asian characters in my text, but
wofstream writes all basic characters in plain ASCII !

That is until it reaches an extended character than I get an exception
(I set badbit to throw exceptions).

I use Visual Studio 2008 but I would like to use the standard C++
library for this task (and all other tasks).

By searching Google I could see I should explicitly define a null
conversion for my wide-character stream and then imbue it on my
wofstream but the code sample was quite large and when I used it my
application crashed.

Why does this have to be so complicated ? Do I need to explicitly
define this null conversion ? Do I need to define all those conversion
methods and know about locale and facets to write the file in UTF-16 ?

Is there no easy way in the standard for this ?

How can I write binary data to the wofstream ? I tried myFile.write()
but that takes a wofstream::char_type * pointer and still undergoes
the damn conversion that narrows the characters and throws on extended
ones...

Thank you,
Timothy Madden

AnonMail2005 · Dec 20, 2009

Hello

I would like to export data from my application to CSV (comma separated
values) in Unicode because I have Asian characters in my text, but
wofstream writes all basic characters in plain ASCII !

That is until it reaches an extended character than I get an exception
(I set badbit to throw exceptions).

I use Visual Studio 2008 but I would like to use the standard C++
library for this task (and all other tasks).

By searching Google I could see I should explicitly define a null
conversion for my wide-character stream and then imbue it on my
wofstream but the code sample was quite large and when I used it my
application crashed.

Why does this have to be so complicated ? Do I need to explicitly
define this null conversion ? Do I need to define all those conversion
methods and know about locale and facets to write the file in UTF-16 ?

Is there no easy way in the standard for this ?

How can I write binary data to the wofstream ? I tried myFile.write()
but that takes a wofstream::char_type * pointer and still undergoes
the damn conversion that narrows the characters and throws on extended
ones...

Thank you,
Timothy Madden

Use iconv to convert from one character set to another. It's cross
platform and freely available (not sure under which license). And
code your csv read/writer to be the only place where this conversion
take place. Meaning no other code knows or cares about the
conversion. Either have different csv readers/writers for different
conversions or supply a converter to the csv reader/writer class as an
input to the constructor. For latter case, you could set up a default
converter if one is not supplied.

I don't know about UTF-16 but we used this strategy in our C++ wrapper
of xml2. We converted ASCII to UTF-8 and it worked properly. No wide
character stuff. Others can chime in if the strategy works for UTF-16.

HTH

Stefan Ram · Dec 20, 2009

Timothy Madden said:
Is there no easy way in the standard for this ?

Converting a sequence of Unicode 5.1.0 code points to a
sequence of octets using UTF-16 should not be so difficult
to write.

I do not know whether support for this is already in the
standard library of C++ or in boost, but see

http://site.icu-project.org/

Timothy Madden · Dec 20, 2009

Stefan said:
Converting a sequence of Unicode 5.1.0 code points to a
sequence of octets using UTF-16 should not be so difficult
to write.

I do not know whether support for this is already in the
standard library of C++ or in boost, but see

http://site.icu-project.org/

I got my hands on "C++ Standard Library: A tutorial and reference" by
Nicolai M. Josuttis (I have it in .chm) and then I saw the sample code
from the net might need some adjustments. Now my wofstream object writes
UTF-16 files with the corrected null codecvt facet on the imbued locale.

I can post the corrected facet if anyone is interested.

Thank you,
Timothy Madden

zindorsky · Dec 21, 2009

(or UTF-8, which is way more popular; why do you want precisely UTF-16?)

Probably because he's encoding Asian scripts. Codepoints in that range
require 3 bytes to encode in UTF-8, instead of the 2 UTF-16 requires.

Timothy Madden · Dec 21, 2009

Juha said:
Encoding unicode characters in UTF-16 (or UTF-8, which is way more
popular; why do you want precisely UTF-16?) is not that complicated, but
if you want to save yourself the work (there are a few gotchas,
especially if you want to fully support the entire unicode), you can use
a conversion library, such as this one:

http://utfcpp.sourceforge.net/

I would like to use UTF-16 so I can start my text file with the byte
order mark and then any text editor, and also M$ Excel, would know my
charset and encoding.

If I write a plain text file with a majority of ASCII characters but a
few Asian ones in UTF-8, then an editor would not know my encoding and
would misinterpret the few extended characters in my text.

And I already have my text as a stream of wchar_t, what should I do with
a conversion library ? M

My problem was to convince the wofstream to write them as such in-file
on the disk (instead of trying to convert them to narrow characters,
which fails upon encountering any extended character).

Thank you,
Timothy Madden

How to create a JSON array with values from DOM(HTML TABLE) when I click a button using JQuery/Javascript?	0	May 1, 2023
How to create a JSON array with values from DOM(HTML TABLE) when I click a button using JQuery/Javascript?	0	May 1, 2023
How to change key name in json file with python	0	Oct 2, 2022
Converting from UTF-16 to UTF-32	7	Jul 31, 2006
UTF-16 -> utf-8	0	Jan 23, 2007
UTF-16 & wchar_t: the 2nd worst thing about C++	23	Mar 9, 2006
Converting file from utf-16 to utf-8	3	Mar 23, 2010
UTF-16 + firefox +javascript = null	3	Apr 20, 2009

How to create an UTF-16 text file with iostream ?

Timothy Madden

AnonMail2005

Stefan Ram

Timothy Madden

zindorsky

Timothy Madden

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads