Reading unicode text files

W

Wx

Hello.

I'm trying to read a textfile written by the NTBackup utility on
Windows 2003 SBS. The problem is that when i print the output, it
looks like this:

S t a t o : b a c k u p
O p e r a z i o n e : b a c k u p
D e s t i n a z i o n e b a c k u p a t t i v o : F i l e
N o m e s u p p o r t o : " l u m e v e . b k f c r e a t o i
l 2 1 / 0 5 / 2 0 0 7 a l l e 2 3 . 0 0 "

As you can see, there is a space prior to any charater. I know that
unicode characters uses two bytes, so... can be the problem related to
different charset?

If I try to read a new textfile, there are no problem.

This is the relevant portion of the code:


try {
ifstream infile(strLogFile.c_str());

if (infile.is_open()) {
string line;
while (getline(infile, line)) {
cout << line << endl;
}

infile.close();

} else {
cerr << "Impossibile aprire " << strLogFile << endl;
return false;
}

Excuse me for my english.
Thanks

Wx
 
A

Alf P. Steinbach

* Wx:
I'm trying to read a textfile written by the NTBackup utility on
Windows 2003 SBS. The problem is that when i print the output, it
looks like this:

S t a t o : b a c k u p
O p e r a z i o n e : b a c k u p
D e s t i n a z i o n e b a c k u p a t t i v o : F i l e
N o m e s u p p o r t o : " l u m e v e . b k f c r e a t o i
l 2 1 / 0 5 / 2 0 0 7 a l l e 2 3 . 0 0 "

As you can see, there is a space prior to any charater. I know that
unicode characters uses two bytes, so... can be the problem related to
different charset?

Yes. The "spaces" are, at least before they end up in your program,
zero bytes.

If I try to read a new textfile, there are no problem.

This is the relevant portion of the code:


try {
ifstream infile(strLogFile.c_str());

Well, it doesn't help you to use a wide character stream, because they
simply convert to/from external narrow character data.

What you can do is open the file in binary mode.

Then read the contents as binary data and treat as a sequence of wchar_t
values (e.g., you can just store them in a std::wstring).

Essentially this means implementing the machinery that the standard
library provides for narrow character streams. Or, you can buy an
existing implementation or find one on the net (I doubt you'll find
one). I think Dinkumware offers such an implementation.

Note that handling wchar_t in Windows leads you into compiler-specific
territory, since e.g. MingW g++ 3.4.4 doesn't support wide character
streams.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top