Reading unicode text files

Discussion in 'C++' started by Wx, May 22, 2007.

  1. Wx

    Wx Guest

    Hello.

    I'm trying to read a textfile written by the NTBackup utility on
    Windows 2003 SBS. The problem is that when i print the output, it
    looks like this:

    S t a t o : b a c k u p
    O p e r a z i o n e : b a c k u p
    D e s t i n a z i o n e b a c k u p a t t i v o : F i l e
    N o m e s u p p o r t o : " l u m e v e . b k f c r e a t o i
    l 2 1 / 0 5 / 2 0 0 7 a l l e 2 3 . 0 0 "

    As you can see, there is a space prior to any charater. I know that
    unicode characters uses two bytes, so... can be the problem related to
    different charset?

    If I try to read a new textfile, there are no problem.

    This is the relevant portion of the code:


    try {
    ifstream infile(strLogFile.c_str());

    if (infile.is_open()) {
    string line;
    while (getline(infile, line)) {
    cout << line << endl;
    }

    infile.close();

    } else {
    cerr << "Impossibile aprire " << strLogFile << endl;
    return false;
    }

    Excuse me for my english.
    Thanks

    Wx
     
    Wx, May 22, 2007
    #1
    1. Advertising

  2. * Wx:
    >
    > I'm trying to read a textfile written by the NTBackup utility on
    > Windows 2003 SBS. The problem is that when i print the output, it
    > looks like this:
    >
    > S t a t o : b a c k u p
    > O p e r a z i o n e : b a c k u p
    > D e s t i n a z i o n e b a c k u p a t t i v o : F i l e
    > N o m e s u p p o r t o : " l u m e v e . b k f c r e a t o i
    > l 2 1 / 0 5 / 2 0 0 7 a l l e 2 3 . 0 0 "
    >
    > As you can see, there is a space prior to any charater. I know that
    > unicode characters uses two bytes, so... can be the problem related to
    > different charset?


    Yes. The "spaces" are, at least before they end up in your program,
    zero bytes.


    > If I try to read a new textfile, there are no problem.
    >
    > This is the relevant portion of the code:
    >
    >
    > try {
    > ifstream infile(strLogFile.c_str());


    Well, it doesn't help you to use a wide character stream, because they
    simply convert to/from external narrow character data.

    What you can do is open the file in binary mode.

    Then read the contents as binary data and treat as a sequence of wchar_t
    values (e.g., you can just store them in a std::wstring).

    Essentially this means implementing the machinery that the standard
    library provides for narrow character streams. Or, you can buy an
    existing implementation or find one on the net (I doubt you'll find
    one). I think Dinkumware offers such an implementation.

    Note that handling wchar_t in Windows leads you into compiler-specific
    territory, since e.g. MingW g++ 3.4.4 doesn't support wide character
    streams.


    --
    A: Because it messes up the order in which people normally read text.
    Q: Why is it such a bad thing?
    A: Top-posting.
    Q: What is the most annoying thing on usenet and in e-mail?
     
    Alf P. Steinbach, May 22, 2007
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Darrel
    Replies:
    3
    Views:
    698
    Kevin Spencer
    Nov 11, 2004
  2. crazyprakash
    Replies:
    4
    Views:
    3,431
    adrian
    Oct 30, 2005
  3. Replies:
    4
    Views:
    998
    M.E.Farmer
    Feb 13, 2005
  4. Replies:
    0
    Views:
    806
  5. Devin
    Replies:
    1
    Views:
    156
    Martin Honnen
    Nov 19, 2008
Loading...

Share This Page