istream altering text

Discussion in 'C++' started by dohboy, Mar 1, 2008.

  1. dohboy

    dohboy Guest

    a kinda newbie here.
    I've done a simple little program that reads a text file and counts the number of lines and words.
    I had a heck of a time getting it to count properly when I finally discovered the problem was not my coding, but the istream altering the incoming text.
    What I was doing was checking each incoming character (seekg) and comparing it to a 'h0a' . What I found was that text files end their lines with a '0d' (CR) and a '0a' (line feed). However, it was reading them off the istream as both being '0a'. It had changed the CR.
    My questions are, Is there any other little istream quirks like this I should be aware of? And is there some way to set the stream to not alter what is read?
    TIA
    -doh
    dohboy, Mar 1, 2008
    #1
    1. Advertising

  2. * dohboy:
    > a kinda newbie here.
    > I've done a simple little program that reads a text file and counts the number
    > of lines and words.
    > I had a heck of a time getting it to count properly when I finally discovered
    > the problem was not my coding, but the istream altering the incoming text.
    > What I was doing was checking each incoming character (seekg) and comparing it
    > to a 'h0a' . What I found was that text files end their lines with a '0d' (CR)
    > and a '0a' (line feed). However, it was reading them off the istream as both
    > being '0a'. It had changed the CR.
    > My questions are, Is there any other little istream quirks like this I should be
    > aware of? And is there some way to set the stream to not alter what is read?


    By default text streams translate the OS convention for newline into
    '\n' on input, and vice versa, translate '\n' to OS convention on output.

    Since you're counting lines it's probably best to work with that feature
    instead of trying to turn it off.


    Cheers, & hth.,

    - Alf

    --
    A: Because it messes up the order in which people normally read text.
    Q: Why is it such a bad thing?
    A: Top-posting.
    Q: What is the most annoying thing on usenet and in e-mail?
    Alf P. Steinbach, Mar 1, 2008
    #2
    1. Advertising

  3. dohboy

    Ian Collins Guest

    dohboy wrote:
    > a kinda newbie here.
    > I've done a simple little program that reads a text file and counts the number of lines and words.
    > I had a heck of a time getting it to count properly when I finally discovered the problem was not my coding, but the istream altering the incoming text.
    > What I was doing was checking each incoming character (seekg) and comparing it to a 'h0a' . What I found was that text files end their lines with a '0d' (CR) and a '0a' (line feed). However, it was reading them off the istream as both being '0a'. It had changed the CR.
    > My questions are, Is there any other little istream quirks like this I should be aware of? And is there some way to set the stream to not alter what is read?


    It'd more of a normalisation than a quirk. It saves the programmer from
    tedious platform specific conversion code.

    I guess another is the use of the eof flag to hide platform specific
    file endings.

    --
    Ian Collins.
    Ian Collins, Mar 2, 2008
    #3
  4. dohboy

    Ron AF Greve Guest

    Hi,

    Didn't test the following code. But you might want to use something like
    this

    #include <fstream>
    #include <string>
    #include <algorithm>

    using namespace std;

    ifstream Input( Filename.c_str(), ios_base::binary );

    // test if open!
    //So files are read the same on whatever system then you might want to get
    rid of any carriage returns with something like
    string Line;
    while( getline( Input, Line ) )
    {
    Line.erase( remove_if( Line.begin(), Line.end(), bind2nd(
    equal_to<char>(), (char)13) ), Line.end() );

    // From here on Line is the same on unix and ms-windows
    }


    Regards, Ron AF Greve

    http://www.InformationSuperHighway.eu

    "dohboy" <> wrote in message
    news:...
    >a kinda newbie here.
    > I've done a simple little program that reads a text file and counts the
    > number of lines and words.
    > I had a heck of a time getting it to count properly when I finally
    > discovered the problem was not my coding, but the istream altering the
    > incoming text.
    > What I was doing was checking each incoming character (seekg) and
    > comparing it to a 'h0a' . What I found was that text files end their lines
    > with a '0d' (CR) and a '0a' (line feed). However, it was reading them off
    > the istream as both being '0a'. It had changed the CR.
    > My questions are, Is there any other little istream quirks like this I
    > should be aware of? And is there some way to set the stream to not alter
    > what is read?
    > TIA
    > -doh
    Ron AF Greve, Mar 2, 2008
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. James R. Davis

    Programatically altering text before rendering

    James R. Davis, Nov 29, 2007, in forum: ASP .Net
    Replies:
    3
    Views:
    515
    James R. Davis
    Nov 30, 2007
  2. xmllmx
    Replies:
    5
    Views:
    587
    Jorgen Grahn
    Jun 15, 2010
  3. O'Donnell Tribunal

    Altering a text string for an Indexserver query

    O'Donnell Tribunal, Jan 13, 2004, in forum: ASP General
    Replies:
    1
    Views:
    103
    Alex G
    Jan 13, 2004
  4. C Gillespie

    altering text with javascript

    C Gillespie, Sep 24, 2004, in forum: Javascript
    Replies:
    25
    Views:
    230
    Philo Hippo
    Oct 11, 2004
  5. ccc31807
    Replies:
    19
    Views:
    195
    ccc31807
    Aug 16, 2010
Loading...

Share This Page