White space and >>

Discussion in 'C++' started by Grumble, Nov 13, 2003.

  1. Grumble

    Grumble Guest

    Hello all,

    [ Disclaimer: I am a complete C++ newbie ]

    I want to read lines from a text file, where each line has the
    following syntax:

    token1:token2:token3

    There could be white space between tokens and ':'
    There could be white space before token1 or after token3.

    Because I will need to access every line several times, later in my
    program, I first store every line in a string vector:

    // Do you guys put the & near the type or near the parameter name?
    static void read_lines(vector<string> &v)
    {
    ifstream ifs(INFILE); // input file stream

    if (ifs == NULL)
    {
    cerr << "Unable to open input file " << INFILE << ".\n";
    exit(-1);
    }

    string line;

    while (getline(ifs, line))
    {
    // Ignore empty lines and comments.
    if (line.empty() || line[0]==HASH) continue;

    v.push_back(line);
    }
    }

    Does that part look OK?

    Later on, when I am dealing with a specific line, I create a
    stringstream object so I can use the >> operator.

    Ideally, I would simply write:

    {
    istringstream myss(mystring);
    string token1, token2, token3;

    myss >> token1;
    myss >> token2;
    myss >> token3;
    }

    But this doesn't work because ':' is not treated as white space. Is
    there a simple solution?

    Is my approach completely wrong?

    Nudge
    Grumble, Nov 13, 2003
    #1
    1. Advertising

  2. Grumble

    Grumble Guest

    Grumble wrote:
    > Hello all,
    >
    > [ Disclaimer: I am a complete C++ newbie ]
    >
    > I want to read lines from a text file, where each line has the following
    > syntax:
    >
    > token1:token2:token3
    >
    > There could be white space between tokens and ':'
    > There could be white space before token1 or after token3.


    I forgot to mention that it is valid for token1 to be empty, but it
    is not valid for token2 and token3 to be empty.

    I see a problem. Consider

    : t2 : t3

    myss >> token1;
    myss >> token2;
    myss >> token3;

    If the >> operator considers ':' to be white space, then I will end
    up with token1 = "t2" which is not what I want...

    On the other hand, consider

    t1:t2:t3

    If ':' is not treated as white space, or perhaps some kind of
    special delimiter, then I will end up with token1="t1:t2:t3" which
    is wrong too...

    Errr, how can I get the "ignore white space" behavior, along with
    the "split at the delimiter" behavior together?

    Nudge
    Grumble, Nov 13, 2003
    #2
    1. Advertising

  3. Grumble wrote:
    >
    > If ':' is not treated as white space, or perhaps some kind of
    > special delimiter, then I will end up with token1="t1:t2:t3" which
    > is wrong too...
    >
    > Errr, how can I get the "ignore white space" behavior, along with
    > the "split at the delimiter" behavior together?


    I think you are barking up the wrong tree.

    Take your string.

    Locate the 2 ':' characters.

    Split the string into 3 seperate strings using the ':' positions
    you have determined earlier.

    You now have 3 strings, each one containing maybe some
    leading whitespace, the token, maybe some trailing whitespace.

    Get rid of leading and trailing whitespace in each string
    and you are left with the tokens alone.

    Not every problem is worth to be solved with clever uses of streams.
    Sometimes simple string manipulation is simpler.

    --
    Karl Heinz Buchegger
    Karl Heinz Buchegger, Nov 13, 2003
    #3
  4. Grumble

    Grumble Guest

    Karl Heinz Buchegger wrote:
    >
    > Grumble wrote:
    >
    >>If ':' is not treated as white space, or perhaps some kind of
    >>special delimiter, then I will end up with token1="t1:t2:t3" which
    >>is wrong too...
    >>
    >>Errr, how can I get the "ignore white space" behavior, along with
    >>the "split at the delimiter" behavior together?

    >
    >
    > I think you are barking up the wrong tree.
    >
    > Take your string.
    >
    > Locate the 2 ':' characters.
    >
    > Split the string into 3 seperate strings using the ':' positions
    > you have determined earlier.
    >
    > You now have 3 strings, each one containing maybe some
    > leading whitespace, the token, maybe some trailing whitespace.
    >
    > Get rid of leading and trailing whitespace in each string
    > and you are left with the tokens alone.
    >
    > Not every problem is worth to be solved with clever uses of streams.
    > Sometimes simple string manipulation is simpler.


    How disappointing :)

    What you describe is what I have done, but I was hoping for shorter
    a solution (in terms of lines of code).


    void extract_field(string &field, string &line, size_t lpos, size_t
    rpos)
    {
    string temp = line.substr(lpos, rpos-lpos);

    lpos = temp.find_first_not_of(WHITESPACE);
    rpos = temp.find_first_of(WHITESPACE, lpos);

    if (lpos == string::npos) // temp contains only white space.
    {
    field.erase();
    }
    else
    {
    field = temp.substr(lpos, rpos-lpos);
    }
    }

    {
    string opt_name, opt_type, opt_val;

    size_t lpos = 0, rpos; // left and right position.

    // Extract option name from line and strip white space.
    rpos = line.find_first_of(COLON, lpos);
    extract_field(opt_name, line, lpos, rpos);
    lpos = rpos+1;

    // Extract option type from line and strip white space.
    rpos = line.find_first_of(COLON, lpos);
    extract_field(opt_type, line, lpos, rpos);
    lpos = rpos+1;

    // Extract option value list from line.
    opt_val = line.substr(lpos);
    }

    IMO, the above is far less elegant than:

    myss >> opt_name;
    myss >> opt_type;
    myss >> opt_val;
    // modulo error handling of course

    I might use getline() to split my line into 3 strings... then use an
    istringstream to strip white leading and trailing white space...

    I have a related question: at some point I have a string, and I want
    to concatenate an int at the end.

    string s("toto");
    int n=7;

    s = s + n; // It would be nice if this resulted in s = "toto7" :)

    Am I supposed to use C's sprintf? A stringstream?

    Nudge
    Grumble, Nov 13, 2003
    #4
  5. Hi Grumble,

    "Grumble" <> schrieb im Newsbeitrag
    news:bp0dck$l7s$...
    > How disappointing :)
    >
    > I was hoping for shorter
    > a solution (in terms of lines of code).


    you could take an intensive look at the C++ stream library. There are ways
    to do it, if you really want to. ;-)

    If not's a life-or-death matter of doing it in an object-oriented way or if
    you want to be short, reading a single text line using "cin" followed by a
    sscanf() on the input buffer might be shorter than writing classes for
    sorting out stream input.

    The best solution would be using a class for regular expressions (perhaps
    with streams support).

    Your problem could be parsed by a regular expression like "/\w+[ \t]*:[
    \t]*\w+[ \t]*:[ \t]*\w+/", this means "one or more word characters followed
    by zero or more blank or tab characters, followed by a colon, followed by
    .... etc."

    Languages like Perl or PHP have regular expression support on language or
    library level, and I'm sure there's a regexp library for C++ as well. :)

    I hope that helps.

    Regards,
    Ekkehard Morgenstern.
    Ekkehard Morgenstern, Nov 13, 2003
    #5
  6. Grumble

    Jon Bell Guest

    In article <bp0602$inc$>,
    Grumble <> wrote:
    >
    >I want to read lines from a text file, where each line has the
    >following syntax:
    >
    >token1:token2:token3


    [snip code that reads the file into a vector of strings, one line per
    string]

    >Does that part look OK?


    Looks OK to me.

    >Later on, when I am dealing with a specific line, I create a
    >stringstream object so I can use the >> operator.
    >
    >Ideally, I would simply write:
    >
    >{
    > istringstream myss(mystring);
    > string token1, token2, token3;
    >
    > myss >> token1;
    > myss >> token2;
    > myss >> token3;
    >}
    >
    >But this doesn't work because ':' is not treated as white space. Is
    >there a simple solution?


    Use getline() on myss, and tell it to use ':' as the separator, where
    appropriate.

    getline (myss, token1, ':');
    getline (myss, token2, ':');
    getline (myss, token3);

    The tokens you pick up will also include whatever whitespace happens to
    lie in between the colons.

    --
    Jon Bell <> Presbyterian College
    Dept. of Physics and Computer Science Clinton, South Carolina USA
    Jon Bell, Nov 13, 2003
    #6
  7. Grumble wrote:
    >
    > Karl Heinz Buchegger wrote:
    > >
    > > Grumble wrote:
    > >
    > >>If ':' is not treated as white space, or perhaps some kind of
    > >>special delimiter, then I will end up with token1="t1:t2:t3" which
    > >>is wrong too...
    > >>
    > >>Errr, how can I get the "ignore white space" behavior, along with
    > >>the "split at the delimiter" behavior together?

    > >
    > >
    > > I think you are barking up the wrong tree.
    > >
    > > Take your string.
    > >
    > > Locate the 2 ':' characters.
    > >
    > > Split the string into 3 seperate strings using the ':' positions
    > > you have determined earlier.
    > >
    > > You now have 3 strings, each one containing maybe some
    > > leading whitespace, the token, maybe some trailing whitespace.
    > >
    > > Get rid of leading and trailing whitespace in each string
    > > and you are left with the tokens alone.
    > >
    > > Not every problem is worth to be solved with clever uses of streams.
    > > Sometimes simple string manipulation is simpler.

    >
    > How disappointing :)


    Depends :)

    >
    > What you describe is what I have done, but I was hoping for shorter
    > a solution (in terms of lines of code).
    >
    > void extract_field(string &field, string &line, size_t lpos, size_t
    > rpos)
    > {
    > string temp = line.substr(lpos, rpos-lpos);
    >
    > lpos = temp.find_first_not_of(WHITESPACE);
    > rpos = temp.find_first_of(WHITESPACE, lpos);
    >
    > if (lpos == string::npos) // temp contains only white space.
    > {
    > field.erase();
    > }
    > else
    > {
    > field = temp.substr(lpos, rpos-lpos);
    > }
    > }
    >


    I would refactor the above into 2 functions:

    A function TrimWhitespace
    and a function ExtractField (which uses TrimWhitespace)

    The reason?
    A function for trimming a string is a good thing to have in your
    toolbox and will come in handy a hundred of times.

    And the function has gotten shorter and your toolbox has grown
    by one additional function :)

    [snip]

    >
    > I might use getline() to split my line into 3 strings...


    OK

    > then use an
    > istringstream to strip white leading and trailing white space...


    Or use your know function TrimWhitespace() from your personal
    toolbox :)
    A good programmer has a collected a bag of little helper functions
    like this one over the years.

    >
    > I have a related question: at some point I have a string, and I want
    > to concatenate an int at the end.
    >
    > string s("toto");
    > int n=7;
    >
    > s = s + n; // It would be nice if this resulted in s = "toto7" :)
    >
    > Am I supposed to use C's sprintf? A stringstream?


    stringstream.
    you also might look at boost for it's lexical_cast.
    www.boost.org

    --
    Karl Heinz Buchegger
    Karl Heinz Buchegger, Nov 14, 2003
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Shuo Xiang

    Stack space, global space, heap space

    Shuo Xiang, Jul 9, 2003, in forum: C Programming
    Replies:
    10
    Views:
    2,863
    Bryan Bullard
    Jul 11, 2003
  2. Christian Seberino
    Replies:
    21
    Views:
    1,621
    Stephen Horne
    Oct 27, 2003
  3. Ian Bicking
    Replies:
    2
    Views:
    977
    Steve Lamb
    Oct 23, 2003
  4. Ian Bicking
    Replies:
    2
    Views:
    705
    Michael Hudson
    Oct 24, 2003
  5. Ben C
    Replies:
    6
    Views:
    2,131
    Leif K-Brooks
    Jan 28, 2007
Loading...

Share This Page