Beginners: Count occurrences of a string within a string

Discussion in 'C++' started by yogi_bear_79, Feb 27, 2008.

  1. yogi_bear_79

    yogi_bear_79 Guest

    I'm sure I have a few things wrong here. But I am stuck on how to do
    a recurring search. Also my statement cin >> quote; acts weird. If I
    enter more than one word it blows right past cin >> findMe; and
    completes and exits the code. If you string for cin >> quote; is one
    word it behaves correctly or at least in that regard!


    #include <iostream>
    #include <string>
    using namespace std;

    string quote;
    string findMe;
    int foundIt(string str1, string str2);

    int main ()
    {
    cout << " \n Enter a sentence or two:" << endl;
    cin >> quote;
    cout << " \n Enter string to search for:" << endl;
    cin >> findMe;

    cout << foundIt(quote, findMe);
    }
    int foundIt(string str1, string str2)
    {
    size_t pos;
    int x = 0;

    pos = str1.find(str1);
    if (pos!=string::npos)
    x++;

    return x;
    }
    yogi_bear_79, Feb 27, 2008
    #1
    1. Advertising

  2. yogi_bear_79

    Christopher Guest

    On Feb 26, 9:37 pm, yogi_bear_79 <> wrote:
    > I'm sure I have a few things wrong here. But I am stuck on how to do
    > a recurring search. Also my statement cin >> quote; acts weird. If I
    > enter more than one word it blows right past cin >> findMe; and
    > completes and exits the code. If you string for cin >> quote; is one
    > word it behaves correctly or at least in that regard!
    >
    > #include <iostream>
    > #include <string>
    > using namespace std;
    >
    > string quote;
    > string findMe;


    Is there a reason to make these global when your passing them as
    parameters anyway?

    > int foundIt(string str1, string str2);
    >
    > int main ()
    > {
    > cout << " \n Enter a sentence or two:" << endl;
    > cin >> quote;


    you mean 'one word'. Using stream shift operators are not the same as
    getting everything in the stream. You want to "get unformatted text
    from a stream" or "Copy the contents of a stream buffer". I put those
    in quotes for you to search against.

    > cout << " \n Enter string to search for:" << endl;
    > cin >> findMe;


    Same as above

    > cout << foundIt(quote, findMe);}


    Global params being passed. Wierd.

    > int foundIt(string str1, string str2)
    > {
    > size_t pos;
    > int x = 0;
    >
    > pos = str1.find(str1);
    > if (pos!=string::npos)


    You found something, perhaps:

    increment a count
    get the position in the string, of the end of the substring you found
    This would probably require that you obtain the length of the
    substring
    search again starting from that position

    > x++;
    >
    > return x;
    >
    > }
    Christopher, Feb 27, 2008
    #2
    1. Advertising

  3. yogi_bear_79

    Micah Cowan Guest

    yogi_bear_79 wrote:
    > I'm sure I have a few things wrong here. But I am stuck on how to do
    > a recurring search. Also my statement cin >> quote; acts weird. If I
    > enter more than one word it blows right past cin >> findMe; and
    > completes and exits the code.


    The >> operator, when applied to an input stream on the left and a
    string on the right, skips any whitespace on the input stream, and then
    reads in a single word, up until the next whitespace character (or the
    end of the input stream). Thus, if you enter two words on a single line,
    your code will print the first prompt, read the first word into quote
    (leaving the second), print the second prompt, and read the next word
    into findMe (leaving any remaining input, such as the final newline, in
    the input stream). It won't wait to receive a second line, because it
    already has a second word to read in from the first line.

    If reading in a line is what you wanted, rather than words, you should
    consider the getline() function.

    > int foundIt(string str1, string str2)


    Note that str1 and str2 are not the same strings as quote and findMe,
    but are instead entirely new strings to which have been copied the
    _contents_ of quote and findMe. To avoid the needless copying, you
    should use references-to-string as the parameter types, rather than
    plain strings.

    Also, since you're not going to modify the strings, you'd do well to
    make them references to _const_ strings, to tell the caller that you
    won't be changing them (which is particularly helpful in cases when the
    caller itself has promised _its_ caller not to modify these strings):

    int foundIt(const string &str1, const string &str2)

    > {
    > size_t pos;


    size_t pos = 0;

    > int x = 0;
    >
    > pos = str1.find(str1);
    > if (pos!=string::npos)
    > x++;


    while (pos != string::npos) {
    str1.find(str2, pos);
    cout << "Found " << str2 << " at " << pos << '.' << endl;
    }

    >
    > return x;
    > }



    --
    Micah J. Cowan
    Programmer, musician, typesetting enthusiast, gamer...
    http://micah.cowan.name/
    Micah Cowan, Feb 27, 2008
    #3
  4. yogi_bear_79

    James Kanze Guest

    On Feb 27, 4:37 am, yogi_bear_79 <> wrote:
    > I'm sure I have a few things wrong here. But I am stuck on
    > how to do a recurring search. Also my statement cin >> quote;
    > acts weird. If I enter more than one word it blows right past
    > cin >> findMe; and completes and exits the code. If you
    > string for cin >> quote; is one word it behaves correctly or
    > at least in that regard!


    So how do you want to decide how much text to read with the
    first >>. If you want a word (skipping any white space which
    precedes it), use >>. If you want a line, use getline(). If
    you want the entire file, you can do so using
    istreambuf_iterators, or with something like s <<
    std::cin.rdbuf(), where s is a istringstream, but then you'll
    have read end of file, and not be able to enter the search
    string. If you want some other convention, you'll probably have
    to program it. (Under Unix, a line consisting of a single '.'
    is a frequent convention. This could be done something like:

    std::string line ;
    while ( std::getline( std::cin, line ) && line != "." ) {
    quote += line + '\n' ;
    }

    . I'd definitely put this in a separate function, however.)

    For the lookup, I'd generally prefer the standard algorithms
    over the member functions of std::string. If you're learning,
    I'd especially prefer them; they represent the usual idiom for
    processing any container, and using them will get you used to
    iterators in general. Something like:

    int
    countMatches(
    std::string const& text,
    std::string const& toMatch )
    {
    int result = 0 ;
    for ( std::string::const_iterator current
    = std::search( text.begin(), text.end(),
    toMatch.begin(),
    toMatch.end() ) ;
    current != text.end() ;
    current = std::search(
    current + toMatch.size(), text.end(),
    toMatch.begin(), toMatch.end() ) {
    ++ result ;
    }
    return result ;
    }

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
    James Kanze, Feb 27, 2008
    #4
  5. yogi_bear_79

    yogi_bear_79 Guest

    On Feb 26, 10:58 pm, Micah Cowan <> wrote:
    > yogi_bear_79 wrote:
    > > I'm sure I have a few things wrong here.  But I am stuck on how to do
    > > a recurring search. Also my statement cin  >> quote; acts weird.  If I
    > > enter more than one word it blows right past cin  >> findMe; and
    > > completes and exits the code.

    >
    > The >> operator, when applied to an input stream on the left and a
    > string on the right, skips any whitespace on the input stream, and then
    > reads in a single word, up until the next whitespace character (or the
    > end of the input stream). Thus, if you enter two words on a single line,
    > your code will print the first prompt, read the first word into quote
    > (leaving the second), print the second prompt, and read the next word
    > into findMe (leaving any remaining input, such as the final newline, in
    > the input stream). It won't wait to receive a second line, because it
    > already has a second word to read in from the first line.
    >
    > If reading in a line is what you wanted, rather than words, you should
    > consider the getline() function.
    >
    > > int foundIt(string str1, string str2)

    >
    > Note that str1 and str2 are not the same strings as quote and findMe,
    > but are instead entirely new strings to which have been copied the
    > _contents_ of quote and findMe. To avoid the needless copying, you
    > should use references-to-string as the parameter types, rather than
    > plain strings.
    >
    > Also, since you're not going to modify the strings, you'd do well to
    > make them references to _const_ strings, to tell the caller that you
    > won't be changing them (which is particularly helpful in cases when the
    > caller itself has promised _its_ caller not to modify these strings):
    >
    >   int foundIt(const string &str1, const string &str2)
    >
    > > {
    > >    size_t pos;

    >
    >   size_t pos = 0;
    >
    > >    int x = 0;

    >
    > >    pos = str1.find(str1);
    > >    if (pos!=string::npos)
    > >            x++;

    >
    >   while (pos != string::npos) {
    >     str1.find(str2, pos);
    >     cout << "Found " << str2 << " at " << pos << '.' << endl;
    >   }
    >
    >
    >
    > >    return x;
    > > }

    >
    > --
    > Micah J. Cowan
    > Programmer, musician, typesetting enthusiast, gamer...http://micah.cowan.name/


    Thanks to everyone for their help & tips. I managed to get it really
    close. Below is my entire code, the current problem is the While loop
    is infinite. My task is to count the number of times the substring
    appears in the string:qote. I planned to use the int x to count. That
    is where I am stuck. The code as is does find the occurances I am
    looking for, but I think I need an if statement to qualify weather or
    not it found something, if it did, then increment x. I'd assume if it
    didn't find anything then the search is over, so it would bs something
    like. If found then increment x and loop, else exit the loop. So I
    what I need is an if statemnt that will verify something was found or
    not.

    James, I think your code was a bit further ahead than I am supposed to
    know at this point!

    #include <iostream>
    #include <string>
    using namespace std;


    int findIt(const string &str1, const string &str2);

    int main ()
    {
    string quote;
    string findMe;

    cout << " \n Enter a sentence or two:" << endl;
    getline (cin,quote);
    cout << " \n Enter string to search for:" << endl;
    getline (cin,findMe);

    cout << findIt(quote, findMe);
    }
    int findIt(const string &str1, const string &str2)
    {
    size_t pos = 0;
    int x = 0;

    while (pos != string::npos) {
    pos = str1.find(str2, pos);
    cout << pos<<endl;
    pos = pos+str2.size();
    }

    return x;
    }
    yogi_bear_79, Feb 28, 2008
    #5
  6. yogi_bear_79

    yogi_bear_79 Guest

    On Feb 27, 8:00 pm, yogi_bear_79 <> wrote:
    > On Feb 26, 10:58 pm, Micah Cowan <> wrote:
    >
    >
    >
    >
    >
    > > yogi_bear_79 wrote:
    > > > I'm sure I have a few things wrong here.  But I am stuck on how to do
    > > > a recurring search. Also my statement cin  >> quote; acts weird.  If I
    > > > enter more than one word it blows right past cin  >> findMe; and
    > > > completes and exits the code.

    >
    > > The >> operator, when applied to an input stream on the left and a
    > > string on the right, skips any whitespace on the input stream, and then
    > > reads in a single word, up until the next whitespace character (or the
    > > end of the input stream). Thus, if you enter two words on a single line,
    > > your code will print the first prompt, read the first word into quote
    > > (leaving the second), print the second prompt, and read the next word
    > > into findMe (leaving any remaining input, such as the final newline, in
    > > the input stream). It won't wait to receive a second line, because it
    > > already has a second word to read in from the first line.

    >
    > > If reading in a line is what you wanted, rather than words, you should
    > > consider the getline() function.

    >
    > > > int foundIt(string str1, string str2)

    >
    > > Note that str1 and str2 are not the same strings as quote and findMe,
    > > but are instead entirely new strings to which have been copied the
    > > _contents_ of quote and findMe. To avoid the needless copying, you
    > > should use references-to-string as the parameter types, rather than
    > > plain strings.

    >
    > > Also, since you're not going to modify the strings, you'd do well to
    > > make them references to _const_ strings, to tell the caller that you
    > > won't be changing them (which is particularly helpful in cases when the
    > > caller itself has promised _its_ caller not to modify these strings):

    >
    > >   int foundIt(const string &str1, const string &str2)

    >
    > > > {
    > > >    size_t pos;

    >
    > >   size_t pos = 0;

    >
    > > >    int x = 0;

    >
    > > >    pos = str1.find(str1);
    > > >    if (pos!=string::npos)
    > > >            x++;

    >
    > >   while (pos != string::npos) {
    > >     str1.find(str2, pos);
    > >     cout << "Found " << str2 << " at " << pos << '.' << endl;
    > >   }

    >
    > > >    return x;
    > > > }

    >
    > > --
    > > Micah J. Cowan
    > > Programmer, musician, typesetting enthusiast, gamer...http://micah.cowan..name/

    >
    > Thanks to everyone for their help & tips. I managed to get it really
    > close.  Below is my entire code, the current problem is the While loop
    > is infinite.  My task is to count the number of times the substring
    > appears in the string:qote. I planned to use the int x to count. That
    > is where I am stuck.  The code as is does find the occurances I am
    > looking for, but I think I need an if statement to qualify weather or
    > not it found something, if it did, then increment x.  I'd assume if it
    > didn't find anything then the search is over, so it would bs something
    > like. If found then increment x and loop, else exit the loop.  So I
    > what I need is an if statemnt that will verify something was found or
    > not.
    >
    > James, I think your code was a bit further ahead than I am supposed to
    > know at this point!
    >
    > #include <iostream>
    > #include <string>
    > using namespace std;
    >
    > int findIt(const string &str1, const string &str2);
    >
    > int main ()
    > {
    >         string quote;
    >         string findMe;
    >
    >         cout << " \n Enter a sentence or two:" << endl;
    >         getline (cin,quote);
    >         cout << " \n Enter string to search for:" << endl;
    >         getline (cin,findMe);
    >
    >         cout << findIt(quote, findMe);}
    >
    > int findIt(const string &str1, const string &str2)
    > {
    >         size_t pos = 0;
    >         int x = 0;
    >
    >         while (pos != string::npos) {
    >                 pos = str1.find(str2, pos);
    >                 cout << pos<<endl;
    >                 pos = pos+str2.size();
    >         }
    >
    >         return x;
    >
    >
    >
    > }- Hide quoted text -
    >
    > - Show quoted text -- Hide quoted text -
    >
    > - Show quoted text -


    Final Working Code:

    #include <iostream>
    #include <string>
    using namespace std;


    int findIt(const string &str1, const string &str2);

    int main ()
    {
    string quote;
    string findMe;

    cout << " \n Enter a sentence or two:" << endl;
    getline (cin,quote);
    cout << " \n Enter string to search for:" << endl;
    getline (cin,findMe);

    cout << "\n There are " << findIt(quote, findMe) <<
    " occurances of '" << findMe << "' in the "
    "string you entered."<<endl;
    }
    int findIt(const string &str1, const string &str2)
    {
    size_t pos = 0;
    int x = 0;

    while (pos != string::npos) {
    pos = str1.find(str2, pos);
    if (pos!=string::npos){
    x++;
    pos = pos+str2.size();
    }
    }

    return x;
    }
    yogi_bear_79, Feb 28, 2008
    #6
  7. yogi_bear_79

    James Kanze Guest

    On Feb 28, 2:00 am, yogi_bear_79 <> wrote:
    > On Feb 26, 10:58 pm, Micah Cowan <> wrote:
    > > yogi_bear_79 wrote:


    [...]
    > James, I think your code was a bit further ahead than I am
    > supposed to know at this point!


    I'm not sure how? You should be learning the standard
    algorithms long before the specific functions of std::string
    (which I probably wouldn't bother teaching at all), and you
    should be learning iterators before dealing with pos in a
    string. (But of course, you can apply the exact same algorithm
    using the position and std::string::find instead of the
    iterators and std::search.)

    > #include <iostream>
    > #include <string>
    > using namespace std;


    > int findIt(const string &str1, const string &str2);


    > int main ()
    > {
    > string quote;
    > string findMe;


    > cout << " \n Enter a sentence or two:" << endl;
    > getline (cin,quote);
    > cout << " \n Enter string to search for:" << endl;
    > getline (cin,findMe);


    > cout << findIt(quote, findMe);}


    > int findIt(const string &str1, const string &str2)
    > {
    > size_t pos = 0;
    > int x = 0;


    > while (pos != string::npos) {
    > pos = str1.find(str2, pos);
    > cout << pos<<endl;
    > pos = pos+str2.size();


    What happens in this line when str1.find returns
    std::string::npos?

    > }


    You really do want the loop I proposed:

    for ( size_t pos = str1.find( str2 ) ;
    pos != std::string::npos ;
    pos = str1.find( str2, pos + str2.size() ) {
    // Whatever you do when you find a match...
    }

    As a general rule: prefer for, with no modification of the
    control variable outside of the third part. It's much easier to
    get right. (If you need the value of the control variable after
    the loop, either move the first part of the for before the loop,
    or rewrite the loop as a while. And of course, don't try to
    force things which don't fit into this pattern.)

    You're problem, of course, is that you're modifying the control
    variable twice in the loop, and the first modification might
    cause the loop invariant (pos != std::string::npos) to be
    invalid. The only operations which should possibly invalidate
    the loop invariant should be at the end of the loop. A for
    makes this clearer---the only operations which might invalidate
    the loop invariant should be in the third part of the for.

    > return x;
    > }


    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
    James Kanze, Feb 28, 2008
    #7
  8. yogi_bear_79

    James Kanze Guest

    On Feb 28, 3:32 am, yogi_bear_79 <> wrote:

    [...]
    > Final Working Code:


    > int findIt(const string &str1, const string &str2)
    > {
    > size_t pos = 0;
    > int x = 0;


    > while (pos != string::npos) {
    > pos = str1.find(str2, pos);
    > if (pos!=string::npos){
    > x++;
    > pos = pos+str2.size();
    > }
    > }
    > return x;
    > }


    Sort of. I think you'll still missing the basic principles of
    how to write a loop. For example, what is the loop invariant
    here? And how do you prove that each pass through the loop
    "advances"? (You can do the latter, but it's far more
    complicated than it should be.)

    Suppose we take as an invariant that pos points to a match (and
    thus is not std::string::npos). That results in something like
    the loop I just proposed. Where the invariant is always true in
    the loop (no need for a special if to verify it---the test is
    part of the loop condition). And it's trivial to prove
    advancement, because the pos + str2.size() is executed every
    time we enter the loop---it's not in any conditional block.

    (FWIW: both your code and mine do have one error. What are the
    results if someone enters an empty string as the search string?
    I think this problem needs to be handled first at the level of
    specification, however; there are an infinite number of empty
    strings in any string you care to consider.)

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
    James Kanze, Feb 28, 2008
    #8
  9. yogi_bear_79

    Joe Greer Guest


    > On Feb 28, 3:32 am, yogi_bear_79 <> wrote:
    >
    > [...]
    >> Final Working Code:

    >
    >> int findIt(const string &str1, const string &str2)
    >> {
    >> size_t pos = 0;
    >> int x = 0;

    >
    >> while (pos != string::npos) {
    >> pos = str1.find(str2, pos);
    >> if (pos!=string::npos){
    >> x++;
    >> pos = pos+str2.size();
    >> }
    >> }
    >> return x;
    >> }

    >


    One thing that I haven't seen discussed is the following scenario:

    str1 = "strstrstr"
    str2 = "strstr"

    Is there one match or two matches here? Certainly your current code will
    return 1, but (strstr)str and str(strstr) are possibilities. Only your
    requirements document knows for sure whether 1 or 2 is the correct answer.

    joe
    Joe Greer, Feb 28, 2008
    #9
  10. yogi_bear_79

    James Kanze Guest

    On Feb 28, 4:44 pm, Joe Greer <> wrote:
    > > On Feb 28, 3:32 am, yogi_bear_79 <> wrote:


    > > [...]
    > >> Final Working Code:


    > >> int findIt(const string &str1, const string &str2)
    > >> {
    > >> size_t pos = 0;
    > >> int x = 0;


    > >> while (pos != string::npos) {
    > >> pos = str1.find(str2, pos);
    > >> if (pos!=string::npos){
    > >> x++;
    > >> pos = pos+str2.size();
    > >> }
    > >> }
    > >> return x;
    > >> }


    > One thing that I haven't seen discussed is the following scenario:


    > str1 = "strstrstr"
    > str2 = "strstr"


    > Is there one match or two matches here? Certainly your
    > current code will return 1, but (strstr)str and str(strstr)
    > are possibilities. Only your requirements document knows for
    > sure whether 1 or 2 is the correct answer.


    Agreed. I didn't enter into this aspect, but the problem is not
    sufficiently specified. The difference that this makes in the
    code is minor, however: in one case, you add str2.size(), in the
    other you simply add 1.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
    James Kanze, Feb 28, 2008
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Shabam
    Replies:
    2
    Views:
    2,023
    Joe Smith
    Nov 17, 2004
  2. Replies:
    4
    Views:
    754
    Malcolm
    Oct 2, 2005
  3. bahoo
    Replies:
    37
    Views:
    833
    Paul McGuire
    Apr 9, 2007
  4. PerlFAQ Server
    Replies:
    0
    Views:
    163
    PerlFAQ Server
    Jan 4, 2011
  5. PerlFAQ Server
    Replies:
    0
    Views:
    150
    PerlFAQ Server
    Apr 22, 2011
Loading...

Share This Page