C++ solution for K & R(2nd Ed) Ex.6-4 - better solution needed

Discussion in 'C++' started by subramanian100in@yahoo.com, India, Sep 29, 2007.

  1. , India

    , India Guest

    As a beginner in C++, I have attempted the C++ solution for the
    following:

    Consider the Ex. 6-4 in K & R(ANSI C) 2nd Edition in Page 143 :

    Write a program that prints the distinct words in its input sorted
    into descending order of frequency of occurrence. Precede each word by
    its count.

    (Here I assume that we should NOT sort the words first. Instead sort
    in decreasing order as per frequency.)

    Following is my C++ solution which works fine. Kindly suggest better
    way of doing it.

    #include <iostream>
    #include <vector>
    #include <string>
    #include <utility>
    #include <algorithm>

    using namespace std;

    typedef pair<int, string> is;

    bool cmp_fn(is arg1, is arg2)
    {
    return (arg2.first < arg1.first) ? true : false;
    }

    void print(const is & arg)
    {
    cout << arg.second << ": " << arg.first << endl;
    }

    int main()
    {
    vector<string> unique_words;
    vector<is> v;

    string word;

    while (cin >> word)
    {
    if (find(unique_words.begin(), unique_words.end(),
    word) == unique_words.end())
    {
    unique_words.push_back(word);
    v.push_back(make_pair(1, word));
    }
    else
    {
    for (vector<is>::iterator i = v.begin(); i !=
    v.end(); ++i)
    if (i->second == word)
    {
    ++i->first;
    break;
    }

    }
    }

    sort(v.begin(), v.end(), cmp_fn);

    for_each(v.begin(), v.end(), print);

    return 0;
    }

    Kindly help.

    Thanks
    V.Subramanian
    , India, Sep 29, 2007
    #1
    1. Advertising

  2. , India

    Barry Guest

    , India wrote:
    > As a beginner in C++, I have attempted the C++ solution for the
    > following:
    >
    > Consider the Ex. 6-4 in K & R(ANSI C) 2nd Edition in Page 143 :
    >
    > Write a program that prints the distinct words in its input sorted
    > into descending order of frequency of occurrence. Precede each word by
    > its count.
    >
    > (Here I assume that we should NOT sort the words first. Instead sort
    > in decreasing order as per frequency.)
    >
    > Following is my C++ solution which works fine. Kindly suggest better
    > way of doing it.
    >
    > #include <iostream>
    > #include <vector>
    > #include <string>
    > #include <utility>
    > #include <algorithm>
    >
    > using namespace std;
    >
    > typedef pair<int, string> is;
    >
    > bool cmp_fn(is arg1, is arg2)
    > {
    > return (arg2.first < arg1.first) ? true : false;
    > }
    >
    > void print(const is & arg)
    > {
    > cout << arg.second << ": " << arg.first << endl;
    > }
    >
    > int main()
    > {
    > vector<string> unique_words;
    > vector<is> v;
    >
    > string word;
    >
    > while (cin >> word)
    > {
    > if (find(unique_words.begin(), unique_words.end(),
    > word) == unique_words.end())
    > {
    > unique_words.push_back(word);
    > v.push_back(make_pair(1, word));
    > }
    > else
    > {
    > for (vector<is>::iterator i = v.begin(); i !=
    > v.end(); ++i)
    > if (i->second == word)
    > {
    > ++i->first;
    > break;
    > }
    >
    > }
    > }
    >
    > sort(v.begin(), v.end(), cmp_fn);
    >
    > for_each(v.begin(), v.end(), print);
    >
    > return 0;
    > }
    >
    > Kindly help.
    >


    I see that the problem needs mutli_index, where we need string as key,
    but sorted by another key of int.

    I think Boost has something for you.

    Anyway, if you don't want to use big tool to deal with small thing,
    I suggest you use std::vector<pair<int, string> > as your associative
    container, you wrap this as your member, when you add a key word, find a
    right place to insert.

    In this case, insertion, searching, is of O(N), not considering the cost
    of caused by insertion.


    --
    Thanks
    Barry
    Barry, Sep 29, 2007
    #2
    1. Advertising

  3. On 2007-09-29 13:41, , India wrote:
    > As a beginner in C++, I have attempted the C++ solution for the
    > following:
    >
    > Consider the Ex. 6-4 in K & R(ANSI C) 2nd Edition in Page 143 :
    >
    > Write a program that prints the distinct words in its input sorted
    > into descending order of frequency of occurrence. Precede each word by
    > its count.
    >
    > (Here I assume that we should NOT sort the words first. Instead sort
    > in decreasing order as per frequency.)
    >
    > Following is my C++ solution which works fine. Kindly suggest better
    > way of doing it.
    >
    > #include <iostream>
    > #include <vector>
    > #include <string>
    > #include <utility>
    > #include <algorithm>


    I have not looked at your code, but for this you should only need to
    include <iostream>, <string> and <map>.

    Some hints:

    Use std::map to store the word as a key and the number of occurrences.

    The [] operator of std::map can be used like this:

    map[key] = value;

    One nice feature of the [] operator is that if there is no key/value
    pair already in the map it will be created, and the value will be
    default initialised (meaning that if the value is an int it will be set
    to 0).

    --
    Erik Wikström
    =?UTF-8?B?RXJpayBXaWtzdHLDtm0=?=, Sep 29, 2007
    #3
  4. , India

    Kai-Uwe Bux Guest

    Erik Wikström wrote:

    > On 2007-09-29 13:41, , India wrote:
    >> As a beginner in C++, I have attempted the C++ solution for the
    >> following:
    >>
    >> Consider the Ex. 6-4 in K & R(ANSI C) 2nd Edition in Page 143 :
    >>
    >> Write a program that prints the distinct words in its input sorted
    >> into descending order of frequency of occurrence. Precede each word by
    >> its count.
    >>
    >> (Here I assume that we should NOT sort the words first. Instead sort
    >> in decreasing order as per frequency.)
    >>
    >> Following is my C++ solution which works fine. Kindly suggest better
    >> way of doing it.
    >>
    >> #include <iostream>
    >> #include <vector>
    >> #include <string>
    >> #include <utility>
    >> #include <algorithm>

    >
    > I have not looked at your code, but for this you should only need to
    > include <iostream>, <string> and <map>.


    Really? You are probably thinking of using

    std::map< std::string, unsigned >

    But that will not magially sort the strings by frequency.

    It think, I still would have <vector> and <algorithm>. (Of course, you could
    first build the map and then use an inverse multimap, but I think that
    would be more obscure than just sorting a vector.)

    > Some hints:
    >
    > Use std::map to store the word as a key and the number of occurrences.
    >
    > The [] operator of std::map can be used like this:
    >
    > map[key] = value;
    >
    > One nice feature of the [] operator is that if there is no key/value
    > pair already in the map it will be created, and the value will be
    > default initialised (meaning that if the value is an int it will be set
    > to 0).



    Best

    Kai-Uwe Bux
    Kai-Uwe Bux, Sep 29, 2007
    #4
  5. , India

    red floyd Guest

    Kai-Uwe Bux wrote:
    > Erik Wikström wrote:
    >
    >> On 2007-09-29 13:41, , India wrote:
    >>> As a beginner in C++, I have attempted the C++ solution for the
    >>> following:
    >>>
    >>> Consider the Ex. 6-4 in K & R(ANSI C) 2nd Edition in Page 143 :
    >>>
    >>> Write a program that prints the distinct words in its input sorted
    >>> into descending order of frequency of occurrence. Precede each word by
    >>> its count.
    >>>
    >>> (Here I assume that we should NOT sort the words first. Instead sort
    >>> in decreasing order as per frequency.)
    >>>
    >>> Following is my C++ solution which works fine. Kindly suggest better
    >>> way of doing it.
    >>>
    >>> #include <iostream>
    >>> #include <vector>
    >>> #include <string>
    >>> #include <utility>
    >>> #include <algorithm>

    >> I have not looked at your code, but for this you should only need to
    >> include <iostream>, <string> and <map>.

    >
    > Really? You are probably thinking of using
    >
    > std::map< std::string, unsigned >
    >


    // comparator for *DECREASING* order
    struct compare {
    bool operator()(const std::pair<string, unsigned>& lhs,
    const std::pair<string, unsigned& rhs) const
    {
    return lhs.second > rhs.second;
    }
    };

    std::map<std::string, unsigned> words_and_freqs;

    // fill map, then do this. Creates a set sorted by frequency

    std::set<std::pair<std::string, unsigned> >
    freqs_first(words_and_freqs.begin(), words_and_freqs.end(),
    compare());

    the set is sorted by the frequency in descending order.
    red floyd, Sep 30, 2007
    #5
  6. , India

    Kai-Uwe Bux Guest

    red floyd wrote:

    > Kai-Uwe Bux wrote:
    >> Erik Wikström wrote:
    >>
    >>> On 2007-09-29 13:41, , India wrote:
    >>>> As a beginner in C++, I have attempted the C++ solution for the
    >>>> following:
    >>>>
    >>>> Consider the Ex. 6-4 in K & R(ANSI C) 2nd Edition in Page 143 :
    >>>>
    >>>> Write a program that prints the distinct words in its input sorted
    >>>> into descending order of frequency of occurrence. Precede each word by
    >>>> its count.
    >>>>
    >>>> (Here I assume that we should NOT sort the words first. Instead sort
    >>>> in decreasing order as per frequency.)
    >>>>
    >>>> Following is my C++ solution which works fine. Kindly suggest better
    >>>> way of doing it.
    >>>>
    >>>> #include <iostream>
    >>>> #include <vector>
    >>>> #include <string>
    >>>> #include <utility>
    >>>> #include <algorithm>
    >>> I have not looked at your code, but for this you should only need to
    >>> include <iostream>, <string> and <map>.

    >>
    >> Really? You are probably thinking of using
    >>
    >> std::map< std::string, unsigned >
    >>

    >
    > // comparator for *DECREASING* order
    > struct compare {
    > bool operator()(const std::pair<string, unsigned>& lhs,
    > const std::pair<string, unsigned& rhs) const
    > {
    > return lhs.second > rhs.second;
    > }
    > };
    >
    > std::map<std::string, unsigned> words_and_freqs;
    >
    > // fill map, then do this. Creates a set sorted by frequency
    >
    > std::set<std::pair<std::string, unsigned> >
    > freqs_first(words_and_freqs.begin(), words_and_freqs.end(),
    > compare());
    >
    > the set is sorted by the frequency in descending order.


    And, if I see this correctly, it will contain exactly one word for each
    frequency, because two pairs with equal frequency counts will be treated as
    comparing equal by std::set<>.

    However, that can be fixed, in which case you would throw in the header
    <set> instead of <vector> and <algorithm>. Well, that's fine, too.


    Best

    Kai-Uwe Bux
    Kai-Uwe Bux, Sep 30, 2007
    #6
  7. , India

    , India Guest

    First I express my thanks to all of you for replying.

    Looks like, some of you have suggested map data structure. This cannot
    be used because it sorts the input based on string which is the key.
    But the input words should not be sorted. Their order should be
    retained as they appear in the input. Later, they should be sorted
    based on frequency of their occurrence.

    Let me put the logic that I have used, in words.
    Instead of going through the code, kindly go through this and give me
    your feedback as help.

    I create "vector<string> unique_words;" to store each input word as it
    arrives(after checking if it is already not found in this vector). I
    also create "vector< pair<int, string> > v;" along with the above
    vector<string>. Whenever a word arrives, first it is stored in
    vector<string> if it is a new word and in this case, make_pair(1,
    word) is stored in vector<pair<int,string>>. If the word has been
    previously stored, then its count is incremented in
    vector<pair<int,string>>. After reading all words, I do

    sort(v.begin(), v.end(), cmp_fn);

    for_each(v.begin(), v.end(), print);

    The disadvantage is the addition of two global functions cmp_fn and
    print. There can be other disadvantages also.

    Kindly give your feedback.

    Thanks
    V.Subramanian
    , India, Sep 30, 2007
    #7
  8. , India

    Kai-Uwe Bux Guest

    , India wrote:

    >
    > First I express my thanks to all of you for replying.
    >
    > Looks like, some of you have suggested map data structure. This cannot
    > be used because it sorts the input based on string which is the key.


    It still can (and should!) be used to compile the frequency data. You'r just
    not done, yet :)

    > But the input words should not be sorted. Their order should be
    > retained as they appear in the input. Later, they should be sorted
    > based on frequency of their occurrence.


    So, there needs to be a second step in the process. Assume you had a map

    std::map< std::string, unsigned int >

    with frequency data. How would you go about sorting them by frequency. There
    are several options:

    a) convert the list into a vector of pairs. (Easy since vector has a
    constructor that takes a pair of iterators.) Then sort by frequency. Then
    write to screen.

    b) convert the list into a std::multimap< unsigned count, std::string >.

    > Let me put the logic that I have used, in words.
    > Instead of going through the code, kindly go through this and give me
    > your feedback as help.
    >
    > I create "vector<string> unique_words;" to store each input word as it
    > arrives(after checking if it is already not found in this vector). I
    > also create "vector< pair<int, string> > v;" along with the above
    > vector<string>. Whenever a word arrives, first it is stored in
    > vector<string> if it is a new word and in this case, make_pair(1,
    > word) is stored in vector<pair<int,string>>. If the word has been
    > previously stored, then its count is incremented in
    > vector<pair<int,string>>. After reading all words, I do
    >
    > sort(v.begin(), v.end(), cmp_fn);


    good thinking.


    > for_each(v.begin(), v.end(), print);


    You might want to have a look into ostream_iterator. Then you can use

    std::copy( v.begin(),v.end(), ... );


    > The disadvantage is the addition of two global functions cmp_fn and
    > print. There can be other disadvantages also.
    >
    > Kindly give your feedback.


    You'r on the right track.



    Best

    Kai-Uwe Bux
    Kai-Uwe Bux, Sep 30, 2007
    #8
  9. , India

    Barry Guest

    , India wrote:
    > First I express my thanks to all of you for replying.
    >
    > Looks like, some of you have suggested map data structure. This cannot
    > be used because it sorts the input based on string which is the key.
    > But the input words should not be sorted. Their order should be
    > retained as they appear in the input. Later, they should be sorted
    > based on frequency of their occurrence.
    >
    > Let me put the logic that I have used, in words.
    > Instead of going through the code, kindly go through this and give me
    > your feedback as help.
    >
    > I create "vector<string> unique_words;" to store each input word as it
    > arrives(after checking if it is already not found in this vector). I
    > also create "vector< pair<int, string> > v;" along with the above
    > vector<string>. Whenever a word arrives, first it is stored in
    > vector<string> if it is a new word and in this case, make_pair(1,
    > word) is stored in vector<pair<int,string>>. If the word has been
    > previously stored, then its count is incremented in
    > vector<pair<int,string>>. After reading all words, I do
    >
    > sort(v.begin(), v.end(), cmp_fn);
    >
    > for_each(v.begin(), v.end(), print);
    >
    > The disadvantage is the addition of two global functions cmp_fn and
    > print. There can be other disadvantages also.


    I think you can wrap them up,

    You can check this out:

    #include <string>
    #include <iostream>
    #include <vector>
    #include <algorithm>

    class AssocVector
    {
    public:
    typedef std::vector<std::pair<int, std::string> > ContainerType;
    typedef ContainerType::iterator Iterator;
    typedef ContainerType::const_iterator ConstIterator;
    public:
    void Insert(std::string const& word)
    {
    bool found = false;
    Iterator iter = c.begin();
    for (; iter != c.end(); ++iter)
    {
    if (iter->second == word)
    {
    ++(iter->first);
    found = true;
    break;
    }
    }

    if (found && iter != c.begin())
    {
    for (; iter != c.begin(); --iter)
    {
    Iterator prev = iter;
    --prev;
    if (prev->first < iter->first)
    std::iter_swap(prev, iter);
    else
    break;
    }
    }

    if (!found)
    c.push_back(std::make_pair(1, word));
    }

    Iterator begin()
    {
    return c.begin();
    }

    Iterator end()
    {
    return c.end();
    }

    ConstIterator begin() const
    {
    return c.begin();
    }

    ConstIterator end() const
    {
    return c.end();
    }
    protected:
    ContainerType c;
    };

    struct Printer
    {
    void operator() (std::pair<int, std::string> const& p) const
    {
    std::cout << p.first << ' ' << p.second << std::endl;
    }

    };

    int main()
    {
    AssocVector assocVec;

    std::string word;
    while (std::cin >> word)
    assocVec.Insert(word);

    std::for_each (assocVec.begin(), assocVec.end(), (Printer()));
    }

    --
    Thanks
    Barry
    Barry, Sep 30, 2007
    #9
  10. , India

    Kai-Uwe Bux Guest

    Barry wrote:

    > , India wrote:
    >> First I express my thanks to all of you for replying.
    >>
    >> Looks like, some of you have suggested map data structure. This cannot
    >> be used because it sorts the input based on string which is the key.
    >> But the input words should not be sorted. Their order should be
    >> retained as they appear in the input. Later, they should be sorted
    >> based on frequency of their occurrence.
    >>
    >> Let me put the logic that I have used, in words.
    >> Instead of going through the code, kindly go through this and give me
    >> your feedback as help.
    >>
    >> I create "vector<string> unique_words;" to store each input word as it
    >> arrives(after checking if it is already not found in this vector). I
    >> also create "vector< pair<int, string> > v;" along with the above
    >> vector<string>. Whenever a word arrives, first it is stored in
    >> vector<string> if it is a new word and in this case, make_pair(1,
    >> word) is stored in vector<pair<int,string>>. If the word has been
    >> previously stored, then its count is incremented in
    >> vector<pair<int,string>>. After reading all words, I do
    >>
    >> sort(v.begin(), v.end(), cmp_fn);
    >>
    >> for_each(v.begin(), v.end(), print);
    >>
    >> The disadvantage is the addition of two global functions cmp_fn and
    >> print. There can be other disadvantages also.

    >
    > I think you can wrap them up,
    >
    > You can check this out:
    >
    > #include <string>
    > #include <iostream>
    > #include <vector>
    > #include <algorithm>
    >
    > class AssocVector
    > {
    > public:
    > typedef std::vector<std::pair<int, std::string> > ContainerType;
    > typedef ContainerType::iterator Iterator;
    > typedef ContainerType::const_iterator ConstIterator;
    > public:
    > void Insert(std::string const& word)
    > {
    > bool found = false;
    > Iterator iter = c.begin();
    > for (; iter != c.end(); ++iter)
    > {
    > if (iter->second == word)
    > {
    > ++(iter->first);
    > found = true;
    > break;
    > }
    > }
    >
    > if (found && iter != c.begin())
    > {
    > for (; iter != c.begin(); --iter)
    > {
    > Iterator prev = iter;
    > --prev;
    > if (prev->first < iter->first)
    > std::iter_swap(prev, iter);
    > else
    > break;
    > }
    > }
    >
    > if (!found)
    > c.push_back(std::make_pair(1, word));
    > }


    Wow: is that bubble sort on the fly?

    Anyway, I don't like the mix of responsibilities that issues from putting
    the increment of the frequency count within the search loop. Consider:


    void Insert(std::string const& word)
    {
    Iterator iter = c.begin();
    while ( iter != c.end() && iter->second != word ) {
    ++ iter;
    }
    if ( iter == c.end() ) {
    c.push_back(std::make_pair(1, word));
    } else {
    ++ iter->first;
    while ( iter != c.begin() ) {
    Iterator next = iter;
    --iter;
    if (iter->first < next->first) {
    std::iter_swap(next, iter);
    } else {
    return;
    }
    }
    }
    }


    >
    > Iterator begin()
    > {
    > return c.begin();
    > }
    >
    > Iterator end()
    > {
    > return c.end();
    > }
    >
    > ConstIterator begin() const
    > {
    > return c.begin();
    > }
    >
    > ConstIterator end() const
    > {
    > return c.end();
    > }
    > protected:
    > ContainerType c;
    > };
    >
    > struct Printer
    > {
    > void operator() (std::pair<int, std::string> const& p) const
    > {
    > std::cout << p.first << ' ' << p.second << std::endl;
    > }
    >
    > };
    >
    > int main()
    > {
    > AssocVector assocVec;
    >
    > std::string word;
    > while (std::cin >> word)
    > assocVec.Insert(word);
    >
    > std::for_each (assocVec.begin(), assocVec.end(), (Printer()));
    > }


    I think this is overkill (and inefficient due to the use of bubble sort :)

    Instead, let me suggest some spagetty code (all goes into main):


    #include <iostream>
    #include <string>
    #include <map>

    int main ( void ) {
    std::string word;
    typedef std::map< std::string, unsigned long >
    frequency_table;
    frequency_table frequency;

    // reading and counting:
    while ( std::cin >> word ) {
    ++ frequency[ word ];
    }

    // reverting and resorting:
    typedef std::multimap< unsigned long, std::string >
    inverse_table;
    inverse_table inverse;
    for ( frequency_table::const_iterator iter
    = frequency.begin();
    iter != frequency.end(); ++ iter ) {
    inverse.insert
    ( inverse_table::value_type
    ( iter->second, iter->first ) );
    }

    // output:
    for ( inverse_table::const_reverse_iterator iter
    = inverse.rbegin();
    iter != inverse.rend(); ++ iter ) {
    std::cout << iter->second << " " << iter->first << '\n';
    }
    }


    Best

    Kai-Uwe Bux
    Kai-Uwe Bux, Sep 30, 2007
    #10
  11. , India

    Barry Guest

    Kai-Uwe Bux wrote:
    > Barry wrote:
    >
    >> , India wrote:
    >>> First I express my thanks to all of you for replying.
    >>>
    >>> Looks like, some of you have suggested map data structure. This cannot
    >>> be used because it sorts the input based on string which is the key.
    >>> But the input words should not be sorted. Their order should be
    >>> retained as they appear in the input. Later, they should be sorted
    >>> based on frequency of their occurrence.
    >>>
    >>> Let me put the logic that I have used, in words.
    >>> Instead of going through the code, kindly go through this and give me
    >>> your feedback as help.
    >>>
    >>> I create "vector<string> unique_words;" to store each input word as it
    >>> arrives(after checking if it is already not found in this vector). I
    >>> also create "vector< pair<int, string> > v;" along with the above
    >>> vector<string>. Whenever a word arrives, first it is stored in
    >>> vector<string> if it is a new word and in this case, make_pair(1,
    >>> word) is stored in vector<pair<int,string>>. If the word has been
    >>> previously stored, then its count is incremented in
    >>> vector<pair<int,string>>. After reading all words, I do
    >>>
    >>> sort(v.begin(), v.end(), cmp_fn);
    >>>
    >>> for_each(v.begin(), v.end(), print);
    >>>
    >>> The disadvantage is the addition of two global functions cmp_fn and
    >>> print. There can be other disadvantages also.

    >> I think you can wrap them up,
    >>
    >> You can check this out:
    >>
    >> #include <string>
    >> #include <iostream>
    >> #include <vector>
    >> #include <algorithm>
    >>
    >> class AssocVector
    >> {
    >> public:
    >> typedef std::vector<std::pair<int, std::string> > ContainerType;
    >> typedef ContainerType::iterator Iterator;
    >> typedef ContainerType::const_iterator ConstIterator;
    >> public:
    >> void Insert(std::string const& word)
    >> {
    >> bool found = false;
    >> Iterator iter = c.begin();
    >> for (; iter != c.end(); ++iter)
    >> {
    >> if (iter->second == word)
    >> {
    >> ++(iter->first);
    >> found = true;
    >> break;
    >> }
    >> }
    >>
    >> if (found && iter != c.begin())
    >> {
    >> for (; iter != c.begin(); --iter)
    >> {
    >> Iterator prev = iter;
    >> --prev;
    >> if (prev->first < iter->first)
    >> std::iter_swap(prev, iter);
    >> else
    >> break;
    >> }
    >> }
    >>
    >> if (!found)
    >> c.push_back(std::make_pair(1, word));
    >> }

    >
    > Wow: is that bubble sort on the fly?
    >


    Nah,
    Actually, this is more like /insertion sort/

    I shouldn't keep swapping, I just need swap once

    if (found && iter != c.begin())
    {
    Iterator hole = iter;

    for (; iter != c.begin(); --iter)
    {
    Iterator prev = iter;
    --prev;
    if (prev->first < iter->first)
    //std::iter_swap(prev, iter);
    ;
    else
    break;
    }
    if (iter != hole)
    std::iter_swap(iter, hole);
    }

    > Anyway, I don't like the mix of responsibilities that issues from putting
    > the increment of the frequency count within the search loop. Consider:
    >
    >
    > void Insert(std::string const& word)
    > {
    > Iterator iter = c.begin();
    > while ( iter != c.end() && iter->second != word ) {
    > ++ iter;
    > }
    > if ( iter == c.end() ) {
    > c.push_back(std::make_pair(1, word));
    > } else {
    > ++ iter->first;
    > while ( iter != c.begin() ) {
    > Iterator next = iter;
    > --iter;
    > if (iter->first < next->first) {
    > std::iter_swap(next, iter);
    > } else {
    > return;
    > }
    > }
    > }
    > }
    >


    Nah,
    *found* variable is not needed.

    >
    >> Iterator begin()
    >> {
    >> return c.begin();
    >> }
    >>
    >> Iterator end()
    >> {
    >> return c.end();
    >> }
    >>
    >> ConstIterator begin() const
    >> {
    >> return c.begin();
    >> }
    >>
    >> ConstIterator end() const
    >> {
    >> return c.end();
    >> }
    >> protected:
    >> ContainerType c;
    >> };
    >>
    >> struct Printer
    >> {
    >> void operator() (std::pair<int, std::string> const& p) const
    >> {
    >> std::cout << p.first << ' ' << p.second << std::endl;
    >> }
    >>
    >> };
    >>
    >> int main()
    >> {
    >> AssocVector assocVec;
    >>
    >> std::string word;
    >> while (std::cin >> word)
    >> assocVec.Insert(word);
    >>
    >> std::for_each (assocVec.begin(), assocVec.end(), (Printer()));
    >> }

    >
    > I think this is overkill (and inefficient due to the use of bubble sort :)
    >
    > Instead, let me suggest some spagetty code (all goes into main):
    >
    >
    > #include <iostream>
    > #include <string>
    > #include <map>
    >
    > int main ( void ) {
    > std::string word;
    > typedef std::map< std::string, unsigned long >
    > frequency_table;
    > frequency_table frequency;
    >
    > // reading and counting:
    > while ( std::cin >> word ) {
    > ++ frequency[ word ];
    > }
    >
    > // reverting and resorting:
    > typedef std::multimap< unsigned long, std::string >
    > inverse_table;
    > inverse_table inverse;
    > for ( frequency_table::const_iterator iter
    > = frequency.begin();
    > iter != frequency.end(); ++ iter ) {
    > inverse.insert
    > ( inverse_table::value_type
    > ( iter->second, iter->first ) );
    > }
    >
    > // output:
    > for ( inverse_table::const_reverse_iterator iter
    > = inverse.rbegin();
    > iter != inverse.rend(); ++ iter ) {
    > std::cout << iter->second << " " << iter->first << '\n';
    > }
    > }
    >


    yeh, using map for sorting is better, but have to buy another one for
    indexing word while increment the frequency of a certain word.

    So, this is Space VS. Time problem


    --
    Thanks
    Barry
    Barry, Sep 30, 2007
    #11
  12. , India

    Kai-Uwe Bux Guest

    Barry wrote:

    > Kai-Uwe Bux wrote:
    >> Barry wrote:
    >>
    >>> , India wrote:
    >>>> First I express my thanks to all of you for replying.
    >>>>
    >>>> Looks like, some of you have suggested map data structure. This cannot
    >>>> be used because it sorts the input based on string which is the key.
    >>>> But the input words should not be sorted. Their order should be
    >>>> retained as they appear in the input. Later, they should be sorted
    >>>> based on frequency of their occurrence.
    >>>>
    >>>> Let me put the logic that I have used, in words.
    >>>> Instead of going through the code, kindly go through this and give me
    >>>> your feedback as help.
    >>>>
    >>>> I create "vector<string> unique_words;" to store each input word as it
    >>>> arrives(after checking if it is already not found in this vector). I
    >>>> also create "vector< pair<int, string> > v;" along with the above
    >>>> vector<string>. Whenever a word arrives, first it is stored in
    >>>> vector<string> if it is a new word and in this case, make_pair(1,
    >>>> word) is stored in vector<pair<int,string>>. If the word has been
    >>>> previously stored, then its count is incremented in
    >>>> vector<pair<int,string>>. After reading all words, I do
    >>>>
    >>>> sort(v.begin(), v.end(), cmp_fn);
    >>>>
    >>>> for_each(v.begin(), v.end(), print);
    >>>>
    >>>> The disadvantage is the addition of two global functions cmp_fn and
    >>>> print. There can be other disadvantages also.
    >>> I think you can wrap them up,
    >>>
    >>> You can check this out:
    >>>
    >>> #include <string>
    >>> #include <iostream>
    >>> #include <vector>
    >>> #include <algorithm>
    >>>
    >>> class AssocVector
    >>> {
    >>> public:
    >>> typedef std::vector<std::pair<int, std::string> > ContainerType;
    >>> typedef ContainerType::iterator Iterator;
    >>> typedef ContainerType::const_iterator ConstIterator;
    >>> public:
    >>> void Insert(std::string const& word)
    >>> {
    >>> bool found = false;
    >>> Iterator iter = c.begin();
    >>> for (; iter != c.end(); ++iter)
    >>> {
    >>> if (iter->second == word)
    >>> {
    >>> ++(iter->first);
    >>> found = true;
    >>> break;
    >>> }
    >>> }
    >>>
    >>> if (found && iter != c.begin())
    >>> {
    >>> for (; iter != c.begin(); --iter)
    >>> {
    >>> Iterator prev = iter;
    >>> --prev;
    >>> if (prev->first < iter->first)
    >>> std::iter_swap(prev, iter);
    >>> else
    >>> break;
    >>> }
    >>> }
    >>>
    >>> if (!found)
    >>> c.push_back(std::make_pair(1, word));
    >>> }

    >>
    >> Wow: is that bubble sort on the fly?
    >>

    >
    > Nah,
    > Actually, this is more like /insertion sort/
    >
    > I shouldn't keep swapping, I just need swap once
    >
    > if (found && iter != c.begin())
    > {
    > Iterator hole = iter;
    >
    > for (; iter != c.begin(); --iter)
    > {
    > Iterator prev = iter;
    > --prev;
    > if (prev->first < iter->first)
    > //std::iter_swap(prev, iter);
    > ;
    > else
    > break;
    > }
    > if (iter != hole)
    > std::iter_swap(iter, hole);
    > }


    a) The code obscures the logic. Why are you using a break here?

    b) The code is incorrect. You keep changing iter, but leaving out the swap
    means that the value is not moving along with the --iter moves. Thus you
    are comparing the wrong values.


    >> Anyway, I don't like the mix of responsibilities that issues from putting
    >> the increment of the frequency count within the search loop. Consider:
    >>
    >>
    >> void Insert(std::string const& word)
    >> {
    >> Iterator iter = c.begin();
    >> while ( iter != c.end() && iter->second != word ) {
    >> ++ iter;
    >> }
    >> if ( iter == c.end() ) {
    >> c.push_back(std::make_pair(1, word));
    >> } else {
    >> ++ iter->first;
    >> while ( iter != c.begin() ) {
    >> Iterator next = iter;
    >> --iter;
    >> if (iter->first < next->first) {
    >> std::iter_swap(next, iter);
    >> } else {
    >> return;
    >> }
    >> }
    >> }
    >> }
    >>

    >
    > Nah, *found* variable is not needed.


    That's part of what I was trying to point out. The code I posted does not
    use it.


    >>
    >>> Iterator begin()
    >>> {
    >>> return c.begin();
    >>> }
    >>>
    >>> Iterator end()
    >>> {
    >>> return c.end();
    >>> }
    >>>
    >>> ConstIterator begin() const
    >>> {
    >>> return c.begin();
    >>> }
    >>>
    >>> ConstIterator end() const
    >>> {
    >>> return c.end();
    >>> }
    >>> protected:
    >>> ContainerType c;
    >>> };
    >>>
    >>> struct Printer
    >>> {
    >>> void operator() (std::pair<int, std::string> const& p) const
    >>> {
    >>> std::cout << p.first << ' ' << p.second << std::endl;
    >>> }
    >>>
    >>> };
    >>>
    >>> int main()
    >>> {
    >>> AssocVector assocVec;
    >>>
    >>> std::string word;
    >>> while (std::cin >> word)
    >>> assocVec.Insert(word);
    >>>
    >>> std::for_each (assocVec.begin(), assocVec.end(), (Printer()));
    >>> }

    >>
    >> I think this is overkill (and inefficient due to the use of bubble sort
    >> :)
    >>
    >> Instead, let me suggest some spagetty code (all goes into main):
    >>
    >>
    >> #include <iostream>
    >> #include <string>
    >> #include <map>
    >>
    >> int main ( void ) {
    >> std::string word;
    >> typedef std::map< std::string, unsigned long >
    >> frequency_table;
    >> frequency_table frequency;
    >>
    >> // reading and counting:
    >> while ( std::cin >> word ) {
    >> ++ frequency[ word ];
    >> }
    >>
    >> // reverting and resorting:
    >> typedef std::multimap< unsigned long, std::string >
    >> inverse_table;
    >> inverse_table inverse;
    >> for ( frequency_table::const_iterator iter
    >> = frequency.begin();
    >> iter != frequency.end(); ++ iter ) {
    >> inverse.insert
    >> ( inverse_table::value_type
    >> ( iter->second, iter->first ) );
    >> }
    >>
    >> // output:
    >> for ( inverse_table::const_reverse_iterator iter
    >> = inverse.rbegin();
    >> iter != inverse.rend(); ++ iter ) {
    >> std::cout << iter->second << " " << iter->first << '\n';
    >> }
    >> }
    >>

    >
    > yeh, using map for sorting is better, but have to buy another one for
    > indexing word while increment the frequency of a certain word.
    >
    > So, this is Space VS. Time problem


    Not really: you can dismantle the map while building the multimap. That way,
    you keep the space roughly constant. On the other hand, map and multimap
    will probably both be larger than the vector you are using.


    #include <iostream>
    #include <string>
    #include <map>

    int main ( void ) {
    std::string word;
    typedef std::map< std::string, unsigned long >
    frequency_table;
    frequency_table frequency;

    // reading and counting:
    while ( std::cin >> word ) {
    ++ frequency[ word ];
    }

    // reverting and resorting:
    typedef std::multimap< unsigned long, std::string >
    inverse_table;
    inverse_table inverse;
    while ( ! frequency.empty() ) {
    frequency_table::iterator start = frequency.begin();
    inverse.insert
    ( inverse_table::value_type
    ( start->second, start->first ) );
    frequency.erase ( start );
    }

    // output:
    for ( inverse_table::const_reverse_iterator iter
    = inverse.rbegin();
    iter != inverse.rend(); ++ iter ) {
    std::cout << iter->second << " " << iter->first << '\n';
    }
    }


    Best

    Kai-Uwe Bux
    Kai-Uwe Bux, Sep 30, 2007
    #12
  13. , India

    Barry Guest

    Kai-Uwe Bux wrote:
    > Barry wrote:
    >
    >> Kai-Uwe Bux wrote:
    >>> Barry wrote:
    >>>
    >>>> , India wrote:
    >>>>> First I express my thanks to all of you for replying.
    >>>>>
    >>>>> Looks like, some of you have suggested map data structure. This cannot
    >>>>> be used because it sorts the input based on string which is the key.
    >>>>> But the input words should not be sorted. Their order should be
    >>>>> retained as they appear in the input. Later, they should be sorted
    >>>>> based on frequency of their occurrence.
    >>>>>
    >>>>> Let me put the logic that I have used, in words.
    >>>>> Instead of going through the code, kindly go through this and give me
    >>>>> your feedback as help.
    >>>>>
    >>>>> I create "vector<string> unique_words;" to store each input word as it
    >>>>> arrives(after checking if it is already not found in this vector). I
    >>>>> also create "vector< pair<int, string> > v;" along with the above
    >>>>> vector<string>. Whenever a word arrives, first it is stored in
    >>>>> vector<string> if it is a new word and in this case, make_pair(1,
    >>>>> word) is stored in vector<pair<int,string>>. If the word has been
    >>>>> previously stored, then its count is incremented in
    >>>>> vector<pair<int,string>>. After reading all words, I do
    >>>>>
    >>>>> sort(v.begin(), v.end(), cmp_fn);
    >>>>>
    >>>>> for_each(v.begin(), v.end(), print);
    >>>>>
    >>>>> The disadvantage is the addition of two global functions cmp_fn and
    >>>>> print. There can be other disadvantages also.
    >>>> I think you can wrap them up,
    >>>>
    >>>> You can check this out:
    >>>>
    >>>> #include <string>
    >>>> #include <iostream>
    >>>> #include <vector>
    >>>> #include <algorithm>
    >>>>
    >>>> class AssocVector
    >>>> {
    >>>> public:
    >>>> typedef std::vector<std::pair<int, std::string> > ContainerType;
    >>>> typedef ContainerType::iterator Iterator;
    >>>> typedef ContainerType::const_iterator ConstIterator;
    >>>> public:
    >>>> void Insert(std::string const& word)
    >>>> {
    >>>> bool found = false;
    >>>> Iterator iter = c.begin();
    >>>> for (; iter != c.end(); ++iter)
    >>>> {
    >>>> if (iter->second == word)
    >>>> {
    >>>> ++(iter->first);
    >>>> found = true;
    >>>> break;
    >>>> }
    >>>> }
    >>>>
    >>>> if (found && iter != c.begin())
    >>>> {
    >>>> for (; iter != c.begin(); --iter)
    >>>> {
    >>>> Iterator prev = iter;
    >>>> --prev;
    >>>> if (prev->first < iter->first)
    >>>> std::iter_swap(prev, iter);
    >>>> else
    >>>> break;
    >>>> }
    >>>> }
    >>>>
    >>>> if (!found)
    >>>> c.push_back(std::make_pair(1, word));
    >>>> }
    >>> Wow: is that bubble sort on the fly?
    >>>

    >> Nah,
    >> Actually, this is more like /insertion sort/
    >>
    >> I shouldn't keep swapping, I just need swap once
    >>
    >> if (found && iter != c.begin())
    >> {
    >> Iterator hole = iter;
    >>
    >> for (; iter != c.begin(); --iter)
    >> {
    >> Iterator prev = iter;
    >> --prev;
    >> if (prev->first < iter->first)
    >> //std::iter_swap(prev, iter);
    >> ;
    >> else
    >> break;
    >> }
    >> if (iter != hole)
    >> std::iter_swap(iter, hole);
    >> }

    >
    > a) The code obscures the logic. Why are you using a break here?
    >


    I rewrote the code without thinking
    actually,
    for (; iter != c.begin() && iter->first < hole->first;
    --iter)
    ;

    the loop is try to the word with frequency that is less than the current
    word, the case when we need swapping is that the current word frequency
    has left neighbor whose frequency is the same, as Insert only add 1 to
    the frequency of the current word.

    say the frequency sequence is like the following:

    3 2 2 2 ...
    ^
    current word frequency

    now we add 1 to the current word frequency, then it's 3, we only have to
    swap the second and the current slots. The sequency now becomes

    3 3 2 2
    ^

    right?

    > b) The code is incorrect. You keep changing iter, but leaving out the swap
    > means that the value is not moving along with the --iter moves. Thus you
    > are comparing the wrong values.
    >


    As I explained, I did a careless mistake.



    --
    Thanks
    Barry
    Barry, Sep 30, 2007
    #13
  14. , India

    Barry Guest

    Barry wrote:
    > Kai-Uwe Bux wrote:
    >> Barry wrote:
    >>
    >>> Kai-Uwe Bux wrote:
    >>>> Barry wrote:
    >>>>
    >>>>> , India wrote:
    >>>>>> First I express my thanks to all of you for replying.
    >>>>>>


    forget what I posted, still wrong
    So careless.
    Good luck to myself when I do job interview


    --
    Thanks
    Barry
    Barry, Sep 30, 2007
    #14
  15. On 2007-09-30 04:37, , India wrote:
    > First I express my thanks to all of you for replying.
    >
    > Looks like, some of you have suggested map data structure. This cannot
    > be used because it sorts the input based on string which is the key.
    > But the input words should not be sorted. Their order should be
    > retained as they appear in the input. Later, they should be sorted
    > based on frequency of their occurrence.
    >
    > Let me put the logic that I have used, in words.
    > Instead of going through the code, kindly go through this and give me
    > your feedback as help.
    >
    > I create "vector<string> unique_words;" to store each input word as it
    > arrives(after checking if it is already not found in this vector). I
    > also create "vector< pair<int, string> > v;" along with the above
    > vector<string>. Whenever a word arrives, first it is stored in
    > vector<string> if it is a new word and in this case, make_pair(1,
    > word) is stored in vector<pair<int,string>>. If the word has been
    > previously stored, then its count is incremented in
    > vector<pair<int,string>>. After reading all words, I do
    >
    > sort(v.begin(), v.end(), cmp_fn);
    >
    > for_each(v.begin(), v.end(), print);
    >
    > The disadvantage is the addition of two global functions cmp_fn and
    > print. There can be other disadvantages also.


    It is a good solution, however it does not take full advantage of the
    containers provided in the standard library. For example unique_words
    should probably be a set instead of a vector. Using the vector to store
    both frequency and word have some advantages over a map, since it can be
    sorted by both frequency and word making it useful in both stages of the
    program. My only complaint is that your solution requires you to write
    more code than one that takes better advantage of the standard
    containers, it will however probably perform better and sometimes that
    is more important.

    My suggestion would have been to use first a map to read in the words
    and counting their frequency, and then construct a multimap sorted by
    frequency which is then printed:

    #include <iostream>
    #include <string>
    #include <map>

    int main()
    {
    std::map<std::string, size_t> words;

    // Read words and count frequency
    std::string w;
    while (std::cin >> w)
    {
    ++words[w];
    }

    // Construct multimap, notice the use of greater to sort it in
    // decending order instead of ascending
    std::multimap<size_t, std::string, std::greater<size_t> > freq;
    std::map<std::string, size_t>::iterator i;
    for (i = words.begin(); i != words.end(); ++i)
    {
    freq.insert(std::make_pair(i->second, i->first));
    }

    // Print, someone can probably come up with a way to use std::copy
    // to print it instead, but I prefer an explicit loop
    std::multimap<size_t, std::string, std::greater<size_t>>::iterator j;
    for (j = freq.begin(); j != freq.end(); ++j)
    {
    std::cout << j->first << "\t" << j->second << "\n";
    }
    }

    --
    Erik Wikström
    =?UTF-8?B?RXJpayBXaWtzdHLDtm0=?=, Sep 30, 2007
    #15
  16. , India

    , India Guest

    On Sep 30, 7:10 am, Erik Wikström <> wrote:

    > It is a good solution, however it does not take full advantage of the
    > containers provided in the standard library. For example unique_words
    > should probably be a set instead of a vector. Using the vector to store
    > both frequency and word have some advantages over a map, since it can be
    > sorted by both frequency and word making it useful in both stages of the
    > program. My only complaint is that your solution requires you to write
    > more code than one that takes better advantage of the standard
    > containers, it will however probably perform better and sometimes that
    > is more important.
    >
    > My suggestion would have been to use first a map to read in the words
    > and counting their frequency, and then construct a multimap sorted by
    > frequency which is then printed:
    >
    > #include <iostream>
    > #include <string>
    > #include <map>
    >
    > int main()
    > {
    > std::map<std::string, size_t> words;
    >
    > // Read words and count frequency
    > std::string w;
    > while (std::cin >> w)
    > {
    > ++words[w];
    > }
    >
    > // Construct multimap, notice the use of greater to sort it in
    > // decending order instead of ascending
    > std::multimap<size_t, std::string, std::greater<size_t> > freq;
    > std::map<std::string, size_t>::iterator i;
    > for (i = words.begin(); i != words.end(); ++i)
    > {
    > freq.insert(std::make_pair(i->second, i->first));
    > }
    >
    > // Print, someone can probably come up with a way to use std::copy
    > // to print it instead, but I prefer an explicit loop
    > std::multimap<size_t, std::string, std::greater<size_t>>::iterator j;
    > for (j = freq.begin(); j != freq.end(); ++j)
    > {
    > std::cout << j->first << "\t" << j->second << "\n";
    > }
    >
    > }
    >
    > --
    > Erik Wikström


    Let me restate the exercise:

    Consider the Ex. 6-4 in K & R(ANSI C) 2nd Edition in Page 143 :

    Write a program that prints the distinct words in its input sorted
    into descending order of frequency of occurrence. Precede each word by
    its count.

    (Here I assumed that we should NOT sort the words first. Instead sort
    in decreasing order as per frequency.)

    If we use a set or a map, won't it store input words in sorted
    manner ? If so, it is not accepted in this problem. Order of input
    words can be changed only for sorting by their frequency. That is why
    I had to use vector. Correct me if I am wrong.

    However thanks for giving a simple solution and I will use it if there
    is no condition imposed on the sorting of words(ie the words may or
    may not maintain their input order).

    Thanks
    V.Subramanian
    , India, Sep 30, 2007
    #16
  17. On 2007-09-30 16:03, , India wrote:
    > On Sep 30, 7:10 am, Erik Wikström <> wrote:
    >
    >> It is a good solution, however it does not take full advantage of the
    >> containers provided in the standard library. For example unique_words
    >> should probably be a set instead of a vector. Using the vector to store
    >> both frequency and word have some advantages over a map, since it can be
    >> sorted by both frequency and word making it useful in both stages of the
    >> program. My only complaint is that your solution requires you to write
    >> more code than one that takes better advantage of the standard
    >> containers, it will however probably perform better and sometimes that
    >> is more important.
    >>
    >> My suggestion would have been to use first a map to read in the words
    >> and counting their frequency, and then construct a multimap sorted by
    >> frequency which is then printed:
    >>
    >> #include <iostream>
    >> #include <string>
    >> #include <map>
    >>
    >> int main()
    >> {
    >> std::map<std::string, size_t> words;
    >>
    >> // Read words and count frequency
    >> std::string w;
    >> while (std::cin >> w)
    >> {
    >> ++words[w];
    >> }
    >>
    >> // Construct multimap, notice the use of greater to sort it in
    >> // decending order instead of ascending
    >> std::multimap<size_t, std::string, std::greater<size_t> > freq;
    >> std::map<std::string, size_t>::iterator i;
    >> for (i = words.begin(); i != words.end(); ++i)
    >> {
    >> freq.insert(std::make_pair(i->second, i->first));
    >> }
    >>
    >> // Print, someone can probably come up with a way to use std::copy
    >> // to print it instead, but I prefer an explicit loop
    >> std::multimap<size_t, std::string, std::greater<size_t>>::iterator j;
    >> for (j = freq.begin(); j != freq.end(); ++j)
    >> {
    >> std::cout << j->first << "\t" << j->second << "\n";
    >> }
    >>
    >> }
    >>
    >> --
    >> Erik Wikström

    >
    > Let me restate the exercise:
    >
    > Consider the Ex. 6-4 in K & R(ANSI C) 2nd Edition in Page 143 :
    >
    > Write a program that prints the distinct words in its input sorted
    > into descending order of frequency of occurrence. Precede each word by
    > its count.
    >
    > (Here I assumed that we should NOT sort the words first. Instead sort
    > in decreasing order as per frequency.)
    >
    > If we use a set or a map, won't it store input words in sorted
    > manner ? If so, it is not accepted in this problem. Order of input
    > words can be changed only for sorting by their frequency. That is why
    > I had to use vector. Correct me if I am wrong.


    I see nothing in the requirements that tells you how to implement it,
    just on the observable behaviour, so anything should be allowed as long
    as the results are correct.

    > However thanks for giving a simple solution and I will use it if there
    > is no condition imposed on the sorting of words(ie the words may or
    > may not maintain their input order).


    While reading the words into the map they will be sorted in whatever
    manner strings are sorted, then I copy the contents of the map over to
    the multimap but this time they are stored sorted by frequency. So when
    you iterate over the multimap you will go through the words in the order
    of most used word to least used word (I do not know how two words which
    have the same frequency are sorted internally, but I imagine that they
    are sorted by comparing the words).

    --
    Erik Wikström
    =?UTF-8?B?RXJpayBXaWtzdHLDtm0=?=, Sep 30, 2007
    #17
  18. , India

    James Kanze Guest

    On Sep 30, 5:28 pm, Erik Wikström <> wrote:
    > On 2007-09-30 16:03, , India wrote:

    [...]
    > > Let me restate the exercise:
    > > Consider the Ex. 6-4 in K & R(ANSI C) 2nd Edition in Page 143 :


    > > Write a program that prints the distinct words in its input sorted
    > > into descending order of frequency of occurrence. Precede each word by
    > > its count.


    > > (Here I assumed that we should NOT sort the words first. Instead sort
    > > in decreasing order as per frequency.)


    > > If we use a set or a map, won't it store input words in sorted
    > > manner ? If so, it is not accepted in this problem. Order of input
    > > words can be changed only for sorting by their frequency. That is why
    > > I had to use vector. Correct me if I am wrong.


    > I see nothing in the requirements that tells you how to implement it,
    > just on the observable behaviour, so anything should be allowed as long
    > as the results are correct.


    > > However thanks for giving a simple solution and I will use it if there
    > > is no condition imposed on the sorting of words(ie the words may or
    > > may not maintain their input order).


    > While reading the words into the map they will be sorted in whatever
    > manner strings are sorted, then I copy the contents of the map over to
    > the multimap but this time they are stored sorted by frequency.


    I rather suspect that it would be preferrable to copy to an
    std::vector, and then sort that, rather than use multimap. Both
    simpler to program, and faster.

    It's also possible to maintain everything in a vector, keeping
    the vector sorted and using std::lower_bound for the search.
    Doing this, insertion will be significantly slower, but you save
    the copy at the end. Off hand, I couldn't say which would be
    fastest, but using the map, then copying to the vector, would
    certainly be the easiest to write and maintain. (One might also
    consider using hash_map once it's available.)

    > So when you iterate over the multimap you will go through the
    > words in the order of most used word to least used word (I do
    > not know how two words which have the same frequency are
    > sorted internally, but I imagine that they are sorted by
    > comparing the words).


    In multimap, no, or at least not directly. Multimap and
    multiset preserve the relative order of equivalent elements.
    But if you're copying from a container which was sorted on the
    words, of course, the relative order of equivalent elements will
    be the order of the words.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
    James Kanze, Oct 1, 2007
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. FN
    Replies:
    1
    Views:
    485
    [MVP] SmileSeo
    Oct 18, 2003
  2. Peter Bencsik
    Replies:
    2
    Views:
    826
  3. Steve Hershoff
    Replies:
    3
    Views:
    333
    Steve Hershoff
    Aug 6, 2007
  4. Andrew Thompson
    Replies:
    8
    Views:
    149
    Premshree Pillai
    Jun 7, 2005
  5. Christos Kokaliaris
    Replies:
    45
    Views:
    293
    Jorgen Grahn
    Jun 16, 2014
Loading...

Share This Page