Is incrementing iterators actually this slow?

Discussion in 'C++' started by James Aguilar, Mar 29, 2005.

  1. Hey all,

    I have a bit of code that implements a suffix array, which is essentially an
    array of iterators or indexes into a string. The array starts out like this

    String: ABAB
    Array: 0123

    But then gets sorted

    Array: 2031
    2 AB
    0 ABAB
    3 B
    1 BAB

    All of this is well and good, but I have a little problem. Currently, I am
    using the STL's sort algorithm and providing a custom comparison function,
    which ends up sorting this vector<string::iterator> alphabetically. Here is
    the comparison and other associated functions:

    bool itLess(string::const_iterator it1, string::const_iterator it2, int
    skip)
    {
    advanceIterators(it1, it2, skip);
    return (*it1 < *it2 || *it1 == '\0');
    }

    int advanceIterators(string::const_iterator &it1, string::const_iterator
    &it2,
    int skip)
    {
    string::const_iterator ref(it1);
    advanceNoCheck(it1, it2, skip);

    while (*it1 == *it2 && !(*it1 == '\0')) {
    ++it1;
    ++it2;
    }

    return it1 - ref;
    }

    void advanceNoCheck(string::const_iterator &it1, string::const_iterator
    &it2,
    int skip)
    {
    if (skip) {
    it1 += skip;
    it2 += skip;
    }
    }

    The way the sorting works is that one method goes through the text string
    and makes an iterator for each position, putting it into the array.
    Afterwards, I call 'sort(m_array.begin(), m_array.end(), &itLess).' It
    takes entirely too long.

    I have profiled my code with gprof (I know, it's platform specific, but you
    should be able to read the output). The code was compiled with all
    optimizations turned on, and run on a text of 10 million characters that did
    not have a high incidence of repetition. Here is the output of the
    profiler:

    % cumulative self self total
    time seconds seconds calls Ks/call Ks/call name
    int advanceIterators(string::const_iterator &, string::const_iterator &):
    98.92 1196.55 1196.55 302747982 0.00 0.00
    bool itLess(string::const_iterator, string::const_iterator):
    0.42 1201.59 5.04 282747983 0.00 0.00
    std::__unguarded_partition([template junk]):
    0.21 1204.10 2.51 1074094 0.00 0.00
    void advanceNoCheck(string::const_iterator &, string::const_iterator &):
    0.09 1205.20 1.10 302747982 0.00 0.00

    Clearly, my advanceIterators (and hence, all of my iterator comparison) is
    far too slow. If I run my code without optimizations (that is, if I do not
    inline iterator::eek:perator ++ and iterator::eek:perator *), you would see that
    most of the time spent in advanceIterators is actually spent in those two
    methods. Does anyone know a faster way to sort string::const_iterators
    lexicographically?

    - JFA1
    James Aguilar, Mar 29, 2005
    #1
    1. Advertising

  2. "James Aguilar" <> wrote in message
    news:d2buav$23b$...

    > bool itLess(string::const_iterator it1, string::const_iterator it2, int
    > skip)
    > {
    > advanceIterators(it1, it2, skip);
    > return (*it1 < *it2 || *it1 == '\0');
    > }


    I suspect that this code doesn't do what you think it does.

    I am guessing that you are using the comparison *it1 == '\0' as a way of
    checking whether it1 has run off the end of the string.

    It doesn't do that. It is your responsibility to know where the end of the
    string is, and to dereference only iterators that you know refer to elements
    of the string.

    >
    > int advanceIterators(string::const_iterator &it1, string::const_iterator
    > &it2,
    > int skip)
    > {
    > string::const_iterator ref(it1);
    > advanceNoCheck(it1, it2, skip);
    >
    > while (*it1 == *it2 && !(*it1 == '\0')) {
    > ++it1;
    > ++it2;
    > }
    >
    > return it1 - ref;
    > }


    Same problem here: It looks like you are assuming that once you run off the
    end of the string, *it1 will be 0. It won't, in general. You'll get what
    you get.
    Andrew Koenig, Mar 29, 2005
    #2
    1. Advertising

  3. "Andrew Koenig" <> wrote in message
    news:iif2e.21182$...
    >
    > Same problem here: It looks like you are assuming that once you run off
    > the end of the string, *it1 will be 0. It won't, in general. You'll get
    > what you get.


    It seems to work on small problems (sorting them correctly. Also, a little
    test code shows that, at least with g++ 3.3.3, it works:

    #include <string>
    #include <iostream>

    using namespace std;

    int main()
    {
    string a;

    cout << (*(a.begin() + a.length()) == '\0') << '\n';

    a += "asad";
    cout << (*(a.begin() + a.length()) == '\0') << '\n';

    a = "OMGZ?";
    cout << (*(a.begin() + a.length()) == '\0') << '\n';

    return 0;
    }

    prints 1, 1, 1 on my system. Even if this is non-standard, I'm almost
    positive that it is not the problem.

    - JFA1
    James Aguilar, Mar 29, 2005
    #3
  4. "James Aguilar" <> wrote in message
    news:d2c058$2o1$...

    > It seems to work on small problems (sorting them correctly. Also, a
    > little test code shows that, at least with g++ 3.3.3, it works:


    No, it happens to do something that appears to work in this particular case.

    > prints 1, 1, 1 on my system. Even if this is non-standard, I'm almost
    > positive that it is not the problem.


    It doesn't matter. As written, the program's behavior is undefined.
    There's no point in investigating further until this problem is corrected.
    Andrew Koenig, Mar 29, 2005
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    3
    Views:
    3,027
  2. HK
    Replies:
    3
    Views:
    447
  3. mike
    Replies:
    3
    Views:
    384
    Virgil Green
    Jul 11, 2005
  4. Marcin Kaliciñski

    Iterators and reverse iterators

    Marcin Kaliciñski, May 8, 2005, in forum: C++
    Replies:
    1
    Views:
    485
    Kai-Uwe Bux
    May 8, 2005
  5. , India
    Replies:
    10
    Views:
    1,072
    James Kanze
    Aug 8, 2009
Loading...

Share This Page