map vs list performance

B

barcaroller

I have an STL list that contains about 1500 objects. To check if an object
exists in my list I iterate over it and check if a "key" matches. A key is
just a struct that contains four integers.

typedef struct
{
int a;
int b;
int c;
int d;
} abcd;

class C
{
abcd key;
char data[100];
}

for i = list.begin(); i != list.end(); i++)
if ( mya == i->key.a && myb == i->key.b &&
myc == i->key.c && myd == i->key.d )
// object has been found; access data[]



To improve performance, I converted the list to an STL map. I used the
struct as the map's key and I added the operator<() and operator==()
functions to allow the map to sort and find objects. I also replaced the
list's for loop with map's find() function.

if (map.find(mykey) != map.end())
// object has been found; access data[]


Everything is working correctly (adding, finding, deleting, etc) but the map
solution is over 100% slower than the list for searches (say 100,000
searches). I understand that a map has some additional overhead but I would
have expected this overhead to be negligible for 1500 elements. I can't
explain why a sequential list search is faster (the found objects are
randomly dispersed and are not at the beginning of the list).

Has anyone else noticed this performance issue? Using gcc on Linux.
 
A

AnonMail2005

I have an STL list that contains about 1500 objects.  To check if an object
exists in my list I iterate over it and check if a "key" matches.  A key is
just a struct that contains four integers.

    typedef struct
    {
        int a;
        int b;
        int c;
        int d;
    } abcd;

    class C
    {
        abcd key;
        char data[100];
    }

    for i = list.begin(); i != list.end(); i++)
        if ( mya == i->key.a && myb == i->key.b &&
             myc == i->key.c && myd == i->key.d )
            // object has been found; access data[]

To improve performance, I converted the list to an STL map.  I used the
struct as the map's key and I added the operator<() and operator==()
functions to allow the map to sort and find objects.  I also replaced the
list's for loop with map's find() function.

    if (map.find(mykey) != map.end())
        // object has been found; access data[]

Everything is working correctly (adding, finding, deleting, etc) but the map
solution is over 100% slower than the list for searches (say 100,000
searches).  I understand that a map has some additional overhead but I would
have expected this overhead to be negligible for 1500 elements.  I can't
explain why a sequential list search is faster (the found objects are
randomly dispersed and are not at the beginning of the list).

Has anyone else noticed this performance issue?  Using gcc on Linux.

Sounds fishy. Show the operator< function that you wrote. Also, for
a map, you do not need operator==. You would need that operator if
you used the find algorithm with the list.

HTH
 
A

AnonMail2005

I have an STL list that contains about 1500 objects.  To check if an object
exists in my list I iterate over it and check if a "key" matches.  A key is
just a struct that contains four integers.
    typedef struct
    {
        int a;
        int b;
        int c;
        int d;
    } abcd;
    class C
    {
        abcd key;
        char data[100];
    }
    for i = list.begin(); i != list.end(); i++)
        if ( mya == i->key.a && myb == i->key.b &&
             myc == i->key.c && myd == i->key.d )
            // object has been found; access data[]
To improve performance, I converted the list to an STL map.  I used the
struct as the map's key and I added the operator<() and operator==()
functions to allow the map to sort and find objects.  I also replaced the
list's for loop with map's find() function.
    if (map.find(mykey) != map.end())
        // object has been found; access data[]
Everything is working correctly (adding, finding, deleting, etc) but the map
solution is over 100% slower than the list for searches (say 100,000
searches).  I understand that a map has some additional overhead but I would
have expected this overhead to be negligible for 1500 elements.  I can't
explain why a sequential list search is faster (the found objects are
randomly dispersed and are not at the beginning of the list).
Has anyone else noticed this performance issue?  Using gcc on Linux.

Sounds fishy.  Show the operator< function that you wrote.  Also, for
a map, you do not need operator==.  You would need that operator if
you used the find algorithm with the list.

HTH

The operator< member function should be something like this:

bool abcd::eek:perator<(const abcd & rhs) const
{
bool res = false;
if (a < rhs.a)
{
res = true;
}
else if (a == rhs.a)
{
if (b < rhs.b)
{
res = true;
}
else if (b == rhs.b)
{
if (c < rhs.c)
{
res = true;
}
else if (c == rhs.b)
{
if (d < rhs.d)
{
res = true;
}
}
}
}
return res;
}

HTH
 
B

Branimir Maksimovic

I have an STL list that contains about 1500 objects. To check if an object
exists in my list I iterate over it and check if a "key" matches. A key is
just a struct that contains four integers.

typedef struct
{
int a;
int b;
int c;
int d;
} abcd;

class C
{
abcd key;
char data[100];
}

for i = list.begin(); i != list.end(); i++)
if ( mya == i->key.a && myb == i->key.b &&
myc == i->key.c && myd == i->key.d )
// object has been found; access data[]

To improve performance, I converted the list to an STL map. I used the
struct as the map's key and I added the operator<() and operator==()
functions to allow the map to sort and find objects. I also replaced the
list's for loop with map's find() function.

if (map.find(mykey) != map.end())
// object has been found; access data[]

Everything is working correctly (adding, finding, deleting, etc) but the map
solution is over 100% slower than the list for searches (say 100,000
searches). I understand that a map has some additional overhead but I would
have expected this overhead to be negligible for 1500 elements. I can't
explain why a sequential list search is faster (the found objects are
randomly dispersed and are not at the beginning of the list).

Because in map, search goes with random access in memory,
and with list it goes in sequential order. I have discovered
that by using bidirectional iterator for quick sort instead of
random access I gain double the speed of quick sort...
because it seems that cpu reads ahead in memory
cache while instructions are executing.
Of course, nodes in list have to be sorted by address.



Yes. On list where nodes are sorted by address pass through list
is twice as fast as unordered nodes, no matter the size of list
and where nodes are. If nodes are allocated one behind another
you get additional speed boost, perhaps additional double speed.
That's why , search on list on 1500 elements is faster than map,
because probably sequential search goes through level 1 cache
read ahead, which is 10-30 times faster than main memory.

Greets
 
H

Horace Cochran

Hi there, did you know GNU time command can be used to clock the runtime of
your executable a.out? For your case, you can write 2 programs each using a
different STL container then run them with 'time a.out'.
 
N

Naive Group User

I have an STL list that contains about 1500 objects.  To check if an object
exists in my list I iterate over it and check if a "key" matches.  A key is
just a struct that contains four integers.

    typedef struct
    {
        int a;
        int b;
        int c;
        int d;
    } abcd;

    class C
    {
        abcd key;
        char data[100];
    }

    for i = list.begin(); i != list.end(); i++)
        if ( mya == i->key.a && myb == i->key.b &&
             myc == i->key.c && myd == i->key.d )
            // object has been found; access data[]

To improve performance, I converted the list to an STL map.  I used the
struct as the map's key and I added the operator<() and operator==()
functions to allow the map to sort and find objects.  I also replaced the
list's for loop with map's find() function.

    if (map.find(mykey) != map.end())
        // object has been found; access data[]

Everything is working correctly (adding, finding, deleting, etc) but the map
solution is over 100% slower than the list for searches (say 100,000
searches).  I understand that a map has some additional overhead but I would
have expected this overhead to be negligible for 1500 elements.  I can't
explain why a sequential list search is faster (the found objects are
randomly dispersed and are not at the beginning of the list).

Has anyone else noticed this performance issue?  Using gcc on Linux.

Dear Sir,

I have just made a question about speaking skill in sci.math- a very
necessary skill I would like to acquire via your experience Sir.
I have been to an interview as of late for a job of business
consultant and definitely I fail but I am not absolutely in a bad mood
because those interviewers are a little limited in viewpoints and
evaluation process, and admittedly there are things I forget; things I
(sadly) missed during I was at schools but definitely I lack a special
speaking skill that I strongly need to acquire during the next few
weeks or more. Thereby, seeking Sir, The Professor and Bear's advice
as well as experience is a must-do right now. Just a few posts don't
count up the tall pyramids but they are priceless and appriciative
things ever from an online community Sir.

About your question, Sir,
Since built-in functions in those containers are mainly to as
versatile as possible to 'feed' a large number of coders of varied
needs. So, a hefty import of different headers as well as libraries
will be performed during the compiling and linking processes. The
containers inheritted from the base classes i.e iostream reimplemented
several to many virtual functions to expose their (aka) polymorphism
for current and future uses with the templatizing capabilities. Once
the objects are assigned to each placeholder of the list which is more
limited in functionalities performed in terms of first-key-second-
value pairwise implementation in comparison to map container. I think
that is why it is much faster than the map in relation to runtime
speed.
 
N

Naive Group User

I have an STL list that contains about 1500 objects.  To check if an object
exists in my list I iterate over it and check if a "key" matches.  A key is
just a struct that contains four integers.
    typedef struct
    {
        int a;
        int b;
        int c;
        int d;
    } abcd;
    class C
    {
        abcd key;
        char data[100];
    }
    for i = list.begin(); i != list.end(); i++)
        if ( mya == i->key.a && myb == i->key.b &&
             myc == i->key.c && myd == i->key.d )
            // object has been found; access data[]
To improve performance, I converted the list to an STL map.  I used the
struct as the map's key and I added the operator<() and operator==()
functions to allow the map to sort and find objects.  I also replaced the
list's for loop with map's find() function.
    if (map.find(mykey) != map.end())
        // object has been found; access data[]
Everything is working correctly (adding, finding, deleting, etc) but the map
solution is over 100% slower than the list for searches (say 100,000
searches).  I understand that a map has some additional overhead but I would
have expected this overhead to be negligible for 1500 elements.  I can't
explain why a sequential list search is faster (the found objects are
randomly dispersed and are not at the beginning of the list).
Has anyone else noticed this performance issue?  Using gcc on Linux.

Dear Sir,

I have just made a question about speaking skill in sci.math- a very
necessary skill I would like to acquire via your experience Sir.
I have been to an interview as of late for a job of business
consultant and definitely I fail but I am not absolutely in a bad mood
because those interviewers are a little limited in viewpoints and
evaluation process, and admittedly there are things I forget; things I
(sadly) missed during I was at schools but definitely I lack a special
speaking skill that I strongly need to acquire during the next few
weeks or more. Thereby, seeking Sir, The Professor and Bear's advice
as well as experience is a must-do right now. Just a few posts don't
count up the tall pyramids but they are priceless and appriciative
things ever from an online community Sir.

About your question, Sir,
Since built-in functions in those containers are mainly to as
versatile as possible to 'feed' a large number of coders of varied
needs. So, a hefty import of different headers as well as libraries
will be performed during the compiling and linking processes. The
containers inheritted from the base classes i.e iostream reimplemented
several to many virtual functions to expose their (aka) polymorphism
for current and future uses with the templatizing capabilities. Once
the objects are assigned to each placeholder of the list which is more
limited in functionalities performed in terms of first-key-second-
value pairwise implementation in comparison to map container. I think
that is why it is much faster than the map in relation to runtime
speed.- Hide quoted text -

- Show quoted text -

Ohhh I read what i wrote and I don't understand what I wrote too :-D

I am Sorry Sir but
I am have been online too long
I better go now

Next time next better
Remember me, remember you
 
B

barcaroller

Sounds fishy. Show the operator< function that you wrote. Also, for
a map, you do not need operator==. You would need that operator if
you used the find algorithm with the list.


The operator< member function should be something like this:

bool abcd::eek:perator<(const abcd & rhs) const
{
bool res = false;
if (a < rhs.a)
{
res = true;
}
else if (a == rhs.a)
{
if (b < rhs.b)
{
res = true;
}
else if (b == rhs.b)
{
if (c < rhs.c)
{
res = true;
}
else if (c == rhs.b)
{
if (d < rhs.d)
{
res = true;
}
}
}
}
return res;
}

Thank you. My operator<() function looks like this:

bool operator<(const key& k)
{
if (a < k.a)
return true;
else
if (a > k.a)
return false;

if (b < k.b)
return true;
else
if (b > k.b)
return false;

if (c < k.c)
return true;
else
if (c > k.c)
return false;

if (d < k.d)
return true;

return false;
}
 
B

barcaroller

Branimir Maksimovic said:
Because in map, search goes with random access in memory,
and with list it goes in sequential order. I have discovered
that by using bidirectional iterator for quick sort instead of
random access I gain double the speed of quick sort...
because it seems that cpu reads ahead in memory
cache while instructions are executing.
Of course, nodes in list have to be sorted by address.

Yes. On list where nodes are sorted by address pass through list
is twice as fast as unordered nodes, no matter the size of list
and where nodes are. If nodes are allocated one behind another
you get additional speed boost, perhaps additional double speed.
That's why , search on list on 1500 elements is faster than map,
because probably sequential search goes through level 1 cache
read ahead, which is 10-30 times faster than main memory.

Thank you for that explanation. It would explain what I'm seeing. When I
first familiarized myself with the STL (many years ago) I was led to believe
that map and set searches (as opposed to insertions) are generally faster
than list searches for large containers. The theory behind that reasoning
is well defined. It would indeed be unfortunate if the performance of these
container types depends on the CPU architecture.
 
L

Lars Tetzlaff

Am 31.12.2009 00:17, schrieb barcaroller:
I have an STL list that contains about 1500 objects. To check if an object
exists in my list I iterate over it and check if a "key" matches. A key is
just a struct that contains four integers.

typedef struct
{
int a;
int b;
int c;
int d;
} abcd;

class C
{
abcd key;
char data[100];
}

for i = list.begin(); i != list.end(); i++)
if ( mya == i->key.a && myb == i->key.b &&
myc == i->key.c && myd == i->key.d )
// object has been found; access data[]



To improve performance, I converted the list to an STL map. I used the
struct as the map's key and I added the operator<() and operator==()
functions to allow the map to sort and find objects. I also replaced the
list's for loop with map's find() function.

if (map.find(mykey) != map.end())
// object has been found; access data[]


Everything is working correctly (adding, finding, deleting, etc) but the map
solution is over 100% slower than the list for searches (say 100,000
searches). I understand that a map has some additional overhead but I would
have expected this overhead to be negligible for 1500 elements. I can't
explain why a sequential list search is faster (the found objects are
randomly dispersed and are not at the beginning of the list).

Has anyone else noticed this performance issue? Using gcc on Linux.

I think you missed something. With the follwoing code

#include <cstdlib>
#include <list>
#include <map>
#include <iostream>
#include <vector>
#include <time.h>

using namespace std;

struct abcd
{
int a;
int b;
int c;
int d;

abcd( int a, int b, int c, int d ) : a(a), b(b), c(c), d(d) {}
abcd() {}

bool operator<(const abcd& k) const
{
if (a < k.a)
return true;
else
if (a > k.a)
return false;

if (b < k.b)
return true;
else
if (b > k.b)
return false;

if (c < k.c)
return true;
else
if (c > k.c)
return false;

if (d < k.d)
return true;

return false;
}
};

struct C
{
abcd key;
char data[100];
};

std::list<C> l;
std::map<abcd,C> m;
vector<abcd> k;

void init()
{
srand( 42 );

for( int i = 0; i < 1500; ++i ) {
abcd aABCD( rand(), rand(), rand(), rand() );
C aC;
aC.key = aABCD;
l.push_back( aC );
m.insert( make_pair( aABCD, aC ) );
if( i%10 == 0 ) k.push_back( aABCD );
}
}

const int count = 1000;

int main()
{
init();

clock_t start = clock();

int x = 0;
for( int z = 0; z < count; ++z ) {
x = 0;
for( vector<abcd>::iterator j = k.begin(); j != k.end(); j++ ) {
for( list<C>::iterator i = l.begin(); i != l.end(); i++) {
if ( j->a == i->key.a && j->b == i->key.b &&
j->c == i->key.c && j->d == i->key.d ){
x++;
continue;
}
}
}
}

clock_t listtime = clock();

int y = 0;

for( int z = 0; z < count; ++z ) {
y = 0;
for( vector<abcd>::iterator j = k.begin(); j != k.end(); j++ ) {
if( m.find( *j ) != m.end() ) {
y++;
}
}
}

clock_t maptime = clock();

std::cerr << "List: " << listtime - start << " " << x << std::endl;
std::cerr << "Map: " << maptime - listtime << " " << y << std::endl;
}



i get:

List: 1780000 150
Map: 10000 150

so map is 178 times faster than map!

Testet on Linux with gcc 4.3.2, -O3 option, processor at a fixed clock.

Lars
 
J

Jeff Flinn

Stephen said:
or

bool abcd::eek:perator<(const abcd & rhs) const
{
if (a != rhs.a)
return (a < rhs.a);
else if (b != rhs.b)
return (b < rhs.b);
else if (c != rhs.c)
return (c < rhs.c);
else
return (d < rhs.d);
}

or my favorite:

#include "boost/tuple/tuple.hpp"
#include "boost/tuple/tuple_comparison.hpp"

bool abcd::eek:perator<(const abcd & rhs) const
{
using boost::tuple::tie; // or IIRC std::tr1::tuple::tie

return tie(a, b, c, d) < tie(rhs.a, rhs.b, rhs.c, rhs.d);
}

Jeff
 
B

Branimir Maksimovic

Lars said:
i get:

List: 1780000 150
Map: 10000 150

so map is 178 times faster than map!

Testet on Linux with gcc 4.3.2, -O3 option, processor at a fixed clock.

n^2 algorithm have to be slower than nlogn, no matter what ;)

Greets
 
D

Daniel Pitts

Branimir said:
n^2 algorithm have to be slower than nlogn, no matter what ;)
Your statement is not true.

If an algorithm is O(f(n)) then the number of operations can be k*f(n) +
g(n), where g(n) contributes less than f(n) for large n. so, for small
N, an O(nlogn) algorithm may be slower than an N^2 algorithm. For
example, compare heap-sort to merge-sort on a list of 3 elements.
 
B

Branimir Maksimovic

Daniel said:
Your statement is not true.

If an algorithm is O(f(n)) then the number of operations can be k*f(n) +
g(n), where g(n) contributes less than f(n) for large n. so, for small
N, an O(nlogn) algorithm may be slower than an N^2 algorithm. For
example, compare heap-sort to merge-sort on a list of 3 elements.

Agreed. btw happy new year ;)


Greets
 
M

Maxim Yegorushkin

I have an STL list that contains about 1500 objects. To check if an
object
exists in my list I iterate over it and check if a "key" matches. A
key is
just a struct that contains four integers.

typedef struct
{
int a;
int b;
int c;
int d;
} abcd;

class C
{
abcd key;
char data[100];
}

for i = list.begin(); i != list.end(); i++)
if ( mya == i->key.a && myb == i->key.b &&
myc == i->key.c && myd == i->key.d )
// object has been found; access data[]

To improve performance, I converted the list to an STL map. I used the
struct as the map's key and I added the operator<() and operator==()
functions to allow the map to sort and find objects. I also replaced the
list's for loop with map's find() function.

if (map.find(mykey) != map.end())
// object has been found; access data[]

Everything is working correctly (adding, finding, deleting, etc) but
the map
solution is over 100% slower than the list for searches (say 100,000
searches). I understand that a map has some additional overhead but I
would
have expected this overhead to be negligible for 1500 elements. I can't
explain why a sequential list search is faster (the found objects are
randomly dispersed and are not at the beginning of the list).

Because in map, search goes with random access in memory,
and with list it goes in sequential order.

Not so.

Both map and list store elements in nodes. Each node is allocated
separately.
> I have discovered
that by using bidirectional iterator for quick sort instead of
random access I gain double the speed of quick sort...

Can not be true. Quick sort requires random iterators, this is why you
can't pass std::list<> iterators into std::sort(). std::list<> has sort
member function for sorting which employs mergesort because quicksort
can not be used for lists.

[]
Yes. On list where nodes are sorted by address pass through list
is twice as fast as unordered nodes, no matter the size of list
and where nodes are.

Sounds unlikely. Can't ignore CPU cache effects.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,021
Latest member
AkilahJaim

Latest Threads

Top