__gnu_cxx::hash_set efficient union

M

mark.dufour

hi all,

as part of an 'implicitly statically typed Python' to C++ compiler
called Shedskin I'm working on (http://mark.dufour.googlepages.com),
I'm using __gnu_cxx::hash_set to implement Python sets. unfortunately
I can't get the performance to quite match that of set. for example,
this is how I thought set union would be reasonably fast:

template<class T> set<T> *set<T>::_union(set<T> *s) {
set<T> *c = new set<T>();
c->units = units;

it2 = s->units.end();
for(it1 = s->units.begin(); it1 != it2; it1++)
c->units.insert(*it1);

return c;
}

(some context:

units is of type __gnu_cxx::hash_set<T, hashfunc<T>, hasheq<T>,
gc_allocator<T> > // Boehm GC

template<class T> class hashfunc
{
public: int operator()(T t) const { return hasher<T>(t); }
};

template<> int hasher(int a) { return a; }

template<class T> class hasheq {
public: int operator()(T t, T v) const { return __eq(t,v); }
};

template<> int __eq(int a, int b) { return a == b; }
)

...but for a simple union of two large sets of ints, this approach is
about three times slower than under the Python interpreter. just
copying the left argument (c->units = units;) is slower than the
complete union in Python..

am I doing something really silly here, or is hash_set just not that
efficient? perhaps the new unordered_set will be much more
performant..? (btw, will unordered_set have union etc. members?)

here is a link to the blog posting that triggered this, with some
Python code that becomes much slower after compilation:
http://pyinsci.blogspot.com/2007/08/set-implementation-performance.html

any help is much appreciated! suggestions on how to improve
performance of the implementations for the other Python builtins are
also very much welcome.


thanks in advance,
mark dufour.
 
?

=?ISO-8859-1?Q?Erik_Wikstr=F6m?=

hi all,

as part of an 'implicitly statically typed Python' to C++ compiler
called Shedskin I'm working on (http://mark.dufour.googlepages.com),
I'm using __gnu_cxx::hash_set to implement Python sets. unfortunately
I can't get the performance to quite match that of set. for example,
this is how I thought set union would be reasonably fast:

template<class T> set<T> *set<T>::_union(set<T> *s) {
set<T> *c = new set<T>();
c->units = units;

it2 = s->units.end();
for(it1 = s->units.begin(); it1 != it2; it1++)
c->units.insert(*it1);

return c;
}

(some context:

units is of type __gnu_cxx::hash_set<T, hashfunc<T>, hasheq<T>,
gc_allocator<T> > // Boehm GC

template<class T> class hashfunc
{
public: int operator()(T t) const { return hasher<T>(t); }
};

template<> int hasher(int a) { return a; }

template<class T> class hasheq {
public: int operator()(T t, T v) const { return __eq(t,v); }
};

template<> int __eq(int a, int b) { return a == b; }
)

..but for a simple union of two large sets of ints, this approach is
about three times slower than under the Python interpreter. just
copying the left argument (c->units = units;) is slower than the
complete union in Python..

am I doing something really silly here, or is hash_set just not that
efficient? perhaps the new unordered_set will be much more
performant..? (btw, will unordered_set have union etc. members?)

here is a link to the blog posting that triggered this, with some
Python code that becomes much slower after compilation:
http://pyinsci.blogspot.com/2007/08/set-implementation-performance.html

any help is much appreciated! suggestions on how to improve
performance of the implementations for the other Python builtins are
also very much welcome.

I don't know how Python's sets works, but what's wrong with std::set and
std::set_union()? I can see no reason to use a hash when dealing with
ints, since they can easily be compared as they are, without having to
compute their hashes first.

Also note that for performance std::vector is often best, so put the
ints in the two sets into two vectors and sort them (if they are not
sorted during insertion). Then create a new vector large enough to
contain all the ints in both sets and use std::set_union() to get the union.
 
M

mark.dufour

I don't know how Python's sets works, but what's wrong with std::set and
std::set_union()? I can see no reason to use a hash when dealing with
ints, since they can easily be compared as they are, without having to
compute their hashes first.

I like to stay close to algorithmic complexity, and I'm not just
dealing withs ints but arbitrary types.

anyway, Python integer hashing consists of just taking the integer, so
I don't think this explains the large performance difference..
Also note that for performance std::vector is often best, so put the
ints in the two sets into two vectors and sort them (if they are not
sorted during insertion). Then create a new vector large enough to
contain all the ints in both sets and use std::set_union() to get the union.

thanks. but I don't think this will work for me, as I'm not using
sorted sets, and sorting (of ints) already takes longer than the union
in Python..


mark dufour.
 
T

tragomaskhalos

hi all,

as part of an 'implicitly statically typed Python' to C++ compiler
called Shedskin I'm working on (http://mark.dufour.googlepages.com),
I'm using __gnu_cxx::hash_set to implement Python sets. unfortunately
I can't get the performance to quite match that of set. for example,
this is how I thought set union would be reasonably fast:

template<class T> set<T> *set<T>::_union(set<T> *s) {
set<T> *c = new set<T>();
c->units = units;

it2 = s->units.end();
for(it1 = s->units.begin(); it1 != it2; it1++)
c->units.insert(*it1);

return c;

}

I'm sure the effect is negligible, but is there any reason
you're dynamically allocating the set here? I'd have though
set itself is a fairly small object so creating it on the
stack and returning that (and relying on RVO) might be better.

Also, I have read that support for multithreading, injudiciously
applied by library providers, can have a ruinous effect upon
performance of standard containers. So it may be worth having a
look at the documentation (or on google) to see if there are
any such issues surrounding __gnu_cxx::hash_set.
 
M

mark.dufour

I'm sure the effect is negligible, but is there any reason
you're dynamically allocating the set here? I'd have though
set itself is a fairly small object so creating it on the
stack and returning that (and relying on RVO) might be better.

relying on optimizations is dangerous business :) besides I'm not
much of a C++ expert, so copying things around all the time sounds a
bit scary..
Also, I have read that support for multithreading, injudiciously
applied by library providers, can have a ruinous effect upon
performance of standard containers. So it may be worth having a
look at the documentation (or on google) to see if there are
any such issues surrounding __gnu_cxx::hash_set.

I cannot find any useful information.

I did write another test, without using a GC and default hash
function, and I cannot get it to run much faster than this (still
several times slower than Python):

#include <ext/hash_set>

int main() {
typedef __gnu_cxx::hash_set<int> HS;
HS *ha = new HS();
HS *hb = new HS();
HS *hc;

HS::iterator it1, it2;

for(int i = 0; i<10000; i++)
ha->insert(rand() % 100000);
for(int i = 0; i<10000; i++)
hb->insert(rand() % 100000);

for(int i = 0; i<1000; i++) {
hc = new HS(*ha);
hc->insert(hb->begin(), hb->end());

delete hc;
}

}

does anyone have an idea how to speed this up..?

this is the python equivalent:

import random

seta = set([random.randint(0,100000) for n in xrange(10000)])
setb = set([random.randint(0,100000) for n in xrange(10000)])

for i in xrange(1000):
seta | setb


thanks!
mark dufour.
 
M

mark.dufour

hi all,

I gave google's hash_map/set implementation a try. it provides a
superset of the sgi api, but does not support custom allocators
(unfortunately for me, as I use the Boehm GC), and you have to provide
an 'empty key' for every instantiation.

for the same program I now actually get a speedup versus CPython:

#include <google/dense_hash_set>

int main() {
typedef google::dense_hash_set<int> HS;

HS *ha = new HS();
ha->set_empty_key(-1);
HS *hb = new HS();
hb->set_empty_key(-1);
HS *hc;

HS::iterator it1, it2;

for(int i = 0; i<10000; i++)
ha->insert(rand() % 100000);
for(int i = 0; i<10000; i++)
hb->insert(rand() % 100000);

for(int i = 0; i<1000; i++) {
hc = new HS();
hc->set_empty_key(-1);

*hc = *ha;
hc->insert(hb->begin(), hb->end());

delete hc;
}

}


mark dufour.
 
M

mark.dufour

I gave google's hash_map/set implementation a try. it provides a
superset of the sgi api, but does not support custom allocators
(unfortunately for me, as I use the Boehm GC), and you have to provide
an 'empty key' for every instantiation.

for the same program I now actually get a speedup versus CPython:

then again, it seems this won't help much to implement Python sets,
because of the 'empty key' restriction.

I also tried std::tr1::unordered_set, but performance was the same as
for __gnu_cxx::hash_set.


mark.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top