The SETT Programming Contest: The fastest set<T> implementation

Discussion in 'C++' started by SETT Programming Contest, Jun 3, 2008.

  1. The SETT Programming Contest: The fastest set<T> implementation

    Write the fastest set<T> implementation using only standard C++/C.
    Ideally it should have the same interface like std::set.
    At least the following methods must be implemented:
    insert(), find(), begin(), end(), erase(), size(), operator<(),
    and at least the forward iterator.

    Here, speed and correctness are the 2 most important factors.
    Functionally it should behave similar to std::set,
    have practically no limitation on the number of items,
    and be significantly faster, either always or under specified
    conditions (for example for random items etc.).

    The class should be placed in an own namespace.
    Supply also a benchmark routine which compares your
    implementation with that of std::set, and briefly document
    and comment these results.

    Hint: Since items in a set are always in an ordered sequence
    (as determined by operator<()) you need to use some tree
    routines (AVL, red-black, splay, etc.) or use your own tree
    method in the heart of your implementation.

    See also this paper:
    Ben Pfaff, "Performance Analysis of BST in System Software", Stanford University
    http://www.stanford.edu/~blp/papers/libavl.pdf

    Publish your finished project somewhere on the Web
    (for example at your own homepage or at http://sourceforge.net/ )
    and announce it in comp.lang.c++ .

    There is no deadline on this contest. The fastest method
    will be the winner of the SETT Programming Contest,
    until someone comes up with a faster method.

    Good Luck!
     
    SETT Programming Contest, Jun 3, 2008
    #1
    1. Advertising

  2. "Chris Thomasson" <> wrote:
    > "SETT Programming Contest" wrote:
    > >
    > > The SETT Programming Contest: The fastest set<T> implementation

    >
    > What's the prize?


    The winner will get the title "Winner of the SETT Programming Contest".
    Also a wiki page will be created and the winner presented there.
    Sponsors from the industry, academic institutions, and individuals
    can decide on their own to honor the winner with additional prizes.
     
    SETT Programming Contest, Jun 3, 2008
    #2
    1. Advertising

  3. SETT Programming Contest wrote:
    > The SETT Programming Contest: The fastest set<T> implementation


    Considerable speedups can be achieved by simply using the default
    std::set with a custom memory allocator. I have done exactly that here:

    http://warp.povusers.org/FSBAllocator/

    I suppose that could be my entry.
     
    Juha Nieminen, Jun 3, 2008
    #3
  4. SETT Programming Contest

    Lars Uffmann Guest

    SETT Programming Contest wrote:
    > The winner will get the title "Winner of the SETT Programming Contest".
    > Also a wiki page will be created and the winner presented there.
    > Sponsors from the industry, academic institutions, and individuals
    > can decide on their own to honor the winner with additional prizes.


    [x] Do your own homework
     
    Lars Uffmann, Jun 3, 2008
    #4
  5. SETT Programming Contest

    shablool Guest

    On Jun 3, 4:55 pm, "SETT Programming Contest"
    <> wrote:
    > The SETT Programming Contest: The fastest set<T> implementation
    >
    > Write the fastest set<T> implementation using only standard C++/C.
    > Ideally it should have the same interface like std::set.
    > At least the following methods must be implemented:
    > insert(), find(), begin(), end(), erase(), size(), operator<(),
    > and at least the forward iterator.
    >
    > Here, speed and correctness are the 2 most important factors.
    > Functionally it should behave similar to std::set,
    > have practically no limitation on the number of items,
    > and be significantly faster, either always or under specified
    > conditions (for example for random items etc.).
    >
    > The class should be placed in an own namespace.
    > Supply also a benchmark routine which compares your
    > implementation with that of std::set, and briefly document
    > and comment these results.
    >
    > Hint: Since items in a set are always in an ordered sequence
    > (as determined by operator<()) you need to use some tree
    > routines (AVL, red-black, splay, etc.) or use your own tree
    > method in the heart of your implementation.
    >
    > See also this paper:
    > Ben Pfaff, "Performance Analysis of BST in System Software", Stanford Universityhttp://www.stanford.edu/~blp/papers/libavl.pdf
    >
    > Publish your finished project somewhere on the Web
    > (for example at your own homepage or athttp://sourceforge.net/)
    > and announce it in comp.lang.c++ .
    >
    > There is no deadline on this contest. The fastest method
    > will be the winner of the SETT Programming Contest,
    > until someone comes up with a faster method.
    >
    > Good Luck!


    Interesting contest. However, the assumption that there should be no
    limitation on the number of items implies that the underlying memory
    management model will be similar to the one used by std::set. It
    appears that a change to this memory model (particularly within modern
    multi-threaded environments) may have a more dramatic affect on
    performance benchmarks, then the selection of the underlying data
    structure, as demonstrated by the String library
    (see: http://strinx.sourceforge.net/docs/strinx.html).

    ShacharS.
     
    shablool, Jun 3, 2008
    #5
  6. "SETT Programming Contest" <> writes:
    > The SETT Programming Contest: The fastest set<T> implementation
    >
    > Write the fastest set<T> implementation using only standard C++/C.
    > Ideally it should have the same interface like std::set.

    [snip]

    What is "C++/C"?

    Since C doesn't have C++-style templates or namespaces, the problem as
    stated cannot be solved in C (though you could probably do the
    equivalent).

    Why not just say "standard C++" and leave C out of it?

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Jun 3, 2008
    #6
  7. SETT Programming Contest

    Christopher Guest

    On Jun 3, 8:55 am, "SETT Programming Contest"
    <> wrote:
    > The SETT Programming Contest: The fastest set<T> implementation
    >

    [snip drabble]

    I am running a SCTMM contest: Send Chris the most money contest.
    The contest has no deadline.
    The person who sends the most money will be the winner.
    Sponsors will decide on prizes.
    I'll create a wiki page and be sure to thank you.
    Send entries to:

    SCTMM Contest
    P.O box 50001
    Austin TX 78735
     
    Christopher, Jun 3, 2008
    #7
  8. "Keith Thompson" <> wrote:
    > "SETT Programming Contest" writes:
    > >
    > > The SETT Programming Contest: The fastest set<T> implementation
    > >
    > > Write the fastest set<T> implementation using only standard C++/C.
    > > Ideally it should have the same interface like std::set.

    > [snip]
    >
    > What is "C++/C"?
    >
    > Since C doesn't have C++-style templates or namespaces, the problem as
    > stated cannot be solved in C (though you could probably do the
    > equivalent).
    >
    > Why not just say "standard C++" and leave C out of it?


    It means that you can also use pure C for the underlying methods,
    ie. by using the 'extern "C"' interface mechanism of the C++ compiler.
    Only the public interface must be in C++, the rest can also be
    in portable standard C. For example if you already have some
    working C code (for example the tree routines) for re-use here...
     
    SETT Programming Contest, Jun 3, 2008
    #8
  9. "Richard Harter" <> wrote:
    > "SETT Programming Contest" wrote:
    > >
    > >The SETT Programming Contest: The fastest set<T> implementation
    > >
    > >Write the fastest set<T> implementation using only standard C++/C.
    > >Ideally it should have the same interface like std::set.
    > >At least the following methods must be implemented:
    > >insert(), find(), begin(), end(), erase(), size(), operator<(),
    > >and at least the forward iterator.
    > >
    > >Here, speed and correctness are the 2 most important factors.
    > >Functionally it should behave similar to std::set,
    > >have practically no limitation on the number of items,
    > >and be significantly faster, either always or under specified
    > >conditions (for example for random items etc.).
    > >
    > >The class should be placed in an own namespace.
    > >Supply also a benchmark routine which compares your
    > >implementation with that of std::set, and briefly document
    > >and comment these results.
    > >
    > >Hint: Since items in a set are always in an ordered sequence
    > >(as determined by operator<()) you need to use some tree
    > >routines (AVL, red-black, splay, etc.) or use your own tree
    > >method in the heart of your implementation.
    > >
    > >See also this paper:
    > >Ben Pfaff, "Performance Analysis of BST in System Software", Stanford University
    > >http://www.stanford.edu/~blp/papers/libavl.pdf
    > >
    > >Publish your finished project somewhere on the Web
    > >(for example at your own homepage or at http://sourceforge.net/ )
    > >and announce it in comp.lang.c++ .
    > >
    > >There is no deadline on this contest. The fastest method
    > >will be the winner of the SETT Programming Contest,
    > >until someone comes up with a faster method.
    > >
    > >The winner will get the title "Winner of the SETT Programming Contest".
    > >Also a wiki page will be created and the winner presented there.
    > >Sponsors from the industry, academic institutions, and individuals
    > >can decide on their own to honor the winner with additional prizes.
    > >
    > >Good Luck!

    >
    > I take it that you really mean C++ and not C.


    It means that you can also use pure C for the underlying methods,
    ie. by using the 'extern "C"' interface mechanism of the C++ compiler.
    Only the public interface must be in C++, the rest can also be
    in portable standard C. For example if you already have some
    working C code (for example the tree routines) for re-use here...

    > Must entries use binary trees? What about T trees or radix trees?


    You are free to use any tree mechanism you like. It's your choice.

    > Are we talking about in-memory methods here or
    > do we have to account for disk based methods too?


    In-memory methods only; much like std::set .
     
    SETT Programming Contest, Jun 3, 2008
    #9
  10. On Jun 4, 4:37 am, "SETT Programming Contest"
    <> wrote:
    > "Richard Harter" <> wrote:
    > > "SETT Programming Contest" wrote:
    > > >Write the fastest set<T> implementation using only standard C++/C.

    >
    > > >Here, speed and correctness are the 2 most important factors.

    >
    > In-memory methods only; much like std::set


    Too bad the contest doesn't include memory needs as a
    criterion; I've a method with the same compaction as
    Cleary's Compact Tree, but faster -- though not as fast as
    non-compact methods.

    Obviously, a compact method will be faster than a "fast"
    method, if the former fits in core and the latter doesn't.

    James Dow Allen
     
    James Dow Allen, Jun 4, 2008
    #10
  11. SETT Programming Contest

    James Kanze Guest

    On Jun 3, 3:55 pm, "SETT Programming Contest"
    <> wrote:
    > The SETT Programming Contest: The fastest set<T> implementation


    > Write the fastest set<T> implementation using only standard
    > C++/C.


    Fastest at what? On what machine? With which implementation of
    C++?

    [...]
    > See also this paper:
    > Ben Pfaff, "Performance Analysis of BST in System Software", Stanford Universityhttp://www.stanford.edu/~blp/papers/libavl.pdf


    Which more or less establishes that "fastest" is meaningless
    unless you define at what?

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Jun 4, 2008
    #11
  12. SETT Programming Contest

    Bartc Guest

    "SETT Programming Contest" <> wrote
    in message news:g23kml$ld8$...
    > The SETT Programming Contest: The fastest set<T> implementation
    >
    > Write the fastest set<T> implementation using only standard C++/C.
    > Ideally it should have the same interface like std::set.
    > At least the following methods must be implemented:
    > insert(), find(), begin(), end(), erase(), size(), operator<(),
    > and at least the forward iterator.


    I'm not going to enter this (since, until today, I'd never heard of set<T>
    or BST).

    But I'm interested in clearing up what the task is.

    A set<T> is just an arbitrary collection of values (each unique) of type T?
    Like a Pascal set? (OK a Pascal set is limited to scalars and has a pre-set
    range).

    And T can be anything at all, provided operator < is meaningful for it?

    In this case it all sounds pretty straightforward, so why limit the contest
    to C++ only? Is it to eliminate languages that already have very fast
    implementations built-in?

    --
    Bartc
     
    Bartc, Jun 4, 2008
    #12
  13. SETT Programming Contest

    Ben Pfaff Guest

    "SETT Programming Contest" <>
    writes:
    > Write the fastest set<T> implementation using only standard C++/C.
    > Ideally it should have the same interface like std::set.
    > At least the following methods must be implemented:
    > insert(), find(), begin(), end(), erase(), size(), operator<(),
    > and at least the forward iterator.


    Wouldn't it be easier to compare the implementations if you
    required them to meet all the requirements placed on std::set by
    the C++ standard?

    > See also this paper:
    > Ben Pfaff, "Performance Analysis of BST in System Software", Stanford University
    > http://www.stanford.edu/~blp/papers/libavl.pdf


    How amusing.
    --
    Ben Pfaff
    http://benpfaff.org
     
    Ben Pfaff, Jun 4, 2008
    #13
  14. SETT Programming Contest

    Mike Guest

    In article <g23kml$ld8$>, lid says...
    > The SETT Programming Contest: The fastest set<T> implementation
    >
    > Write the fastest set<T> implementation using only standard C++/C.
    > Ideally it should have the same interface like std::set.
    > At least the following methods must be implemented:
    > insert(), find(), begin(), end(), erase(), size(), operator<(),
    > and at least the forward iterator.
    >
    > Here, speed and correctness are the 2 most important factors.


    Does this mean that a totally incorect but exceedingly fast implementation can still get a 50% pass mark?
     
    Mike, Jun 5, 2008
    #14
  15. SETT Programming Contest

    shablool Guest

    On Jun 4, 7:56 pm, Ben Pfaff <> wrote:
    > "SETT Programming Contest" <>
    > writes:
    >
    > > Write the fastest set<T> implementation using only standard C++/C.
    > > Ideally it should have the same interface like std::set.
    > > At least the following methods must be implemented:
    > > insert(), find(), begin(), end(), erase(), size(), operator<(),
    > > and at least the forward iterator.

    >
    > Wouldn't it be easier to compare the implementations if you
    > required them to meet all the requirements placed on std::set by
    > the C++ standard?
    >
    > > See also this paper:
    > > Ben Pfaff, "Performance Analysis of BST in System Software", Stanford University
    > >http://www.stanford.edu/~blp/papers/libavl.pdf

    >
    > How amusing.
    > --
    > Ben Pfaffhttp://benpfaff.org



    As stated in previous post, this requirement alone assumes certain
    memory model, which may affect the actual performance. Nevertheless, I
    would like propose the following code fragment as possible candidate
    for the actual test (after-all, we should start somewhere):


    template <typename SetT>
    void sett_contest(SetT& s, int n)
    {
    typedef typename SetT::value_type value_type;

    int i;
    std::vector<value_type> v(n);

    for (i = 0; i < n; ++i)
    {
    v = value_type(i);
    }

    std::random_shuffle(v.begin(), v.end());
    s.insert(v.begin(), v.end());

    std::random_shuffle(v.begin(), v.end());
    for (i = 0; i < n; ++i)
    {
    s.erase(s.find(v));
    }
    }

    ShacharS.
     
    shablool, Jun 5, 2008
    #15
  16. SETT Programming Contest

    Ben Pfaff Guest

    shablool <> writes:

    >> "SETT Programming Contest" <>
    >> writes:
    >>
    >> > Write the fastest set<T> implementation using only standard C++/C.
    >> > Ideally it should have the same interface like std::set.
    >> > At least the following methods must be implemented:
    >> > insert(), find(), begin(), end(), erase(), size(), operator<(),
    >> > and at least the forward iterator.


    > I would like propose the following code fragment as possible
    > candidate for the actual test (after-all, we should start
    > somewhere):


    [...]

    > std::random_shuffle(v.begin(), v.end());
    > s.insert(v.begin(), v.end());
    >
    > std::random_shuffle(v.begin(), v.end());
    > for (i = 0; i < n; ++i)
    > {
    > s.erase(s.find(v));
    > }


    That fragment is going to strongly favor implementations that do
    little or no balancing of the tree, because balancing is just a
    waste of time when you know that insertions and deletions occur
    in random order. Is that your intention?

    I don't see any specifications on iterator invalidation in the
    original post. Perhaps a hash table implementation would be
    acceptable. If so, that's probably going to beat any tree-based
    implementation, especially as the sets grow larger. I wonder
    whether that is the intention, too.
    --
    "While the Melissa license is a bit unclear, Melissa aggressively
    encourages free distribution of its source code."
    --Kevin Dalley <>
     
    Ben Pfaff, Jun 5, 2008
    #16
  17. SETT Programming Contest

    shablool Guest

    On Jun 5, 3:38 pm, Ben Pfaff <> wrote:
    > shablool <> writes:
    > >> "SETT Programming Contest" <>
    > >> writes:

    >
    > >> > Write the fastest set<T> implementation using only standard C++/C.
    > >> > Ideally it should have the same interface like std::set.
    > >> > At least the following methods must be implemented:
    > >> > insert(), find(), begin(), end(), erase(), size(), operator<(),
    > >> > and at least the forward iterator.

    > > I would like propose the following code fragment as possible
    > > candidate for the actual test (after-all, we should start
    > > somewhere):

    >
    > [...]
    >
    > > std::random_shuffle(v.begin(), v.end());
    > > s.insert(v.begin(), v.end());

    >
    > > std::random_shuffle(v.begin(), v.end());
    > > for (i = 0; i < n; ++i)
    > > {
    > > s.erase(s.find(v));
    > > }

    >
    > That fragment is going to strongly favor implementations that do
    > little or no balancing of the tree, because balancing is just a
    > waste of time when you know that insertions and deletions occur
    > in random order. Is that your intention?
    >
    > I don't see any specifications on iterator invalidation in the
    > original post. Perhaps a hash table implementation would be
    > acceptable. If so, that's probably going to beat any tree-based
    > implementation, especially as the sets grow larger. I wonder
    > whether that is the intention, too.
    > --
    > "While the Melissa license is a bit unclear, Melissa aggressively
    > encourages free distribution of its source code."
    > --Kevin Dalley <>


    The original post said nothing about the underlying data-structure
    representation. In particular, it did not assume usage of self
    balancing binary search trees. The above code fragment is merely a
    proposal for the actual test, which implicitly defines the interface
    requirements. If you think it is incompatible with the spirit of the
    original post, why don't you propose an alternative one?

    ShacharS.
     
    shablool, Jun 5, 2008
    #17
  18. On Jun 3, 3:55 pm, "SETT Programming Contest"
    <> wrote:
    > Write the fastest set<T> implementation using only standard C++/C.
    > Ideally it should have the same interface like std::set.
    > At least the following methods must be implemented:
    > insert(), find(), begin(), end(), erase(), size(), operator<(),
    > and at least the forward iterator.


    Is that you Razii? What are you trying to prove?

    Regards,
    Anders
     
    Anders Dalvander, Jun 5, 2008
    #18
  19. SETT Programming Contest

    Ben Pfaff Guest

    shablool <> writes:

    > On Jun 5, 3:38 pm, Ben Pfaff <> wrote:
    >> shablool <> writes:
    >> >> "SETT Programming Contest" <>
    >> >> writes:

    >>
    >> >> > Write the fastest set<T> implementation using only standard C++/C.
    >> >> > Ideally it should have the same interface like std::set.
    >> >> > At least the following methods must be implemented:
    >> >> > insert(), find(), begin(), end(), erase(), size(), operator<(),
    >> >> > and at least the forward iterator.
    >> > I would like propose the following code fragment as possible
    >> > candidate for the actual test (after-all, we should start
    >> > somewhere):

    >> That fragment is going to strongly favor implementations that do
    >> little or no balancing of the tree, because balancing is just a
    >> waste of time when you know that insertions and deletions occur
    >> in random order. Is that your intention?

    >
    > The original post said nothing about the underlying data-structure
    > representation. In particular, it did not assume usage of self
    > balancing binary search trees.


    What non-tree-like data structures fulfill the asymptotic
    requirements for std::set?

    > The above code fragment is merely a proposal for the actual
    > test, which implicitly defines the interface requirements.


    The interface requirements were defined by the OP.

    > If you think it is incompatible with the spirit of the original
    > post, why don't you propose an alternative one?


    I already proposed an alternative for the interface requirements,
    in a direct followup to the OP.
    --
    Ben Pfaff
    http://benpfaff.org
     
    Ben Pfaff, Jun 5, 2008
    #19
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    0
    Views:
    348
  2. Jonathan
    Replies:
    6
    Views:
    1,116
    Kelsey Bjarnason
    Nov 29, 2003
  3. Replies:
    0
    Views:
    400
  4. Pumkin

    Sett absolute path problem

    Pumkin, Oct 12, 2006, in forum: ASP .Net
    Replies:
    9
    Views:
    579
    Pumkin
    Oct 12, 2006
  5. Replies:
    0
    Views:
    138
Loading...

Share This Page