need an efficient data structure

Discussion in 'C++' started by Raghavendra Mahuli, May 27, 2004.

  1. i need to store a lot of integers(say 10,000) in a datastructure.
    The constraints are:
    1. It should be fast.
    2. It should be orderded or sorted.
    3.Insterting and deleting should be fast.
    4. The no. of elements can change at runtime and hence it should be dynamic.
    Right now i am using SortedVector , but it is very slow.
    So, can u pls suggest a suitable Datastructure.
    Raghavendra Mahuli, May 27, 2004
    #1
    1. Advertising

  2. "Raghavendra Mahuli" <> wrote in message
    news:c9485s$pkd$...
    >
    > i need to store a lot of integers(say 10,000) in a datastructure.
    > The constraints are:
    > 1. It should be fast.
    > 2. It should be orderded or sorted.
    > 3.Insterting and deleting should be fast.
    > 4. The no. of elements can change at runtime and hence it should be

    dynamic.
    > Right now i am using SortedVector , but it is very slow.
    > So, can u pls suggest a suitable Datastructure.
    >


    std::set or std::multiset if you want duplicates.

    john
    John Harrison, May 27, 2004
    #2
    1. Advertising

  3. Raghavendra Mahuli

    tom_usenet Guest

    On Thu, 27 May 2004 09:30:01 +0100, "John Harrison"
    <> wrote:

    >
    >"Raghavendra Mahuli" <> wrote in message
    >news:c9485s$pkd$...
    >>
    >> i need to store a lot of integers(say 10,000) in a datastructure.
    >> The constraints are:
    >> 1. It should be fast.
    >> 2. It should be orderded or sorted.
    >> 3.Insterting and deleting should be fast.
    >> 4. The no. of elements can change at runtime and hence it should be

    >dynamic.
    >> Right now i am using SortedVector , but it is very slow.
    >> So, can u pls suggest a suitable Datastructure.
    >>

    >
    >std::set or std::multiset if you want duplicates.


    And make sure you have a fast memory allocator! A pool allocator would
    help a lot here, if your std::allocator isn't already optimized in
    that way.

    Tom
    --
    C++ FAQ: http://www.parashift.com/c -faq-lite/
    C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
    tom_usenet, May 27, 2004
    #3
  4. Raghavendra Mahuli

    Jeff Schwab Guest

    Raghavendra Mahuli wrote:
    > i need to store a lot of integers(say 10,000) in a datastructure.
    > The constraints are:
    > 1. It should be fast.


    There will be trade-offs. Which operations should be fastest?

    > 2. It should be orderded or sorted.


    Must that property be maintained at all times, or do you plan to do a
    bunch of insertions or deletions between reads? What I mean is, if you
    will insert 100 elements in a row, does the collection really need to
    re-sort itself after each insertion?

    > 3.Insterting and deleting should be fast.


    Inserting and deleting where? If you need to maintain the ordering
    after absolutely every insertion, a std::set or std::multiset may be
    best, as JohnH suggested. If you only need it to be sorted once all
    elements have been inserted, a std::vector may be more efficient by a
    constant factor in both time and space.

    > 4. The no. of elements can change at runtime and hence it should be dynamic.


    The standard library has dynamically growable collections for almost all
    your containment needs. :)
    Jeff Schwab, May 27, 2004
    #4
  5. Raghavendra Mahuli

    Siemel Naran Guest

    "Raghavendra Mahuli" <> wrote in message

    > i need to store a lot of integers(say 10,000) in a datastructure.
    > The constraints are:
    > 1. It should be fast.
    > 2. It should be orderded or sorted.
    > 3.Insterting and deleting should be fast.
    > 4. The no. of elements can change at runtime and hence it should be

    dynamic.
    > Right now i am using SortedVector , but it is very slow.
    > So, can u pls suggest a suitable Datastructure.


    In addition to the fine points raised by everyone else, will the integers be
    in a range (say 0 to 65535), and can each number occur more than once?
    Siemel Naran, May 27, 2004
    #5
  6. I'll clarify my situation further.....
    1. Having a fast structure, what i mean is insertion of elements anywhere
    (and deletion anywhere) should not take much time....Also access time, i.e.
    time required to read a particular element should also be fast....I know it
    is a tough requirement (But thats why i posted the question to this group).
    2. Also each integer can occur more than once but not more than twice.....

    If no standard structure satisfies these constraints, can u suggest a new
    idea that would satisfy the needs.....
    Raghavendra Mahuli, May 28, 2004
    #6
  7. "Jeff Schwab" <> wrote in message
    news:...
    > Raghavendra Mahuli wrote:
    > > i need to store a lot of integers(say 10,000) in a datastructure.
    > > The constraints are:
    > > 1. It should be fast.

    >
    > There will be trade-offs. Which operations should be fastest?

    --------- Insertion , deletion & searching should be fast
    > > 2. It should be orderded or sorted.

    >
    > Must that property be maintained at all times, or do you plan to do a
    > bunch of insertions or deletions between reads? What I mean is, if you
    > will insert 100 elements in a row, does the collection really need to
    > re-sort itself after each insertion?


    ------ At a time, i'll be inserting only 2 elemnts in a row.....it should
    reorder itself immediately.
    > > 3.Insterting and deleting should be fast.

    >
    > Inserting and deleting where? If you need to maintain the ordering
    > after absolutely every insertion, a std::set or std::multiset may be
    > best, as JohnH suggested. If you only need it to be sorted once all
    > elements have been inserted, a std::vector may be more efficient by a
    > constant factor in both time and space.

    --------I need to maintain ordering after every 2 insertions...
    Raghavendra Mahuli, May 28, 2004
    #7
  8. Raghavendra Mahuli

    Jeff Schwab Guest

    Raghavendra Mahuli wrote:
    > "Jeff Schwab" <> wrote in message
    > news:...
    >
    >>Raghavendra Mahuli wrote:
    >>
    >>>i need to store a lot of integers(say 10,000) in a datastructure.
    >>>The constraints are:
    >>>1. It should be fast.

    >>
    >>There will be trade-offs. Which operations should be fastest?

    >
    > --------- Insertion , deletion & searching should be fast


    Is O(ln2 n) good enough for all of those operations?

    >>>2. It should be orderded or sorted.

    >>
    >>Must that property be maintained at all times, or do you plan to do a
    >>bunch of insertions or deletions between reads? What I mean is, if you
    >>will insert 100 elements in a row, does the collection really need to
    >>re-sort itself after each insertion?

    >
    >
    > ------ At a time, i'll be inserting only 2 elemnts in a row.....it should
    > reorder itself immediately.


    Sounds like std::multiset might be your best bet.

    >
    >>>3.Insterting and deleting should be fast.

    >>
    >>Inserting and deleting where? If you need to maintain the ordering
    >>after absolutely every insertion, a std::set or std::multiset may be
    >>best, as JohnH suggested. If you only need it to be sorted once all
    >>elements have been inserted, a std::vector may be more efficient by a
    >>constant factor in both time and space.

    >
    > --------I need to maintain ordering after every 2 insertions...


    Is the number of possible values bounded? If so, how many possible
    values of elements are there? 10,000? 2^32? I don't mean how many
    elements you need to store in your container, but how many different
    values might exist for the type of your integer; for example, if you're
    using 16-bit integers, there are 2^16=65536 possible values.
    Jeff Schwab, May 28, 2004
    #8
  9. > Is O(ln2 n) good enough for all of those operations?
    --------I think that is sufficient


    > Is the number of possible values bounded? If so, how many possible
    > values of elements are there? 10,000? 2^32? I don't mean how many
    > elements you need to store in your container, but how many different
    > values might exist for the type of your integer; for example, if you're
    > using 16-bit integers, there are 2^16=65536 possible values.


    -----i am storing memory addresses....So all values in the range are
    possible.....but they'll generally be multiples of 4...........And
    maximum
    possible value is 2^32.
    Raghavendra Mahuli, May 28, 2004
    #9
  10. Raghavendra Mahuli

    Siemel Naran Guest

    "Raghavendra Mahuli" <> wrote in message
    news:c96ls6$ju6

    > -----i am storing memory addresses....So all values in the range are
    > possible.....but they'll generally be multiples of 4...........And
    > maximum
    > possible value is 2^32.


    Try the map idea suggested by John, along with the allocator suggested by
    tom_usenet.

    A direct address table, hinted at in my post, is out of the question since
    the range of values is so big (basically you have an array whose size equals
    to the number of different possible elements).

    You can use a hybrid approach, which could very well be a hashtable.
    Something like: all of the numbers in [0, 2^32] hash to one of 32 different
    values with equal probability, which still means many numbers could hash to
    the same value, and so you store the different numbers in a map (with custom
    pool allocator) data structure. See if your implementation has a
    hash_table.
    Siemel Naran, May 28, 2004
    #10
  11. Raghavendra Mahuli

    Jeff Schwab Guest

    Raghavendra Mahuli wrote:
    >>Is O(ln2 n) good enough for all of those operations?

    >
    > --------I think that is sufficient
    >
    >
    >
    >>Is the number of possible values bounded? If so, how many possible
    >>values of elements are there? 10,000? 2^32? I don't mean how many
    >>elements you need to store in your container, but how many different
    >>values might exist for the type of your integer; for example, if you're
    >>using 16-bit integers, there are 2^16=65536 possible values.

    >
    >
    > -----i am storing memory addresses....So all values in the range are
    > possible.....but they'll generally be multiples of 4...........And
    > maximum
    > possible value is 2^32.


    For the values that aren't multiples of 4, stick them in a std::set.
    The access time for them will be logarithmic with the cardinality of the
    set, but since that cardinality is small, the performance should be good.

    This leaves 2^(32-2) = 2^30 values to store; you could use a sequence of
    bits, initialized to zero, to represent whether each value is in the
    collection. Since you allow up to two of each element, you'll probably
    want an extra bit per element to show whether it has been inserted more
    than once. This means you need 2^31 bits. I would recommend using two
    std::vector<bool>: one to represent whether each value had been
    inserted at least once, and another to represent whether each value had
    been inserted more than once. You'll need at least 256 MB of RAM, plus
    space for the std::set. The machines where I work typically have
    multiple GB of RAM, so this is probably the solution I would choose if
    performance were critical.
    Jeff Schwab, May 28, 2004
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Knews
    Replies:
    5
    Views:
    363
    E.C. B├Ąck
    Sep 4, 2003
  2. John Galt
    Replies:
    4
    Views:
    525
    Chris Smith
    Feb 13, 2004
  3. John
    Replies:
    5
    Views:
    694
    Karl Heinz Buchegger
    Jan 23, 2004
  4. Dongsheng Ruan

    need help on a data structure problem

    Dongsheng Ruan, Feb 2, 2007, in forum: Python
    Replies:
    3
    Views:
    267
    Neil Cerutti
    Feb 2, 2007
  5. A
    Replies:
    27
    Views:
    1,592
    Jorgen Grahn
    Apr 17, 2011
Loading...

Share This Page