STL container for sorted items

Discussion in 'C++' started by Hunk, Mar 14, 2007.

  1. Hunk

    Hunk Guest

    Would like some advice on the fillowing
    I have a sorted list of items on which i require to search and
    retrieve the said item and also modify an item based on its identity.
    I think an Map stl container would be most suited. Am i correct in my
    assumptions
    Problem
    Reocrds are already stored with a unique identity in a binary
    format.The structure is roughly

    ----------------------
    Record Count
    --------------------
    Record1 -------------------------------------- String Identifier
    ------------------
    --------------------
    Obj
    pointer
    Record 2
    ------------------
    Record n
    --------------------

    -------------
    Obj1
    ------------
    Obj2
    ------------

    I propose to have a map of <strings , Obj Pointer>. The map would be
    initialised with the file item initially and then routines would be
    provided to search an item based on the identifier and update an item
    based on the identifier.
    Also if dealing with a file stored in memory , is it advisable to use
    a STL in the first place instead of say directly using address
    manipulations to get to the items since the identifiers are already
    sorted?
    Hunk, Mar 14, 2007
    #1
    1. Advertising

  2. Hunk

    Hunk Guest

    On Mar 14, 4:16 pm, "Hunk" <> wrote:
    > Would like some advice on the fillowing
    > I have a sorted list of items on which i require to search and
    > retrieve the said item and also modify an item based on its identity.
    > I think an Map stl container would be most suited. Am i correct in my
    > assumptions
    > Problem
    > Reocrds are already stored with a unique identity in a binary
    > format.The structure is roughly
    >
    > ----------------------
    > Record Count
    > --------------------
    > Record1 -------------------------------------- String Identifier
    > ------------------
    > --------------------
    > Obj
    > pointer
    > Record 2
    > ------------------
    > Record n
    > --------------------
    >
    > -------------
    > Obj1
    > ------------
    > Obj2
    > ------------
    >
    > I propose to have a map of <strings , Obj Pointer>. The map would be
    > initialised with the file item initially and then routines would be
    > provided to search an item based on the identifier and update an item
    > based on the identifier.
    > Also if dealing with a file stored in memory , is it advisable to use
    > a STL in the first place instead of say directly using address
    > manipulations to get to the items since the identifiers are already
    > sorted?


    I think the formattin above is screwed up.I'm pasting the format once
    again
    ----------------------
    Record Count
    --------------------
    Record1
    Record 2
    ------------------
    Record n
    --------------------


    -------------
    Obj1
    ------------
    Obj2
    ------------

    Each Record is made up of
    Identity (unique)
    Obj Pointer (pointer to objects obj1,ibj2)
    Hunk, Mar 14, 2007
    #2
    1. Advertising

  3. Hunk

    coosa Guest

    On Mar 14, 7:23 pm, "Hunk" <> wrote:
    > On Mar 14, 4:16 pm, "Hunk" <> wrote:
    >
    >
    >
    > > Would like some advice on the fillowing
    > > I have a sorted list of items on which i require to search and
    > > retrieve the said item and also modify an item based on its identity.
    > > I think an Map stl container would be most suited. Am i correct in my
    > > assumptions
    > > Problem
    > > Reocrds are already stored with a unique identity in a binary
    > > format.The structure is roughly

    >
    > > ----------------------
    > > Record Count
    > > --------------------
    > > Record1 -------------------------------------- String Identifier
    > > ------------------
    > > --------------------
    > > Obj
    > > pointer
    > > Record 2
    > > ------------------
    > > Record n
    > > --------------------

    >
    > > -------------
    > > Obj1
    > > ------------
    > > Obj2
    > > ------------

    >
    > > I propose to have a map of <strings , Obj Pointer>. The map would be
    > > initialised with the file item initially and then routines would be
    > > provided to search an item based on the identifier and update an item
    > > based on the identifier.
    > > Also if dealing with a file stored in memory , is it advisable to use
    > > a STL in the first place instead of say directly using address
    > > manipulations to get to the items since the identifiers are already
    > > sorted?

    >
    > I think the formattin above is screwed up.I'm pasting the format once
    > again
    > ----------------------
    > Record Count
    > --------------------
    > Record1
    > Record 2
    > ------------------
    > Record n
    > --------------------
    >
    > -------------
    > Obj1
    > ------------
    > Obj2
    > ------------
    >
    > Each Record is made up of
    > Identity (unique)
    > Obj Pointer (pointer to objects obj1,ibj2)


    I believe a Set is more appropriate;
    The sorting itself should be made in such that the < operator should
    be overloaded.
    For instance, since sorted elements is important to you, you could
    probably create a class which contains both the identity and object
    content; in that class you should overload the less operator compared
    to the identity.

    Regards
    coosa, Mar 14, 2007
    #3
  4. Hunk

    coosa Guest

    On Mar 15, 12:30 am, "coosa" <> wrote:
    > On Mar 14, 7:23 pm, "Hunk" <> wrote:
    >
    >
    >
    > > On Mar 14, 4:16 pm, "Hunk" <> wrote:

    >
    > > > Would like some advice on the fillowing
    > > > I have a sorted list of items on which i require to search and
    > > > retrieve the said item and also modify an item based on its identity.
    > > > I think an Map stl container would be most suited. Am i correct in my
    > > > assumptions
    > > > Problem
    > > > Reocrds are already stored with a unique identity in a binary
    > > > format.The structure is roughly

    >
    > > > ----------------------
    > > > Record Count
    > > > --------------------
    > > > Record1 -------------------------------------- String Identifier
    > > > ------------------
    > > > --------------------
    > > > Obj
    > > > pointer
    > > > Record 2
    > > > ------------------
    > > > Record n
    > > > --------------------

    >
    > > > -------------
    > > > Obj1
    > > > ------------
    > > > Obj2
    > > > ------------

    >
    > > > I propose to have a map of <strings , Obj Pointer>. The map would be
    > > > initialised with the file item initially and then routines would be
    > > > provided to search an item based on the identifier and update an item
    > > > based on the identifier.
    > > > Also if dealing with a file stored in memory , is it advisable to use
    > > > a STL in the first place instead of say directly using address
    > > > manipulations to get to the items since the identifiers are already
    > > > sorted?

    >
    > > I think the formattin above is screwed up.I'm pasting the format once
    > > again
    > > ----------------------
    > > Record Count
    > > --------------------
    > > Record1
    > > Record 2
    > > ------------------
    > > Record n
    > > --------------------

    >
    > > -------------
    > > Obj1
    > > ------------
    > > Obj2
    > > ------------

    >
    > > Each Record is made up of
    > > Identity (unique)
    > > Obj Pointer (pointer to objects obj1,ibj2)

    >
    > I believe a Set is more appropriate;
    > The sorting itself should be made in such that the < operator should
    > be overloaded.
    > For instance, since sorted elements is important to you, you could
    > probably create a class which contains both the identity and object
    > content; in that class you should overload the less operator compared
    > to the identity.
    >
    > Regards


    Assuming the class is called A and it contains id as int and the
    pointers to your objects;
    Hence:
    typedef set <A, less<A>> A_set;
    ..
    ..
    ..
    For your A class definition:
    bool A::eek:perator < (const A & a) const {return this->id < a.id;}
    now A_set s;
    s takes simply an object of A and it is ensured that all the object of
    A inserted into the set are sorted in ascending order based on the id
    inside each object of A.

    Hope it helps
    coosa, Mar 14, 2007
    #4
  5. Hunk

    Hunk Guest

    On Mar 14, 9:40 pm, "coosa" <> wrote:
    > On Mar 15, 12:30 am, "coosa" <> wrote:
    >
    >
    >
    >
    >
    > > On Mar 14, 7:23 pm, "Hunk" <> wrote:

    >
    > > > On Mar 14, 4:16 pm, "Hunk" <> wrote:

    >
    > > > > Would like some advice on the fillowing
    > > > > I have a sorted list of items on which i require to search and
    > > > > retrieve the said item and also modify an item based on its identity.
    > > > > I think an Map stl container would be most suited. Am i correct in my
    > > > > assumptions
    > > > > Problem
    > > > > Reocrds are already stored with a unique identity in a binary
    > > > > format.The structure is roughly

    >
    > > > > ----------------------
    > > > > Record Count
    > > > > --------------------
    > > > > Record1 -------------------------------------- String Identifier
    > > > > ------------------
    > > > > --------------------
    > > > > Obj
    > > > > pointer
    > > > > Record 2
    > > > > ------------------
    > > > > Record n
    > > > > --------------------

    >
    > > > > -------------
    > > > > Obj1
    > > > > ------------
    > > > > Obj2
    > > > > ------------

    >
    > > > > I propose to have a map of <strings , Obj Pointer>. The map would be
    > > > > initialised with the file item initially and then routines would be
    > > > > provided to search an item based on the identifier and update an item
    > > > > based on the identifier.
    > > > > Also if dealing with a file stored in memory , is it advisable to use
    > > > > a STL in the first place instead of say directly using address
    > > > > manipulations to get to the items since the identifiers are already
    > > > > sorted?

    >
    > > > I think the formattin above is screwed up.I'm pasting the format once
    > > > again
    > > > ----------------------
    > > > Record Count
    > > > --------------------
    > > > Record1
    > > > Record 2
    > > > ------------------
    > > > Record n
    > > > --------------------

    >
    > > > -------------
    > > > Obj1
    > > > ------------
    > > > Obj2
    > > > ------------

    >
    > > > Each Record is made up of
    > > > Identity (unique)
    > > > Obj Pointer (pointer to objects obj1,ibj2)

    >
    > > I believe a Set is more appropriate;
    > > The sorting itself should be made in such that the < operator should
    > > be overloaded.
    > > For instance, since sorted elements is important to you, you could
    > > probably create a class which contains both the identity and object
    > > content; in that class you should overload the less operator compared
    > > to the identity.

    >
    > > Regards

    >
    > Assuming the class is called A and it contains id as int and the
    > pointers to your objects;
    > Hence:
    > typedef set <A, less<A>> A_set;
    > .
    > .
    > .
    > For your A class definition:
    > bool A::eek:perator < (const A & a) const {return this->id < a.id;}
    > now A_set s;
    > s takes simply an object of A and it is ensured that all the object of
    > A inserted into the set are sorted in ascending order based on the id
    > inside each object of A.
    >
    > Hope it helps- Hide quoted text -
    >
    > - Show quoted text -


    How would a set help in returning an object compared to a map?
    Since the items given to me are already sorted, and i would have to
    retrieve an object based on its identity, wouldnt map be more useful
    as in the set i would have to decode the object first?
    Hunk, Mar 15, 2007
    #5
  6. Hunk

    coosa Guest

    On Mar 15, 1:17 pm, "Hunk" <> wrote:
    > On Mar 14, 9:40 pm, "coosa" <> wrote:
    >
    >
    >
    > > On Mar 15, 12:30 am, "coosa" <> wrote:

    >
    > > > On Mar 14, 7:23 pm, "Hunk" <> wrote:

    >
    > > > > On Mar 14, 4:16 pm, "Hunk" <> wrote:

    >
    > > > > > Would like some advice on the fillowing
    > > > > > I have a sorted list of items on which i require to search and
    > > > > > retrieve the said item and also modify an item based on its identity.
    > > > > > I think an Map stl container would be most suited. Am i correct in my
    > > > > > assumptions
    > > > > > Problem
    > > > > > Reocrds are already stored with a unique identity in a binary
    > > > > > format.The structure is roughly

    >
    > > > > > ----------------------
    > > > > > Record Count
    > > > > > --------------------
    > > > > > Record1 -------------------------------------- String Identifier
    > > > > > ------------------
    > > > > > --------------------
    > > > > > Obj
    > > > > > pointer
    > > > > > Record 2
    > > > > > ------------------
    > > > > > Record n
    > > > > > --------------------

    >
    > > > > > -------------
    > > > > > Obj1
    > > > > > ------------
    > > > > > Obj2
    > > > > > ------------

    >
    > > > > > I propose to have a map of <strings , Obj Pointer>. The map would be
    > > > > > initialised with the file item initially and then routines would be
    > > > > > provided to search an item based on the identifier and update an item
    > > > > > based on the identifier.
    > > > > > Also if dealing with a file stored in memory , is it advisable to use
    > > > > > a STL in the first place instead of say directly using address
    > > > > > manipulations to get to the items since the identifiers are already
    > > > > > sorted?

    >
    > > > > I think the formattin above is screwed up.I'm pasting the format once
    > > > > again
    > > > > ----------------------
    > > > > Record Count
    > > > > --------------------
    > > > > Record1
    > > > > Record 2
    > > > > ------------------
    > > > > Record n
    > > > > --------------------

    >
    > > > > -------------
    > > > > Obj1
    > > > > ------------
    > > > > Obj2
    > > > > ------------

    >
    > > > > Each Record is made up of
    > > > > Identity (unique)
    > > > > Obj Pointer (pointer to objects obj1,ibj2)

    >
    > > > I believe a Set is more appropriate;
    > > > The sorting itself should be made in such that the < operator should
    > > > be overloaded.
    > > > For instance, since sorted elements is important to you, you could
    > > > probably create a class which contains both the identity and object
    > > > content; in that class you should overload the less operator compared
    > > > to the identity.

    >
    > > > Regards

    >
    > > Assuming the class is called A and it contains id as int and the
    > > pointers to your objects;
    > > Hence:
    > > typedef set <A, less<A>> A_set;
    > > .
    > > .
    > > .
    > > For your A class definition:
    > > bool A::eek:perator < (const A & a) const {return this->id < a.id;}
    > > now A_set s;
    > > s takes simply an object of A and it is ensured that all the object of
    > > A inserted into the set are sorted in ascending order based on the id
    > > inside each object of A.

    >
    > > Hope it helps- Hide quoted text -

    >
    > > - Show quoted text -

    >
    > How would a set help in returning an object compared to a map?
    > Since the items given to me are already sorted, and i would have to
    > retrieve an object based on its identity, wouldnt map be more useful
    > as in the set i would have to decode the object first?


    Yes, map would be a better choice assuming you don't have to add any
    more to your collection;
    In simple words, if your application requires later that you add more
    into your collection which needs to be always sorted, then I suggest
    using a set; however, if you no longer need to add and the current
    data you have is already sorted before inserting into your new
    collection, then a map is a good choice for fast lookup, even though
    it doesn't play role in maps whether the keys are sorted or not (some
    one corrects me if I'm wrong).
    coosa, Mar 16, 2007
    #6
  7. Hunk

    coosa Guest

    On Mar 16, 9:06 am, "coosa" <> wrote:
    > On Mar 15, 1:17 pm, "Hunk" <> wrote:
    >
    >
    >
    > > On Mar 14, 9:40 pm, "coosa" <> wrote:

    >
    > > > On Mar 15, 12:30 am, "coosa" <> wrote:

    >
    > > > > On Mar 14, 7:23 pm, "Hunk" <> wrote:

    >
    > > > > > On Mar 14, 4:16 pm, "Hunk" <> wrote:

    >
    > > > > > > Would like some advice on the fillowing
    > > > > > > I have a sorted list of items on which i require to search and
    > > > > > > retrieve the said item and also modify an item based on its identity.
    > > > > > > I think an Map stl container would be most suited. Am i correct in my
    > > > > > > assumptions
    > > > > > > Problem
    > > > > > > Reocrds are already stored with a unique identity in a binary
    > > > > > > format.The structure is roughly

    >
    > > > > > > ----------------------
    > > > > > > Record Count
    > > > > > > --------------------
    > > > > > > Record1 -------------------------------------- String Identifier
    > > > > > > ------------------
    > > > > > > --------------------
    > > > > > > Obj
    > > > > > > pointer
    > > > > > > Record 2
    > > > > > > ------------------
    > > > > > > Record n
    > > > > > > --------------------

    >
    > > > > > > -------------
    > > > > > > Obj1
    > > > > > > ------------
    > > > > > > Obj2
    > > > > > > ------------

    >
    > > > > > > I propose to have a map of <strings , Obj Pointer>. The map would be
    > > > > > > initialised with the file item initially and then routines would be
    > > > > > > provided to search an item based on the identifier and update an item
    > > > > > > based on the identifier.
    > > > > > > Also if dealing with a file stored in memory , is it advisable to use
    > > > > > > a STL in the first place instead of say directly using address
    > > > > > > manipulations to get to the items since the identifiers are already
    > > > > > > sorted?

    >
    > > > > > I think the formattin above is screwed up.I'm pasting the format once
    > > > > > again
    > > > > > ----------------------
    > > > > > Record Count
    > > > > > --------------------
    > > > > > Record1
    > > > > > Record 2
    > > > > > ------------------
    > > > > > Record n
    > > > > > --------------------

    >
    > > > > > -------------
    > > > > > Obj1
    > > > > > ------------
    > > > > > Obj2
    > > > > > ------------

    >
    > > > > > Each Record is made up of
    > > > > > Identity (unique)
    > > > > > Obj Pointer (pointer to objects obj1,ibj2)

    >
    > > > > I believe a Set is more appropriate;
    > > > > The sorting itself should be made in such that the < operator should
    > > > > be overloaded.
    > > > > For instance, since sorted elements is important to you, you could
    > > > > probably create a class which contains both the identity and object
    > > > > content; in that class you should overload the less operator compared
    > > > > to the identity.

    >
    > > > > Regards

    >
    > > > Assuming the class is called A and it contains id as int and the
    > > > pointers to your objects;
    > > > Hence:
    > > > typedef set <A, less<A>> A_set;
    > > > .
    > > > .
    > > > .
    > > > For your A class definition:
    > > > bool A::eek:perator < (const A & a) const {return this->id < a.id;}
    > > > now A_set s;
    > > > s takes simply an object of A and it is ensured that all the object of
    > > > A inserted into the set are sorted in ascending order based on the id
    > > > inside each object of A.

    >
    > > > Hope it helps- Hide quoted text -

    >
    > > > - Show quoted text -

    >
    > > How would a set help in returning an object compared to a map?
    > > Since the items given to me are already sorted, and i would have to
    > > retrieve an object based on its identity, wouldnt map be more useful
    > > as in the set i would have to decode the object first?

    >
    > Yes, map would be a better choice assuming you don't have to add any
    > more to your collection;
    > In simple words, if your application requires later that you add more
    > into your collection which needs to be always sorted, then I suggest
    > using a set; however, if you no longer need to add and the current
    > data you have is already sorted before inserting into your new
    > collection, then a map is a good choice for fast lookup, even though
    > it doesn't play role in maps whether the keys are sorted or not (some
    > one corrects me if I'm wrong).


    But I just recalled ...
    Why not using hash_map?
    I guess it's the fastest way to retrieve elements by a certain key
    Check http://www.sgi.com/tech/stl/hash_map.html or by MSDN
    http://msdn2.microsoft.com/en-us/library/6x7w9f6z(VS.80).aspx for more
    reference.
    coosa, Mar 16, 2007
    #7
  8. Hunk

    Hunk Guest

    On Mar 16, 6:21 am, "coosa" <> wrote:
    > On Mar 16, 9:06 am, "coosa" <> wrote:
    >
    >
    >
    >
    >
    > > On Mar 15, 1:17 pm, "Hunk" <> wrote:

    >
    > > > On Mar 14, 9:40 pm, "coosa" <> wrote:

    >
    > > > > On Mar 15, 12:30 am, "coosa" <> wrote:

    >
    > > > > > On Mar 14, 7:23 pm, "Hunk" <> wrote:

    >
    > > > > > > On Mar 14, 4:16 pm, "Hunk" <> wrote:

    >
    > > > > > > > Would like some advice on the fillowing
    > > > > > > > I have a sorted list of items on which i require to search and
    > > > > > > > retrieve the said item and also modify an item based on its identity.
    > > > > > > > I think an Map stl container would be most suited. Am i correct in my
    > > > > > > > assumptions
    > > > > > > > Problem
    > > > > > > > Reocrds are already stored with a unique identity in a binary
    > > > > > > > format.The structure is roughly

    >
    > > > > > > > ----------------------
    > > > > > > > Record Count
    > > > > > > > --------------------
    > > > > > > > Record1 -------------------------------------- String Identifier
    > > > > > > > ------------------
    > > > > > > > --------------------
    > > > > > > > Obj
    > > > > > > > pointer
    > > > > > > > Record 2
    > > > > > > > ------------------
    > > > > > > > Record n
    > > > > > > > --------------------

    >
    > > > > > > > -------------
    > > > > > > > Obj1
    > > > > > > > ------------
    > > > > > > > Obj2
    > > > > > > > ------------

    >
    > > > > > > > I propose to have a map of <strings , Obj Pointer>. The map would be
    > > > > > > > initialised with the file item initially and then routines would be
    > > > > > > > provided to search an item based on the identifier and update an item
    > > > > > > > based on the identifier.
    > > > > > > > Also if dealing with a file stored in memory , is it advisable to use
    > > > > > > > a STL in the first place instead of say directly using address
    > > > > > > > manipulations to get to the items since the identifiers are already
    > > > > > > > sorted?

    >
    > > > > > > I think the formattin above is screwed up.I'm pasting the format once
    > > > > > > again
    > > > > > > ----------------------
    > > > > > > Record Count
    > > > > > > --------------------
    > > > > > > Record1
    > > > > > > Record 2
    > > > > > > ------------------
    > > > > > > Record n
    > > > > > > --------------------

    >
    > > > > > > -------------
    > > > > > > Obj1
    > > > > > > ------------
    > > > > > > Obj2
    > > > > > > ------------

    >
    > > > > > > Each Record is made up of
    > > > > > > Identity (unique)
    > > > > > > Obj Pointer (pointer to objects obj1,ibj2)

    >
    > > > > > I believe a Set is more appropriate;
    > > > > > The sorting itself should be made in such that the < operator should
    > > > > > be overloaded.
    > > > > > For instance, since sorted elements is important to you, you could
    > > > > > probably create a class which contains both the identity and object
    > > > > > content; in that class you should overload the less operator compared
    > > > > > to the identity.

    >
    > > > > > Regards

    >
    > > > > Assuming the class is called A and it contains id as int and the
    > > > > pointers to your objects;
    > > > > Hence:
    > > > > typedef set <A, less<A>> A_set;
    > > > > .
    > > > > .
    > > > > .
    > > > > For your A class definition:
    > > > > bool A::eek:perator < (const A & a) const {return this->id < a.id;}
    > > > > now A_set s;
    > > > > s takes simply an object of A and it is ensured that all the object of
    > > > > A inserted into the set are sorted in ascending order based on the id
    > > > > inside each object of A.

    >
    > > > > Hope it helps- Hide quoted text -

    >
    > > > > - Show quoted text -

    >
    > > > How would a set help in returning an object compared to a map?
    > > > Since the items given to me are already sorted, and i would have to
    > > > retrieve an object based on its identity, wouldnt map be more useful
    > > > as in the set i would have to decode the object first?

    >
    > > Yes, map would be a better choice assuming you don't have to add any
    > > more to your collection;
    > > In simple words, if your application requires later that you add more
    > > into your collection which needs to be always sorted, then I suggest
    > > using a set; however, if you no longer need to add and the current
    > > data you have is already sorted before inserting into your new
    > > collection, then a map is a good choice for fast lookup, even though
    > > it doesn't play role in maps whether the keys are sorted or not (some
    > > one corrects me if I'm wrong).

    >
    > But I just recalled ...
    > Why not using hash_map?
    > I guess it's the fastest way to retrieve elements by a certain key
    > Checkhttp://www.sgi.com/tech/stl/hash_map.htmlor by MSDNhttp://msdn2.microsoft.com/en-us/library/6x7w9f6z(VS.80).aspxfor more
    > reference.- Hide quoted text -
    >
    > - Show quoted text -


    The more i think of it i feel we should use a vector with a pair
    instead of any associative container. The reasons as explained in item
    23 in Effective Stl by Scott meyers says that associative containers
    would lead to page faults and more memory due to the fact that they
    use 3 pointers internally and they are not stored in a contiguos
    location in memory.
    So when the decision is
    Insert elements
    Sort Elements -- skip this in my case
    Look up elements
    the best decision is to use a pair in a vector to achieve the same
    thing as a map<string,structure> would do.
    Do let me know if i'm wrong in my explanations
    Hunk, Mar 16, 2007
    #8
  9. Hunk

    Lionel B Guest

    On Fri, 16 Mar 2007 04:59:02 -0700, Hunk wrote:

    > On Mar 16, 6:21 am, "coosa" <> wrote:
    >> On Mar 16, 9:06 am, "coosa" <> wrote:
    >>
    >>
    >>
    >>
    >>
    >> > On Mar 15, 1:17 pm, "Hunk" <> wrote:

    >>
    >> > > On Mar 14, 9:40 pm, "coosa" <> wrote:

    >>
    >> > > > On Mar 15, 12:30 am, "coosa" <> wrote:

    >>
    >> > > > > On Mar 14, 7:23 pm, "Hunk" <> wrote:

    >>
    >> > > > > > On Mar 14, 4:16 pm, "Hunk" <> wrote:

    >>
    >> > > > > > > Would like some advice on the fillowing
    >> > > > > > > I have a sorted list of items on which i require to search and
    >> > > > > > > retrieve the said item and also modify an item based on its identity.
    >> > > > > > > I think an Map stl container would be most suited. Am i correct in my
    >> > > > > > > assumptions


    [snip]

    >> > Yes, map would be a better choice assuming you don't have to add any
    >> > more to your collection;
    >> > In simple words, if your application requires later that you add more
    >> > into your collection which needs to be always sorted, then I suggest
    >> > using a set; however, if you no longer need to add and the current
    >> > data you have is already sorted before inserting into your new
    >> > collection, then a map is a good choice for fast lookup, even though
    >> > it doesn't play role in maps whether the keys are sorted or not (some
    >> > one corrects me if I'm wrong).

    >>
    >> But I just recalled ...
    >> Why not using hash_map?
    >> I guess it's the fastest way to retrieve elements by a certain key
    >> Check http://www.sgi.com/tech/stl/hash_map.htmlor by
    >> MSDNhttp://msdn2.microsoft.com/en-us/library/6x7w9f6z(VS.80).aspxfor
    >> more reference.

    >
    > The more i think of it i feel we should use a vector with a pair instead
    > of any associative container. The reasons as explained in item 23 in
    > Effective Stl by Scott meyers says that associative containers would
    > lead to page faults and more memory due to the fact that they use 3
    > pointers internally and they are not stored in a contiguos location in
    > memory.
    > So when the decision is
    > Insert elements
    > Sort Elements -- skip this in my case Look up elements
    > the best decision is to use a pair in a vector to achieve the same thing
    > as a map<string,structure> would do. Do let me know if i'm wrong in my
    > explanations


    My inclination would be to use the data structure that most closely
    reflects the problem you are addressing; in your case this sounds like a
    map. You won't know until you try it whether efficiency is an issue. If it
    is, you could try other ideas, such as a vector of pairs.

    --
    Lionel B
    Lionel B, Mar 16, 2007
    #9
  10. Hunk

    coosa Guest

    On Mar 16, 8:20 pm, Lionel B <> wrote:
    > On Fri, 16 Mar 2007 04:59:02 -0700, Hunk wrote:
    > > On Mar 16, 6:21 am, "coosa" <> wrote:
    > >> On Mar 16, 9:06 am, "coosa" <> wrote:

    >
    > >> > On Mar 15, 1:17 pm, "Hunk" <> wrote:

    >
    > >> > > On Mar 14, 9:40 pm, "coosa" <> wrote:

    >
    > >> > > > On Mar 15, 12:30 am, "coosa" <> wrote:

    >
    > >> > > > > On Mar 14, 7:23 pm, "Hunk" <> wrote:

    >
    > >> > > > > > On Mar 14, 4:16 pm, "Hunk" <> wrote:

    >
    > >> > > > > > > Would like some advice on the fillowing
    > >> > > > > > > I have a sorted list of items on which i require to search and
    > >> > > > > > > retrieve the said item and also modify an item based on its identity.
    > >> > > > > > > I think an Map stl container would be most suited. Am i correct in my
    > >> > > > > > > assumptions

    >
    > [snip]
    >
    >
    >
    > >> > Yes, map would be a better choice assuming you don't have to add any
    > >> > more to your collection;
    > >> > In simple words, if your application requires later that you add more
    > >> > into your collection which needs to be always sorted, then I suggest
    > >> > using a set; however, if you no longer need to add and the current
    > >> > data you have is already sorted before inserting into your new
    > >> > collection, then a map is a good choice for fast lookup, even though
    > >> > it doesn't play role in maps whether the keys are sorted or not (some
    > >> > one corrects me if I'm wrong).

    >
    > >> But I just recalled ...
    > >> Why not using hash_map?
    > >> I guess it's the fastest way to retrieve elements by a certain key
    > >> Checkhttp://www.sgi.com/tech/stl/hash_map.htmlorby
    > >> MSDNhttp://msdn2.microsoft.com/en-us/library/6x7w9f6z(VS.80).aspxfor
    > >> more reference.

    >
    > > The more i think of it i feel we should use a vector with a pair instead
    > > of any associative container. The reasons as explained in item 23 in
    > > Effective Stl by Scott meyers says that associative containers would
    > > lead to page faults and more memory due to the fact that they use 3
    > > pointers internally and they are not stored in a contiguos location in
    > > memory.
    > > So when the decision is
    > > Insert elements
    > > Sort Elements -- skip this in my case Look up elements
    > > the best decision is to use a pair in a vector to achieve the same thing
    > > as a map<string,structure> would do. Do let me know if i'm wrong in my
    > > explanations

    >
    > My inclination would be to use the data structure that most closely
    > reflects the problem you are addressing; in your case this sounds like a
    > map. You won't know until you try it whether efficiency is an issue. If it
    > is, you could try other ideas, such as a vector of pairs.
    >
    > --
    > Lionel B


    Agree with you; a map or precisely a hashed map.
    Trust me on not using vectors; I've worked on the netflix competition
    with a dataset containing over 100 millions of ratings and I used Java
    for it, but it won't play a big role if I used C++; for some
    processing the code ran and lasted around 2-3 days to finish; when I
    decided to switch to HashMap it took less than 6 hours to process.
    Vectors vs Maps, I also have the book of scott meyers and it's a great
    one and I don't see a contradiction for using Maps or hashed maps; you
    worry much about retrieving the elements fast and I guess Maps are
    best suited for that.
    Let me repeat again:
    You need to ensure sorted elements -> sets
    you need uniqueness -> sets and maps
    you need fast lookup -> maps and hashes
    you need insertions and deletions from the beginning, ending or middle
    of container -> lists
    you need random access -> vectors
    See what's most crucial for your application and decide for a
    container.

    Good luck
    coosa, Mar 16, 2007
    #10
  11. Hunk

    Matteo Guest

    On Mar 16, 11:29 am, "coosa" <> wrote:
    > Agree with you; a map or precisely a hashed map.
    > Trust me on not using vectors; I've worked on the netflix competition
    > with a dataset containing over 100 millions of ratings and I used Java
    > for it, but it won't play a big role if I used C++; for some
    > processing the code ran and lasted around 2-3 days to finish; when I
    > decided to switch to HashMap it took less than 6 hours to process.


    Were you using sorted vectors with a binary search? Or were you just
    scanning through the vector? True, a hash map would still be faster
    (in the average case) than a sorted vector, but I would like to know
    how you used vectors in this case before I take your advice.

    > Vectors vs Maps, I also have the book of scott meyers and it's a great
    > one and I don't see a contradiction for using Maps or hashed maps; you
    > worry much about retrieving the elements fast and I guess Maps are
    > best suited for that.
    > Let me repeat again:
    > You need to ensure sorted elements -> sets
    > you need uniqueness -> sets and maps
    > you need fast lookup -> maps and hashes
    > you need insertions and deletions from the beginning, ending or middle
    > of container -> lists
    > you need random access -> vectors
    > See what's most crucial for your application and decide for a
    > container.
    >
    > Good luck



    I think there is a little bit of misinformation here. Both std::map
    and std::set are sorted, and they both have fairly fast (i.e. O(log
    N)) lookup. Hash maps have even faster lookup (in the average case,
    O(1)).

    The reason that the OP might prefer a map over a set his case is that
    he wants to modify the data in the records. While this can be done
    with std::set (by storing a pointer to the record, and providing a
    comparison operator), it isn't a natural fit for the problem.

    However, as the data is already sorted, and if you will not be
    inserting or deleting data, then using a std::vector is a fine idea.
    To search a sorted vector, use std::lower_bound, which should do a
    binary search (avoid std::binary_search, as it only returns a boolean
    indicating containment).

    If you need a reference:
    http://www.sgi.com/tech/stl/lower_bound.html

    -matt
    Matteo, Mar 16, 2007
    #11
  12. Hunk

    Hunk Guest

    On Mar 17, 2:58 am, "Matteo" <> wrote:
    > On Mar 16, 11:29 am, "coosa" <> wrote:
    >
    > > Agree with you; a map or precisely a hashed map.
    > > Trust me on not using vectors; I've worked on the netflix competition
    > > with a dataset containing over 100 millions of ratings and I used Java
    > > for it, but it won't play a big role if I used C++; for some
    > > processing the code ran and lasted around 2-3 days to finish; when I
    > > decided to switch to HashMap it took less than 6 hours to process.

    >
    > Were you using sorted vectors with a binary search? Or were you just
    > scanning through the vector? True, a hash map would still be faster
    > (in the average case) than a sorted vector, but I would like to know
    > how you used vectors in this case before I take your advice.
    >
    > > Vectors vs Maps, I also have the book of scott meyers and it's a great
    > > one and I don't see a contradiction for using Maps or hashed maps; you
    > > worry much about retrieving the elements fast and I guess Maps are
    > > best suited for that.
    > > Let me repeat again:
    > > You need to ensure sorted elements -> sets
    > > you need uniqueness -> sets and maps
    > > you need fast lookup -> maps and hashes
    > > you need insertions and deletions from the beginning, ending or middle
    > > of container -> lists
    > > you need random access -> vectors
    > > See what's most crucial for your application and decide for a
    > > container.

    >
    > > Good luck

    >
    > I think there is a little bit of misinformation here. Both std::map
    > and std::set are sorted, and they both have fairly fast (i.e. O(log
    > N)) lookup. Hash maps have even faster lookup (in the average case,
    > O(1)).
    >
    > The reason that the OP might prefer a map over a set his case is that
    > he wants to modify the data in the records. While this can be done
    > with std::set (by storing a pointer to the record, and providing a
    > comparison operator), it isn't a natural fit for the problem.
    >
    > However, as the data is already sorted, and if you will not be
    > inserting or deleting data, then using a std::vector is a fine idea.
    > To search a sorted vector, use std::lower_bound, which should do a
    > binary search (avoid std::binary_search, as it only returns a boolean
    > indicating containment).
    >

    Exactly my point Matt. The data I have is already sorted.
    In short my operations involve
    1) Put the sorted data into a container ( I dont have to sort the
    data, its already in sorted form)
    2) Provide a look up to the data to the user based on unique identity
    3) May have to modify the data (the pointer is stored in the vector).
    Modifying the data will not result in a resort as the identity will
    retain the same, only the data pointed will be different)
    4)As matt pointed out, i'm using a std::lower_bound for lookup with a
    predicate logic that compares strings for equality.
    Something like
    lower_bound(vector_data.begin(),vector_data.end(),identity,DataCompare())
    Where vector_data refers to vector containing a pair of
    (identity,,pointer) and DataCompare is the predicate logic defining
    equality of identity which is in this case a string

    > If you need a reference:http://www.sgi.com/tech/stl/lower_bound.html
    >
    > -matt
    Hunk, Mar 20, 2007
    #12
  13. Hunk

    Hunk Guest

    On Mar 20, 2:06 pm, "Hunk" <> wrote:
    > On Mar 17, 2:58 am, "Matteo" <> wrote:
    >
    >
    >
    > > On Mar 16, 11:29 am, "coosa" <> wrote:

    >
    > > > Agree with you; a map or precisely a hashed map.
    > > > Trust me on not using vectors; I've worked on the netflix competition
    > > > with a dataset containing over 100 millions of ratings and I used Java
    > > > for it, but it won't play a big role if I used C++; for some
    > > > processing the code ran and lasted around 2-3 days to finish; when I
    > > > decided to switch to HashMap it took less than 6 hours to process.

    >
    > > Were you using sorted vectors with a binary search? Or were you just
    > > scanning through the vector? True, a hash map would still be faster
    > > (in the average case) than a sorted vector, but I would like to know
    > > how you used vectors in this case before I take your advice.

    >
    > > > Vectors vs Maps, I also have the book of scott meyers and it's a great
    > > > one and I don't see a contradiction for using Maps or hashed maps; you
    > > > worry much about retrieving the elements fast and I guess Maps are
    > > > best suited for that.
    > > > Let me repeat again:
    > > > You need to ensure sorted elements -> sets
    > > > you need uniqueness -> sets and maps
    > > > you need fast lookup -> maps and hashes
    > > > you need insertions and deletions from the beginning, ending or middle
    > > > of container -> lists
    > > > you need random access -> vectors
    > > > See what's most crucial for your application and decide for a
    > > > container.

    >
    > > > Good luck

    >
    > > I think there is a little bit of misinformation here. Both std::map
    > > and std::set are sorted, and they both have fairly fast (i.e. O(log
    > > N)) lookup. Hash maps have even faster lookup (in the average case,
    > > O(1)).

    >
    > > The reason that the OP might prefer a map over a set his case is that
    > > he wants to modify the data in the records. While this can be done
    > > with std::set (by storing a pointer to the record, and providing a
    > > comparison operator), it isn't a natural fit for the problem.

    >
    > > However, as the data is already sorted, and if you will not be
    > > inserting or deleting data, then using a std::vector is a fine idea.
    > > To search a sorted vector, use std::lower_bound, which should do a
    > > binary search (avoid std::binary_search, as it only returns a boolean
    > > indicating containment).

    >
    > Exactly my point Matt. The data I have is already sorted.
    > In short my operations involve
    > 1) Put the sorted data into a container ( I dont have to sort the
    > data, its already in sorted form)
    > 2) Provide a look up to the data to the user based on unique identity
    > 3) May have to modify the data (the pointer is stored in the vector).
    > Modifying the data will not result in a resort as the identity will
    > retain the same, only the data pointed will be different)
    > 4)As matt pointed out, i'm using a std::lower_bound for lookup with a
    > predicate logic that compares strings for equality.
    > Something like
    > lower_bound(vector_data.begin(),vector_data.end(),identity,DataCompare())
    > Where vector_data refers to vector containing a pair of
    > (identity,,pointer) and DataCompare is the predicate logic defining
    > equality of identity which is in this case a string
    >
    >
    >
    > > If you need a reference:http://www.sgi.com/tech/stl/lower_bound.html

    >
    > > -matt- Hide quoted text -

    >
    > - Show quoted text -- Hide quoted text -
    >
    > - Show quoted text -


    A quick question on predicate logic used in lower_bound. Does it have
    to be less than comparitor? What happens if we define the predicate to
    be an equal to operator? How do i locate an entry in a vector
    effectively. There always seems to be range check algorithms for STL
    and not any lookup for vector,list et.al
    Hunk, Mar 21, 2007
    #13
  14. Hunk

    P.J. Plauger Guest

    "Hunk" <> wrote in message
    news:...

    > A quick question on predicate logic used in lower_bound. Does it have
    > to be less than comparitor?


    It has to be a strict weak ordering. < works, as does >; but none of
    the other four comparisons are strict weak orderings.

    > What happens if we define the predicate to
    > be an equal to operator?


    lower_bound will go mildly crazy.

    > How do i locate an entry in a vector
    > effectively.


    Order the contents of the vector and search it using the same
    ordering rule.

    > There always seems to be range check algorithms for STL
    > and not any lookup for vector,list et.al


    The general algorithms work on pairs of iterators, including pairs
    of container iterators. No need for specialized algorithms, in
    general.

    P.J. Plauger
    Dinkumware, Ltd.
    http://www.dinkumware.com
    P.J. Plauger, Mar 21, 2007
    #14
  15. Hunk

    Hunk Guest

    On Mar 21, 2:51 pm, "P.J. Plauger" <> wrote:
    > "Hunk" <> wrote in message
    >
    > news:...
    >
    > > A quick question on predicate logic used in lower_bound. Does it have
    > > to be less than comparitor?

    >
    > It has to be a strict weak ordering. < works, as does >; but none of
    > the other four comparisons are strict weak orderings.
    >
    > > What happens if we define the predicate to
    > > be an equal to operator?

    >
    > lower_bound will go mildly crazy.


    Yep it did go crazy when i put in an equal to operator;). I had a
    sorted vector of strings and i tried to locate a string in that using
    lower_bound with a predicate logic defined to return a string
    comparison.

    >
    > > How do i locate an entry in a vector
    > > effectively.

    >
    > Order the contents of the vector and search it using the same
    > ordering rule.
    >
    > > There always seems to be range check algorithms for STL
    > > and not any lookup for vector,listet.al

    >
    > The general algorithms work on pairs of iterators, including pairs
    > of container iterators. No need for specialized algorithms, in
    > general.
    >


    I used the lower_bound to do a lookup on the vector items in the
    following way
    my_vector (contains a pair of identity,pointers) identity is a string
    lower_bound(my_vector.begin(),my_vector.end(),identity,datacompare())
    my datacompare defines a less than comparitor for strings

    My question is is this better compared to a map ?
    My operations are in the order mentioned below
    1) Put an already sorted list in a container (sorted on unique
    strings)
    2) do some look ups
    3) Update the data (This would not modify the string identifiers. Only
    the data associated)

    So for these set of operations would a vector be more efficient than a
    map container?

    > P.J. Plauger
    > Dinkumware, Ltd.http://www.dinkumware.com
    Hunk, Mar 22, 2007
    #15
  16. Hunk

    P.J. Plauger Guest

    "Hunk" <> wrote in message
    news:...

    > On Mar 21, 2:51 pm, "P.J. Plauger" <> wrote:
    >> "Hunk" <> wrote in message
    >>
    >> news:...
    >>
    >> > A quick question on predicate logic used in lower_bound. Does it have
    >> > to be less than comparitor?

    >>
    >> It has to be a strict weak ordering. < works, as does >; but none of
    >> the other four comparisons are strict weak orderings.
    >>
    >> > What happens if we define the predicate to
    >> > be an equal to operator?

    >>
    >> lower_bound will go mildly crazy.

    >
    > Yep it did go crazy when i put in an equal to operator;). I had a
    > sorted vector of strings and i tried to locate a string in that using
    > lower_bound with a predicate logic defined to return a string
    > comparison.
    >
    >>
    >> > How do i locate an entry in a vector
    >> > effectively.

    >>
    >> Order the contents of the vector and search it using the same
    >> ordering rule.
    >>
    >> > There always seems to be range check algorithms for STL
    >> > and not any lookup for vector,listet.al

    >>
    >> The general algorithms work on pairs of iterators, including pairs
    >> of container iterators. No need for specialized algorithms, in
    >> general.
    >>

    >
    > I used the lower_bound to do a lookup on the vector items in the
    > following way
    > my_vector (contains a pair of identity,pointers) identity is a string
    > lower_bound(my_vector.begin(),my_vector.end(),identity,datacompare())
    > my datacompare defines a less than comparitor for strings
    >
    > My question is is this better compared to a map ?
    > My operations are in the order mentioned below
    > 1) Put an already sorted list in a container (sorted on unique
    > strings)
    > 2) do some look ups
    > 3) Update the data (This would not modify the string identifiers. Only
    > the data associated)
    >
    > So for these set of operations would a vector be more efficient than a
    > map container?


    If you're not going to change the ordering of the data by updating
    it, then a vector works fine:

    -- If you can really trust that the data ordered on the way in,
    you don't even have to sort it.

    -- lower_bound on a random-access series has the same logarithmic
    time complexity as lookups in a (nearly) balanced binary tree.

    -- vector has lower space overhead than map.

    For my part, however, I'd sort the vector after filling it, just in
    case. A decent sort will verify that a sequence is already in sort
    in linear time.

    P.J. Plauger
    Dinkumware, Ltd.
    http://www.dinkumware.com
    P.J. Plauger, Mar 22, 2007
    #16
  17. Hunk

    coosa Guest

    On Mar 21, 1:45 pm, "Hunk" <> wrote:
    > On Mar 20, 2:06 pm, "Hunk" <> wrote:
    >
    >
    >
    > > On Mar 17, 2:58 am, "Matteo" <> wrote:

    >
    > > > On Mar 16, 11:29 am, "coosa" <> wrote:

    >
    > > > > Agree with you; a map or precisely a hashed map.
    > > > > Trust me on not using vectors; I've worked on the netflix competition
    > > > > with a dataset containing over 100 millions of ratings and I used Java
    > > > > for it, but it won't play a big role if I used C++; for some
    > > > > processing the code ran and lasted around 2-3 days to finish; when I
    > > > > decided to switch to HashMap it took less than 6 hours to process.

    >
    > > > Were you using sorted vectors with a binary search? Or were you just
    > > > scanning through the vector? True, a hash map would still be faster
    > > > (in the average case) than a sorted vector, but I would like to know
    > > > how you used vectors in this case before I take your advice.

    >
    > > > > Vectors vs Maps, I also have the book of scott meyers and it's a great
    > > > > one and I don't see a contradiction for using Maps or hashed maps; you
    > > > > worry much about retrieving the elements fast and I guess Maps are
    > > > > best suited for that.
    > > > > Let me repeat again:
    > > > > You need to ensure sorted elements -> sets
    > > > > you need uniqueness -> sets and maps
    > > > > you need fast lookup -> maps and hashes
    > > > > you need insertions and deletions from the beginning, ending or middle
    > > > > of container -> lists
    > > > > you need random access -> vectors
    > > > > See what's most crucial for your application and decide for a
    > > > > container.

    >
    > > > > Good luck

    >
    > > > I think there is a little bit of misinformation here. Both std::map
    > > > and std::set are sorted, and they both have fairly fast (i.e. O(log
    > > > N)) lookup. Hash maps have even faster lookup (in the average case,
    > > > O(1)).

    >
    > > > The reason that the OP might prefer a map over a set his case is that
    > > > he wants to modify the data in the records. While this can be done
    > > > with std::set (by storing a pointer to the record, and providing a
    > > > comparison operator), it isn't a natural fit for the problem.

    >
    > > > However, as the data is already sorted, and if you will not be
    > > > inserting or deleting data, then using a std::vector is a fine idea.
    > > > To search a sorted vector, use std::lower_bound, which should do a
    > > > binary search (avoid std::binary_search, as it only returns a boolean
    > > > indicating containment).

    >
    > > Exactly my point Matt. The data I have is already sorted.
    > > In short my operations involve
    > > 1) Put the sorted data into a container ( I dont have to sort the
    > > data, its already in sorted form)
    > > 2) Provide a look up to the data to the user based on unique identity
    > > 3) May have to modify the data (the pointer is stored in the vector).
    > > Modifying the data will not result in a resort as the identity will
    > > retain the same, only the data pointed will be different)
    > > 4)As matt pointed out, i'm using a std::lower_bound for lookup with a
    > > predicate logic that compares strings for equality.
    > > Something like
    > > lower_bound(vector_data.begin(),vector_data.end(),identity,DataCompare())
    > > Where vector_data refers to vector containing a pair of
    > > (identity,,pointer) and DataCompare is the predicate logic defining
    > > equality of identity which is in this case a string

    >
    > > > If you need a reference:http://www.sgi.com/tech/stl/lower_bound.html

    >
    > > > -matt- Hide quoted text -

    >
    > > - Show quoted text -- Hide quoted text -

    >
    > > - Show quoted text -

    >
    > A quick question on predicate logic used in lower_bound. Does it have
    > to be less than comparitor? What happens if we define the predicate to
    > be an equal to operator? How do i locate an entry in a vector
    > effectively. There always seems to be range check algorithms for STL
    > and not any lookup for vector,list et.al


    Then again,
    I stick to using a set since you can overload the greater operator or
    the equal operator to make the comparison other than less operator ...

    regards
    coosa, Mar 23, 2007
    #17
  18. In message <>, coosa
    <> writes
    >On Mar 21, 1:45 pm, "Hunk" <> wrote:
    >> On Mar 20, 2:06 pm, "Hunk" <> wrote:
    >>
    >>
    >>
    >> > On Mar 17, 2:58 am, "Matteo" <> wrote:

    >>
    >> > > On Mar 16, 11:29 am, "coosa" <> wrote:

    >>
    >> > > > Agree with you; a map or precisely a hashed map.
    >> > > > Trust me on not using vectors; I've worked on the netflix competition
    >> > > > with a dataset containing over 100 millions of ratings and I used Java
    >> > > > for it, but it won't play a big role if I used C++; for some
    >> > > > processing the code ran and lasted around 2-3 days to finish; when I
    >> > > > decided to switch to HashMap it took less than 6 hours to process.

    >>
    >> > > Were you using sorted vectors with a binary search? Or were you just
    >> > > scanning through the vector? True, a hash map would still be faster
    >> > > (in the average case) than a sorted vector, but I would like to know
    >> > > how you used vectors in this case before I take your advice.

    >>
    >> > > > Vectors vs Maps, I also have the book of scott meyers and it's a great
    >> > > > one and I don't see a contradiction for using Maps or hashed maps; you
    >> > > > worry much about retrieving the elements fast and I guess Maps are
    >> > > > best suited for that.
    >> > > > Let me repeat again:
    >> > > > You need to ensure sorted elements -> sets
    >> > > > you need uniqueness -> sets and maps
    >> > > > you need fast lookup -> maps and hashes
    >> > > > you need insertions and deletions from the beginning, ending or middle
    >> > > > of container -> lists
    >> > > > you need random access -> vectors
    >> > > > See what's most crucial for your application and decide for a
    >> > > > container.

    >>
    >> > > > Good luck

    >>
    >> > > I think there is a little bit of misinformation here. Both std::map
    >> > > and std::set are sorted, and they both have fairly fast (i.e. O(log
    >> > > N)) lookup. Hash maps have even faster lookup (in the average case,
    >> > > O(1)).

    >>
    >> > > The reason that the OP might prefer a map over a set his case is that
    >> > > he wants to modify the data in the records. While this can be done
    >> > > with std::set (by storing a pointer to the record, and providing a
    >> > > comparison operator), it isn't a natural fit for the problem.

    >>
    >> > > However, as the data is already sorted, and if you will not be
    >> > > inserting or deleting data, then using a std::vector is a fine idea.
    >> > > To search a sorted vector, use std::lower_bound, which should do a
    >> > > binary search (avoid std::binary_search, as it only returns a boolean
    >> > > indicating containment).

    >>
    >> > Exactly my point Matt. The data I have is already sorted.
    >> > In short my operations involve
    >> > 1) Put the sorted data into a container ( I dont have to sort the
    >> > data, its already in sorted form)
    >> > 2) Provide a look up to the data to the user based on unique identity
    >> > 3) May have to modify the data (the pointer is stored in the vector).
    >> > Modifying the data will not result in a resort as the identity will
    >> > retain the same, only the data pointed will be different)
    >> > 4)As matt pointed out, i'm using a std::lower_bound for lookup with a
    >> > predicate logic that compares strings for equality.
    >> > Something like
    >> > lower_bound(vector_data.begin(),vector_data.end(),identity,DataCompare())
    >> > Where vector_data refers to vector containing a pair of
    >> > (identity,,pointer) and DataCompare is the predicate logic defining
    >> > equality of identity which is in this case a string

    >>
    >> > > If you need a reference:http://www.sgi.com/tech/stl/lower_bound.html

    >>
    >> > > -matt- Hide quoted text -

    >>
    >> > - Show quoted text -- Hide quoted text -

    >>
    >> > - Show quoted text -

    >>
    >> A quick question on predicate logic used in lower_bound. Does it have
    >> to be less than comparitor? What happens if we define the predicate to
    >> be an equal to operator? How do i locate an entry in a vector
    >> effectively. There always seems to be range check algorithms for STL
    >> and not any lookup for vector,list et.al

    >
    >Then again,
    >I stick to using a set since you can overload the greater operator or
    >the equal operator to make the comparison other than less operator ...
    >

    You can equally well do that on the sort and lower_bound algorithms
    (make sure you use the same comparison for both!), so in that regard
    they are no different from using an associative container.

    --
    Richard Herring
    Richard Herring, Mar 27, 2007
    #18
  19. Hunk

    Hunk Guest

    On Mar 27, 4:19 pm, Richard Herring <junk@[127.0.0.1]> wrote:
    > In message <>, coosa
    > <> writes
    >
    >
    >
    > >On Mar 21, 1:45 pm, "Hunk" <> wrote:
    > >> On Mar 20, 2:06 pm, "Hunk" <> wrote:

    >
    > >> > On Mar 17, 2:58 am, "Matteo" <> wrote:

    >
    > >> > > On Mar 16, 11:29 am, "coosa" <> wrote:

    >
    > >> > > > Agree with you; a map or precisely a hashed map.
    > >> > > > Trust me on not using vectors; I've worked on the netflix competition
    > >> > > > with a dataset containing over 100 millions of ratings and I used Java
    > >> > > > for it, but it won't play a big role if I used C++; for some
    > >> > > > processing the code ran and lasted around 2-3 days to finish; when I
    > >> > > > decided to switch to HashMap it took less than 6 hours to process.

    >
    > >> > > Were you using sorted vectors with a binary search? Or were you just
    > >> > > scanning through the vector? True, a hash map would still be faster
    > >> > > (in the average case) than a sorted vector, but I would like to know
    > >> > > how you used vectors in this case before I take your advice.

    >
    > >> > > > Vectors vs Maps, I also have the book of scott meyers and it's a great
    > >> > > > one and I don't see a contradiction for using Maps or hashed maps; you
    > >> > > > worry much about retrieving the elements fast and I guess Maps are
    > >> > > > best suited for that.
    > >> > > > Let me repeat again:
    > >> > > > You need to ensure sorted elements -> sets
    > >> > > > you need uniqueness -> sets and maps
    > >> > > > you need fast lookup -> maps and hashes
    > >> > > > you need insertions and deletions from the beginning, ending or middle
    > >> > > > of container -> lists
    > >> > > > you need random access -> vectors
    > >> > > > See what's most crucial for your application and decide for a
    > >> > > > container.

    >
    > >> > > > Good luck

    >
    > >> > > I think there is a little bit of misinformation here. Both std::map
    > >> > > and std::set are sorted, and they both have fairly fast (i.e. O(log
    > >> > > N)) lookup. Hash maps have even faster lookup (in the average case,
    > >> > > O(1)).

    >
    > >> > > The reason that the OP might prefer a map over a set his case is that
    > >> > > he wants to modify the data in the records. While this can be done
    > >> > > with std::set (by storing a pointer to the record, and providing a
    > >> > > comparison operator), it isn't a natural fit for the problem.

    >
    > >> > > However, as the data is already sorted, and if you will not be
    > >> > > inserting or deleting data, then using a std::vector is a fine idea.
    > >> > > To search a sorted vector, use std::lower_bound, which should do a
    > >> > > binary search (avoid std::binary_search, as it only returns a boolean
    > >> > > indicating containment).

    >
    > >> > Exactly my point Matt. The data I have is already sorted.
    > >> > In short my operations involve
    > >> > 1) Put the sorted data into a container ( I dont have to sort the
    > >> > data, its already in sorted form)
    > >> > 2) Provide a look up to the data to the user based on unique identity
    > >> > 3) May have to modify the data (the pointer is stored in the vector).
    > >> > Modifying the data will not result in a resort as the identity will
    > >> > retain the same, only the data pointed will be different)
    > >> > 4)As matt pointed out, i'm using a std::lower_bound for lookup with a
    > >> > predicate logic that compares strings for equality.
    > >> > Something like
    > >> > lower_bound(vector_data.begin(),vector_data.end(),identity,DataCompare())
    > >> > Where vector_data refers to vector containing a pair of
    > >> > (identity,,pointer) and DataCompare is the predicate logic defining
    > >> > equality of identity which is in this case a string

    >
    > >> > > If you need a reference:http://www.sgi.com/tech/stl/lower_bound.html

    >
    > >> > > -matt- Hide quoted text -

    >
    > >> > - Show quoted text -- Hide quoted text -

    >
    > >> > - Show quoted text -

    >
    > >> A quick question on predicate logic used in lower_bound. Does it have
    > >> to be less than comparitor? What happens if we define the predicate to
    > >> be an equal to operator? How do i locate an entry in a vector
    > >> effectively. There always seems to be range check algorithms for STL
    > >> and not any lookup for vector,list et.al

    >
    > >Then again,
    > >I stick to using a set since you can overload the greater operator or
    > >the equal operator to make the comparison other than less operator ...

    >
    > You can equally well do that on the sort and lower_bound algorithms
    > (make sure you use the same comparison for both!), so in that regard
    > they are no different from using an associative container.
    >
    > --
    > Richard Herring- Hide quoted text -
    >
    > - Show quoted text -


    Thanks a lot guys for the suggestions. It was a great help. Have
    already tried out with vector, have to try out coosa suggestion of a
    set.. but i think vector would anyway suffice unless there is an added
    advantage in memory and speed in using a set
    Hunk, Mar 29, 2007
    #19
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Maitre Bart
    Replies:
    2
    Views:
    520
    Maitre Bart
    Feb 11, 2004
  2. Replies:
    6
    Views:
    553
  3. Replies:
    4
    Views:
    797
    Daniel T.
    Feb 16, 2006
  4. wolverine
    Replies:
    2
    Views:
    449
    Marcus Kwok
    Jul 24, 2006
  5. Gary Wessle

    sorted container

    Gary Wessle, Mar 7, 2007, in forum: C++
    Replies:
    7
    Views:
    287
Loading...

Share This Page