Algorithm to find pairs in array [can not order]

Discussion in 'Java' started by obaqueiro@gmail.com, Nov 17, 2006.

  1. Guest

    Hello, so I am trying to make a better (with less complexity,
    prefferably without a nested loop) to find something like pairs of
    items in an ArrayList:

    So far I've got this:

    // shuffle Option offers
    Collections.shuffle(offeredOptions);
    for (int i=0;i<offeredOptions.size()-1;i++)
    {
    Offer of1 = offeredOptions.get(i);
    // look for a matching option offer in the list
    for (int j=1;j<offeredOptions.size();j++)
    {
    Offer of2 = offeredOptions.get(j);
    /* we are looking for 2 offers for the same Option contract with
    opposite
    * offer types (write and hold)
    */
    if (of1.getOption() == of1.getOption() &&
    (of1.type == OfferType.hold && of2.type == OfferType.write) ||
    (of1.type == OfferType.write && of2.type == OfferType.hold))
    {
    offeredOptions.remove(i);

    offeredOptions.remove(j);
    // do something else
    ....
    // ...
    break; // get to the
    i loop

    }
    }
    }

    I know it is not optimal, but basically I need to match 2 items from
    the array (which is randomized), and then remove them.
    I am sure there must be a more efficient way to do this but I cant
    remember the proper algorithm...

    I know it is not a *proper* Java question but didn't knew which group
    to address

    thanks anyway!
     
    , Nov 17, 2006
    #1
    1. Advertising

  2. wrote:
    > Hello, so I am trying to make a better (with less complexity,
    > prefferably without a nested loop) to find something like pairs of
    > items in an ArrayList:
    >
    > So far I've got this:
    >
    > // shuffle Option offers
    > Collections.shuffle(offeredOptions);
    > for (int i=0;i<offeredOptions.size()-1;i++)
    > {
    > Offer of1 = offeredOptions.get(i);
    > // look for a matching option offer in the list
    > for (int j=1;j<offeredOptions.size();j++)
    > {
    > Offer of2 = offeredOptions.get(j);
    > /* we are looking for 2 offers for the same Option contract with
    > opposite
    > * offer types (write and hold)
    > */
    > if (of1.getOption() == of1.getOption() &&
    > (of1.type == OfferType.hold && of2.type == OfferType.write) ||
    > (of1.type == OfferType.write && of2.type == OfferType.hold))
    > {
    > offeredOptions.remove(i);
    >
    > offeredOptions.remove(j);
    > // do something else
    > ...
    > // ...
    > break; // get to the
    > i loop
    >
    > }
    > }
    > }
    >
    > I know it is not optimal, but basically I need to match 2 items from
    > the array (which is randomized), and then remove them.
    > I am sure there must be a more efficient way to do this but I cant
    > remember the proper algorithm...
    >
    > I know it is not a *proper* Java question but didn't knew which group
    > to address
    >
    > thanks anyway!
    >


    There is a quick halving of the work. The inner loop does not need to
    begin from index 1. It can start from i+1, because any match between a
    pair either of which has an index value less than i would have been
    found on a previous outer loop iteration.

    If you rely on direct comparison, using the unordered array, the work is
    going to be O(n*n).

    What is the type of a getOption() result? Can there be more than two
    offers with the same contract?

    Depending on those answers, there may be much more efficient solutions
    for large n in which the data gets copied to a HashSet or HashMap during
    the search.

    Patricia
     
    Patricia Shanahan, Nov 17, 2006
    #2
    1. Advertising

  3. Guest

    Thanks for your answer,

    Patricia Shanahan wrote:
    >
    > There is a quick halving of the work. The inner loop does not need to
    > begin from index 1. It can start from i+1, because any match between a
    > pair either of which has an index value less than i would have been
    > found on a previous outer loop iteration.


    If you look closely at the end of the inner loop I remove the matched
    objects from the list:
    >>offeredOptions.remove(i);
    >>offeredOptions.remove(j);


    One think that I missed to add is the instruction
    i=0 ;

    After the pair is found.

    // shuffle Option offers
    Collections.shuffle(offeredOptions);
    for (int i=0;i<offeredOptions.size()-1;i++)
    {
    Offer of1 = offeredOptions.get(i);
    // look for a matching option offer in the list
    for (int j=1;j<offeredOptions.size();j++)
    {
    Offer of2 = offeredOptions.get(j);
    /* we are looking for 2 offers for the same Option contract with
    opposite
    * offer types (write and hold)
    */
    if (of1.getOption() == of1.getOption() &&
    (of1.type == OfferType.hold && of2.type ==
    OfferType.write) ||
    (of1.type == OfferType.write && of2.type ==
    OfferType.hold))
    {
    offeredOptions.remove(i);
    offeredOptions.remove(j);
    // do something else
    ...
    // ...
    i=0; // THIS WAS LEFT OUT IN THE PREVIOUS
    POST
    break; // get to the i loop
    }
    }
    }

    So basically I remove the matched objects AND start from the begginning
    of the resulting list
    (and compare the object 0 to the object 1,2,3...).

    > If you rely on direct comparison, using the unordered array, the work is
    > going to be O(n*n).


    The issue is that the matching of pairs *must* be random, as it is
    possible to have 2 objects [offers]
    that match a third one, and the selection of the 2 to be matched has to
    be random (if you have not
    infered from the code, it is some kind of random clearing between
    traders).

    > What is the type of a getOption() result? Can there be more than two
    > offers with the same contract?
    >

    Basically, getOption() returns an object with the type Option which is
    a kind of contract, there CAN be two offers (made by different traders)
    with the same contract (as the Option object lists just a *type* of
    contract, and there could be several offers with the same *type* of
    contract, the type of contract is defined by some parameters in the
    Option class)].

    And then, what will make two offers match (as stated in the code) is to
    have the SAME Option type (hence the comparisson of the two getOption
    objects) AND that the offers have opposite positions ('write' and
    'hold' which can be seen as sell and buy the contract).

    > Depending on those answers, there may be much more efficient solutions
    > for large n in which the data gets copied to a HashSet or HashMap during
    > the search.
    >
    > Patricia


    At the end I believe given the restrictions a O(n*n) is the best I can
    achieve. Not that it is wrong but I thought something better could be
    done.

    Thank you again!
    Omar
     
    , Nov 17, 2006
    #3
  4. Guest

    Thanks for your answer,

    Patricia Shanahan wrote:
    >
    > There is a quick halving of the work. The inner loop does not need to
    > begin from index 1. It can start from i+1, because any match between a
    > pair either of which has an index value less than i would have been
    > found on a previous outer loop iteration.


    If you look closely at the end of the inner loop I remove the matched
    objects from the list:
    >>offeredOptions.remove(i);
    >>offeredOptions.remove(j);


    One think that I missed to add is the instruction
    i=0 ;

    After the pair is found.

    // shuffle Option offers
    Collections.shuffle(offeredOptions);
    for (int i=0;i<offeredOptions.size()-1;i++)
    {
    Offer of1 = offeredOptions.get(i);
    // look for a matching option offer in the list
    for (int j=1;j<offeredOptions.size();j++)
    {
    Offer of2 = offeredOptions.get(j);
    /* we are looking for 2 offers for the same Option contract with
    opposite
    * offer types (write and hold)
    */
    if (of1.getOption() == of1.getOption() &&
    (of1.type == OfferType.hold && of2.type ==
    OfferType.write) ||
    (of1.type == OfferType.write && of2.type ==
    OfferType.hold))
    {
    offeredOptions.remove(i);
    offeredOptions.remove(j);
    // do something else
    ...
    // ...
    i=0; // THIS WAS LEFT OUT IN THE PREVIOUS
    POST
    break; // get to the i loop
    }
    }
    }

    So basically I remove the matched objects AND start from the begginning
    of the resulting list
    (and compare the object 0 to the object 1,2,3...).

    > If you rely on direct comparison, using the unordered array, the work is
    > going to be O(n*n).


    The issue is that the matching of pairs *must* be random, as it is
    possible to have 2 objects [offers]
    that match a third one, and the selection of the 2 to be matched has to
    be random (if you have not
    infered from the code, it is some kind of random clearing between
    traders).

    > What is the type of a getOption() result? Can there be more than two
    > offers with the same contract?
    >

    Basically, getOption() returns an object with the type Option which is
    a kind of contract, there CAN be two offers (made by different traders)
    with the same contract (as the Option object lists just a *type* of
    contract, and there could be several offers with the same *type* of
    contract, the type of contract is defined by some parameters in the
    Option class)].

    And then, what will make two offers match (as stated in the code) is to
    have the SAME Option type (hence the comparisson of the two getOption
    objects) AND that the offers have opposite positions ('write' and
    'hold' which can be seen as sell and buy the contract).

    > Depending on those answers, there may be much more efficient solutions
    > for large n in which the data gets copied to a HashSet or HashMap during
    > the search.
    >
    > Patricia


    At the end I believe given the restrictions a O(n*n) is the best I can
    achieve. Not that it is wrong but I thought something better could be
    done.

    Thank you again!
    Omar
     
    , Nov 17, 2006
    #4
  5. Guest

    skreiv:
    > At the end I believe given the restrictions a O(n*n) is the best I can
    > achieve. Not that it is wrong but I thought something better could be
    > done.


    Your code is extremely inefficient if you have many matching pairs -
    not O(n*n)!! This is due to the following reasons:

    - You restart the whole algorithm once you remove two elements!!
    - You use remove(i) with ArrayLists - causes all subsequent elements to
    be shifted!
    - You are using a bad algorithm.

    It is easy to fix the first two points, and it might make a huge
    difference in execution time. You can for instance mark each offer that
    should be removed. Don't compare marked offers and at the end you have
    a separate loop to copy the unmarked offers to a new ArrayList.

    If you still need it faster you can make the algo O(n log n) quite
    easily by making option types comparable (doesn't matter how), then you
    just sort the offers for their option type - this is O(n log n). For
    two offers with the same option type put hold before write (or vice
    versa). After the sorting you can go through the sorted list and remove
    duplicates as they will be next to eachother. This stage is O(n) if
    done properly. Random matching of pairs is easy to implement in the
    compare function.

    Daniel
     
    , Nov 17, 2006
    #5
  6. wrote:
    > skreiv:
    >> At the end I believe given the restrictions a O(n*n) is the best I can
    >> achieve. Not that it is wrong but I thought something better could be
    >> done.

    >
    > Your code is extremely inefficient if you have many matching pairs -
    > not O(n*n)!! This is due to the following reasons:
    >
    > - You restart the whole algorithm once you remove two elements!!
    > - You use remove(i) with ArrayLists - causes all subsequent elements to
    > be shifted!


    And using a LinkedList would not solve this because that would make the
    indexed access slow.

    > - You are using a bad algorithm.


    Indeed. It is even worse than I thought at first.

    >
    > It is easy to fix the first two points, and it might make a huge
    > difference in execution time. You can for instance mark each offer that
    > should be removed. Don't compare marked offers and at the end you have
    > a separate loop to copy the unmarked offers to a new ArrayList.
    >
    > If you still need it faster you can make the algo O(n log n) quite
    > easily by making option types comparable (doesn't matter how), then you
    > just sort the offers for their option type - this is O(n log n). For
    > two offers with the same option type put hold before write (or vice
    > versa). After the sorting you can go through the sorted list and remove
    > duplicates as they will be next to eachother. This stage is O(n) if
    > done properly. Random matching of pairs is easy to implement in the
    > compare function.


    In conjunction with this. note that the standard Collections.sort is
    stable. That is, if two items are equal in the sort order, they appear
    in the result in the same order as in the input. As far as I can tell,
    order only needs to be preserved among options for the same direction
    for the same contract.

    If the current algorithm is anywhere near to livable performance, the
    O(n log n) performance from those ideas should be fine. However, if the
    list is very long, consider the following:

    Have a pair of HashMap instances, each mapping from contract to a queue
    of offers, one for unmatched holds and one for unmatched writes.

    Do a single scan of the input list. I'll describe processing for a write
    - exchange "write" and "hold" to get the hold processing.

    Look up the contract in the holds map. If there is queue, poll to get
    the head item. If that is non-null, you have a match.

    If no match, add the offer to the holds map, making it the tail item of
    the queue for its contract. That may involve creating a new queue and
    inserting in the map.

    As you go along, the unmatched offers can be appended to the tail of a list.

    This reduces the cost from O(n log n) to O(n). However, unless the list
    is very long, there will be little or no gain in net performance because
    the constant factor will be bigger, and the simpler approach is better
    just because it is simpler.

    Patricia
     
    Patricia Shanahan, Nov 17, 2006
    #6
  7. bugbear Guest

    wrote:
    >
    > If you still need it faster you can make the algo O(n log n) quite
    > easily by making option types comparable (doesn't matter how), then you
    > just sort the offers for their option type - this is O(n log n). For
    > two offers with the same option type put hold before write (or vice
    > versa). After the sorting you can go through the sorted list and remove
    > duplicates as they will be next to eachother. This stage is O(n) if
    > done properly. Random matching of pairs is easy to implement in the
    > compare function.


    Can't the whole thing be made O(n) by using HashMap?
    Make equals() and hashCode() of Offer depend (only) on option,
    then make the value of the map a linked list of the Offers.

    The meat:

    List l;
    if(map.containsKey(thisOffer)) {
    l = map.get(thisOffer);
    } else {
    l = new LinkedList();
    map.put(thisOffer, l);
    }
    l.add(thisOffer);

    (with apologies for typos)

    That bunches everything up in linear time.

    Then simply iterate over the map, and (sub)iterate
    over each list.

    Uses more storage - time vs space payoff, who'd 'ave though it?

    BugBear
     
    bugbear, Nov 17, 2006
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Markus Dehmann
    Replies:
    6
    Views:
    377
    Andrew Koenig
    Feb 12, 2005
  2. ma740988
    Replies:
    6
    Views:
    712
    Neelesh Bodas
    Jan 3, 2006
  3. cesco
    Replies:
    5
    Views:
    473
    Maxim Yegorushkin
    Feb 13, 2006
  4. Eduardo Yáñez Parareda

    Pairs tournament algorithm

    Eduardo Yáñez Parareda, Aug 30, 2006, in forum: Ruby
    Replies:
    16
    Views:
    324
    James Edward Gray II
    Aug 31, 2006
  5. mark4asp
    Replies:
    3
    Views:
    172
    mark4asp
    Mar 5, 2007
Loading...

Share This Page