# Algorithm to find pairs in array [can not order]

Discussion in 'Java' started by obaqueiro@gmail.com, Nov 17, 2006.

1. ### Guest

Hello, so I am trying to make a better (with less complexity,
prefferably without a nested loop) to find something like pairs of
items in an ArrayList:

So far I've got this:

// shuffle Option offers
Collections.shuffle(offeredOptions);
for (int i=0;i<offeredOptions.size()-1;i++)
{
Offer of1 = offeredOptions.get(i);
// look for a matching option offer in the list
for (int j=1;j<offeredOptions.size();j++)
{
Offer of2 = offeredOptions.get(j);
/* we are looking for 2 offers for the same Option contract with
opposite
* offer types (write and hold)
*/
if (of1.getOption() == of1.getOption() &&
(of1.type == OfferType.hold && of2.type == OfferType.write) ||
(of1.type == OfferType.write && of2.type == OfferType.hold))
{
offeredOptions.remove(i);

offeredOptions.remove(j);
// do something else
....
// ...
break; // get to the
i loop

}
}
}

I know it is not optimal, but basically I need to match 2 items from
the array (which is randomized), and then remove them.
I am sure there must be a more efficient way to do this but I cant
remember the proper algorithm...

I know it is not a *proper* Java question but didn't knew which group

thanks anyway!

, Nov 17, 2006

2. ### Patricia ShanahanGuest

wrote:
> Hello, so I am trying to make a better (with less complexity,
> prefferably without a nested loop) to find something like pairs of
> items in an ArrayList:
>
> So far I've got this:
>
> // shuffle Option offers
> Collections.shuffle(offeredOptions);
> for (int i=0;i<offeredOptions.size()-1;i++)
> {
> Offer of1 = offeredOptions.get(i);
> // look for a matching option offer in the list
> for (int j=1;j<offeredOptions.size();j++)
> {
> Offer of2 = offeredOptions.get(j);
> /* we are looking for 2 offers for the same Option contract with
> opposite
> * offer types (write and hold)
> */
> if (of1.getOption() == of1.getOption() &&
> (of1.type == OfferType.hold && of2.type == OfferType.write) ||
> (of1.type == OfferType.write && of2.type == OfferType.hold))
> {
> offeredOptions.remove(i);
>
> offeredOptions.remove(j);
> // do something else
> ...
> // ...
> break; // get to the
> i loop
>
> }
> }
> }
>
> I know it is not optimal, but basically I need to match 2 items from
> the array (which is randomized), and then remove them.
> I am sure there must be a more efficient way to do this but I cant
> remember the proper algorithm...
>
> I know it is not a *proper* Java question but didn't knew which group
>
> thanks anyway!
>

There is a quick halving of the work. The inner loop does not need to
begin from index 1. It can start from i+1, because any match between a
pair either of which has an index value less than i would have been
found on a previous outer loop iteration.

If you rely on direct comparison, using the unordered array, the work is
going to be O(n*n).

What is the type of a getOption() result? Can there be more than two
offers with the same contract?

Depending on those answers, there may be much more efficient solutions
for large n in which the data gets copied to a HashSet or HashMap during
the search.

Patricia

Patricia Shanahan, Nov 17, 2006

3. ### Guest

Patricia Shanahan wrote:
>
> There is a quick halving of the work. The inner loop does not need to
> begin from index 1. It can start from i+1, because any match between a
> pair either of which has an index value less than i would have been
> found on a previous outer loop iteration.

If you look closely at the end of the inner loop I remove the matched
objects from the list:
>>offeredOptions.remove(i);
>>offeredOptions.remove(j);

One think that I missed to add is the instruction
i=0 ;

After the pair is found.

// shuffle Option offers
Collections.shuffle(offeredOptions);
for (int i=0;i<offeredOptions.size()-1;i++)
{
Offer of1 = offeredOptions.get(i);
// look for a matching option offer in the list
for (int j=1;j<offeredOptions.size();j++)
{
Offer of2 = offeredOptions.get(j);
/* we are looking for 2 offers for the same Option contract with
opposite
* offer types (write and hold)
*/
if (of1.getOption() == of1.getOption() &&
(of1.type == OfferType.hold && of2.type ==
OfferType.write) ||
(of1.type == OfferType.write && of2.type ==
OfferType.hold))
{
offeredOptions.remove(i);
offeredOptions.remove(j);
// do something else
...
// ...
i=0; // THIS WAS LEFT OUT IN THE PREVIOUS
POST
break; // get to the i loop
}
}
}

So basically I remove the matched objects AND start from the begginning
of the resulting list
(and compare the object 0 to the object 1,2,3...).

> If you rely on direct comparison, using the unordered array, the work is
> going to be O(n*n).

The issue is that the matching of pairs *must* be random, as it is
possible to have 2 objects [offers]
that match a third one, and the selection of the 2 to be matched has to
be random (if you have not
infered from the code, it is some kind of random clearing between

> What is the type of a getOption() result? Can there be more than two
> offers with the same contract?
>

Basically, getOption() returns an object with the type Option which is
a kind of contract, there CAN be two offers (made by different traders)
with the same contract (as the Option object lists just a *type* of
contract, and there could be several offers with the same *type* of
contract, the type of contract is defined by some parameters in the
Option class)].

And then, what will make two offers match (as stated in the code) is to
have the SAME Option type (hence the comparisson of the two getOption
objects) AND that the offers have opposite positions ('write' and
'hold' which can be seen as sell and buy the contract).

> Depending on those answers, there may be much more efficient solutions
> for large n in which the data gets copied to a HashSet or HashMap during
> the search.
>
> Patricia

At the end I believe given the restrictions a O(n*n) is the best I can
achieve. Not that it is wrong but I thought something better could be
done.

Thank you again!
Omar

, Nov 17, 2006
4. ### Guest

Patricia Shanahan wrote:
>
> There is a quick halving of the work. The inner loop does not need to
> begin from index 1. It can start from i+1, because any match between a
> pair either of which has an index value less than i would have been
> found on a previous outer loop iteration.

If you look closely at the end of the inner loop I remove the matched
objects from the list:
>>offeredOptions.remove(i);
>>offeredOptions.remove(j);

One think that I missed to add is the instruction
i=0 ;

After the pair is found.

// shuffle Option offers
Collections.shuffle(offeredOptions);
for (int i=0;i<offeredOptions.size()-1;i++)
{
Offer of1 = offeredOptions.get(i);
// look for a matching option offer in the list
for (int j=1;j<offeredOptions.size();j++)
{
Offer of2 = offeredOptions.get(j);
/* we are looking for 2 offers for the same Option contract with
opposite
* offer types (write and hold)
*/
if (of1.getOption() == of1.getOption() &&
(of1.type == OfferType.hold && of2.type ==
OfferType.write) ||
(of1.type == OfferType.write && of2.type ==
OfferType.hold))
{
offeredOptions.remove(i);
offeredOptions.remove(j);
// do something else
...
// ...
i=0; // THIS WAS LEFT OUT IN THE PREVIOUS
POST
break; // get to the i loop
}
}
}

So basically I remove the matched objects AND start from the begginning
of the resulting list
(and compare the object 0 to the object 1,2,3...).

> If you rely on direct comparison, using the unordered array, the work is
> going to be O(n*n).

The issue is that the matching of pairs *must* be random, as it is
possible to have 2 objects [offers]
that match a third one, and the selection of the 2 to be matched has to
be random (if you have not
infered from the code, it is some kind of random clearing between

> What is the type of a getOption() result? Can there be more than two
> offers with the same contract?
>

Basically, getOption() returns an object with the type Option which is
a kind of contract, there CAN be two offers (made by different traders)
with the same contract (as the Option object lists just a *type* of
contract, and there could be several offers with the same *type* of
contract, the type of contract is defined by some parameters in the
Option class)].

And then, what will make two offers match (as stated in the code) is to
have the SAME Option type (hence the comparisson of the two getOption
objects) AND that the offers have opposite positions ('write' and
'hold' which can be seen as sell and buy the contract).

> Depending on those answers, there may be much more efficient solutions
> for large n in which the data gets copied to a HashSet or HashMap during
> the search.
>
> Patricia

At the end I believe given the restrictions a O(n*n) is the best I can
achieve. Not that it is wrong but I thought something better could be
done.

Thank you again!
Omar

, Nov 17, 2006
5. ### Guest

skreiv:
> At the end I believe given the restrictions a O(n*n) is the best I can
> achieve. Not that it is wrong but I thought something better could be
> done.

Your code is extremely inefficient if you have many matching pairs -
not O(n*n)!! This is due to the following reasons:

- You restart the whole algorithm once you remove two elements!!
- You use remove(i) with ArrayLists - causes all subsequent elements to
be shifted!
- You are using a bad algorithm.

It is easy to fix the first two points, and it might make a huge
difference in execution time. You can for instance mark each offer that
should be removed. Don't compare marked offers and at the end you have
a separate loop to copy the unmarked offers to a new ArrayList.

If you still need it faster you can make the algo O(n log n) quite
easily by making option types comparable (doesn't matter how), then you
just sort the offers for their option type - this is O(n log n). For
two offers with the same option type put hold before write (or vice
versa). After the sorting you can go through the sorted list and remove
duplicates as they will be next to eachother. This stage is O(n) if
done properly. Random matching of pairs is easy to implement in the
compare function.

Daniel

, Nov 17, 2006
6. ### Patricia ShanahanGuest

wrote:
> skreiv:
>> At the end I believe given the restrictions a O(n*n) is the best I can
>> achieve. Not that it is wrong but I thought something better could be
>> done.

>
> Your code is extremely inefficient if you have many matching pairs -
> not O(n*n)!! This is due to the following reasons:
>
> - You restart the whole algorithm once you remove two elements!!
> - You use remove(i) with ArrayLists - causes all subsequent elements to
> be shifted!

And using a LinkedList would not solve this because that would make the
indexed access slow.

> - You are using a bad algorithm.

Indeed. It is even worse than I thought at first.

>
> It is easy to fix the first two points, and it might make a huge
> difference in execution time. You can for instance mark each offer that
> should be removed. Don't compare marked offers and at the end you have
> a separate loop to copy the unmarked offers to a new ArrayList.
>
> If you still need it faster you can make the algo O(n log n) quite
> easily by making option types comparable (doesn't matter how), then you
> just sort the offers for their option type - this is O(n log n). For
> two offers with the same option type put hold before write (or vice
> versa). After the sorting you can go through the sorted list and remove
> duplicates as they will be next to eachother. This stage is O(n) if
> done properly. Random matching of pairs is easy to implement in the
> compare function.

In conjunction with this. note that the standard Collections.sort is
stable. That is, if two items are equal in the sort order, they appear
in the result in the same order as in the input. As far as I can tell,
order only needs to be preserved among options for the same direction
for the same contract.

If the current algorithm is anywhere near to livable performance, the
O(n log n) performance from those ideas should be fine. However, if the
list is very long, consider the following:

Have a pair of HashMap instances, each mapping from contract to a queue
of offers, one for unmatched holds and one for unmatched writes.

Do a single scan of the input list. I'll describe processing for a write
- exchange "write" and "hold" to get the hold processing.

Look up the contract in the holds map. If there is queue, poll to get
the head item. If that is non-null, you have a match.

If no match, add the offer to the holds map, making it the tail item of
the queue for its contract. That may involve creating a new queue and
inserting in the map.

As you go along, the unmatched offers can be appended to the tail of a list.

This reduces the cost from O(n log n) to O(n). However, unless the list
is very long, there will be little or no gain in net performance because
the constant factor will be bigger, and the simpler approach is better
just because it is simpler.

Patricia

Patricia Shanahan, Nov 17, 2006
7. ### bugbearGuest

wrote:
>
> If you still need it faster you can make the algo O(n log n) quite
> easily by making option types comparable (doesn't matter how), then you
> just sort the offers for their option type - this is O(n log n). For
> two offers with the same option type put hold before write (or vice
> versa). After the sorting you can go through the sorted list and remove
> duplicates as they will be next to eachother. This stage is O(n) if
> done properly. Random matching of pairs is easy to implement in the
> compare function.

Can't the whole thing be made O(n) by using HashMap?
Make equals() and hashCode() of Offer depend (only) on option,
then make the value of the map a linked list of the Offers.

The meat:

List l;
if(map.containsKey(thisOffer)) {
l = map.get(thisOffer);
} else {
map.put(thisOffer, l);
}

(with apologies for typos)

That bunches everything up in linear time.

Then simply iterate over the map, and (sub)iterate
over each list.

Uses more storage - time vs space payoff, who'd 'ave though it?

BugBear

bugbear, Nov 17, 2006