Comparing 2 unequal arrays...

R

Raistlin

Hey, for c++ does any body have a way to compare two arrays? Say one
has the numbers 1-5000, the other has *most* of the numbers from
1-5000, is there a way to display the numbers that are in the 1st
array, but not in the 2nd array? Or load those numbers in a 3rd array?

Thats the simple version...there are a couple other problems: ex:
the 10th value of the 2nd array corresponds to the 12th value of the
first.
- you cant simply make a for loop and compare each individual one...

The arrays are different sizes...one having 5000 values, the other
like 4700ish.

Any ideas? I was thinking of something like writing the smaller array
into a file, and then having something read the values of the first
array, and if it doesnt find it in the file, then it sends it to a
diff file or something...but I have NO IDEA how to code that...

Any ideas/thoughts/etc would be very welcome =)
 
C

Conrad Weyns

Raistlin said:
Hey, for c++ does any body have a way to compare two arrays? Say one
has the numbers 1-5000, the other has *most* of the numbers from
1-5000, is there a way to display the numbers that are in the 1st
array, but not in the 2nd array? Or load those numbers in a 3rd array?

Thats the simple version...there are a couple other problems: ex:
the 10th value of the 2nd array corresponds to the 12th value of the
first.
- you cant simply make a for loop and compare each individual one...

The arrays are different sizes...one having 5000 values, the other
like 4700ish.

Any ideas? I was thinking of something like writing the smaller array
into a file, and then having something read the values of the first
array, and if it doesnt find it in the file, then it sends it to a
diff file or something...but I have NO IDEA how to code that...

Any ideas/thoughts/etc would be very welcome =)

Play with the following:

---- main.cpp ----
#include <iostream>
#include <map>

int main()
{
using namespace std;
int a[] = { 1, 3, 5, 600, 600, 123, 145 };
int b[] = { 3, 3, 1, 600, 5000, 145 };

map<int, pair<bool, bool> > m;

for (int i = 0; i < sizeof(a) / sizeof(a[0]); ++i)
{
pair<bool, bool>& p = m[a];
p.first = true;
}

for (int i = 0; i < sizeof(b) / sizeof(b[0]); ++i)
{
pair<bool, bool>& p = m[b];
p.second = true;
}

for (map<int, pair<bool, bool> >::const_iterator cit = m.begin(); cit !=
m.end(); ++cit)
{
int key = cit->first;
pair<bool, bool> const& p = cit->second;
cout << key << " " << p.first << " " << p.second << endl;
}

return 0;
}

/* Output:

1 1 1
3 1 1
5 1 0
123 1 0
145 1 1
600 1 1
5000 0 1

thus:
1 is in a and b
3 is in a and b
5 is in a not b
....
5000 is not in a but in b
etc etc
*/
 
M

Mattia Belletti

Well, this is not inherently C++, it's more a general algorithm question.
I add another way over that suggested by Conrad: sort the two arrays (I
know qsort C method, but surely there's something equivalent in C++
standard libraries), and then you can simply run with two pointers over
the two arrays: when the value pointed by one is greater that the
other, increase by one the other (pointer math...) - whenever the two
pointers swap role (the one which before pointed a number greater now
points a number lesser than the other) it means that the value given by
the pointer which has remained fixed isn't present in the other array;
if they are the same, you've found an exact match, you can print it, and
then increase both.

This is very C-like, but it should work.

Time needed:
sorting (n lg n) + comparisons (n) = n lg n. I think it's the best you can
sort out, if the arrays are unordered.

--
/**
* Mattia Belletti - Undergraduate student @ cs.unibo.it
* ICQ: 33292311 - email: (e-mail address removed)
* IRC: BluShine - site(s): http://cs.unibo.it/~mbellett
* Linux registered user 299762 @ Linux registered machine 213003
*/
 
I

Ivan Vecerina

Raistlin said:
Hey, for c++ does any body have a way to compare two arrays? Say one
has the numbers 1-5000, the other has *most* of the numbers from
1-5000, is there a way to display the numbers that are in the 1st
array, but not in the 2nd array? Or load those numbers in a 3rd array?

Thats the simple version...there are a couple other problems: ex:
the 10th value of the 2nd array corresponds to the 12th value of the
first.
- you cant simply make a for loop and compare each individual one...

The arrays are different sizes...one having 5000 values, the other
like 4700ish.

Any ideas? I was thinking of something like writing the smaller array
into a file, and then having something read the values of the first
array, and if it doesnt find it in the file, then it sends it to a
diff file or something...but I have NO IDEA how to code that...

Sorting both sequences first is the most efficient approach.

The standard C++ algorithm std::set_difference outputs the
'difference' of two sorted sequences (that is, items of the first
one that are not present in the second one).

#include <vector>
#include <algorithm>
#include <iterator>
#include <iostream>

typedef std::vector<int> Items;

void printDiff(Items& a, Items& b)
{
std::sort( a.begin(), a.end() );
std::sort( b.begin(), b.end() );
std::set_difference
( a.begin(), a.end(), b.begin(), b.end()
, std::eek:stream_iterator<int>(std::cout,"\n") );
}

Instead of printing the output to cout using ostream_iterator,
you could as well store it into a new collection:
Items diff;
std::set_difference
( a.begin(), a.end(), b.begin(), b.end()
, std::back_inserter(diff) );
// diff now contains the requested items.


I hope this helps,
Ivan
 
C

Conrad Weyns

Ivan Vecerina said:
Sorting both sequences first is the most efficient approach.

The standard C++ algorithm std::set_difference outputs the
'difference' of two sorted sequences

The key here is "difference".
(that is, items of the first one that are not present in the second one).

I think this is only true if there are no duplicates in a collection else
you'll get the number of times one specific value occurs *more* often than
in the other sequence. Ehm, I should be able to phrase that better, so:

a[] = { 7, 7, 7 };
b[] = { 7 };

difference = { 7, 7 };

I think anyway..
Regards,
Conrad Weyns
 
C

Conrad Weyns

Conrad Weyns said:
Ivan Vecerina said:
[]
Sorting both sequences first is the most efficient approach.

The standard C++ algorithm std::set_difference outputs the
'difference' of two sorted sequences

The key here is "difference".
(that is, items of the first one that are not present in the second one).

Sorry! I see it now, that's exactly what you meant....
conrad weyns

I think this is only true if there are no duplicates in a collection else
you'll get the number of times one specific value occurs *more* often than
in the other sequence. Ehm, I should be able to phrase that better, so:

a[] = { 7, 7, 7 };
b[] = { 7 };

difference = { 7, 7 };

I think anyway..
Regards,
Conrad Weyns
[]
I hope this helps,
Ivan
 
R

Ron Natalie

Raistlin said:
Hey, for c++ does any body have a way to compare two arrays?

Well the equality operator doesn't really work on arrays at all (due to the misimplementation
of arrays in the language). There are various algorithms for doing element-wise comparisons.

See the Equal library function for simple equality tests.
Say one
has the numbers 1-5000, the other has *most* of the numbers from
1-5000, is there a way to display the numbers that are in the 1st
array, but not in the 2nd array? Or load those numbers in a 3rd array?

Are the arrays sorted? It makes a difference as to the algorithm.
Thats the simple version...there are a couple other problems: ex:
the 10th value of the 2nd array corresponds to the 12th value of the
first.
- you cant simply make a for loop and compare each individual one...

Is the order of the elements found important? Are the numbers unique?
There are solutions, but we need a more conscise description of what you want.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top