# New FAQ: How do I compute the difference of two arrays?

Discussion in 'Perl Misc' started by The Poor, Sep 26, 2003.

1. ### The PoorGuest

Here is my new code changed from the old code in FAQ.
Please add it to the perldoc for the next release of perl.

OLD:
+
How do I compute the difference of two arrays? How do I compute the
intersection of two arrays?

Use a hash. Here's code to do both and more. It assumes that each
element is unique in a given array:

@union = @intersection = @difference = ();
%count = ();
foreach \$element (@array1, @array2) { \$count{\$element}++ }
foreach \$element (keys %count) {
push @union, \$element;
push @{ \$count{\$element} > 1 ? \@intersection :
\@difference }, \$element;
}

Note that this is the *symmetric difference*, that is, all
elements in
either A or in B but not in both. Think of it as an xor operation.

-
New:
+
How do I compute the difference of two arrays? How do I compute the
intersection of two arrays?

Use a hash. Here's code to do both and more. It assumes that each
element is unique in a given array:

@union = @intersection = @in1notin2 = @in2notin1();
%count = ();
foreach \$element (@array1) { \$count{\$element}++ }
foreach \$element (@array2) { \$count{\$element}-- }
foreach \$element (keys %count) {
push @union, \$element;
push (@intersection, \$element) if \$count{\$element} == 0;
push (@in1notin2, \$element) if \$count{\$element} == 1;
push (@in2notin1, \$element) if \$count{\$element} == -1;
}
-

2 Questions:
how to fix it when the array is not uniqu?
how to simplfy the last 3 lines using case, ?: or something else?

push @intersection, \$element if \$count{\$element} == 0;
push @in1notin2, \$element if \$count{\$element} == 1;
push @in2notin1, \$element if \$count{\$element} == -1;

http://edealseek.com

The Poor, Sep 26, 2003

2. ### Steve GrazziniGuest

The Poor <.2y.net> wrote:
> Here is my new code changed from the old code in FAQ.
> Please add it to the perldoc for the next release of perl.

This is a discussion group. FAQ patches (preferably in the form
of unified or context diffs) can be mailed to perlfaq-workers at
perl.org or posted from nntp.perl.org.

You should also include some rationale -- what's wrong with the
I'm not sure that it is an improvement, but you'll have to make
a case for it. (And fix the syntax error, etc.)

--
Steve

Steve Grazzini, Sep 26, 2003

3. ### Jay TiltonGuest

Steve Grazzini <> wrote:

: The Poor <.2y.net> wrote:
: > Here is my new code changed from the old code in FAQ.
: > Please add it to the perldoc for the next release of perl.
:
: This is a discussion group. FAQ patches (preferably in the form
: of unified or context diffs) can be mailed to perlfaq-workers at
: perl.org or posted from nntp.perl.org.
:
: You should also include some rationale -- what's wrong with the
: I'm not sure that it is an improvement, but you'll have to make
: a case for it. (And fix the syntax error, etc.)

As well, when making that case it is good to keep in mind that the FAQ
answers are not intended to form a cookbook of pre-fab do-anything
solutions. That's more the domain of CPAN, or the actual Perl Cookbook.

Recipe 4.7 in Perl Cookbook (1st ed.), by the way, directly addresses
the problem of "Finding Elements in One Array but Not Another."

A more useful (and easier) change to the FAQ would be to add mention of
a module on CPAN that handles permutations of the question that the FAQ
module appears to be especially capable.

Jay Tilton, Sep 27, 2003
4. ### Bob WaltonGuest

The Poor wrote:

....
> 2 Questions:
> how to fix it when the array is not uniqu?

Make it unique:

{my %h or @array1_unique=(@h{@array1}=1 and keys %h)}

> how to simplfy the last 3 lines using case, ?: or something else?
>
> push @intersection, \$element if \$count{\$element} == 0;
> push @in1notin2, \$element if \$count{\$element} == 1;
> push @in2notin1, \$element if \$count{\$element} == -1;

One way:
push @{\$result[\$count{\$element}+1]},\$element;
#@{\$result[0][...]} is in2notin1
#@{\$result[1][...]} is intersection
#@{\$result[2][...]} is in1notin2

--
Bob Walton

Bob Walton, Sep 27, 2003
5. ### Randal L. SchwartzGuest

>>>>> "The" == The Poor <.2y.net> writes:

The> -
The> New:
The> +
The> How do I compute the difference of two arrays? How do I compute the
The> intersection of two arrays?

The> Use a hash. Here's code to do both and more. It assumes that each
The> element is unique in a given array:

Here's code that doesn't require that uniqueness, and would be a better
candidate for the FAQ:

my %tally;
\$tally{\$_} .= "a" for @array_a;
\$tally{\$_} .= "b" for @array_b;
my @union = keys %tally;
my @intersection = grep \$tally{\$_} =~ /ab/, @union;
my @a_not_b = grep \$tally{\$_} =~ /a\$/, @union;
my @b_not_a = grep \$tally{\$_} =~ /^b/, @union;

print "Just another Perl hacker,";

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

Randal L. Schwartz, Sep 27, 2003
6. ### Gregory ToomeyGuest

It was a dark and stormy night, and Randal L. Schwartz managed to scribble:

>>>>>> "The" == The Poor <.2y.net> writes:

>
> The> -
> The> New:
> The> +
> The> How do I compute the difference of two arrays? How do I compute
> the The> intersection of two arrays?
>
> The> Use a hash. Here's code to do both and more. It assumes that each
> The> element is unique in a given array:
>
> Here's code that doesn't require that uniqueness, and would be a better
> candidate for the FAQ:
>
> my %tally;
> \$tally{\$_} .= "a" for @array_a;
> \$tally{\$_} .= "b" for @array_b;
> my @union = keys %tally;
> my @intersection = grep \$tally{\$_} =~ /ab/, @union;
> my @a_not_b = grep \$tally{\$_} =~ /a\$/, @union;
> my @b_not_a = grep \$tally{\$_} =~ /^b/, @union;
>
> print "Just another Perl hacker,";

Having coded this problem myself, your solution is minimal & very elegant (as expected).

gtoomey

Gregory Toomey, Sep 27, 2003
7. ### Anno SiegelGuest

Gregory Toomey <> wrote in comp.lang.perl.misc:
> It was a dark and stormy night, and Randal L. Schwartz managed to scribble:
>
> >>>>>> "The" == The Poor <.2y.net> writes:

> >
> > The> -
> > The> New:
> > The> +
> > The> How do I compute the difference of two arrays? How do I compute
> > the The> intersection of two arrays?
> >
> > The> Use a hash. Here's code to do both and more. It assumes that each
> > The> element is unique in a given array:
> >
> > Here's code that doesn't require that uniqueness, and would be a better
> > candidate for the FAQ:
> >
> > my %tally;
> > \$tally{\$_} .= "a" for @array_a;
> > \$tally{\$_} .= "b" for @array_b;
> > my @union = keys %tally;
> > my @intersection = grep \$tally{\$_} =~ /ab/, @union;
> > my @a_not_b = grep \$tally{\$_} =~ /a\$/, @union;
> > my @b_not_a = grep \$tally{\$_} =~ /^b/, @union;
> >
> > print "Just another Perl hacker,";

>
> Having coded this problem myself, your solution is minimal & very
> elegant (as expected).

....a variation on the theme:

my %tally;
\$tally{\$_} += 1 for @array_a; # or |=
\$tally{\$_} += 2 for @array_b;
# use more powers of two for more arrays
my @union = keys %tally;
my @intersection = grep \$tally{\$_} == 3, @union;
my \$a_not_b = grep \$tally{\$_} == 1, @union;
my \$b_not_a = grep \$tally{\$_} == 2, @union;
# etc.

Anno

Anno Siegel, Sep 29, 2003
8. ### David K. WallGuest

Anno Siegel <-berlin.de> wrote:

> Gregory Toomey <> wrote in comp.lang.perl.misc:
>> It was a dark and stormy night, and Randal L. Schwartz managed to
>> scribble:
>>
>> >>>>>> "The" == The Poor <.2y.net> writes:
>> >
>> > The> -
>> > The> New:
>> > The> +
>> > The> How do I compute the difference of two arrays? How do I
>> > compute the The> intersection of two arrays?
>> >
>> > The> Use a hash. Here's code to do both and more. It
>> > assumes that each The> element is unique in a given array:
>> >
>> > Here's code that doesn't require that uniqueness, and would be
>> > a better candidate for the FAQ:
>> >
>> > my %tally;
>> > \$tally{\$_} .= "a" for @array_a;
>> > \$tally{\$_} .= "b" for @array_b;
>> > my @union = keys %tally;
>> > my @intersection = grep \$tally{\$_} =~ /ab/, @union;
>> > my @a_not_b = grep \$tally{\$_} =~ /a\$/, @union;
>> > my @b_not_a = grep \$tally{\$_} =~ /^b/, @union;
>> >
>> > print "Just another Perl hacker,";

>>
>> Having coded this problem myself, your solution is minimal & very
>> elegant (as expected).

>
> ...a variation on the theme:
>
> my %tally;
> \$tally{\$_} += 1 for @array_a; # or |=
> \$tally{\$_} += 2 for @array_b;
> # use more powers of two for more arrays
> my @union = keys %tally;
> my @intersection = grep \$tally{\$_} == 3, @union;

This assumes there are no duplicated values in either array, right?

> my \$a_not_b = grep \$tally{\$_} == 1, @union;

I think you mean @a_not_b.

> my \$b_not_a = grep \$tally{\$_} == 2, @union;

^
> # etc.

Randal's solution allowed for non-unique values in @array_a and
@array_b. The above doesn't produce correct results for, say,

my @array_a = qw(a a a y z);
my @array_b = qw(b b b x y z);

my @array_c = qw(c c c x z);

my %tally;
\$tally{\$_} .= "a" for @array_a;
\$tally{\$_} .= "b" for @array_b;
\$tally{\$_} .= "c" for @array_c;
my @union = keys %tally;
my @intersection = grep \$tally{\$_} =~ /a+b+c/, @union;
my @a_only = grep \$tally{\$_} =~ /^a+\$/, @union;
my @b_only = grep \$tally{\$_} =~ /^b+\$/, @union;
my @c_only = grep \$tally{\$_} =~ /^c+\$/, @union;
my @a_and_b = grep \$tally{\$_} =~ /a+b/, @union;
my @a_b_not_c = grep \$tally{\$_} =~ /a+b+(?!c)/, @union;

David K. Wall, Sep 29, 2003
9. ### Anno SiegelGuest

David K. Wall <> wrote in comp.lang.perl.misc:
> Anno Siegel <-berlin.de> wrote:

[union, intersection of arrays]

> > ...a variation on the theme:
> >
> > my %tally;
> > \$tally{\$_} += 1 for @array_a; # or |=
> > \$tally{\$_} += 2 for @array_b;
> > # use more powers of two for more arrays
> > my @union = keys %tally;
> > my @intersection = grep \$tally{\$_} == 3, @union;

>
> This assumes there are no duplicated values in either array, right?

Yes, if you use +=, as in the code above. Not if you use |= instead,
as indicated in a comment. I must have been thinking of sets, not lists.

> > my \$a_not_b = grep \$tally{\$_} == 1, @union;

>
> I think you mean @a_not_b.

More sloppiness.

Thanks. The code should definitely use "\$tally{\$_} |= ...".

Anno

Anno Siegel, Sep 29, 2003
10. ### David K. WallGuest

Anno Siegel <-berlin.de> wrote:

> David K. Wall <> wrote in
> comp.lang.perl.misc:
>> Anno Siegel <-berlin.de> wrote:

>
> [union, intersection of arrays]
>
>> > ...a variation on the theme:
>> >
>> > my %tally;
>> > \$tally{\$_} += 1 for @array_a; # or |=
>> > \$tally{\$_} += 2 for @array_b;
>> > # use more powers of two for more arrays
>> > my @union = keys %tally;
>> > my @intersection = grep \$tally{\$_} == 3, @union;

>>
>> This assumes there are no duplicated values in either array,
>> right?

>
> Yes, if you use +=, as in the code above. Not if you use |=
> instead, as indicated in a comment. I must have been thinking of
> sets, not lists.

Aha. I didn't notice the comment. It makes sense now.

As an exercise for myself I found a number of subsets of three sets
this way. Although I didn't do any benchmarking I'm pretty sure this
is faster than the string method, but I must say I can think more