deleting duplicates in array using references

B

billb

i have a multidimensional array, but i want to delete duplicate
entries based on the first element of each 'row'.

my array is:

my @array = ( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H,
0.030] , [UK0030239, S1393, M1, 0.030] , [UK0012821, H4030, H,
0.010] , [UK0012821, H4060, H, 0.010] );

and I want to end up with
( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H, 0.030] ,
[UK0012821, H4030, H, 0.010] )

(no real preference in which row is dropped...just on a first come
first served basis.)

i.e. take out the duplicate codes based on the first element of each
row $array[$row] -> [0]

i looked into splice() function based on the index but not sure this
is the best way or the syntax for this?

splice (@array , $row, 1); ?

thanks.
 
P

Paul Lalli

i have a multidimensional array, but i want to delete duplicate
entries based on the first element of each 'row'.

$ perldoc -q duplicate
Found in /opt2/Perl5_8_4/lib/perl5/5.8.4/pod/perlfaq4.pod
How can I remove duplicate elements from a list or array?
my array is:

my @array = ( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H,
0.030] , [UK0030239, S1393, M1, 0.030] , [UK0012821, H4030, H,
0.010] , [UK0012821, H4060, H, 0.010] );

and I want to end up with
( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H, 0.030] ,
[UK0012821, H4030, H, 0.010] )

(no real preference in which row is dropped...just on a first come
first served basis.)

i.e. take out the duplicate codes based on the first element of each
row $array[$row] -> [0]

$ perl -MData::Dumper -e'
my @array = (
[UK9004411, A140, B, 0.040] ,
[UK0030239, H7140, H, 0.030] ,
[UK0030239, S1393, M1, 0.030] ,
[UK0012821, H4030, H, 0.010] ,
[UK0012821, H4060, H, 0.010] ,
);
my %seen;
my @nodups = grep { !$seen{$_->[0]}++ } @array;
print Dumper(\@nodups);
'
$VAR1 = [
[
'UK9004411',
'A140',
'B',
'0.04'
],
[
'UK0030239',
'H7140',
'H',
'0.03'
],
[
'UK0012821',
'H4030',
'H',
'0.01'
]
];
i looked into splice() function based on the index but not sure this
is the best way or the syntax for this?

splice (@array , $row, 1); ?

splice() is fine for removing the elements once you know which ones
you want to remove, but it's useless for actually finding which
elements to remove.

Paul Lalli
 
B

billb

i have a multidimensional array, but i want to delete duplicate
entries based on the first element of each 'row'.

$ perldoc -q duplicate
Found in /opt2/Perl5_8_4/lib/perl5/5.8.4/pod/perlfaq4.pod
How can I remove duplicate elements from a list or array?
my array is:
my @array = ( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H,
0.030] , [UK0030239, S1393, M1, 0.030] , [UK0012821, H4030, H,
0.010] , [UK0012821, H4060, H, 0.010] );
and I want to end up with
( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H, 0.030] ,
[UK0012821, H4030, H, 0.010] )
(no real preference in which row is dropped...just on a first come
first served basis.)
i.e. take out the duplicate codes based on the first element of each
row $array[$row] -> [0]

$ perl -MData::Dumper -e'
my @array = (
[UK9004411, A140, B, 0.040] ,
[UK0030239, H7140, H, 0.030] ,
[UK0030239, S1393, M1, 0.030] ,
[UK0012821, H4030, H, 0.010] ,
[UK0012821, H4060, H, 0.010] ,
);
my %seen;
my @nodups = grep { !$seen{$_->[0]}++ } @array;
print Dumper(\@nodups);
'
$VAR1 = [
[
'UK9004411',
'A140',
'B',
'0.04'
],
[
'UK0030239',
'H7140',
'H',
'0.03'
],
[
'UK0012821',
'H4030',
'H',
'0.01'
]
];
i looked into splice() function based on the index but not sure this
is the best way or the syntax for this?
splice (@array , $row, 1); ?

splice() is fine for removing the elements once you know which ones
you want to remove, but it's useless for actually finding which
elements to remove.

Paul Lalli

ah, very simple and very fast as well! I'll have to understand how
this is working. It uses a hash I see. Many thanks.
 
P

Paul Lalli

i have a multidimensional array, but i want to delete duplicate
entries based on the first element of each 'row'.
my @array = ( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H,
0.030] , [UK0030239, S1393, M1, 0.030] , [UK0012821, H4030, H,
0.010] , [UK0012821, H4060, H, 0.010] );
and I want to end up with
( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H, 0.030] ,
[UK0012821, H4030, H, 0.010] )
my %seen;
my @nodups = grep { !$seen{$_->[0]}++ } @array;
ah, very simple and very fast as well! I'll have to understand how
this is working. It uses a hash I see.

It helps if you expand it out to remove all the "shortcuts"

my %seen;
my @nodups;
foreach my $elem (@array) {
if (! $seen{$elem->[0]}) {
push @nodups, $elem;
}
$seen{$elem->[0]}++;
}

So we're looping through the 2d array, and we check to see if the
first element of the current array reference has been "seen" yet. If
not, we add this array reference to our list of no duplicates. Then
we increment the number of times we've "seen" this element, so that if
the same element is seen again, we won't add it next time.

The shortcuts:
* a foreach-if-push combination is equivalent to grep(). grep selects
only those elements from a list for which the if condition holds.
* in the grep, $_ is used to represent the current element of the
array (rather than $elem as in the above expansion)
* The ++ operator is applied to the same expression as when we're
checking the current value of $seen{$_->[0]}, because a post-fix ++
increments the value *after* returning that value. That is:
$x = $foo++;
is equivalent to:
$x = $foo;
$foo++;

In contrast,
$x = ++$foo;
is equivalent to
$foo++;
$x = $foo;
Many thanks.

You're welcome

Paul Lalli
 
A

anno4000

^^^^^^^^^
[Paul's solution snipped]
ah, very simple and very fast as well! I'll have to understand how
this is working. It uses a hash I see. Many thanks.

On hearing the word "duplicate", like a Pavlovian dog a Perl programmer
goes "Hash, hash, hash...". The word "unique" hash the same effect.

Anno
 
A

anno4000

^^^^^^^^^
[Paul's solution snipped]
ah, very simple and very fast as well! I'll have to understand how
this is working. It uses a hash I see. Many thanks.

On hearing the word "duplicate", like a Pavlovian dog a Perl programmer
goes "Hash, hash, hash...". The word "unique" has the same effect.

Anno
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top