deleting duplicates in array using references

billb · Jul 30, 2007

i have a multidimensional array, but i want to delete duplicate
entries based on the first element of each 'row'.

my array is:

my @array = ( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H,
0.030] , [UK0030239, S1393, M1, 0.030] , [UK0012821, H4030, H,
0.010] , [UK0012821, H4060, H, 0.010] );

and I want to end up with
( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H, 0.030] ,
[UK0012821, H4030, H, 0.010] )

(no real preference in which row is dropped...just on a first come
first served basis.)

i.e. take out the duplicate codes based on the first element of each
row $array[$row] -> [0]

i looked into splice() function based on the index but not sure this
is the best way or the syntax for this?

splice (@array , $row, 1); ?

thanks.

Paul Lalli · Jul 30, 2007

i have a multidimensional array, but i want to delete duplicate
entries based on the first element of each 'row'.

$ perldoc -q duplicate
Found in /opt2/Perl5_8_4/lib/perl5/5.8.4/pod/perlfaq4.pod
How can I remove duplicate elements from a list or array?

my array is:

my @array = ( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H,
0.030] , [UK0030239, S1393, M1, 0.030] , [UK0012821, H4030, H,
0.010] , [UK0012821, H4060, H, 0.010] );

and I want to end up with
( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H, 0.030] ,
[UK0012821, H4030, H, 0.010] )

(no real preference in which row is dropped...just on a first come
first served basis.)

i.e. take out the duplicate codes based on the first element of each
row $array[$row] -> [0]

$ perl -MData:

umper -e'
my @array = (
[UK9004411, A140, B, 0.040] ,
[UK0030239, H7140, H, 0.030] ,
[UK0030239, S1393, M1, 0.030] ,
[UK0012821, H4030, H, 0.010] ,
[UK0012821, H4060, H, 0.010] ,
);
my %seen;
my @nodups = grep { !$seen{$_->[0]}++ } @array;
print Dumper(\@nodups);
'
$VAR1 = [
[
'UK9004411',
'A140',
'B',
'0.04'
],
[
'UK0030239',
'H7140',
'H',
'0.03'
],
[
'UK0012821',
'H4030',
'H',
'0.01'
]
];

i looked into splice() function based on the index but not sure this
is the best way or the syntax for this?

splice (@array , $row, 1); ?

splice() is fine for removing the elements once you know which ones
you want to remove, but it's useless for actually finding which
elements to remove.

Paul Lalli

billb · Jul 30, 2007

i have a multidimensional array, but i want to delete duplicate
entries based on the first element of each 'row'.

Click to expand...

$ perldoc -q duplicate
Found in /opt2/Perl5_8_4/lib/perl5/5.8.4/pod/perlfaq4.pod
How can I remove duplicate elements from a list or array?

my array is:

Click to expand...

my @array = ( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H,
0.030] , [UK0030239, S1393, M1, 0.030] , [UK0012821, H4030, H,
0.010] , [UK0012821, H4060, H, 0.010] );

Click to expand...

and I want to end up with
( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H, 0.030] ,
[UK0012821, H4030, H, 0.010] )

Click to expand...

(no real preference in which row is dropped...just on a first come
first served basis.)

Click to expand...

i.e. take out the duplicate codes based on the first element of each
row $array[$row] -> [0]

Click to expand...

$ perl -MData:umper -e'
my @array = (
[UK9004411, A140, B, 0.040] ,
[UK0030239, H7140, H, 0.030] ,
[UK0030239, S1393, M1, 0.030] ,
[UK0012821, H4030, H, 0.010] ,
[UK0012821, H4060, H, 0.010] ,
);
my %seen;
my @nodups = grep { !$seen{$_->[0]}++ } @array;
print Dumper(\@nodups);
'
$VAR1 = [
[
'UK9004411',
'A140',
'B',
'0.04'
],
[
'UK0030239',
'H7140',
'H',
'0.03'
],
[
'UK0012821',
'H4030',
'H',
'0.01'
]
];

i looked into splice() function based on the index but not sure this
is the best way or the syntax for this?

Click to expand...

splice (@array , $row, 1); ?

Click to expand...

splice() is fine for removing the elements once you know which ones
you want to remove, but it's useless for actually finding which
elements to remove.

Paul Lalli

ah, very simple and very fast as well! I'll have to understand how
this is working. It uses a hash I see. Many thanks.

Paul Lalli · Jul 31, 2007

i have a multidimensional array, but i want to delete duplicate
entries based on the first element of each 'row'.
my @array = ( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H,
0.030] , [UK0030239, S1393, M1, 0.030] , [UK0012821, H4030, H,
0.010] , [UK0012821, H4060, H, 0.010] );
and I want to end up with
( [UK9004411, A140, B, 0.040] , [UK0030239, H7140, H, 0.030] ,
[UK0012821, H4030, H, 0.010] )

Click to expand...

my %seen;
my @nodups = grep { !$seen{$_->[0]}++ } @array;

Click to expand...

ah, very simple and very fast as well! I'll have to understand how
this is working. It uses a hash I see.

It helps if you expand it out to remove all the "shortcuts"

my %seen;
my @nodups;
foreach my $elem (@array) {
if (! $seen{$elem->[0]}) {
push @nodups, $elem;
}
$seen{$elem->[0]}++;
}

So we're looping through the 2d array, and we check to see if the
first element of the current array reference has been "seen" yet. If
not, we add this array reference to our list of no duplicates. Then
we increment the number of times we've "seen" this element, so that if
the same element is seen again, we won't add it next time.

The shortcuts:
* a foreach-if-push combination is equivalent to grep(). grep selects
only those elements from a list for which the if condition holds.
* in the grep, $_ is used to represent the current element of the
array (rather than $elem as in the above expansion)
* The ++ operator is applied to the same expression as when we're
checking the current value of $seen{$_->[0]}, because a post-fix ++
increments the value *after* returning that value. That is:
$x = $foo++;
is equivalent to:
$x = $foo;
$foo++;

In contrast,
$x = ++$foo;
is equivalent to
$foo++;
$x = $foo;

Many thanks.

You're welcome

Paul Lalli

anno4000 · Jul 31, 2007

^^^^^^^^^
[Paul's solution snipped]

ah, very simple and very fast as well! I'll have to understand how
this is working. It uses a hash I see. Many thanks.

On hearing the word "duplicate", like a Pavlovian dog a Perl programmer
goes "Hash, hash, hash...". The word "unique" hash the same effect.

Anno

anno4000 · Jul 31, 2007

^^^^^^^^^
[Paul's solution snipped]

ah, very simple and very fast as well! I'll have to understand how
this is working. It uses a hash I see. Many thanks.

On hearing the word "duplicate", like a Pavlovian dog a Perl programmer
goes "Hash, hash, hash...". The word "unique" has the same effect.

Anno

Survey details won't go through using php, ajax, Mysql	0	Oct 26, 2023
[Q] removing array duplicates where a subset is unique	23	Jul 17, 2009
Address of a specific element: an Array containing Array References...	1	Oct 30, 2008
identifying the duplicate when removing duplicates from any array	1	Jun 10, 2006
Detecting duplicates in an array, anything in the standard library ?	34	Aug 19, 2007
Array#uniq with Hash elements... can't remove duplicates	2	May 11, 2008
Possible to sort and omit duplicates in a list using javascript	2	Feb 3, 2008
[ANN] Multidimensional Array - MDArray 0.5.5	0	Nov 19, 2013

deleting duplicates in array using references

billb

Paul Lalli

billb

Paul Lalli

anno4000

anno4000

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads