removing rows based on two duplicate fileds

heylow · Sep 13, 2007

Gurus:

I have merged /etc/passwd files many systems, and trying to master
passwd file. I concatenated and sorted uniquely. I have the resultant
file which looks like

smith:*:100:100:8A-74(office):/home/smith:/bin/ksh
smith:*:100:100:8A-74(office):/home/smith:/etc/fakesh <-- duplicate
rob:*:101:101:8A-75(office):/home/smith:/bin/ksh
don:*:102:102:B25:/home/don:/bin/fakesh
don:*:102:102:B25:/home/don:/bin/fakesh <-- duplicate
ele:*:255:255:A45:/home/ele:/bin/ksh
rod:*:300:300:B456:/home/rod:/bin/ksh

I want to delete the duplicates; that is, I want to keep only one row
for every uid.

I have tried with ksh, but in vain. Can you shed how it can be done in
perl.

Thanks, Pedro

usenet · Sep 13, 2007

I want to delete the duplicates; that is, I want to keep only one row
for every uid.

If you always want to assume the first instance wins, something like
this should work:

#!/usr/local/bin/perl
use strict; use warnings;

open (my $in, '<', 'passwd.merged');
open (my $out, '>', 'passwd.cleaned');

my %seen;
while (<$in>) {
/^(.*?):/;
print $out $_ unless $seen{$1};
$seen{$1}++;
}

__END__

Tad McClellan · Sep 13, 2007

I want to delete the duplicates;

Can you shed how it can be done in
perl.

It is done the way outlined in the Frequently Asked Questions.

perldoc -q duplicate

How can I remove duplicate elements from a list or array?

Modify it for your particular case:

---------------------------------
#!/usr/bin/perl
use warnings;
use strict;

my @unique = ();
my %seen = ();

while ( <DATA> ) {
my $elem = (split /:/)[2];
next if $seen{ $elem }++;
push @unique, $_;
}

print for @unique;

__DATA__
smith:*:100:100:8A-74(office):/home/smith:/bin/ksh
smith:*:100:100:8A-74(office):/home/smith:/etc/fakesh <-- duplicate
rob:*:101:101:8A-75(office):/home/smith:/bin/ksh
don:*:102:102:B25:/home/don:/bin/fakesh
don:*:102:102:B25:/home/don:/bin/fakesh <-- duplicate
ele:*:255:255:A45:/home/ele:/bin/ksh
rod:*:300:300:B456:/home/rod:/bin/ksh

Tad McClellan · Sep 13, 2007

open (my $in, '<', 'passwd.merged');

You should always, yes *always*, check the return value from open().

/^(.*?):/;
print $out $_ unless $seen{$1};

You should never use the dollar-digit variables unless you have
first ensured that the match _succeeded_.

The OP said he wanted duplicate uid's removed. The 1st field
is not the uid.

Martijn Lievaart · Sep 13, 2007

Gurus:

I have merged /etc/passwd files many systems, and trying to master
passwd file. I concatenated and sorted uniquely. I have the resultant
file which looks like

smith:*:100:100:8A-74(office):/home/smith:/bin/ksh
smith:*:100:100:8A-74(office):/home/smith:/etc/fakesh <-- duplicate
rob:*:101:101:8A-75(office):/home/smith:/bin/ksh
don:*:102:102:B25:/home/don:/bin/fakesh
don:*:102:102:B25:/home/don:/bin/fakesh <-- duplicate
ele:*:255:255:A45:/home/ele:/bin/ksh
rod:*:300:300:B456:/home/rod:/bin/ksh

I want to delete the duplicates; that is, I want to keep only one row
for every uid.

I have tried with ksh, but in vain. Can you shed how it can be done in
perl.

I would not use Perl for this but GNU sort:

# sort -t: -k3,3 -u passwd

M4

Ben Morrow · Sep 14, 2007

Quoth Joe Smith said:
Why that, instead of the simple

print @unique;

It's a habit one gets into when $\ is set to something useful (such as
when using -l).

Ben

Tad McClellan · Sep 15, 2007

Joe Smith said:
Why that, instead of the simple

print @unique;

No good reason in this particular case, since they all have newlines.

I originally had

print "$_\n" for @unique;

then removed print()'s argument when it came out double-spaced.

comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
SINGAPORE PRIVATE CONDO / APT FOR SALE / Singapore New Upcoming Residential Projects	5	Dec 16, 2006
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	1	Feb 1, 2004
SINGAPORE PRIVATE CONDO / APT FOR SALE / Singapore New Upcoming Residential Projects	1	Dec 16, 2006

removing rows based on two duplicate fileds

heylow

usenet

Tad McClellan

Tad McClellan

Martijn Lievaart

Ben Morrow

Tad McClellan

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads