P
phillyfan
I have an .csv file I have pulled into an array. I have searched for a
way to remove duplicate lines from the array. I have used a couple of
different coding techques but because they are use the hash key value
technique I end up removing lines I need. Here is a sample of my file:
The fields are Classcode, start time, end time, building number, days
of week, class title, proff id, and professor name. They are comma
delimited in the .csv file.
ACCT2101TS1 1305 1355 172 103 MWF Accounting I 901463900 Michael Ely
ACCT2101TS1 920 1030 172 222 MWF Accounting
I 901063085 Arnold Schneider
ACCT2101TS1 1305 1355 172 103 MWF Accounting I 901463900 Michael Ely
ACCT2101TS2 1005 1055 172 300 MWF Accounting I 901790899 Robert Dunn
ACCT2101TS2 1005 1055 172 300 MWF Accounting I 901790899 Robert Dunn
ACCT2101TS3 1635 1755 172 300 TR Accounting I 900255352 Michael Kilgore
ACCT2101TS3 1635 1755 172 300 TR Accounting I 900255352 Michael Kilgore
ACCT2101TSA 1635 1755 172 200 TR Accounting I 900255352 Michael Kilgore
ACCT2101TSB 1105 1155 172 200 MWF Accounting I 901063046 Deborah Turner
ACCT2101TSC 1205 1255 172 200 MWF Accounting I 901063046 Deborah Turner
ACCT2102TS1 1305 1355 172 201 MWF Accounting II 901790899 Robert Dunn
ACCT2102TS1 1040 1150 172 222 MWF Accounting
II 901063085 Arnold Schneider
ACCT2102TS1 1305 1355 172 201 MWF Accounting II 901790899 Robert Dunn
If I use:
#! /perl/bin/perl
use strict;
use warnings;
$| = 1;
my @bannerfile = ();
open(INTO, 'data-banner.csv') or die "Can't open data-banner.csv for
reading: $!\n";
chomp(@bannerfile = <INTO>);
close(INTO) or die "Can't close data-banner.csv: $!\n";
my %seen = ();
my $item;
my @uniq = @bannerfile;
@uniq = do { my %seen; grep !$seen{$_}++, @uniq };
or
foreach $item(@bannerfile){
push(@uniq, $item) unless exists $seen{$item};}
What happens, I am sure you already know is because the same classcode
is found it is removed regardless if the information after itis
different. My goal is to strip off the duplicate records that exist
from the file. Example:
ACCT2101TS1 1305 1355 172 103 MWF Accounting I 901463900 Michael Ely
shows up twice just keep one instance of this record and also be able
to keep
ACCT2101TS1 920 1030 172 222 MWF Accounting
I 901063085 Arnold Schneider
because it is a different record.
Hopefully I have made sense in what I am trying to achieve. Thank you
for your help and tutelage.
way to remove duplicate lines from the array. I have used a couple of
different coding techques but because they are use the hash key value
technique I end up removing lines I need. Here is a sample of my file:
The fields are Classcode, start time, end time, building number, days
of week, class title, proff id, and professor name. They are comma
delimited in the .csv file.
ACCT2101TS1 1305 1355 172 103 MWF Accounting I 901463900 Michael Ely
ACCT2101TS1 920 1030 172 222 MWF Accounting
I 901063085 Arnold Schneider
ACCT2101TS1 1305 1355 172 103 MWF Accounting I 901463900 Michael Ely
ACCT2101TS2 1005 1055 172 300 MWF Accounting I 901790899 Robert Dunn
ACCT2101TS2 1005 1055 172 300 MWF Accounting I 901790899 Robert Dunn
ACCT2101TS3 1635 1755 172 300 TR Accounting I 900255352 Michael Kilgore
ACCT2101TS3 1635 1755 172 300 TR Accounting I 900255352 Michael Kilgore
ACCT2101TSA 1635 1755 172 200 TR Accounting I 900255352 Michael Kilgore
ACCT2101TSB 1105 1155 172 200 MWF Accounting I 901063046 Deborah Turner
ACCT2101TSC 1205 1255 172 200 MWF Accounting I 901063046 Deborah Turner
ACCT2102TS1 1305 1355 172 201 MWF Accounting II 901790899 Robert Dunn
ACCT2102TS1 1040 1150 172 222 MWF Accounting
II 901063085 Arnold Schneider
ACCT2102TS1 1305 1355 172 201 MWF Accounting II 901790899 Robert Dunn
If I use:
#! /perl/bin/perl
use strict;
use warnings;
$| = 1;
my @bannerfile = ();
open(INTO, 'data-banner.csv') or die "Can't open data-banner.csv for
reading: $!\n";
chomp(@bannerfile = <INTO>);
close(INTO) or die "Can't close data-banner.csv: $!\n";
my %seen = ();
my $item;
my @uniq = @bannerfile;
@uniq = do { my %seen; grep !$seen{$_}++, @uniq };
or
foreach $item(@bannerfile){
push(@uniq, $item) unless exists $seen{$item};}
What happens, I am sure you already know is because the same classcode
is found it is removed regardless if the information after itis
different. My goal is to strip off the duplicate records that exist
from the file. Example:
ACCT2101TS1 1305 1355 172 103 MWF Accounting I 901463900 Michael Ely
shows up twice just keep one instance of this record and also be able
to keep
ACCT2101TS1 920 1030 172 222 MWF Accounting
I 901063085 Arnold Schneider
because it is a different record.
Hopefully I have made sense in what I am trying to achieve. Thank you
for your help and tutelage.