To parse a text file...

J

Jim Carter

Hi all,

I have the below file (C:\text1.txt) on Windows platform:

---------------------------
Rob: ThirdParty 2002.05
Rob: ThirdParty 4.2
Mike: ThirdParty 2002.05
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 2002.05
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2
-------------------------

Now, I want to avoid the duplicate entry that has 2002.05. For
example, the output should be as follows:

--------------------
Rob: ThirdParty 4.2
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2
------------------------


how can I do that with a perl script?

Thanks,
Carter
 
B

Ben Morrow

Hi all,

I have the below file (C:\text1.txt) on Windows platform:

---------------------------
Rob: ThirdParty 2002.05
Rob: ThirdParty 4.2
Mike: ThirdParty 2002.05
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 2002.05
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2
-------------------------

Now, I want to avoid the duplicate entry that has 2002.05. For
example, the output should be as follows:

--------------------
Rob: ThirdParty 4.2
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2

Use a hash, viz:

my %lines;

# (iterate over lines. Split each line into three fields in @fields)

$fields[2] eq '4.2'
and $lines{$fields[0]} = join ' ', @fields;

$fields[2] eq '2002.05'
and not $lines{$fields[0]}
and $lines{$fields[0]} = join ' ', @fields;

# Now print out the stuff in %lines

Ben
 
B

Bob Walton

Jim said:
Hi all,

I have the below file (C:\text1.txt) on Windows platform:

---------------------------
Rob: ThirdParty 2002.05
Rob: ThirdParty 4.2
Mike: ThirdParty 2002.05
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 2002.05
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2
-------------------------

Now, I want to avoid the duplicate entry that has 2002.05. For
example, the output should be as follows:

--------------------
Rob: ThirdParty 4.2
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2


That's a FAQ:

perldoc -q duplicate

....
 
T

Tore Aursand

Rob: ThirdParty 2002.05
Rob: ThirdParty 4.2
Mike: ThirdParty 2002.05
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 2002.05
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2

Now, I want to avoid the duplicate entry that has 2002.05. For
example, the output should be as follows:

Rob: ThirdParty 4.2
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2

how can I do that with a perl script?

What have you tried so far? What didn't work? You shouldn't let us to
your job, instead try for yourself first.

Anyway. First of all you should lookup 'perldoc -q duplicate', then you
should look into this untested code;

#!/usr/bin/perl
#
use strict;
use warnings;

my %seen = ();
open( TXT, 'C:\text1.txt' ) or die "Couldn't open file; $!\n";
while ( <TXT> ) {
chomp;
if ( /(\d+\.\d+)$/ ) {
next if ( exists $seen{$1} );
$seen{$1}++;
print $_ . "\n";
}
}
close( TXT );
 
G

gnari

Jim Carter said:
Hi all,

I have the below file (C:\text1.txt) on Windows platform:

---------------------------
Rob: ThirdParty 2002.05
Rob: ThirdParty 4.2
Mike: ThirdParty 2002.05 ....
Now, I want to avoid the duplicate entry that has 2002.05. For
example, the output should be as follows:
Rob: ThirdParty 4.2
Mike: ThirdParty 4.2
....

assuming you know the basics:
you split each line a at the start of the digits
build a hash using using first part as key, second part as value,
except do not store new value if key exists and value is 2002.05
if you need to keep the same order, you push the keys to an array
whenever the key is new.

after all lines are done print all the keys, values in your hash, (using
the array, if original order is important)

try to implement that, and if you have problems, show us
what does not work

gnari
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top