To parse a text file...

Jim Carter · Jan 10, 2004

Hi all,

I have the below file (C:\text1.txt) on Windows platform:

---------------------------
Rob: ThirdParty 2002.05
Rob: ThirdParty 4.2
Mike: ThirdParty 2002.05
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 2002.05
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2
-------------------------

Now, I want to avoid the duplicate entry that has 2002.05. For
example, the output should be as follows:

--------------------
Rob: ThirdParty 4.2
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2
------------------------

how can I do that with a perl script?

Thanks,
Carter

Ben Morrow · Jan 10, 2004

Hi all,

I have the below file (C:\text1.txt) on Windows platform:

---------------------------
Rob: ThirdParty 2002.05
Rob: ThirdParty 4.2
Mike: ThirdParty 2002.05
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 2002.05
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2
-------------------------

Now, I want to avoid the duplicate entry that has 2002.05. For
example, the output should be as follows:

--------------------
Rob: ThirdParty 4.2
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2

Use a hash, viz:

my %lines;

# (iterate over lines. Split each line into three fields in @fields)

$fields[2] eq '4.2'
and $lines{$fields[0]} = join ' ', @fields;

$fields[2] eq '2002.05'
and not $lines{$fields[0]}
and $lines{$fields[0]} = join ' ', @fields;

# Now print out the stuff in %lines

Ben

Bob Walton · Jan 10, 2004

Jim said:
Hi all,

I have the below file (C:\text1.txt) on Windows platform:

---------------------------
Rob: ThirdParty 2002.05
Rob: ThirdParty 4.2
Mike: ThirdParty 2002.05
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 2002.05
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2
-------------------------

Now, I want to avoid the duplicate entry that has 2002.05. For
example, the output should be as follows:

--------------------
Rob: ThirdParty 4.2
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2

That's a FAQ:

perldoc -q duplicate

....

Tore Aursand · Jan 10, 2004

Rob: ThirdParty 2002.05
Rob: ThirdParty 4.2
Mike: ThirdParty 2002.05
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 2002.05
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2

Now, I want to avoid the duplicate entry that has 2002.05. For
example, the output should be as follows:

Rob: ThirdParty 4.2
Mike: ThirdParty 4.2
Ed: ThirdParty 2002.05
Dave: ThirdParty 4.2
Tom: ThirdParty 2002.05
Vince: ThirdParty 4.2
Mary: ThirdParty 4.2
Lisa: ThirdParty 2002.05
Maria: ThirdParty 4.2

how can I do that with a perl script?

What have you tried so far? What didn't work? You shouldn't let us to
your job, instead try for yourself first.

Anyway. First of all you should lookup 'perldoc -q duplicate', then you
should look into this untested code;

#!/usr/bin/perl
#
use strict;
use warnings;

my %seen = ();
open( TXT, 'C:\text1.txt' ) or die "Couldn't open file; $!\n";
while ( <TXT> ) {
chomp;
if ( /(\d+\.\d+)$/ ) {
next if ( exists $seen{$1} );
$seen{$1}++;
print $_ . "\n";
}
}
close( TXT );

gnari · Jan 10, 2004

Jim Carter said:
Hi all,

I have the below file (C:\text1.txt) on Windows platform:

---------------------------
Rob: ThirdParty 2002.05
Rob: ThirdParty 4.2
Mike: ThirdParty 2002.05 ....
Now, I want to avoid the duplicate entry that has 2002.05. For
example, the output should be as follows:

Rob: ThirdParty 4.2
Mike: ThirdParty 4.2

....

assuming you know the basics:
you split each line a at the start of the digits
build a hash using using first part as key, second part as value,
except do not store new value if key exists and value is 2002.05
if you need to keep the same order, you push the keys to an array
whenever the key is new.

after all lines are done print all the keys, values in your hash, (using
the array, if original order is important)

try to implement that, and if you have problems, show us
what does not work

gnari

How to avoid duplicate entrees in the bleow text file???	6	Jan 12, 2004
Parsing a text file.....	2	Feb 21, 2005
Difficult text file to parse.	4	Sep 11, 2005
To extract file name only from a file	8	Jul 9, 2009
A Perl parsing question..	8	Jun 12, 2005
Parse Text File and Output to File	2	Aug 1, 2003
Trying to add threading to parse a .txt file.	4	Jun 9, 2008
FAQ 5.24 All I want to do is append a small amount of text to the end of a file. Do I still have to	0	Feb 1, 2011

To parse a text file...

Jim Carter

Ben Morrow

Bob Walton

Tore Aursand

gnari

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads