Parsing a text file.....

C

clearguy02

Hi all,

Below is the scenario:

I have a file (c:\test.txt) with the below lines:

===============================
JSMITH 2002.05.00
JSMITH 2003.06.10
BBRAND 2002.05.00
JCARTER 2002.05.00
JCARTER 4.2
MSUBBA 2003.06.10
SSUSAN 2002.05.00
SSUSAN 2003.06.10
JPORTER 2003.06.10
================================
The logic is that if the same name has two or more different numeric
values (example: JSMITH has 2002.05.00 and 2003.06.10), you need to
skip those lines.

So, output file should be as follows:
==================================
BBRAND 2002.05.00
MSUBBA 2003.06.10
JPORTER 2003.06.10
=================================

So, how can I get the above output from the above input file?

Below is my stupid piece of code which doesn't work:
====================================
while (<DATA>)
{
($name, $version) = split (/\s+/, $_);
next if (($name =~ /$version/) && ($name !~ /$version/));
print $_;

}
__DATA__
JSMITH 2002.05.00
JSMITH 2003.06.10
BBRAND 2002.05.00
JCARTER 2002.05.00
JCARTER 4.2
MSUBBA 2003.06.10
SSUSAN 2002.05.00
SSUSAN 2003.06.10
JPORTER 2003.06.10
===================================

I guess using hashes is a way to go. Can some one suggest me how I can
use hashes in this case?

Thanks,
Rider.
 
A

Arndt Jonasson

I have a file (c:\test.txt) with the below lines:

===============================
JSMITH 2002.05.00
JSMITH 2003.06.10
BBRAND 2002.05.00
JCARTER 2002.05.00
JCARTER 4.2
MSUBBA 2003.06.10
SSUSAN 2002.05.00
SSUSAN 2003.06.10
JPORTER 2003.06.10
================================
The logic is that if the same name has two or more different numeric
values (example: JSMITH has 2002.05.00 and 2003.06.10), you need to
skip those lines.

So, output file should be as follows:
==================================
BBRAND 2002.05.00
MSUBBA 2003.06.10
JPORTER 2003.06.10
=================================

So, how can I get the above output from the above input file?

Below is my stupid piece of code which doesn't work:
====================================
while (<DATA>)
{
($name, $version) = split (/\s+/, $_);
next if (($name =~ /$version/) && ($name !~ /$version/));
print $_;

}

Your code seems to say "print the line if $name both matches and does
not match $version".
I guess using hashes is a way to go. Can some one suggest me how I can
use hashes in this case?

Here is my attempt. You may want to add error handling when the input
lines don't conform to your specification.

#! /usr/local/bin/perl

use warnings;
use strict;

my %count;
my %values;

while (<DATA>) {
my ($name, $value) = split;
$count{$name}++;
$values{$name} = $_;
}
for my $name (sort keys %count) {
if ($count{$name} == 1) {
print "$values{$name}";
}
}
__DATA__
JSMITH 2002.05.00
JSMITH 2003.06.10
BBRAND 2002.05.00
JCARTER 2002.05.00
JCARTER 4.2
MSUBBA 2003.06.10
SSUSAN 2002.05.00
SSUSAN 2003.06.10
JPORTER 2003.06.10
 
D

Dov Wasserman

Arndt said:
(e-mail address removed) writes:



Your code seems to say "print the line if $name both matches and does
not match $version".



Here is my attempt. You may want to add error handling when the input
lines don't conform to your specification.

#! /usr/local/bin/perl

use warnings;
use strict;

my %count;
my %values;

while (<DATA>) {
my ($name, $value) = split;
$count{$name}++;
$values{$name} = $_;
}
for my $name (sort keys %count) {
if ($count{$name} == 1) {
print "$values{$name}";
}
}
__DATA__
JSMITH 2002.05.00
JSMITH 2003.06.10
BBRAND 2002.05.00
JCARTER 2002.05.00
JCARTER 4.2
MSUBBA 2003.06.10
SSUSAN 2002.05.00
SSUSAN 2003.06.10
JPORTER 2003.06.10
Looks good, but the original post seemed to want the output lines in the
same order as in the input. The code you provided prints them in
alphabetical order (sort keys %count).

If keeping the original order is important, I would try something like:

#!/usr/local/bin/perl

use warnings;
use strict;

my %count;
my @values;

while (<>) {
my ($name, $value) = split;
$count{$name}++;
# append anonymous array of name/value pair on first occurrence of name
push @values, [$name, $value] if $count{$name} == 1;
}

for my $val (@values) {
my ($name, $value) = @$val; # dereference anonymous array
print "$name\t$value\n" if $count{$name} == 1;
}


-Dov Wasserman
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,152
Latest member
LorettaGur
Top