Extract text from a file and write to another.

S

Sesa Woruban

Hiya,

I'm very new to Perl and my brain is dead. I'm trying to create a
simple programme that will extract the pertinent lines (only those
with a : in them) from a plain flat text file (that represents my
inbox) and write only those files to another text file. This is what
I've got so far:

use strict;
use warnings;

my $key;
my %hash;
my $infile = '/home/sesaworu/mail/sesaworuban.net/test/input'; #
store the file
my $outfile = '>/home/sesaworu/mail/sesaworuban.net/test/output.txt';
open (INFILE, $infile) or die "cannot open $infile: $!"; # opens the
file
open (OUTFILE, $outfile) or die "cannot open $outfile: $!"; # opens
the file

while()
{
chomp;
$key='',next if /^\s*$/;
if(/([\w\s]+):(.*)/){
$key=$1;
push @{$hash{$key}},$2;
}
else
{
push @{$hash{$key}},$_ if $key;
}
}
for(sort keys %hash)
{
print OUTFILE "$_ : ".join("\n",@{$hash{$_}})."\n";
}

How does that look?

Cheers
Sesa
 
B

Ben Morrow

use strict;
use warnings;
Good.

my $key;
my %hash;

Give this a better name, like %headers.
my $infile = '/home/sesaworu/mail/sesaworuban.net/test/input'; #
store the file

There's no need for that comment: it is perfctly clear from the code.
my $outfile = '>/home/sesaworu/mail/sesaworuban.net/test/output.txt';
open (INFILE, $infile) or die "cannot open $infile: $!"; # opens the
file

I would recommend using a lexical FH:

open my $IN, $infile or die "cannot open $infile: $!";
open (OUTFILE, $outfile) or die "cannot open $outfile: $!"; # opens
the file

while()
{

You mean
while ( said:
chomp;
$key='',next if /^\s*$/;

Use undef rather than ''... it's a better representation of 'no
value'.
if(/([\w\s]+):(.*)/){

This is a mail message, right? In which case, don't you mean something
more like /(.+?) \s* : \s* (.*)/x ? '-' is not included in \w. Also,
do you know that a continuation header line can (and often will) contain
':': it is the whitespace at the start which marks it as a
continuation?

If this is a mailbox, you would be better off using one of the Mail::
modules on CPAN to parse it.
$key=$1;
push @{$hash{$key}},$2;
}
else
{
push @{$hash{$key}},$_ if $key;
}
}
for(sort keys %hash)
{
print OUTFILE "$_ : ".join("\n",@{$hash{$_}})."\n";
}

Ben
 
J

Jürgen Exner

Sesa said:
I'm very new to Perl and my brain is dead. I'm trying to create a
simple programme that will extract the pertinent lines (only those
with a : in them) from a plain flat text file (that represents my
inbox) and write only those files to another text file. This is what
I've got so far:

use strict;
use warnings;

Good and good!
my $key;
my %hash;
my $infile = '/home/sesaworu/mail/sesaworuban.net/test/input'; #
store the file
my $outfile = '>/home/sesaworu/mail/sesaworuban.net/test/output.txt';

Personally I find this missleading. Intuitively I would assume that $outfile
contains the file name of the output file. However it doesn't. It also
contains some wierd additional chevron.
open (INFILE, $infile) or die "cannot open $infile: $!"; # opens the
file
open (OUTFILE, $outfile) or die "cannot open $outfile: $!"; # opens
the file

And here I am missing the chevron, which would indicate that you are opening
the file for writing. No big deal, it works like you coded it, but a few
month from now it will confuse you and it will cost extra time to figure out
why the chevron is missing here.

While what? I guess you meant
while ( said:

Spaces for indentation are cheap, use them liberaly
chomp;
$key='',next if /^\s*$/;
if(/([\w\s]+):(.*)/){
$key=$1;
push @{$hash{$key}},$2;
}
else
{
push @{$hash{$key}},$_ if $key;
}
}

No idea what this loop body is supposed to do. It looks way to complicated
to me.
A simple single-line
{ print OUTFILE if (/:/); }
appears to be all you would need according to your spec above. "Copy a line
if it contains a colon sign".
for(sort keys %hash)
{
print OUTFILE "$_ : ".join("\n",@{$hash{$_}})."\n";
}

Well, why? Your spec doesn't say anything about sorting.

jue
 
T

Tad McClellan

Sesa Woruban said:
I'm trying to create a
simple programme that will extract the pertinent lines (only those
with a : in them) from a plain flat text file (that represents my
inbox) and write only those files to another text file.


perl -ne 'print if /:/' infile >outfile
 
S

Sesa Woruban

If this is a mailbox, you would be better off using one of the Mail::
modules on CPAN to parse it.

Possibly, and I went to have a look at these modules but I think the
learning curve for those will be even higher than just getting this
simple programme cobbled together... unless you can give me a quick
intro?

Cheers
Sesa
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top