good email parser ??

J

Jack

Hi I havent had any luck with the CPAN email modules, I just want to
parse multipart and mime and base64, with all the varieties of email
files out there, these modules just dont work... does anyone know a
free or low cost command line driven email client or parser that can
do the job.

Thank you,

Jack
 
R

rabbits77

Jack said:
Hi I havent had any luck with the CPAN email modules, I just want to
parse multipart and mime and base64, with all the varieties of email
files out there, these modules just dont work... does anyone know a
free or low cost command line driven email client or parser that can
do the job.

Thank you,
I have done some work parsing email in the(fairly distant) past.
Email really isn't that varied!
In order for email to work at all, in fact, it needs to be pretty
predictable!
I bet that you could do this yourself.
Where are your sticking points?
If I understand your question, do you just want to remove all
email attachments?
 
P

Peter J. Holzer

Hi I havent had any luck with the CPAN email modules, I just want to
parse multipart and mime and base64, with all the varieties of email
files out there, these modules just dont work...

MIME::parser works for me. It is a bit slow and tends to use ridiculuous
amounts of memory if you want to avoid temporary files, but I have yet
to find a (syntactically correct) email which can't parse.

hp
 
J

Jack

MIME::parser works for me. It is a bit slow and tends to use ridiculuous
amounts of memory if you want to avoid temporary files, but I have yet
to find a (syntactically correct) email which can't parse.

        hp

Thanks Peter for the posting.. can you provide some guidance then.. I
tried the below code and figured the skeleton would report the base64
image attachments in a MIME message, but isnt picking it up. I need
to be able to deal with text body, base64 body, and image attachments,
and want to parse them out correctly. I can do the base64 decoding,
etc. - how do I accomplish this with MIME::parser ??

Code:
use MIME::parser;

if (@ARGV[0] eq undef) {
$filename1="no dest filename" ;
} else {
$filename1=@ARGV[0];
}

### Create a new parser object:
my $parser = new MIME::parser;

### Tell it where to put things:
$parser->output_under("e:\\tmp");

### Parse an input filehandle:
$entity = $parser->parse($filename1);

### Congratulations: you now have a (possibly multipart) MIME
entity!
$entity->dump_skeleton;

####HERES THE OUTPUT
Content-type: text/plain
Effective-type: text/plain
Content-encoding: 7bit
Body-location: (IN CORE)
Body-size: 0
--

####
It appears to not picking up this from the email itself -
Content-Type: image/jpeg; name="cardamage1.jpg"
Content-Disposition: attachment; filename="cardamage1.jpg"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_fqzhlhly0


###
Also I tried to build my own parser based on the "boundary" definition
but as you can see from the below example, its not clear why I have >
1 boundary !

Date: Sun, 24 Aug 2008 06:46:48 -0700
From: "Ben Brewster" <[email protected]>
To: (e-mail address removed)
Subject: car for sale two images
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----=_Part_13503_152406.1219585608169"

------=_Part_13503_152406.1219585608169
Content-Type: multipart/alternative;
boundary="----=_Part_13504_19292996.1219585608169"

------=_Part_13504_19292996.1219585608169
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Hi


------=_Part_13504_19292996.1219585608169
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

<div dir="ltr"></div>

------=_Part_13504_19292996.1219585608169--

------=_Part_13503_152406.1219585608169
Content-Type: image/jpeg; name=masertione.jpg
Content-Transfer-Encoding: base64
X-Attachment-Id: f_fk9pr8s20
Content-Disposition: attachment; filename=masertione.jpg
 
J

John W. Krahn

Jack said:
Thanks Peter for the posting.. can you provide some guidance then.. I
tried the below code and figured the skeleton would report the base64
image attachments in a MIME message, but isnt picking it up. I need
to be able to deal with text body, base64 body, and image attachments,
and want to parse them out correctly. I can do the base64 decoding,
etc. - how do I accomplish this with MIME::parser ??

Code:

use warnings;
use strict;
use MIME::parser;

if (@ARGV[0] eq undef) {

You cannot use undef in a comparison. Perl will just convert it
internally to a numeric, or in this case, a string representation of
"false", 0 or '' respectively. You shouldn't use a list in scalar
context. If you had warnings enabled then perl would have warned about
this.

if ( not defined $ARGV[ 0 ] ) {
$filename1="no dest filename" ;
} else {
$filename1=@ARGV[0];

$filename1 = $ARGV[ 0 ];

Or if you have Perl version 5.10 installed you could write that as:

my $filename1 = $ARGV[ 0 ] // 'no dest filename';

For older perl's that would be:

my $filename1 = defined $ARGV[ 0 ] ? $ARGV[ 0 ] : 'no dest filename';



John
 
U

Uri Guttman

JWK" == John W Krahn said:
use MIME::parser;
if (@ARGV[0] eq undef) {

JWK> You cannot use undef in a comparison. Perl will just convert it
JWK> internally to a numeric, or in this case, a string representation of
JWK> "false", 0 or '' respectively. You shouldn't use a list in scalar
JWK> context. If you had warnings enabled then perl would have warned
JWK> about this.

couple of nits to pick. undef is coerced to '' with eq since it is
string context. and @ARGV[0] is a slice but it will return a single
value here. sure it is incorrect but it will work.

JWK> if ( not defined $ARGV[ 0 ] ) {
$filename1="no dest filename" ;
} else {
$filename1=@ARGV[0];

JWK> $filename1 = $ARGV[ 0 ];

JWK> Or if you have Perl version 5.10 installed you could write that as:

JWK> my $filename1 = $ARGV[ 0 ] // 'no dest filename';

JWK> For older perl's that would be:

JWK> my $filename1 = defined $ARGV[ 0 ] ? $ARGV[ 0 ] : 'no dest filename';

you should know better. the best way to check for elements in an array
is checking its count. since he wants only one arg this should do fine:

@ARGV or die "missing file name argument" ;
my $filename = shift ;

and to the OP, you can never have an undef in @ARGV unless you put it
there yourself. @ARGV is passed in from the exec call (the shell does
this for command line programs) and shell doesn't know about undef.

uri
 
H

Hans Mulder

Jack said:
Thanks Peter for the posting.. can you provide some guidance then.. I
tried the below code and figured the skeleton would report the base64
image attachments in a MIME message, but isnt picking it up.

The parse() method takes a file handle argument. So you'll have to
open the file yourself and pass the resulting handle to parse():

use warnings;
use strict;

use MIME::parser;

my $dir = "e:\\tmp";

if (not -d $dir) {
mkdir $dir or die "Can't create directory $dir: $!\n";
}

my $filename1 = $ARGV[0] || "no input filename";

### Create a new parser object:
my $parser = new MIME::parser;

### Tell it where to put things:
$parser->output_under($dir);

### Open the file:
open my $fh, '<', $filename1 or die "Can't read $filename1: $!\n";

### Parse an input filehandle:
my $entity = $parser->parse($fh);

### Congratulations: you now have a (possibly multipart) MIME entity!
$entity->dump_skeleton;
__END__

This prints:

Content-type: multipart/mixed
Effective-type: multipart/mixed
Body-file: NONE
Subject: car for sale two images
Num-parts: 2
--
Content-type: multipart/alternative
Effective-type: multipart/alternative
Body-file: NONE
Num-parts: 2
--
Content-type: text/plain
Effective-type: text/plain
Body-file: e:\tmp/msg-1234304022-16083-0/msg-16083-1.txt
--
Content-type: text/html
Effective-type: text/html
Body-file: e:\tmp/msg-1234304022-16083-0/msg-16083-2.html
--
Content-type: image/jpeg
Effective-type: image/jpeg
Body-file: e:\tmp/msg-1234304022-16083-0/masertione.jpg
Recommended-filename: masertione.jpg
--
I need
to be able to deal with text body, base64 body, and image attachments,
and want to parse them out correctly. I can do the base64 decoding,
etc. -

MIME::parser will do the base64 decoding for you.
how do I accomplish this with MIME::parser ??

Read the documentation carefully:

parse INSTREAM
Instance method. Takes a MIME-stream and splits it into its compo-
nent entities.

The INSTREAM can be given as a readable FileHandle, an IO::File, a
globref filehandle (like "\*STDIN"), or as any blessed object con-
forming to the IO:: interface (which minimally implements getline()
and read()).

It does not mention the possibility of passing a filename and parse()
opening it on your behalf. This suggest that this feature does not
exist in this version of MIME::parser.

Hope this helps,

-- HansM
 
J

Jack

Thanks Peter for the posting.. can you provide some guidance then.. I
tried the below code and figured the skeleton would report the base64
image attachments in aMIMEmessage, but isnt picking it up.

The parse() method takes a file handle argument.  So you'll have to
open the file yourself and pass the resulting handle to parse():

use warnings;
use strict;

useMIME::parser;

my $dir = "e:\\tmp";

if (not -d $dir) {
     mkdir $dir or die "Can't create directory $dir: $!\n";

}

my $filename1 = $ARGV[0] || "no input filename";

### Create a new parser object:
my $parser = newMIME::parser;

### Tell it where to put things:
$parser->output_under($dir);

### Open the file:
open my $fh, '<', $filename1 or die "Can't read $filename1: $!\n";

### Parse an input filehandle:
my $entity = $parser->parse($fh);

### Congratulations: you now have a (possibly multipart)MIMEentity!
$entity->dump_skeleton;
__END__

This prints:

Content-type: multipart/mixed
Effective-type: multipart/mixed
Body-file: NONE
Subject: car for sale two images
Num-parts: 2
--
     Content-type: multipart/alternative
     Effective-type: multipart/alternative
     Body-file: NONE
     Num-parts: 2
     --
         Content-type: text/plain
         Effective-type: text/plain
         Body-file: e:\tmp/msg-1234304022-16083-0/msg-16083-1.txt
         --
         Content-type: text/html
         Effective-type: text/html
         Body-file: e:\tmp/msg-1234304022-16083-0/msg-16083-2.html
         --
     Content-type: image/jpeg
     Effective-type: image/jpeg
     Body-file: e:\tmp/msg-1234304022-16083-0/masertione.jpg
     Recommended-filename: masertione.jpg
     --
I need
to be able to deal with text body, base64 body, and image attachments,
and want to parse them out correctly.  I can do the base64 decoding,
etc. -

MIME::parser will do the base64 decoding for you.
how do I accomplish this withMIME::parser ??

Read the documentation carefully:

parse INSTREAM
    Instance method.  Takes aMIME-stream and splits it into its compo-
    nent entities.

    The INSTREAM can be given as a readable FileHandle, an IO::File, a
    globref filehandle (like "\*STDIN"), or as any blessed object con-
    forming to the IO:: interface (which minimally implements getline()
    and read()).

It does not mention the possibility of passing a filename and parse()
opening it on your behalf.  This suggest that this feature does not
exist in this version ofMIME::parser.

Hope this helps,

-- HansM

Thanks Hans... can you tell me if MIME:parser will handle / process
RFC (non mime) emails ?
 
J

Jack

Thanks Peter for the posting.. can you provide some guidance then.. I
tried the below code and figured the skeleton would report the base64
image attachments in aMIMEmessage, but isnt picking it up.

The parse() method takes a file handle argument.  So you'll have to
open the file yourself and pass the resulting handle to parse():

use warnings;
use strict;

useMIME::parser;

my $dir = "e:\\tmp";

if (not -d $dir) {
     mkdir $dir or die "Can't create directory $dir: $!\n";

}

my $filename1 = $ARGV[0] || "no input filename";

### Create a new parser object:
my $parser = newMIME::parser;

### Tell it where to put things:
$parser->output_under($dir);

### Open the file:
open my $fh, '<', $filename1 or die "Can't read $filename1: $!\n";

### Parse an input filehandle:
my $entity = $parser->parse($fh);

### Congratulations: you now have a (possibly multipart)MIMEentity!
$entity->dump_skeleton;
__END__

This prints:

Content-type: multipart/mixed
Effective-type: multipart/mixed
Body-file: NONE
Subject: car for sale two images
Num-parts: 2
--
     Content-type: multipart/alternative
     Effective-type: multipart/alternative
     Body-file: NONE
     Num-parts: 2
     --
         Content-type: text/plain
         Effective-type: text/plain
         Body-file: e:\tmp/msg-1234304022-16083-0/msg-16083-1.txt
         --
         Content-type: text/html
         Effective-type: text/html
         Body-file: e:\tmp/msg-1234304022-16083-0/msg-16083-2.html
         --
     Content-type: image/jpeg
     Effective-type: image/jpeg
     Body-file: e:\tmp/msg-1234304022-16083-0/masertione.jpg
     Recommended-filename: masertione.jpg
     --
I need
to be able to deal with text body, base64 body, and image attachments,
and want to parse them out correctly.  I can do the base64 decoding,
etc. -

MIME::parser will do the base64 decoding for you.
how do I accomplish this withMIME::parser ??

Read the documentation carefully:

parse INSTREAM
    Instance method.  Takes aMIME-stream and splits it into its compo-
    nent entities.

    The INSTREAM can be given as a readable FileHandle, an IO::File, a
    globref filehandle (like "\*STDIN"), or as any blessed object con-
    forming to the IO:: interface (which minimally implements getline()
    and read()).

It does not mention the possibility of passing a filename and parse()
opening it on your behalf.  This suggest that this feature does not
exist in this version ofMIME::parser.

Hope this helps,

-- HansM

Also how does one capture the directory name its creating on the fly
into a variable ??
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top