FormatText/TreeBuilder Removes Line Breaks

A

afrinspray

I'm working on a program that removes html formatting from an IM
conversation. Right now, I'm storing the conversation in a variable,
where each line of the conversation is broken up by line feeds (a
single \n). Then I do the following:

my $formatter = HTML::FormatText->new;
my $tree = HTML::TreeBuilder->new;
$tree->parse($body);
if ($tree) {
$body = $formatter->format($tree);
$tree->delete;
}

where $body is the entire IM conversation.


This strips the line feeds but I needs to keep those in there. Does
anyone have any other suggestions?

Thanks,
Mike
 
A

afrinspray

I just found the FAQ in comp.lang.perl.misc and I'm considering the
line:

s/<(?:[^>'"]*|(['"]).*?\1)*>//gs

Does anyone have any objections?

Thanks,
Mike
 
A

A. Sinan Unur

I just found the FAQ in comp.lang.perl.misc and I'm considering the
line:

s/<(?:[^>'"]*|(['"]).*?\1)*>//gs

Does anyone have any objections?

Uhmmmm ... to what?

Sinan
 
J

John W. Krahn

afrinspray said:
I just found the FAQ in comp.lang.perl.misc and I'm considering the
line:

s/<(?:[^>'"]*|(['"]).*?\1)*>//gs

Does anyone have any objections?

Yes, I strenuously object!


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,023
Latest member
websitedesig25

Latest Threads

Top