R
Reinhard Glauber
Hi Perl-Gurus,
I need to clean a HTML file, so that I get plain text.
So, now that I know that there is something called perldoc I searched and found
$html =~ s/<(?:[^>'"]*|(['"]).*?\1)*>//gs$html =~ s/\t//gs; $html =~ s/\r//gs; This works great, BUT, when I open the cleaned file in viI get a lot of blue ^M - SignsAlso there are way too many blanks in there.How do I get them out ? I know this really sounds like a bad Newbie Question, andofcourse it is ;-) Hopefully its not too bad.Screenshot: http://www.sabineschulte.de/perl.jpg
I need to clean a HTML file, so that I get plain text.
So, now that I know that there is something called perldoc I searched and found
$html =~ s/<(?:[^>'"]*|(['"]).*?\1)*>//gs$html =~ s/\t//gs; $html =~ s/\r//gs; This works great, BUT, when I open the cleaned file in viI get a lot of blue ^M - SignsAlso there are way too many blanks in there.How do I get them out ? I know this really sounds like a bad Newbie Question, andofcourse it is ;-) Hopefully its not too bad.Screenshot: http://www.sabineschulte.de/perl.jpg