strip newlines in TD cell ?

Discussion in 'Perl Misc' started by Richard A. DeVenezia, Sep 29, 2003.

  1. Can't figure this one out...

    How can I strip all the newlines of stuff between <TD and </TD> ?

    I read in and join some HTML

    that I want process as

    <TABLE><TR><TD>1 2 3</TD>
    <TR><TD>A B C</TD></TR></TABLE>

    Richard A. DeVenezia, Sep 29, 2003
  2. Maybe by using an HTML parser to parse HTML?
    Contrary to popular believe parsing HTML correctly is close to rocket
    science and nobody with a sane mind would attempt to do that using REs

    For further details please see the FAQ. 'perldoc -q HTML':
    "How do I remove HTML from a string?"

    Jürgen Exner, Sep 29, 2003
  3. Use a module that can properly parse HTML.

    Then why did you say you wanted to _strip_ newlines?

    If you stripped newlines, you'd end up with:


    It appears that what you actually want is to replace newlines
    with spaces...

    s#(<TD>.*?</TD>)# $a=$1; $a =~ tr/\n/ /; $a #gse;

    But that does not produce output like your example either.

    I'll leave it to you to make it do whatever it is that you want done...
    Tad McClellan, Sep 29, 2003
