Discussion in 'Perl Misc' started by vivek_12315, Feb 13, 2013.

    I m working on my perl regex code, where I have to parse a html line like :

    <a href="/question?id=15422849"><p>MY text here 1</p><p>MY text here 2</p><p>MY text here 3</p></a>

    I am doing something like:
    $string =~ m/(.*)href(.*)/;

    But this is not helping me in what I want. I want something closer to following text:

    "MY text here 1 MY text here 2 MY text here 3"

    Can some give some ideas ?
    vivek_12315, Feb 13, 2013
  2. Your Question used to be Asked Frequently. Please see

    perldoc -q "remove html"

    Jürgen Exner, Feb 13, 2013
    brian d foy, Feb 13, 2013
  4. Actually for this particular example it is almost trivial(*):
    Of course this is going to fail as soon as the HTML code becomes a tiny
    bit more complex.

    *: almost because it doesn't add the space characters between the
    individual paragraph elements.

    Jürgen Exner, Feb 13, 2013
