Deleting element tags

Discussion in 'Perl Misc' started by Rafal Konopka, Jan 4, 2007.

  1. Hi,

    I need to delete element tags from many HTML files. The elements in
    question are 'b' and 'strong' but only if they have a child elelment 'a'

    I'm using the TreeBuilder module and HTML::Element methods. The code
    correctly identifies those elements that I need. For debugging
    purposes, I create a hash associating the file name with all the
    elements that match my condition. Now the big question is, how do I
    remove the tags? I looked in several modules, but I couldn't find a
    method like (see below) $bx->starttag->delete()/$bx->endtag->delete()

    And a secondary question is how can I output newlines after some element
    tags? if I want to prettify the HTML output?

    Here's my solution so far:

    #!perl -w

    use HTML::TreeBuilder;
    chomp(my @filelist = `DIR *.htm /s /b`); #it's run on Windows XP
    my %main_hash = ();

    foreach my $f (@filelist) {

    my $tree = HTML::TreeBuilder->new();
    $tree->parse_file($f);
    my @bs = $tree->find_by_tag_name('strong','b');

    foreach my $bx (@bs) {

    if ( $bx->find_by_tag_name('a') ) {
    push(@{$main_hash{$f}},$bx->as_HTML);
    }
    }
    print $tree->as_HTML(''," "), "\n";
    $tree->delete;
    }

    foreach my $f (keys %main_hash) {
    print "File $f\n";
    print join("",@{$main_hash{$f}}), "\n";
    }
     
    Rafal Konopka, Jan 4, 2007
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Thomas Vackier

    Deleting tags via XSL

    Thomas Vackier, Dec 12, 2003, in forum: XML
    Replies:
    5
    Views:
    522
    Thomas Vackier
    Dec 15, 2003
  2. Harry Barker
    Replies:
    2
    Views:
    520
    Alf P. Steinbach
    Apr 19, 2006
  3. HANM
    Replies:
    2
    Views:
    723
    Joseph Kesselman
    Jan 29, 2008
  4. pablitoman
    Replies:
    1
    Views:
    170
    Mike Dalessio
    Dec 16, 2009
  5. crea
    Replies:
    2
    Views:
    415
    Nobody
    Dec 28, 2012
Loading...

Share This Page