Reading text file

Discussion in 'Perl' started by Kevin B, Oct 16, 2003.

  1. Kevin B

    Kevin B Guest

    I have the following short script that I'm using to clean up the source of a
    web page in order to index and search the page:

    #!/usr/bin/perl
    #striphtml.pl

    undef $/;
    open FD, "< testfile1.txt" or die $!;

    while (<FD>) {
    #s/\r\n//gs;

    #s/^\s+$//;
    s/<.*?>//gs;
    trim();
    print "$_";
    }

    sub trim {

    my @out = @_ ? @_ : $_;
    $_ = join(' ', split(' ')) for @out;
    return wantarray ? @out : "@out";
    }


    the problem is that it leaves blank lines in the output and the use of chomp
    does not clean up. What am I missing to clean up the lines?

    Kevin
     
    Kevin B, Oct 16, 2003
    #1
    1. Advertising

  2. Kevin B

    Roy Johnson Guest

    This newsgroup is defunct. You will reach more people if you post in
    comp.lang.perl.misc instead.

    "Kevin B" <> wrote in message news:<GlAjb.17261$>...
    > undef $/;


    Ok, you're slurping the whole file in at once...

    > open FD, "< testfile1.txt" or die $!;
    >
    > while (<FD>) {


    No real point in a while, if you're getting the whole file in one
    read. Just do
    $_ = <FD>;

    > s/<.*?>//gs;


    strip out all the tags...

    > print "$_";


    No need for the quotes. In this case, no need for an argument at all.
    Just
    print;

    > the problem is that it leaves blank lines in the output and the use of chomp
    > does not clean up. What am I missing to clean up the lines?


    Maybe something like
    tr/\n//s;
    or
    s/\n\s*\n/\n/g;
    ?
     
    Roy Johnson, Oct 16, 2003
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Darrel
    Replies:
    3
    Views:
    679
    Kevin Spencer
    Nov 11, 2004
  2. Replies:
    0
    Views:
    787
  3. Replies:
    1
    Views:
    454
    Keith Thompson
    Apr 12, 2005
  4. Lionel
    Replies:
    22
    Views:
    658
    Steve Holden
    Feb 3, 2009
  5. Robin Wenger
    Replies:
    191
    Views:
    3,242
Loading...

Share This Page