Concatentation

Discussion in 'Perl Misc' started by Daniel Bergquist, Jul 13, 2004.

  1. Consider the following chunk of code:
    --------------------------------------------------
    open (IN, "<:raw", "test2.txt") or die "Can't open test.txt";

    chomp($line = <IN>);

    # Capture excerpt
    $line =~ m/>([^<]+)/;

    # Copy first line of excerpt
    $pExcerpt = $1;

    # Next line
    chomp($line = <IN>);

    # Untill we have reached the end of the section
    until($line =~ m/<\/p>/i) {

    # Capture useful text
    $line =~ m/([^<]+)/;
    chomp($line = <IN>);
    }

    # Capture the rest of the useful text
    $line =~ m/([^<]+)/;

    $pExcerpt = "$pExcerpt $1";

    print "final: $pExcerpt\n";
    -----------------------------------------------------------



    The file test2.txt is as follows:
    -------------------------------------------------
    <p class=p1>I consider myself fortunate to stand before you today as I
    make my
    defense against all the accusations of the Jews. <i>Acts 26:2</i></p>


    ----------------------------------------------

    When run:
    P:\WEBPOP\EXPERI~1>excerpt.pl
    defense against all the accusations of the Jews. you today as I make
    my

    P:\WEBPOP\EXPERI~1>


    When I change the concatenation to as follows:
    $pExcerpt = "$1 $pExcerpt";
    The result is:
    P:\WEBPOP\EXPERI~1>excerpt.pl
    final: defense against all the accusations of the Jews. I consider
    myself fortunate
    to stand before you today as I make my

    P:\WEBPOP\EXPERI~1>

    Which is how I would expect it to work. Why does it not work the first
    way(which is the way I need it)?


    Perl reports itself as v5.8.3 built for MSWin32-x86-multi-thread,
    binary build 809 provided by ActiveState Corp.


    Thanks!

    Daniel Bergquist
     
    Daniel Bergquist, Jul 13, 2004
    #1
    1. Advertising

  2. Daniel Bergquist <> wrote:
    > Consider the following chunk of code:
    > --------------------------------------------------
    > open (IN, "<:raw", "test2.txt") or die "Can't open test.txt";
    >
    > chomp($line = <IN>);
    >
    > # Capture excerpt
    > $line =~ m/>([^<]+)/;
    >
    > # Copy first line of excerpt
    > $pExcerpt = $1;



    You should never use the dollar-digit variables unless you have
    first ensured that the match *succeeded*, since the variables
    are only changed when the match succeeds.


    $pExcerpt = $1 if $line =~ m/>([^<]+)/;


    ( I hope you are not trying to parse HTML or XML with regular expressions...)


    > # Next line
    > chomp($line = <IN>);
    >
    > # Untill we have reached the end of the section
    > until($line =~ m/<\/p>/i) {



    You should use a module that understands HTML for processing HTML data.


    > # Capture useful text
    > $line =~ m/([^<]+)/;



    The above line of code is useless. You don't put the captured text anywhere.

    What do you think that pattern match is doing for you?


    > chomp($line = <IN>);
    > }
    >
    > # Capture the rest of the useful text
    > $line =~ m/([^<]+)/;
    >
    > $pExcerpt = "$pExcerpt $1";



    $pExcerpt .= " $1";



    > Why does it not work the first
    > way(which is the way I need it)?



    It won't matter if you process it properly (with an HTML module rather
    than with regexes).


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Jul 13, 2004
    #2
    1. Advertising

  3. > > # Capture useful text
    > > $line =~ m/([^<]+)/;

    >
    >
    > The above line of code is useless. You don't put the captured text anywhere.
    >
    > What do you think that pattern match is doing for you?


    Heh, sorry, I had cut out a bunch of code and this is a work in
    progress so some lines don't look useful at the moment.


    > > chomp($line = <IN>);
    > > }
    > >
    > > # Capture the rest of the useful text
    > > $line =~ m/([^<]+)/;
    > >
    > > $pExcerpt = "$pExcerpt $1";

    >
    >
    > $pExcerpt .= " $1";
    >
    >
    >
    > > Why does it not work the first
    > > way(which is the way I need it)?

    >
    >
    > It won't matter if you process it properly (with an HTML module rather
    > than with regexes).


    Opps, after a couple Google searches I know realize I have committed a
    crime against the Perl community. Pardon me as I have been using Perl
    only for a few months. Although I have found the modules HTML::parser
    and HTML::TokeParser I am not entirely sure which to use and I would
    like to read through a tutorial or two.


    Thanks!

    Daniel Bergquist
     
    Daniel Bergquist, Jul 14, 2004
    #3
  4. Nevermind, I think I'm figuring one of the parsers out.

    Thanks!


    Daniel Bergquist
     
    Daniel Bergquist, Jul 14, 2004
    #4
  5. Nevermind, I think I'm figuring one of the parsers out.

    Thanks!


    Daniel Bergquist
     
    Daniel Bergquist, Jul 14, 2004
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John Salerno
    Replies:
    16
    Views:
    451
    bruno at modulix
    May 12, 2006
  2. rihad
    Replies:
    37
    Views:
    433
    Mike Lyle
    Oct 14, 2007
Loading...

Share This Page