Endless loop in Text::Format::reformat()

Discussion in 'Perl Misc' started by Gunnar Hjalmarsson, Sep 24, 2006.

  1. It has happened recently (twice) that invoking CGI::ContactForm at my
    web site has resulted in endless loops, which have made the web server
    error log grow to about 700 Mb in a couple of seconds, thus consuming
    the remaining space of my VPS. :(

    This is what was written repeatedly to the error log: "Use of
    uninitialized value in concatenation (.) or string at
    /usr/lib/perl5/site_perl/5.8.1/CGI/ContactForm.pm line 600."

    Line 600 of ContactForm.pm consists of:

    push @output, "$1 ";

    and to see it in context, a URL to the CVS repository:
    http://gunnar.cc/cgi-bin/cvsweb.cgi/contactform/lib/CGI/ContactForm.pm?annotate=1.45;cvsroot=gunnar

    The code in question is the reformat() subroutine, which was copied from
    the module Text::Format. It's used to format the body nicely before
    sending a message.

    I have failed to figure out what kind of data that might cause
    reformat() to go into those endless loops. Therefore, any idea of what
    the problem is would be much appreciated.

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Sep 24, 2006
    #1
    1. Advertising

  2. Gunnar Hjalmarsson wrote:
    > It has happened recently (twice) that invoking CGI::ContactForm at my
    > web site has resulted in endless loops, which have made the web server
    > error log grow to about 700 Mb in a couple of seconds, thus consuming
    > the remaining space of my VPS. :(
    >
    > This is what was written repeatedly to the error log: "Use of
    > uninitialized value in concatenation (.) or string at
    > /usr/lib/perl5/site_perl/5.8.1/CGI/ContactForm.pm line 600."
    >
    > Line 600 of ContactForm.pm consists of:
    >
    > push @output, "$1 ";
    >
    > and to see it in context,


    if ( $line =~ /^(.{$min,$opt1}) (.*)/ ||
    $line =~ /^(.{$min,$max1}) (.*)/ || $line =~ /^(.{$min,})? (.*)/ )
    {
    push @output, "$1 ";
    $line = $2;
    }

    The way that $1 can be undef is if the first to two patterns don't
    match and the third pattern matches but the optional subexpression does
    not. This will happen if the first character of $line is a space.

    I can't immediately see the cause of the endless looping but there's
    your warning.
     
    Brian McCauley, Sep 24, 2006
    #2
    1. Advertising

  3. Gunnar Hjalmarsson

    Bob Walton Guest

    Gunnar Hjalmarsson wrote:
    > It has happened recently (twice) that invoking CGI::ContactForm at my
    > web site has resulted in endless loops, which have made the web server
    > error log grow to about 700 Mb in a couple of seconds, thus consuming
    > the remaining space of my VPS. :(
    >
    > This is what was written repeatedly to the error log: "Use of
    > uninitialized value in concatenation (.) or string at
    > /usr/lib/perl5/site_perl/5.8.1/CGI/ContactForm.pm line 600."
    >
    > Line 600 of ContactForm.pm consists of:
    >
    > push @output, "$1 ";
    >
    > and to see it in context, a URL to the CVS repository:
    > http://gunnar.cc/cgi-bin/cvsweb.cgi/contactform/lib/CGI/ContactForm.pm?annotate=1.45;cvsroot=gunnar
    >
    >
    > The code in question is the reformat() subroutine, which was copied from
    > the module Text::Format. It's used to format the body nicely before
    > sending a message.
    >
    > I have failed to figure out what kind of data that might cause
    > reformat() to go into those endless loops. Therefore, any idea of what
    > the problem is would be much appreciated.
    >


    Hmmm...almost certainly this is happening because $1 is undef at line
    600, since that is the only concatenation (interpolation actually does a
    concatenation). Looking at your code:

    ....
    595:} elsif ( $line =~ /^(.{$min,$opt1}) (.*)/ ||
    596: $line =~ /^(.{$min,$max1}) (.*)/ ||
    $line =~ /^(.{$min,})? (.*)/ ) {
    597: # 1. Try to find a string as long as opt_length.
    598: # 2. Try to find a string as long as max_length.
    599: # 3. Take the first word.
    600: push @output, "$1 ";
    ....

    it looks like the third alternative could match with $1 undef if $line
    started with a space character and there were no other space characters
    in $line (in which case the first two alternatives would fail, assuming
    $min>0). Is that a possibility?
    --
    Bob Walton
    Email: http://bwalton.com/cgi-bin/emailbob.pl
     
    Bob Walton, Sep 24, 2006
    #3
  4. Brian McCauley wrote:
    > Gunnar Hjalmarsson wrote:
    > > It has happened recently (twice) that invoking CGI::ContactForm at my
    > > web site has resulted in endless loops, which have made the web server
    > > error log grow to about 700 Mb in a couple of seconds, thus consuming
    > > the remaining space of my VPS. :(
    > >
    > > This is what was written repeatedly to the error log: "Use of
    > > uninitialized value in concatenation (.) or string at
    > > /usr/lib/perl5/site_perl/5.8.1/CGI/ContactForm.pm line 600."
    > >
    > > Line 600 of ContactForm.pm consists of:
    > >
    > > push @output, "$1 ";
    > >
    > > and to see it in context,

    >
    > if ( $line =~ /^(.{$min,$opt1}) (.*)/ ||
    > $line =~ /^(.{$min,$max1}) (.*)/ || $line =~ /^(.{$min,})? (.*)/ )
    > {
    > push @output, "$1 ";
    > $line = $2;
    > }
    >
    > The way that $1 can be undef is if the first to two patterns don't
    > match and the third pattern matches but the optional subexpression does
    > not. This will happen if the first character of $line is a space.
    >
    > I can't immediately see the cause of the endless looping but there's
    > your warning.


    Got it...

    Look at _stuff.

    sub _stuff {
    my ($text, $num_quotes) = @_;
    if ($text =~ /^ / || $text =~ /^>/ || $text =~ /^From / ||
    $num_quotes > 0) {
    return " $text";
    }
    $text;
    }

    If $num_quotes=0 and $line starts with a space then the line

    $line = '>' x $num_quotes . _stuff($line, $num_quotes);

    Simply prepends a space to $line and then the code quoted in my
    previous message simply strips of a leading space.

    There's your infinite loop.
     
    Brian McCauley, Sep 24, 2006
    #4
  5. Brian McCauley wrote:
    > Brian McCauley wrote:
    >>
    >>if ( $line =~ /^(.{$min,$opt1}) (.*)/ ||
    >> $line =~ /^(.{$min,$max1}) (.*)/ || $line =~ /^(.{$min,})? (.*)/ )
    >>{
    >> push @output, "$1 ";
    >> $line = $2;
    >>}
    >>
    >>The way that $1 can be undef is if the first to two patterns don't
    >>match and the third pattern matches but the optional subexpression does
    >>not. This will happen if the first character of $line is a space.
    >>
    >>I can't immediately see the cause of the endless looping but there's
    >>your warning.

    >
    > Got it...
    >
    > Look at _stuff.
    >
    > sub _stuff {
    > my ($text, $num_quotes) = @_;
    > if ($text =~ /^ / || $text =~ /^>/ || $text =~ /^From / ||
    > $num_quotes > 0) {
    > return " $text";
    > }
    > $text;
    > }
    >
    > If $num_quotes=0 and $line starts with a space then the line
    >
    > $line = '>' x $num_quotes . _stuff($line, $num_quotes);
    >
    > Simply prepends a space to $line and then the code quoted in my
    > previous message simply strips of a leading space.


    No, in that case it strips off both the leading spaces, since $min is 1,
    so the contents of $line won't be identical at the next iteration of the
    while loop.

    > There's your infinite loop.


    Unfortunately I think it's more subtle than that. Please note that
    CGI::ContactForm has worked just fine for about 3.5 years, and the
    infinite loop problem has just appeared twice. I suspect it happened
    when some spam robot invoked the script. Maybe I'd better concentrate on
    preventing such robots from successfully submitting their crap...

    Anyway, thanks for trying!

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Sep 25, 2006
    #5
  6. Bob Walton wrote:
    > Hmmm...almost certainly this is happening because $1 is undef at line
    > 600, since that is the only concatenation (interpolation actually does a
    > concatenation). Looking at your code:
    >
    > ...
    > 595:} elsif ( $line =~ /^(.{$min,$opt1}) (.*)/ ||
    > 596: $line =~ /^(.{$min,$max1}) (.*)/ ||
    > $line =~ /^(.{$min,})? (.*)/ ) {
    > 597: # 1. Try to find a string as long as opt_length.
    > 598: # 2. Try to find a string as long as max_length.
    > 599: # 3. Take the first word.
    > 600: push @output, "$1 ";
    > ...
    >
    > it looks like the third alternative could match with $1 undef if $line
    > started with a space character and there were no other space characters
    > in $line (in which case the first two alternatives would fail, assuming
    > $min>0). Is that a possibility?


    Don't think so. Because of this line:

    $line = '>' x $num_quotes . _stuff($line, $num_quotes);

    $line will not begin with a single space when the regex at line 596 is run.

    See also reply to Brian in this thread.

    Thanks for trying, Bob!

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Sep 25, 2006
    #6
  7. Gunnar Hjalmarsson

    Bob Walton Guest

    Gunnar Hjalmarsson wrote:
    > Bob Walton wrote:

    ....
    Aha. Here is an input line that causes an infinite loop complete with
    your warning :

    $line=' >'.('a' x 114);

    Try input to reformat() that contains a line like that. The number of
    a's isn't important as long as there are enough of them.

    Before the while loop, _unstuff gets rid of the leading space, then in
    the while loop, _stuff puts the space back in front because of the >,
    and then the third alternative in the elseif matches with $1 undef and
    $2 the string starting with >. Then the while repeats on exactly the
    same $line forever.
    --
    Bob Walton
    Email: http://bwalton.com/cgi-bin/emailbob.pl
     
    Bob Walton, Sep 26, 2006
    #7
  8. Bob Walton wrote:
    > Aha. Here is an input line that causes an infinite loop complete with
    > your warning :
    >
    > $line=' >'.('a' x 114);
    >
    > Try input to reformat() that contains a line like that. The number of
    > a's isn't important as long as there are enough of them.
    >
    > Before the while loop, _unstuff gets rid of the leading space, then in
    > the while loop, _stuff puts the space back in front because of the >,
    > and then the third alternative in the elseif matches with $1 undef and
    > $2 the string starting with >. Then the while repeats on exactly the
    > same $line forever.


    I can't believe it was that simple!! Thought I'd tried everything and
    anything; obviously not. :(

    Thank you _very_ much, Bob!

    Since I can't figure out why the first capturing parenthesis is
    optional, I guess I'll simply remove the question mark.

    $line =~ /^(.{$min,})? (.*)/
    -------------------------^

    Thanks again!

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Sep 26, 2006
    #8
  9. Gunnar Hjalmarsson wrote:
    > Brian McCauley wrote:
    > > Brian McCauley wrote:
    > >>
    > >>if ( $line =~ /^(.{$min,$opt1}) (.*)/ ||
    > >> $line =~ /^(.{$min,$max1}) (.*)/ || $line =~ /^(.{$min,})? (.*)/ )
    > >>{
    > >> push @output, "$1 ";
    > >> $line = $2;
    > >>}
    > >>

    > > If $num_quotes=0 and $line starts with a space then the line
    > >
    > > $line = '>' x $num_quotes . _stuff($line, $num_quotes);
    > >
    > > Simply prepends a space to $line and then the code quoted in my
    > > previous message simply strips of a leading space.

    >
    > No, in that case it strips off both the leading spaces, since $min is 1,
    > so the contents of $line won't be identical at the next iteration of the
    > while loop.


    Ah, well since $1 is null we know that /^(.+)? (.*)/ is matching but
    the first subexpression is not. As far as I can see this can only be is
    there's only one space. Which as you said means I can't explain the
    infinite loop.
     
    Brian McCauley, Sep 27, 2006
    #9
  10. Brian McCauley wrote:
    > Ah, well since $1 is null we know that /^(.+)? (.*)/ is matching but
    > the first subexpression is not. As far as I can see this can only be is
    > there's only one space. Which as you said means I can't explain the
    > infinite loop.


    Right, but you were close - see Bob's explanation in another sub-thread.

    I'm about to update CGI::ContactForm, and have just reported the bug in
    Text::Flowed: http://rt.cpan.org/Ticket/Display.html?id=21739

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Sep 27, 2006
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?QmlsbCBCb3Jn?=

    Endless loop on delete auth cookie/abandon session

    =?Utf-8?B?QmlsbCBCb3Jn?=, Feb 2, 2005, in forum: ASP .Net
    Replies:
    0
    Views:
    714
    =?Utf-8?B?QmlsbCBCb3Jn?=
    Feb 2, 2005
  2. Alexandre
    Replies:
    4
    Views:
    641
    Lasse Reichstein Nielsen
    Apr 12, 2005
  3. Leo
    Replies:
    1
    Views:
    471
    Thomas Heller
    Jul 4, 2003
  4. Claudio Grondi
    Replies:
    75
    Views:
    1,184
    Tim Peters
    Jan 21, 2006
  5. Isaac Won
    Replies:
    9
    Views:
    458
    Ulrich Eckhardt
    Mar 4, 2013
Loading...

Share This Page