Perl reliability?

M

MikeJohnson

Hi,

I have written a Perl application of around 4000 lines.

It parses through a log file, builds up a large hash, then writes to a
SQL Server database using DBI.

For efficiency, I have broken up the processing into two ithreads.
The main thread parses the log file and writes scalars to a
Thread::Queue.
A database writing thread reads the scalars from the Thread::Queue,
and writes records to a local SQL Server database using DBI.

The program uses a second DBI connection to a SQLite database,
and Win32::Mutexes to coordinate multiple instances.
It also writes to the Windows Event Log with Win32::EventLog.

We are running ActiveState Perl version 820 on Windows XP.

THE PROBLEM IS THAT PERL ITSELF KEEPS CRASHING!

"Application Error Application Failure perl.exe 5.8.8.819 in
perl58.dll 5.8.8.819 at offset 00085af7 perl.exe 5.8.8.819 perl58.dll
5.8.8.819 00085af7"

These crashes are intermittent. Sometimes the program will crash on
one log file, and sometimes it won't. :(

My questions are-

1. Is Perl more stable on other platforms?

2. Or should we drop the money and purchase ActiveState Enterprise
Edition Perl, with hopefully good support?

3. If the program has bugs could this be the cause of the the Perl
interpreter crashing?
Or does Perl try to insulate application programmers from errors
akin to a Java or .NET runtime?


Thank you very much for any advice or recommendations,

Mike
 
J

Jamie

In said:
I have written a Perl application of around 4000 lines.

It parses through a log file, builds up a large hash, then writes to a
SQL Server database using DBI.

For efficiency, I have broken up the processing into two ithreads.
[snip]

THE PROBLEM IS THAT PERL ITSELF KEEPS CRASHING!
[snip]

These crashes are intermittent. Sometimes the program will crash on
one log file, and sometimes it won't. :(

My questions are-

1. Is Perl more stable on other platforms?

2. Or should we drop the money and purchase ActiveState Enterprise
Edition Perl, with hopefully good support?

3. If the program has bugs could this be the cause of the the Perl
interpreter crashing?

Threading issues, maybe?
Or does Perl try to insulate application programmers from errors
akin to a Java or .NET runtime?

Sounds to me like it's a problem with one (or more) of the loaded modules,
I've seen crashes related to BerkeleyDBM in the long distant past. (corrupt
files, generally) I've also seen problems (again, years ago) related to DBD's,
in my case, DB2.

If you could try it some how w/out threading, that might tell you something
about the problem (module/code/etc.. possibly not thread safe, possibly thread
safe but some contention related to memory access I'm not sure, I generally stick
to fork() with perl code)

I'd try to make it crash w/out threads if I were trying to solve a problem
like that. Next step would be removal of all module related functions (as much
as possible) gradually bringing them online until it crashes.

Jamie
 
G

gf

It would help if you showed code. Debugging apps without code is
tough.

Do you have warnings and strict enabled?

Do you have PrintError and RaiseError enabled in the DBI connect()
statement when you create your database handles? If not, are you
testing every call to the DBI to handle errors?

Have you read the "Threads and Thread Safety" section of the DBI docs,
especially the part that says "Using DBI with perl threads is not yet
recommended for production environments. For more information see
http://www.perlmonks.org/index.pl?node_id=288022"?

I built a multi-threaded spider using DBI and LWP that runs a minimum
of 10 LWP sessions, each feeding data to another thread that handles
all the database writes. It's been stable, even when chewing on REALLY
big customer sites analyzing their pages. (I've had a lot more LWP
sessions running than 10, but I don't remember if it was on our
biggest customers or just during testing. Either way it ran without
crashing.)

Because you're seeing intermittent errors I'd suggesting making sure
all your DBI calls have the error status checked afterward to make
sure they succeed, or make sure you've enabled PrintError and
RaiseError. If you go (or have gone) with the last path, then the DBI
will crash out when something happens that it doesn't like. Well...
not really crash... more a controlled cessation of program
execution. :)

You could try wrapping your DBI calls in eval{} blocks as an
additional attempt to keep the code running but sometimes its better
to just get the inevitable over with quickly so you can get to the
debugging phase with the real error messages intact.

Anyway, that's the sort of stuff I'd check without having access to a
minimum code sample reproducing the problem.
 
G

gf

It would help if you showed code. Debugging apps without code is
tough.

Do you have warnings and strict enabled?

Do you have PrintError and RaiseError enabled in the DBI connect()
statement when you create your database handles? If not, are you
testing every call to the DBI to handle errors?

Have you read the "Threads and Thread Safety" section of the DBI docs,
especially the part that says "Using DBI with perl threads is not yet
recommended for production environments. For more information see
http://www.perlmonks.org/index.pl?node_id=288022"?

I built a multi-threaded spider using DBI and LWP that runs a minimum
of 10 LWP sessions, each feeding data to another thread that handles
all the database writes. It's been stable, even when chewing on REALLY
big customer sites analyzing their pages. (I've had a lot more LWP
sessions running than 10, but I don't remember if it was on our
biggest customers or just during testing. Either way it ran without
crashing.)

Because you're seeing intermittent errors I'd suggesting making sure
all your DBI calls have the error status checked afterward to make
sure they succeed, or make sure you've enabled PrintError and
RaiseError. If you go (or have gone) with the last path, then the DBI
will crash out when something happens that it doesn't like. Well...
not really crash... more a controlled cessation of program
execution. :)

You could try wrapping your DBI calls in eval{} blocks as an
additional attempt to keep the code running but sometimes its better
to just get the inevitable over with quickly so you can get to the
debugging phase with the real error messages intact.

Anyway, that's the sort of stuff I'd check without having access to a
minimum code sample reproducing the problem.
 
G

gf

On Apr 4, 5:15 pm, "MikeJohnson
It parses through a log file, builds up a large hash, then writes to a
SQL Server database using DBI.

This is a pretty common task - something we do here a lot, but I've
never seen a need for threads or even for building a large data
structure to handle the log parsing and insertion.

Are you doing some sort of checks and cross-referencing of the
contents of the log file before insertion? A lot of times I think in
that way, and then later remember (or get "remembered" by our VP who's
a DB whiz) that the database has a lot of built-in functionality that
can replace front-end coding. That reduces the problem to simply
loading, cleaning up data, and inserting it and letting the database
handle uniqueness or collating.

Just something to think about.
 
M

MikeJohnson

My thanks to Jamie and gf for your excellent informative replies.

I plan on removing the threading now.

I saw the ActiveState Graphical Debugger crashing at program-exit time
once I added threading-
not an auspicious sign.

I carefully read the threading provisos in `perldoc threads`,
although not the "Things you need to know before programming Perl
ithreads" info referenced above.

The program uses a very simple threading paradigm- one reader and one
writer.
I do not share DBI handles across threads
(although I do use DBI on separate connections in both threads).

I enable PrintError, but not RaiseError.
I check the result of every DBI call explicitly.
My code to obtain a database connection is below for example.

my $dbh = DBI->connect ($dbi_connectionstring, { PrintError => 1,
RaiseError => 0 });

if (! defined($dbh))
{
my $debug_msg = "Error connecting!\n";
$debug_msg .= "Error was:\n";
$debug_msg .= "$DBI::errstr\n"; # $DBI::errstr is the error
received from the SQL server
&LogErrorMessage ($debug_msg);

return 0;
}

# Note: Disabling AutoCommit here, as opposed to as connection
parameter, seems to work.
$dbh->{AutoCommit} = 0;

# Note: from: http://backpan.cpan.org/authors/id/T/TI/TIMB/DBI_WhatsNewTalk_200607.pdf
$dbh->{ShowErrorStatement} = 1;


I also added a SIG{__WARN__} handler to try to catch and log all
warnings usually sent to stdout.

Thanks again for your replies.
Hopefully the program will run stably once threading is removed.

I considered looking into creating a debug build from Perl source on
Mac OS X or Linux.
This way if the program crashed I would have some symbolic stack trace
info.
I do not know if a debug build with symbol tables is available through
ActiveState.

Thank you,
Mike
 
M

MikeJohnson

FYI here is my code to write records to a database.
I have not had time to add prototypes to functions.
Your comments on my errors are welcomed...
Thanks,
Mike

# Insert all saved rows in database.
# (Hopefully DBD-ODBC will attempt to insert all these records
efficiently).
# NOTE: The record-at-a-time code currently use a hard-coded field
count.
sub WriteRecordsToDatabase
{
my $dbConnection = $_[0];

my $insert_statement_handle = $_[1];

# Third passed parameter is a scalar containing a reference to an
array.
my $ref_array_of_rows = $_[2];

# Fourth parameter is reference to scalar.
my $records_written = $_[3];

my ($debug_msg, $rc);

#
# Sanity-check parameters
#
if ((!$dbConnection) || (!$insert_statement_handle) || (!
$ref_array_of_rows) || (!$records_written))
{
$debug_msg = "WriteRecordsToDatabase: Bad parameter(s)!\n";
&LogErrorMessage ($debug_msg);
return 0;
}

my @tuple_status;
my $next_record_to_write = 0;
my $records_just_written = 0;

eval
{
# Internally calls anonymous subroutine which continually shifts off
an element at a time from
# a dereference of a scalar reference to array.
# $rc = $insert_statement_handle->execute_for_fetch ( sub { shift @
$ref_array_of_rows }, \@tuple_status );

# Internally calls anonymous subroutine which returns an element at
a time from
# a dereference of a scalar reference to array.
$rc = $insert_statement_handle->execute_for_fetch ( sub { return @
$ref_array_of_rows[$next_record_to_write++] }, \@tuple_status );

};
if ($@)
{
# NOTE: Generate more descriptive error message using $tuple_status!
$debug_msg = "WriteRecordsToDatabase: " . $@ . "\n";
&LogErrorMessage ($debug_msg);
return $rc;
}
if (!$rc)
{
$debug_msg = "WriteRecordsToDatabase: bad return code: " .
$DBI::errstr . "\n";
&LogWarningMessage ($debug_msg);


#=================================================================================================
# After sleeping a short random time (to try to let any DB
congestion dissipate,
# re-insert just those records that failed in execute_for_fetch,
individually using execute.

#=================================================================================================
#
# From DBI documentation:
# "If \@tuple_status is passed then the execute_for_fetch method
uses it to return status information.
# The tuple_status array holds one element per tuple.
# If the corresponding execute() did not fail then the element holds
the return value from execute(), which is typically a row count.
# If the execute() did fail then the element holds a reference to an
array containing ($sth->err, $sth->errstr, $sth->state)."
#
# To grab a list of those tuples that failed:
# my @errors = grep { ref $_ } @tuple_status;
#
# To print a sample list of those tuples that failed:
# if (!$rc)
# {
# for my $tuple (0..@last_names-1)
# {
# my $status = $tuple_status[$tuple];
# $status = [0, "Skipped"] unless defined $status;
# next unless ref $status;
# printf "Failed to insert (%s, %s): %s\n", $first_names[$tuple],
$last_names[$tuple], $status->[1];
# }
# }

# Sleep a random period between 0 and 60 seconds, to try to let any
DB congestion dissipate.
sleep (int (rand (60)));

for my $retry_index (0..@$ref_array_of_rows - 1)
{
my $retry_tuple = $tuple_status [$retry_index];

if (ref $retry_tuple)
{
&LogWarningMessage ("WriteRecordsToDatabase: Retrying individual
insert after execute_for_fetch failed with: $retry_tuple->[1]\n");

my $a = @$ref_array_of_rows[$retry_index]->[0];
my $b = @$ref_array_of_rows[$retry_index]->[1];
my $c = @$ref_array_of_rows[$retry_index]->[2];
my $d = @$ref_array_of_rows[$retry_index]->[3];
my $e = @$ref_array_of_rows[$retry_index]->[4];

if (($insert_statement_handle->execute ($a, $b, $c, $d, $e)) != 1)
{
$debug_msg = "WriteRecordsToDatabase: execute failed with: " .
$DBI::errstr . "\n";
&LogErrorMessage ($debug_msg);
return -1;
}

eval
{
$rc = $dbConnection->commit();
};

#
# Note: Can commit fail with a retry-able error on SQL Server?
#
if ($@)
{
$debug_msg = "WriteRecordsToDatabase: execute: commit failed: " .
$@ . "\n";
&LogErrorMessage ($debug_msg);
return $rc;
}
if (!$rc)
{
$debug_msg = "WriteRecordsToDatabase: execute: commit: bad return
code: " . $DBI::errstr . "\n";
&LogErrorMessage ($debug_msg);
return -1;
}

$records_just_written++;

} # tuple failed

else
{
# tuple succeeded- record must've been written.
$records_just_written++;
}

} # for every row in array of rows

} # if execute_for_fetch failed

else
{
$records_just_written = $rc;

eval
{
$rc = $dbConnection->commit();
};



#*******************************************************************************
# Notes:
#
# 1. In the case of SQLite, commit can fail with SQLITE_BUSY.
# If so, consider sleeping then retrying commit.
#
# 2. However, also in the case of SQLite,
# if all access to the local database is controlled by mutex,
# commit is more likely to succeed, because only one process at a
time
# will be reading or writing the database.
#
# 3. Can commit fail with a retry-able error on SQL Server?
#

#*******************************************************************************

if ($@)
{
$debug_msg = "WriteRecordsToDatabase: execute_for_fetch: commit
failed: " . $@ . "\n";
&LogErrorMessage ($debug_msg);
return $rc;
}
if (!$rc)
{
$debug_msg = "WriteRecordsToDatabase: execute_for_fetch: commit:
bad return code: " . $DBI::errstr . "\n";
&LogErrorMessage ($debug_msg);
return -1;
}
}

$$records_written += $records_just_written;

return $records_just_written;

} # WriteRecordsToDatabase
 
D

Dr.Ruud

MikeJohnson schreef:
my $debug_msg = "Error connecting!\n";
$debug_msg .= "Error was:\n";
$debug_msg .= "$DBI::errstr\n"; # $DBI::errstr is the error
received from the SQL server
&LogErrorMessage ($debug_msg);

Alternative:

LogErrorMessage <<"EOT";
Error connecting!
Error was:
$DBI::errstr
received from the SQL server
EOT
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
DewittMill
Top