Perl program and database locked but alive

T

TF

Is there any way to intervene with a program that is still running, but
locked during database read/write subs, to see where the lockup is coming
from?
This problem cannot be repeated and is the result of web page data
corruption, or storage of said data.

Thanks
 
G

gnari

TF said:
Is there any way to intervene with a program that is still running, but
locked during database read/write subs, to see where the lockup is coming
from?

print debug info into a log file
when deadlock occur, read the logfile

you give no info about your application, or what kind of locking
is used, but often the risk of deadlocks can be reduced by being
careful about order of updated within transactions.
for example if in one place you have
transaction update a, update b
and in another place a transaction update b, update a
you have greatly increased risk of deadlocks

gnari
 
J

James Willmore

Is there any way to intervene with a program that is still running, but
locked during database read/write subs, to see where the lockup is coming
from?
This problem cannot be repeated and is the result of web page data
corruption, or storage of said data.

I'm thinking you're using the DBI module, but can't be sure ... because
you have posted no code related to your issue :)

If you're using the DBI module and think that it's having trouble with the
database it's connecting to, use the 'trace' function to get more
information - `perldoc DBI` for more information.

Another way to do CGI script debugging is to run the script at the command
line to see what happens (again ... no code, no more specific help :) ).

HTH

--
Jim

Copyright notice: all code written by the author in this post is
released under the GPL. http://www.gnu.org/licenses/gpl.txt
for more information.

a fortune quote ...
Common sense is the collection of prejudices acquired by age
eighteen. -- Albert Einstein
 
T

TF

TF said:
Is there any way to intervene with a program that is still running, but
locked during database read/write subs, to see where the lockup is coming
from?
This problem cannot be repeated and is the result of web page data
corruption, or storage of said data.

Thanks

O.K., more info.
Can't post any code, as it is over 15K lines.

I'm getting data from the web and storing it into an MLDBM database.

The hang is most likely one of the following:

1) Non-response from web request, though I'm not sure if this can happen.
2) The data checking routine has gone into a loop.
3) Corrupt data has been input into the database causing a lockup on future
access.

The ties & unties are all checked and O.K., so it can't be that.

If I can just intervene and find out what line of code it is hung on, would
be great.
No cgi is involved here.

Thanks
 
J

Jay Tilton

: > Is there any way to intervene with a program that is still running, but
: > locked during database read/write subs, to see where the lockup is coming
: > from?
: > This problem cannot be repeated and is the result of web page data
: > corruption, or storage of said data.
:
: O.K., more info.
: Can't post any code, as it is over 15K lines.
:
: I'm getting data from the web and storing it into an MLDBM database.
:
: The hang is most likely one of the following:
:
: 1) Non-response from web request, though I'm not sure if this can happen.
: 2) The data checking routine has gone into a loop.
: 3) Corrupt data has been input into the database causing a lockup on future
: access.
:
: The ties & unties are all checked and O.K., so it can't be that.
:
: If I can just intervene and find out what line of code it is hung on, would
: be great.

A simple signal handler could help there.

use Carp qw(confess);
$SIG{ 'INT' } = sub { confess "Signal caught" };

Then the process will terminate with a call stack backtrace when you send
that signal to it. See perlvar and perlipc for details.

: No cgi is involved here.

Huh? Where is the "web request" coming from if not through CGI?
 
G

gnari

Jay Tilton said:
: No cgi is involved here.

Huh? Where is the "web request" coming from if not through CGI?

I understand it as the script is doing a series of HTTP client requests
and does some checks and stores some results to a database

in that case it is not a question of database deadlocks the original
post gave me the oimpression of , unless multiple parallel requests
are being made, in which case file locking on the database needs to
be done.

To TF:
I suggest you just do more debugging, such as logging to a file
extensively, store on disk last input file, and so on.
then when lockup happens, tail the log file to see where in the
script you where, look at the last input and so on.

then you might feed your script the last stored input again, to see if
it is the data that does it. I am sure you have better ideas than us
on how to debug your program, but the main rule is to strip away
everything that is not needed to exhibit your problem, until what
is left is so little that it becomes obvious.

gnari
 
T

TF

Jay Tilton said:
: > Is there any way to intervene with a program that is still running, but
: > locked during database read/write subs, to see where the lockup is coming
: > from?
: > This problem cannot be repeated and is the result of web page data
: > corruption, or storage of said data.
:
: O.K., more info.
: Can't post any code, as it is over 15K lines.
:
: I'm getting data from the web and storing it into an MLDBM database.
:
: The hang is most likely one of the following:
:
: 1) Non-response from web request, though I'm not sure if this can happen.
: 2) The data checking routine has gone into a loop.
: 3) Corrupt data has been input into the database causing a lockup on future
: access.
:
: The ties & unties are all checked and O.K., so it can't be that.
:
: If I can just intervene and find out what line of code it is hung on, would
: be great.

A simple signal handler could help there.

use Carp qw(confess);
$SIG{ 'INT' } = sub { confess "Signal caught" };

Then the process will terminate with a call stack backtrace when you send
that signal to it. See perlvar and perlipc for details.

Thanks, I will.
 
T

TF

gnari said:
I understand it as the script is doing a series of HTTP client requests
and does some checks and stores some results to a database

in that case it is not a question of database deadlocks the original
post gave me the oimpression of , unless multiple parallel requests
are being made, in which case file locking on the database needs to
be done.

To TF:
I suggest you just do more debugging, such as logging to a file
extensively, store on disk last input file, and so on.
then when lockup happens, tail the log file to see where in the
script you where, look at the last input and so on.

then you might feed your script the last stored input again, to see if
it is the data that does it. I am sure you have better ideas than us
on how to debug your program, but the main rule is to strip away
everything that is not needed to exhibit your problem, until what
is left is so little that it becomes obvious.

gnari
I've been basically doing this for two weeks, but it takes a whole day to
get to this point and the data from the last request and last stored plce
isn't helping much.
Once I kill the process, the dbm is screwed so I can't really see anything.
All it says is, can't find the end (closing quoute) of data entry or
something like this, MLDBM error.
However, I think that is the result of killing the process while it had the
db open.
I can add more logging, but the logfile is liable to fill the disk before
the error occurs.
 
G

gnari

TF said:
I've been basically doing this for two weeks, but it takes a whole day to
get to this point and the data from the last request and last stored plce
isn't helping much.
Once I kill the process, the dbm is screwed so I can't really see anything.
All it says is, can't find the end (closing quoute) of data entry or
something like this, MLDBM error.
However, I think that is the result of killing the process while it had the
db open.
I can add more logging, but the logfile is liable to fill the disk before
the error occurs.

you might want to try rotating logs, each one containing one request
cycle. similarly, you could make a copy of your database after each
successfull request.
(this would probably mean a close/backup/reopen sequence that
will probably kill your performance, but I suggest this just as a
debugging tool. or you could do this after every 10 requests)

gnari
 
B

Bob Walton

TF wrote:

....
I've been basically doing this for two weeks, but it takes a whole day to
get to this point and the data from the last request and last stored plce
isn't helping much.
Once I kill the process, the dbm is screwed so I can't really see anything.
All it says is, can't find the end (closing quoute) of data entry or
something like this, MLDBM error.
However, I think that is the result of killing the process while it had the
db open.
I can add more logging, but the logfile is liable to fill the disk before
the error occurs.

It sounds to me like you just need some basic debugging techniques. If
you are (as previous notes in the thread stated) using LWP to grab info
from a whole bunch of web pages, make a simple log file that writes out
the URL of every page accessed. Set the log file filehandle to
autoflush ($|=1;) so you don't miss the last bufferful when you
terminate the process.

Then, when you know the URL that's generating the problem, retrieve just
that page and use the Perl debugger to see what's up with your code on
that data.

You might note that LWP by default should timeout after 180 seconds, so
if there is a web page that's not responding, the job should still
proceed, albeit slowly. You could change the timeout if it is just an
insufficient patience problem.

Also, you might check to see if the DBM-type implementation you are
using is compatible with your data. Some of the implementations (like
SDMB, for example) have limitations on key and value lengths which MLDBM
could very easily exceed. Consider using DB_File instead (if you aren't
already). You'll probably have to refer to the docs for the specific
DBM-type implementation you are using on your specific platform and OS.

Finally, note that you could close the MLDBM connection in a $SIG{INT}
handler. That way you won't corrupt your database when you kill your
program via the console, assuming you follow proper file locking
protocols in your program and in the signal handler.

HTH.
 
T

TF

you might want to try rotating logs, each one containing one request
cycle. similarly, you could make a copy of your database after each
successfull request.
(this would probably mean a close/backup/reopen sequence that
will probably kill your performance, but I suggest this just as a
debugging tool. or you could do this after every 10 requests)

gnari

The script is still locked and running, though it is using all cpu cycles.
This tells me it's probably in an endless loop.
I discovered that as long as I don't kill this process, I can access and
populate the database O.K. from another script.
I'm not sure I understand this part.
Anyway, I'll turn some logging on around the fault and let it run Monday
again.
It can only run weekdays, at least live runs. Data is different on weekends.
I still also suspect a job running on the network at noon every day as
possibly causing a problem.

What happens if the reply is dropped completely from an LPW web request?
Does it time out?
Can I add a timeout?
Many thanks for your time!

TF
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top