Help troubleshooting Perl issue

E

Eric Martin

I have an image gallery script that people have used since 2003.
Recently I've been getting a lot of messages about hosts disabling the
script because of hight CPU usage.

I've done some investigating and can't pinpoint the problem. I ran gdb
against one of the processes and got the following:

Reading symbols from /usr/bin/perl...(no debugging symbols
found)...done.
Using host libthread_db library "/lib/tls/libthread_db.so.1".
Reading symbols from /lib/libnsl.so.1...(no debugging symbols
found)...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libdl.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/tls/libm.so.6...
(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /lib/libcrypt.so.1...(no debugging symbols
found)...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libutil.so.1...(no debugging symbols
found)...done.
Loaded symbols for /lib/libutil.so.1
Reading symbols from /lib/tls/libc.so.6...
(no debugging symbols found)...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /usr/lib/perl5/5.8.8/i686-linux/auto/Cwd/Cwd.so...
(no debugging symbols found)...done.
Loaded symbols for /usr/lib/perl5/5.8.8/i686-linux/auto/Cwd/Cwd.so
Reading symbols from /usr/lib/perl5/5.8.8/i686-linux/auto/Fcntl/
Fcntl.so...
(no debugging symbols found)...done.
Loaded symbols for /usr/lib/perl5/5.8.8/i686-linux/auto/Fcntl/Fcntl.so
0x0021597e in __xstat64@GLIBC_2.1 () from /lib/tls/libc.so.6

Am I correct in thinking that the issue is with Fcntl/flock? Are there
any other suggestions on what I can check?

Thanks!
 
T

Ted Zlatanov

EM> I have an image gallery script that people have used since 2003.
EM> Recently I've been getting a lot of messages about hosts disabling the
EM> script because of hight CPU usage.

EM> I've done some investigating and can't pinpoint the problem. I ran gdb
EM> against one of the processes and got the following:
....
EM> Loaded symbols for /usr/lib/perl5/5.8.8/i686-linux/auto/Fcntl/Fcntl.so
EM> 0x0021597e in __xstat64@GLIBC_2.1 () from /lib/tls/libc.so.6

EM> Am I correct in thinking that the issue is with Fcntl/flock? Are there
EM> any other suggestions on what I can check?

You should look through the source code and look for potential problems.

We could do it too, if you showed it to us. As it is, you're showing us
a black box and asking us to guess what's inside and why it's not
running well.

Ted
 
S

smallpond

I have an image gallery script that people have used since 2003.
Recently I've been getting a lot of messages about hosts disabling the
script because of hight CPU usage.

I've done some investigating and can't pinpoint the problem. I ran gdb
against one of the processes and got the following:
... snip ...

Am I correct in thinking that the issue is with Fcntl/flock? Are there
any other suggestions on what I can check?

flock in Linux blocks, so could be. Why do you need to flock?
There is probably a better way to provide exclusive access to the
resource which needs to be shared.
 
E

Eric Martin

snip


You should look through the source code and look for potential problems.

We could do it too, if you showed it to us.  As it is, you're showing us
a black box and asking us to guess what's inside and why it's not
running well.

Thanks Ted. Problem is, the source code is very large, and admittedly,
pretty bad. I was hoping that the output of gdb would provide some
clues.

What I don't get is why this is all of a sudden an issue. Not sure if
it's related to a Perl upgrade or OS, etc. The other problem is that
it doesn't happen consistently, so I haven't been able to reproduce it
yet.

I was wondering if there were some troubleshooting techniques, but it
sounds like a code review and more debugging might be in order.
 
E

Eric Martin

.. snip ...


flock in Linux blocks, so could be.  Why do you need to flock?
There is probably a better way to provide exclusive access to the
resource which needs to be shared.

Why? I guess when I wrote the program, I thought it was needed to
prevent data collisions. Here's an example of an open and write:

open(TMP,$template_path) || error("100|$template_path");
eval q{ flock(TMP,LOCK_EX); };
my @template = <TMP>;
close(TMP);

open(IMG2,">$cat_path") || error_log(__FILE__, __LINE__, "could not
add information to a data file ($cat_path): $!");
eval q{ flock(IMG2,LOCK_EX); };
print IMG2 @cat_info;
close(IMG2);

If you have any suggestions on a better way, I'm all ears. Thanks for
your help.
 
J

Jürgen Exner

Eric Martin said:
I have an image gallery script that people have used since 2003.
Recently I've been getting a lot of messages about hosts disabling the
script because of hight CPU usage.
[...]
I was wondering if there were some troubleshooting techniques, but it
sounds like a code review and more debugging might be in order.

No, not debugging but profiling.

jue
 
T

Ted Zlatanov

EM> Why? I guess when I wrote the program, I thought it was needed to
EM> prevent data collisions. Here's an example of an open and write:

EM> open(TMP,$template_path) || error("100|$template_path");
EM> eval q{ flock(TMP,LOCK_EX); };
EM> my @template = <TMP>;
EM> close(TMP);

EM> open(IMG2,">$cat_path") || error_log(__FILE__, __LINE__, "could not
EM> add information to a data file ($cat_path): $!");
EM> eval q{ flock(IMG2,LOCK_EX); };
EM> print IMG2 @cat_info;
EM> close(IMG2);

EM> If you have any suggestions on a better way, I'm all ears. Thanks for
EM> your help.

You probably don't need LOCK_EX on read, but LOCK_SH instead. Look at
`perldoc -f flock' and especially

To avoid the possibility of miscoordination, Perl now flushes
FILEHANDLE before locking or unlocking it.

That may be the problem. Now how can you tell if flock() is a problem?

Step 1: write a benchmark with an without flock. This is really easy,
there are modules to do all the boring details. Make sure you get at
least 100K iterations on an active system and you'll probably hit the
flock() trouble spots if they exist.

Step 2: reconsider using plain files as databases. Perhaps SQLite is
more suitable for your needs, or perhaps one of the *dbm solutions will
work. Maybe you can use an external DB server; DBI supports practically
everything (including SQLite). If you must have a file, break it down
into multiple files and lock only the ones you use.


EM> Thanks Ted. Problem is, the source code is very large, and admittedly,
EM> pretty bad. I was hoping that the output of gdb would provide some
EM> clues.

EM> What I don't get is why this is all of a sudden an issue. Not sure if
EM> it's related to a Perl upgrade or OS, etc. The other problem is that
EM> it doesn't happen consistently, so I haven't been able to reproduce it
EM> yet.

EM> I was wondering if there were some troubleshooting techniques, but it
EM> sounds like a code review and more debugging might be in order.

I would reconsider hand-rolled file-based databases. Nowadays that's
not likely to give you good performance or reliability compared to the
many available third-party solutions on CPAN and elsewhere.

Ted
 
R

Randal L. Schwartz

Tad> Why do you use eval() there rather than just

Tad> flock(TMP,LOCK_EX);

Tad> ??

There *do* exist machines on which flock() is not implemented, and
therefore Perl won't even support the syntax. These are admittedly
rare, but in the early days of Perl, the advice was to use flock() in an
eval-string just to make sure the program would compile.

print "Just another Perl hacker,"; # the original
 
I

Ilya Zakharevich

Tad> Why do you use eval() there rather than just

Tad> flock(TMP,LOCK_EX);

Tad> ??

There *do* exist machines on which flock() is not implemented, and
therefore Perl won't even support the syntax. These are admittedly
rare, but in the early days of Perl, the advice was to use flock() in an
eval-string just to make sure the program would compile.

Even if flock() would die with "unimplemented", LOCK_EX must be
"protected inside a string" as well. (Well, frankly speaking I do not
know what this would do in `use strict' context - probably one would
need to guard against 2 modes of failure if one one wants a meaningful
report to the user.)

Yours,
Ilya
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,920
Messages
2,570,038
Members
46,449
Latest member
onedumbsquirrel

Latest Threads

Top