Long script "just stops" sometime

Discussion in 'Perl Misc' started by Jerry Krinock, Sep 24, 2010.

  1. I've written a 1500-line script which processes several dozen files of
    source text written in Markdown to html. It takes several minutes to
    run, indicating progress by printf statements. However, about 20% of
    the time, in the middle of processing a Markdown file, it just stops
    progressing, as though it is in an infinite loop. If I kill the
    process and restart, it always completes successfully.

    My script is, of course, being a script, not particularly efficient.
    I was thinking that maybe Perl was running out of memory or something,
    although that's not supposed to happen nowadays (Perl 5.10.0, Mac OS X
    10.6). And when I check it in Apple's Activity Monitor during normal
    operation, I find that its CPU and memory usage are hardly noticeable,
    maybe 3% and a few tens of megabytes.

    Are there any conditions under which Perl would "just stop"?

    Any suggestions to troubleshoot this would be appreciated.

    Thanks!

    Jerry Krinock
    Jerry Krinock, Sep 24, 2010
    #1
    1. Advertising

  2. On 2010-09-24, Jerry Krinock <> wrote:
    > operation, I find that its CPU and memory usage are hardly noticeable,
    > maybe 3% and a few tens of megabytes.
    >
    > Are there any conditions under which Perl would "just stop"?


    With no CPU usage? I would say it reads from STDIN. Did you try to
    press Enter or C-d?

    > Any suggestions to troubleshoot this would be appreciated.


    Have not you heard about debugger? If this happens often, you can
    just wait for it to happen in the debugger. If worse comes to worst,
    and interactive debugging does not help, you can always start in
    NonStop mode with `tracing', and look for the last several thousands
    of lines when the `stop' happens.

    Hope this helps,
    Ilya
    Ilya Zakharevich, Sep 25, 2010
    #2
    1. Advertising

  3. Jerry Krinock

    C.DeRykus Guest

    On Sep 24, 3:11 pm, Jerry Krinock <> wrote:
    > I've written a 1500-line script which processes several dozen files of
    > source text written in Markdown to html.  It takes several minutes to
    > run, indicating progress by printf statements.  However, about 20% of
    > the time, in the middle of processing a Markdown file, it just stops
    > progressing, as though it is in an infinite loop.  If I kill the
    > process and restart, it always completes successfully.
    >
    > My script is, of course, being a script, not particularly efficient.
    > I was thinking that maybe Perl was running out of memory or something,
    > although that's not supposed to happen nowadays (Perl 5.10.0, Mac OS X
    > 10.6).  And when I check it in Apple's Activity Monitor during normal
    > operation, I find that its CPU and memory usage are hardly noticeable,
    > maybe 3% and a few tens of megabytes.
    >
    > Are there any conditions under which Perl would "just stop"?
    >
    > Any suggestions to troubleshoot this would be appreciated.
    >


    You might try just setting a timeout around
    whatever code section turns up in a stack
    trace. (perldoc -f alarm).

    Then exec the program again (perldoc -f exec)
    if there's a timeout.. Of course if memory's
    the problem, you may be able to find some way
    to reduce memory usage and eliminate the timeout
    workaround.

    --
    Charles DeRykus
    C.DeRykus, Sep 26, 2010
    #3
  4. On 2010-09-25 20:50, Ilya Zakharevich <> wrote:
    > On 2010-09-24, Jerry Krinock <> wrote:
    >> operation, I find that its CPU and memory usage are hardly noticeable,
    >> maybe 3% and a few tens of megabytes.
    >>
    >> Are there any conditions under which Perl would "just stop"?

    >
    > With no CPU usage? I would say it reads from STDIN. Did you try to
    > press Enter or C-d?


    Two tools I find indispensable when trying to figure out "strange"
    behaviour of programs are lsof and strace.

    lsof lists the open files of a process. It has been ported to lots of
    unixoid systems (I first encountered it on HP-UX) and should be
    available on MacOS.

    strace prints out the system calls a process invokes. Unfortunately,
    while most unixoid systems have a program which does this, it seems to
    have a different name on each ("truss" on Solaris, "tusc" on HP-UX, ...)
    so the OP will have to find out himself how its called on MacOS.

    In this case, if your guess is right, strace would show that the process
    is currently waiting for a read on fd 0 to complete, and then lsof could
    be used to find out which file fd 0 is (ok, so for fd 0 you may know
    that it's the tty, for for (say) fd 43 you want a tool to look it up).

    hp
    Peter J. Holzer, Sep 26, 2010
    #4
  5. Peter J. Holzer wrote:
    > On 2010-09-25 20:50, Ilya Zakharevich <> wrote:
    >> On 2010-09-24, Jerry Krinock <> wrote:
    >>> operation, I find that its CPU and memory usage are hardly noticeable,
    >>> maybe 3% and a few tens of megabytes.
    >>>
    >>> Are there any conditions under which Perl would "just stop"?

    >> With no CPU usage? I would say it reads from STDIN. Did you try to
    >> press Enter or C-d?

    >
    > Two tools I find indispensable when trying to figure out "strange"
    > behaviour of programs are lsof and strace.
    >
    > lsof lists the open files of a process. It has been ported to lots of
    > unixoid systems (I first encountered it on HP-UX) and should be
    > available on MacOS.
    >
    > strace prints out the system calls a process invokes. Unfortunately,
    > while most unixoid systems have a program which does this, it seems to
    > have a different name on each ("truss" on Solaris, "tusc" on HP-UX, ...)
    > so the OP will have to find out himself how its called on MacOS.
    >
    > In this case, if your guess is right, strace would show that the process
    > is currently waiting for a read on fd 0 to complete, and then lsof could
    > be used to find out which file fd 0 is (ok, so for fd 0 you may know
    > that it's the tty, for for (say) fd 43 you want a tool to look it up).


    Unfortunately, if you use strace "-p" option to attach to an
    already-running process, if often doesn't show you what call the process
    was waiting on at the time of the attachment. You would have to start
    stracing the process from the beginning, which is inconvenient if the
    situation at question only happens occasionally.

    Xho
    Xho Jingleheimerschmidt, Sep 27, 2010
    #5
  6. with <> Ben Morrow wrote:
    > Quoth Xho Jingleheimerschmidt <>:
    >>
    >> Unfortunately, if you use strace "-p" option to attach to an
    >> already-running process, if often doesn't show you what call the process
    >> was waiting on at the time of the attachment. You would have to start
    >> stracing the process from the beginning, which is inconvenient if the
    >> situation at question only happens occasionally.

    >
    > You can find out what call the process is currently blocking in with ps.


    I've just checked (linux, 2.6.30, Debian). Neither '-n
    /boot/System.map-2.6.30*' nor '-n /proc/*/wchan' helps 'ps' to find
    syscall. 'ps' fails with its default routine either. Thus the output
    for 'WCHAN' column is always either '-' or '?'. I remember, it was
    there. Now it's missing.

    --
    Torvalds' goal for Linux is very simple: World Domination
    Stallman's goal for GNU is even simpler: Freedom
    Eric Pozharski, Sep 27, 2010
    #6
  7. Ben Morrow wrote:
    > Quoth Xho Jingleheimerschmidt <>:
    >> Peter J. Holzer wrote:
    >>
    >> Unfortunately, if you use strace "-p" option to attach to an
    >> already-running process, if often doesn't show you what call the process
    >> was waiting on at the time of the attachment. You would have to start
    >> stracing the process from the beginning, which is inconvenient if the
    >> situation at question only happens occasionally.

    >
    > You can find out what call the process is currently blocking in with ps.


    Well, maybe *you* can. I've never been able to.

    Xho
    Xho Jingleheimerschmidt, Sep 28, 2010
    #7
  8. with <> Eric Pozharski wrote:
    *SKIP*
    > I've just checked (linux, 2.6.30, Debian). Neither '-n
    > /boot/System.map-2.6.30*' nor '-n /proc/*/wchan' helps 'ps' to find
    > syscall. 'ps' fails with its default routine either. Thus the output
    > for 'WCHAN' column is always either '-' or '?'. I remember, it was
    > there. Now it's missing.


    I've checked more thoroughly, all 'wchan' files (there're
    '/proc/*/task/*/wchan' too) are always '0'. BTW, they don't have
    trailing newline either. R.I.P.



    --
    Torvalds' goal for Linux is very simple: World Domination
    Stallman's goal for GNU is even simpler: Freedom
    Eric Pozharski, Sep 28, 2010
    #8
  9. On 2010-09-26 23:06, Xho Jingleheimerschmidt <> wrote:
    > Peter J. Holzer wrote:
    >> In this case, if your guess is right, strace would show that the process
    >> is currently waiting for a read on fd 0 to complete, and then lsof could
    >> be used to find out which file fd 0 is (ok, so for fd 0 you may know
    >> that it's the tty, for for (say) fd 43 you want a tool to look it up).

    >
    > Unfortunately, if you use strace "-p" option to attach to an
    > already-running process, if often doesn't show you what call the process
    > was waiting on at the time of the attachment.


    I don't think that ever happened to me in many years of using strace.
    Except with multithreaded processes, but for those strace doesn't work
    reliably anyway (I don't understand why).

    hp
    Peter J. Holzer, Sep 28, 2010
    #9
  10. Peter J. Holzer wrote:
    > On 2010-09-26 23:06, Xho Jingleheimerschmidt <> wrote:
    >> Peter J. Holzer wrote:
    >>> In this case, if your guess is right, strace would show that the process
    >>> is currently waiting for a read on fd 0 to complete, and then lsof could
    >>> be used to find out which file fd 0 is (ok, so for fd 0 you may know
    >>> that it's the tty, for for (say) fd 43 you want a tool to look it up).

    >> Unfortunately, if you use strace "-p" option to attach to an
    >> already-running process, if often doesn't show you what call the process
    >> was waiting on at the time of the attachment.

    >
    > I don't think that ever happened to me in many years of using strace.
    > Except with multithreaded processes, but for those strace doesn't work
    > reliably anyway (I don't understand why).


    This is what I get:

    $ perl -le '<>' &
    [1] 17657
    $ strace -p 17657
    Process 17657 attached - interrupt to quit

    And then no output until a break out of strace.

    Xho
    Xho Jingleheimerschmidt, Sep 29, 2010
    #10
  11. Ben Morrow wrote:
    > Quoth Xho Jingleheimerschmidt <>:
    >> Ben Morrow wrote:
    >>> You can find out what call the process is currently blocking in with ps.

    >> Well, maybe *you* can. I've never been able to.

    >
    > Really?
    >
    > ~% uname
    > FreeBSD
    > ~% perl -MPOSIX=pause -epause &
    > [3] 68357
    > ~% ps -lp 68357
    > UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
    > 1001 68357 14199 0 61 0 5204 3160 pause S 3 0:00.06 perl -MPOSI
    > ~%
    >
    > ~$ uname
    > Linux
    > ~$ perl -MPOSIX=pause -epause &
    > [1] 10335
    > ~$ ps -lp 10335
    > F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
    > 0 S 1051 10335 10321 0 80 0 - 5759 pause pts/1 00:00:00 perl
    > ~$
    >
    > It only tells you *which* syscall, of course, not what arguments it was
    > called with, but that's something.


    On my Linux:

    $ perl -MPOSIX=pause -epause &
    [2] 26855
    $ ps -lp 26855
    Warning: /boot/System.map-2.6.21.4-eeepc not parseable as a System.map
    F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
    0 S 1000 26855 1790 0 77 0 - 943 11ff41 pts/0 00:00:00 perl

    I don't have access to a more conventional Linux right now, but I don't
    recall seeing anything other than '-' and '?' listed under wchan. I'll
    have to try to remember to give it a try.

    Thanks,

    Xho
    Xho Jingleheimerschmidt, Sep 29, 2010
    #11
  12. I really appreciate all the thought that went into this thread. But
    the first solution seems to work…

    On Sep 24, 3:56 pm, Ben Morrow <> wrote:
    > Quoth Jerry Krinock <>:
    > Add the following somewhere early on:
    >
    >     require Carp;
    >     $SIG{INFO} = sub { Carp::cluck("SIGINFO") };
    >     $SIG{QUIT} = sub { Carp::confess("SIGQUIT") };
    >
    > Now you can press ^T to get a backtrace, and ^\ to get a backtrace and
    > kill the program.


    Actually, works better than advertised. When I hit ^T, it prints a
    backtrace to the console, and then, surprise!, the script starts
    running again, and eventually exits success. This happened twice in
    the last few days.

    The backtrace tells me that it's sticking when I invoke IPC::Run to
    invoke another perl script which I have written. I haven't dug into
    it yet, because, well, it's not a big issue if all I need to is type
    ^T to un-stuck it. It might be mis-processing one little line of text
    or something.

    Thanks for all the help,

    Jerry
    Jerry Krinock, Sep 29, 2010
    #12
  13. On 2010-09-29 01:07, Xho Jingleheimerschmidt <> wrote:
    > Peter J. Holzer wrote:
    >> On 2010-09-26 23:06, Xho Jingleheimerschmidt <> wrote:
    >>> Peter J. Holzer wrote:
    >>>> In this case, if your guess is right, strace would show that the
    >>>> process is currently waiting for a read on fd 0 to complete, and
    >>>> then lsof could be used to find out which file fd 0 is (ok, so for
    >>>> fd 0 you may know that it's the tty, for for (say) fd 43 you want a
    >>>> tool to look it up).
    >>> Unfortunately, if you use strace "-p" option to attach to an
    >>> already-running process, if often doesn't show you what call the process
    >>> was waiting on at the time of the attachment.

    >>
    >> I don't think that ever happened to me in many years of using strace.
    >> Except with multithreaded processes, but for those strace doesn't work
    >> reliably anyway (I don't understand why).

    >
    > This is what I get:
    >
    > $ perl -le '<>' &
    > [1] 17657


    Here I get

    [1] + suspended (tty input) perl -le '<>'

    (I expected that. A background process should not be able to
    read from the TTY)

    strace then very rapidly prints

    read(0, 0x8bc1c90, 4096) = ? ERESTARTSYS (To be
    restarted)
    --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
    --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
    read(0, 0x8bc1c90, 4096) = ? ERESTARTSYS (To be
    restarted)
    --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
    --- SIGTTIN (Stopped (tty input)) @ 0 (0) ---
    ....


    If I try that again without the "&", and start strace in another
    terminal, I get:

    % strace -p 27233
    Process 27233 attached - interrupt to quit
    read(0,

    and the cursor is to the right of "read(0, ", indicating that the read
    system call is still in progress. The line is completed as soon as the
    system call returns.

    (Linux 2.6.32-5-686, strace 4.5.20-2, but AFAIR it was always like this)

    hp
    Peter J. Holzer, Sep 29, 2010
    #13
  14. with <4ca2b9dc$0$29845$> Xho Jingleheimerschmidt wrote:
    *SKIP*
    > I don't have access to a more conventional Linux right now, but I
    > don't recall seeing anything other than '-' and '?' listed under
    > wchan. I'll have to try to remember to give it a try.


    Please note, observations could depend on distribution and/or the kernel
    being self-built or of-stock. I believe, kind of security-via-obscurity
    idealism. And that's not the first time I have to fight that in Debian,
    in particular.

    --
    Torvalds' goal for Linux is very simple: World Domination
    Stallman's goal for GNU is even simpler: Freedom
    Eric Pozharski, Sep 30, 2010
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. YesBalala
    Replies:
    1
    Views:
    505
    YesBalala
    Feb 13, 2004
  2. =?Utf-8?B?SnVsaWE=?=
    Replies:
    4
    Views:
    760
    =?Utf-8?B?SnVsaWE=?=
    Nov 7, 2004
  3. Eric
    Replies:
    2
    Views:
    356
    Jason Kester
    Oct 25, 2005
  4. vizlab
    Replies:
    3
    Views:
    4,202
    Michael Bar-Sinai
    Oct 17, 2007
  5. Dylan Parry
    Replies:
    2
    Views:
    358
    Dylan Parry
    Oct 29, 2004
Loading...

Share This Page