any issues with long running python apps?

Discussion in 'Python' started by Les Schaffer, Jul 9, 2010.

  1. Les Schaffer

    Les Schaffer Guest

    i have been asked to guarantee that a proposed Python application will
    run continuously under MS Windows for two months time. And i am looking
    to know what i don't know.

    The app would read instrument data from a serial port, store the data in
    file, and display in matplotlib. typical sampling times are about once
    per hour. our design would be to read the serial port once a second
    (long story, but it worked for ten days straight so far) and just store
    and display the values once an hour. so basically we'd have a long
    running for-loop. maybe running in a separate thread.

    i have thought through the issues on our end: we have to properly
    handle the serial port and make sure our GUI doesn't hang easily. we'd
    have to be sure we don't have any memory leaks in numpy or matplotlib
    and we're not storing data in memory stupidly. i've never looked into
    Windows serial port driver reliability over the long term. But we think
    if the serial port hangs, we can detect that and react accordingly.

    but none of this has anything to do with Python itself. i am sure python
    servers have been running reliably for long periods of time, but i've
    never had to deal with a two-month guarantee before. is there something
    else i am missing here that i should be concerned about on the
    pure-Python side of things? something under the hood of the python
    interpreter that could be problematic when run for a long time?

    or need we only concern ourselves with the nuts behind the wheel:that
    is, we the developers?

    thanks

    Les
     
    Les Schaffer, Jul 9, 2010
    #1
    1. Advertisements

  2. It never hurts to separate the data collection and
    visualization/analysis parts into separate programs. That way you can
    keep the critical, long-running data collection program running, and if
    you get an exception in the GUI because of some divide by zero
    programming error or some other problem, you can restart that part
    without impacting the overall system.
     
    Michael Torrie, Jul 9, 2010
    #2
    1. Advertisements

  3. I have been running zope apps for about 10 years now and they normally run for
    many months between being restarted so python has no inherent problems with
    running that long. Your specific python program might though.

    You have to make sure you don't have any reference leaks so your program keeps
    growing in memory but you also have to deal with windows. The program can not
    be any more reliable then the os it is running on. Personally I would never
    make a guarantee like that on a windows box. I would have to hand choose every
    piece of hardware and run some kind of unix on it with a guarantee like that.

    Last I ran a program for a long time on windows it ran into some kind of
    address space fragmentation and eventually it would die on windows. There is
    some kind of problem with the windows VM system. 64bit windows will solve that
    mostly by having an address space so large you can't fragment it enough to
    kill the program in any reasonable time frame. If your program is not
    allocating and destroying large data structures all the time you probably
    don't have to worry about that but you do have to test it.
     
    William Heymann, Jul 9, 2010
    #3
  4. Les Schaffer

    Terry Reedy Guest

    Is this a dedicated machine, so you do not have anything else going that
    can delay for more than a second?
    I read the ibmpc serial port bios code in the early 80s. It was pretty
    simple then. I would be very surprised if it has been messed up since
    and not fixed.
    Python has been used for long-running background processes, at least on
    *nix, so the Python devs are sensitive to memory leak issues and respond
    to leak reports. That still leaves memory fragmentation. To try to avoid
    that, I would allocate all the needed data arrays immediately on startup
    (with dummy None pointers) and reuse them instead of deleting and
    regrowing them. Keeping explicit pointers is, of course more tedious and
    slightly error prone.

    I hope someone with actual experience also answers.
     
    Terry Reedy, Jul 9, 2010
    #4
  5. I normally use Linux for this sort of thing, so YMMV on the following advice.
    I'd keep the two timers, the process that actually checks and logs the data,
    and any postprocessing code completely separate. I'd also use something
    like the logging module to double up on where your data is stored- one on
    the local machine, another physically separated in case you lose a hard
    drive. It will also give you more information about where a failure might
    have occurred if/when it does. I'd also handle output/display on a separate
    machine.
    I would launch this as a subprocess and log so that even if you miss a
    measurement you still get what you need further on in the process.
    Just make sure that you can detect it at the time and that you also
    log an error when it occurs.

    This also minimizes the amount of memory your application directly
    handles and the susceptibility of your code to non-fatal problems with
    the serial port.
    Just ask all the what-ifs and you'll probably be ok.

    Geremy Condra
     
    geremy condra, Jul 9, 2010
    #5
  6. Get a good lawyer and put into the contract, the last thing you want is
    a windows update that restarts the host and you are responsible because
    you guaranteed that it would run continuously.

    On the technical side; as Christian Heimes already pointed out, split
    the programs. Specifically I would have 1 service for data gathering,
    two separate watchdog services (that checks whether the other watchdog
    is still running and the 'core' service).

    The GUI should be an user side app and the services could be too,
    however consider running the services under the appropriate system
    account as in the past I have seen some strange things happening with
    services under user account, especially if there are password policies.

    I don't see from the interpreter point of view no reason why it couldn't
    work, it is much more likely your host system will mess things up (even
    if it wouldn't be windows).
    <cut rest>
     
    Martin P. Hellwig, Jul 9, 2010
    #6
  7. Les Schaffer

    John Nagle Guest

    What if Master Control in Redmond decides to reboot your machine
    to install an update? They've been known to do that even when you
    thought you had remote update turned off.

    If you have to run Windows in a data acquistion application,
    you should be running Windows Embedded. See

    http://www.microsoft.com/windowsembedded

    This is a version of Windows 7 which comes with a tool for
    building customized boot images. Basically, you take out
    everything except what your application needs.

    Do you get Embedded Systems Journal? You should.

    John Nagle
     
    John Nagle, Jul 9, 2010
    #7
  8. On 7/9/2010 12:13 PM Les Schaffer said...
    Keep users off the machine, turn off automated updates, and point dns to
    127.0.0.1. Oh, put it on a UPS. I've got a handful or two of these
    automated systems in place and rarely have trouble. Add a watchdog
    scheduled task process to restart the job and check disk space or memory
    usage and push out a heartbeat.

    I found Chris Liechti's serial module years ago and have used it
    successfully since.

    The only times that come to mind when I've problems on the python side
    were related to memory usage and the system started thrashing. Adding
    memory fixed it.

    HTH,

    Emile
     
    Emile van Sebille, Jul 10, 2010
    #8
  9. Les Schaffer

    Roy Smith Guest

    Heh. The OS won't stay up that long.
     
    Roy Smith, Jul 10, 2010
    #9
  10. Les Schaffer

    Tim Chase Guest

    While I'm not sure how much of Roy's comment was "hah, hah, just
    serious", this has been my biggest issue with long-running Python
    processes on Win32 -- either power outages the UPS can't handle,
    or (more frequently) the updates (whether Microsoft-initiated or
    by other vendors' update tools) require a reboot for every
    ${EXPLETIVE}ing thing. The similar long-running Python processes
    I have on my Linux boxes have about 0.1% of the reboots/restarts
    for non-electrical reasons (just kernel and Python updates).

    As long as you're not storing an ever-increasing quantity of data
    in memory (write it out to disk storage and you should be fine),
    I've not had problems with Python-processes running for months.
    If you want belt+suspenders with that, you can take others'
    recommendations for monitoring processes and process separation
    of data-gathering vs. front-end GUI/web interface.

    -tkc
     
    Tim Chase, Jul 10, 2010
    #10
  11. Les Schaffer a écrit :
    Zope is (rightly) considered as a memory/resources hog, and I have a
    Zope instance hosting two apps on a cheap dedicated server that has not
    been restarted for the past 2 or 3 years. So as long as your code is
    clean you should not have much problem with the Python runtime itself,
    at least on a a linux box. Can't tell how it would work on Windows.
     
    Bruno Desthuilliers, Jul 10, 2010
    #11
  12. Les Schaffer

    John Nagle Guest

    If the device you're listening to is read-only, and you're just
    listening, make a cable to feed the serial data into two machines,
    and have them both log it. Put them on separate UPSs and in
    a place where nobody can knock them over or mess with them.

    John Nagle
     
    John Nagle, Jul 10, 2010
    #12
  13. * John Nagle, on 10.07.2010 20:54:
    "The Ramans do everything in triplicate" - Old jungle proverb


    Cheers,

    - Alf
     
    Alf P. Steinbach /Usenet, Jul 10, 2010
    #13
  14. Les Schaffer

    sturlamolden Guest

    Win32 is also the only OS in common use known to fragment memory
    enough to make long-running processes crash or hang (including system
    services), and require reboots on regular basis. Algorithms haven't
    changed, but it takes a bit "longer" for the heap to go fubar with
    Win64. (That is, "longer" as in "you're dead long before it happens".)
    For processes that needs to run that long, I would really recommend
    using Win64 and Python compiled for amd64.
     
    sturlamolden, Jul 11, 2010
    #14
  15. ^^^^^^^^^^

    IMO, that's going to be your main problem.
     
    Grant Edwards, Jul 12, 2010
    #15
  16. Les Schaffer

    John Nagle Guest

    If you're doing a real-time job, run a real-time OS. QNX,
    a real-time variant of Linux, Windows CE, Windows Embedded, LynxOS,
    etc. There's too much background junk going on in a consumer OS
    today.

    Yesterday, I was running a CNC plasma cutter that's controlled
    by Windows XP. This is a machine that moves around a plasma torch that
    cuts thick steel plate. A "New Java update is available" window
    popped up while I was working. Not good.

    John Nagle
     
    John Nagle, Jul 12, 2010
    #16
  17. Les Schaffer

    CM Guest

    I'm not sure I can like that example any better.
     
    CM, Jul 12, 2010
    #17
  18. Les Schaffer

    John Bokma Guest

    You can blame that one on Sun (or Oracle nowadays). Good example though.
     
    John Bokma, Jul 12, 2010
    #18
  19. Les Schaffer

    Tim Chase Guest

    <Clippy> Hi, it looks like you're attempting to cut something
    with a plasma torch. Would you like help?

    (_) inserting a steel plate to cut

    (_) severing the tip of your finger

    [ ] Don't show me this tip again.




    -tkc
     
    Tim Chase, Jul 12, 2010
    #19
  20. That's downright frightening.

    --

    Stephen Hansen
    ... Also: Ixokai
    ... Mail: me+list/python (AT) ixokai (DOT) io
    ... Blog: http://meh.ixokai.io/


    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v2.0.10 (Darwin)

    iQEcBAEBAgAGBQJMO8MkAAoJEKcbwptVWx/lfroH/2+Z4JO4sDFSd7MPetB2PGwY
    XMF/Yx+XRj1Ux4mrv/LS++QCfBiV6g8aP5IYzHbuB3X8HIzoumbX/N/bOdfID+xV
    MmjXHZF39pmTUzU++LgRiQCwJjeymBCPFvUirl2p+Fz1cGmDCeqCHGmuCRb0xEx8
    AOvlsrpcitVDz7G0tFIT0X+aZ+aNMsG0SD1fywOgdcLOeRIHivHZYUHg9tX6jFk8
    MIaHaC6EngseBT77J72lcPGSTIW/HKLeYbYxrdeDLHX+5p4J6ZcxbpN/dV1d3VU2
    nVgLqpHZdcIO45PqwPj30IxOCXPdNSdSCMAXypbg/Qg3ehaqQvXqxiVD9Hqr45s=
    =7plY
    -----END PGP SIGNATURE-----
     
    Stephen Hansen, Jul 13, 2010
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.