Intermittent Failure on Serial Port

Discussion in 'Python' started by H J van Rooyen, Jun 10, 2006.

  1. Hi All,

    I am writing a polling controller for an RS-485 line that has
    several addressable devices connected.
    It is a small access control system.
    All is well- the code runs for anything from three hours to three days, then
    sometimes when I get a comms error and have to send out a nak character, it
    fails hard...

    The traceback below pops up. - the first lines are just some debug prints.
    and the four records show reader number, id number, name...

    _______________start of Konsole Messages __________________

    /dev/ttyS0 set to 9600 sane cread raw -echo
    .../logs/composite/rawlog
    Pipe exists already
    we get here - thread identity is: 1079298992
    New Thread identity printed by new thread is: 1079298992
    we get here too
    5 0123456789012345 Sefie Sewenstein is in
    2 DE0000085ABF8A01 Error record - catch her, catch him
    2 8A00000870BEDE01 Bertus Bierdrinker is in
    5 0123456789012345 Sefie Sewenstein is out
    Traceback (most recent call last):
    File "portofile.py", line 232, in ?
    ret_val = main_routine(port, pollstruct, pfifo)
    File "portofile.py", line 108, in main_routine
    send_nak(port, timeout) # so bad luck - comms error
    File "/home/hvr/Polling/lib/readerpoll.py", line 125, in send_nak
    port.flush()
    IOError: [Errno 29] Illegal seek
    close failed: [Errno 29] Illegal seek

    _______________end of Konsole Messages ___________________


    The additional thread referred to is used to maintain a file of
    people on the premises,- it is not relevant to this,
    as it just writes out a global dictionary when
    something has changed, and it works quite merrily...

    Some of what I think are the relevant code snippets:

    This one works all the time, as I send out an ack on the
    succesful receipt of a message from one of the readers, and
    I have not seen it fail yet:

    def send_ack(port, timeout):
    """Routine to send out an ack on port"""

    ack_string = ack # Ascii ack = '\x06'
    s = ''

    port.write(ack_string)
    port.flush()
    flush_out(port,timeout) # eat the echoed ack

    This one is called after a comms error, and sometimes falls over in the above
    manner...

    def send_nak(port, timeout):
    """Routine to send out a nak on port"""

    nak_string = nak # Ascii nak = '\x15'
    s = ''

    port.write(nak_string)
    port.flush()
    flush_out(port, timeout) # eat the echoed nak


    # here we read to end, to flush a port buffer

    def flush_out(file, time_out):
    """Reads the port till no more chars come in for a while"""

    start_time = time.time()
    s = ''
    while (time.time() - start_time < time_out):
    s, ret_val = read_char(file, s)
    if ret_val == 0: # if there is input...
    start_time = time.time()


    # We make a function to read a char of file data

    def read_char(file, s):
    """Reads file data returns string, ret_val is 0 on success, 1 for KbdInt, 2
    no input."""

    ret_val = 0 # No error yet
    k = '' # Nothing read yet
    try:
    k = file.read(1) # read one char in as a string
    except KeyboardInterrupt:
    print "\n^C - Returning to Shell - Error is:", KeyboardInterrupt
    ret_val = 1 # ^C while waiting for input
    return k, ret_val # go out on error
    except IOError:
    ret_val = 2 # IOError on input - no record available
    return s, ret_val # so no extra chars
    if k == '': # empty string is nfg
    ret_val = 2
    return s, ret_val # so no extra chars
    s = s + k # something to add in to passed buffer
    return s, ret_val # ok exit no error


    Note that the file is unblocked with some magic from fcntl module...
    The file is the serial port /dev/ttyS0
    There is hardware connected to the port that has the effect of a loopback - you
    hear what you say..

    Out of the box distribution - SuSe 10:

    Python 2.4.1 (#1, Sep 13 2005, 00:39:20)
    [GCC 4.0.2 20050901 (prerelease) (SUSE Linux)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>>


    Where can I find out what the Errno 29 really means?
    Is this Python, the OS or maybe hardware?

    Any Ideas or suggestions will be appreciated

    (I am doing this via email - so I am not on line all the time- so my responses
    may be slow...)

    - Hendrik van Rooyen
     
    H J van Rooyen, Jun 10, 2006
    #1
    1. Advertising

  2. H J van Rooyen

    Serge Orlov Guest

    H J van Rooyen wrote:
    > Traceback (most recent call last):
    > File "portofile.py", line 232, in ?
    > ret_val = main_routine(port, pollstruct, pfifo)
    > File "portofile.py", line 108, in main_routine
    > send_nak(port, timeout) # so bad luck - comms error
    > File "/home/hvr/Polling/lib/readerpoll.py", line 125, in send_nak
    > port.flush()
    > IOError: [Errno 29] Illegal seek
    > close failed: [Errno 29] Illegal seek
    >



    > Where can I find out what the Errno 29 really means?
    > Is this Python, the OS or maybe hardware?


    It is from kernel: grep -w 29 `locate errno`
    /usr/include/asm-generic/errno-base.h: #define ESPIPE 29
    /* Illegal seek */

    man lseek:

    ERRORS:
    ESPIPE fildes is associated with a pipe, socket, or FIFO.

    RESTRICTIONS:
    Linux specific restrictions: using lseek on a tty device
    returns ESPIPE.
     
    Serge Orlov, Jun 10, 2006
    #2
    1. Advertising

  3. Serge Orloff wrote:

    | H J van Rooyen wrote:
    | > Traceback (most recent call last):
    | > File "portofile.py", line 232, in ?
    | > ret_val = main_routine(port, pollstruct, pfifo)
    | > File "portofile.py", line 108, in main_routine
    | > send_nak(port, timeout) # so bad luck - comms error
    | > File "/home/hvr/Polling/lib/readerpoll.py", line 125, in send_nak
    | > port.flush()
    | > IOError: [Errno 29] Illegal seek
    | > close failed: [Errno 29] Illegal seek
    | >
    |
    |
    | > Where can I find out what the Errno 29 really means?
    | > Is this Python, the OS or maybe hardware?
    |
    | It is from kernel: grep -w 29 `locate errno`
    | /usr/include/asm-generic/errno-base.h: #define ESPIPE 29
    | /* Illegal seek */
    |
    | man lseek:
    |
    | ERRORS:
    | ESPIPE fildes is associated with a pipe, socket, or FIFO.
    |
    | RESTRICTIONS:
    | Linux specific restrictions: using lseek on a tty device
    | returns ESPIPE.


    Thanks for the info - so the Kernel sometimes bombs me out - does anybody know
    why the python flush sometimes calls lseek?

    - Hendrik
     
    H J van Rooyen, Jun 11, 2006
    #3
  4. H J van Rooyen

    Serge Orlov Guest

    H J van Rooyen wrote:
    > Serge Orloff wrote:
    >
    > | H J van Rooyen wrote:
    > | > Traceback (most recent call last):
    > | > File "portofile.py", line 232, in ?
    > | > ret_val = main_routine(port, pollstruct, pfifo)
    > | > File "portofile.py", line 108, in main_routine
    > | > send_nak(port, timeout) # so bad luck - comms error
    > | > File "/home/hvr/Polling/lib/readerpoll.py", line 125, in send_nak
    > | > port.flush()
    > | > IOError: [Errno 29] Illegal seek
    > | > close failed: [Errno 29] Illegal seek
    > | >
    > |
    > |
    > | > Where can I find out what the Errno 29 really means?
    > | > Is this Python, the OS or maybe hardware?
    > |
    > | It is from kernel: grep -w 29 `locate errno`
    > | /usr/include/asm-generic/errno-base.h: #define ESPIPE 29
    > | /* Illegal seek */
    > |
    > | man lseek:
    > |
    > | ERRORS:
    > | ESPIPE fildes is associated with a pipe, socket, or FIFO.
    > |
    > | RESTRICTIONS:
    > | Linux specific restrictions: using lseek on a tty device
    > | returns ESPIPE.
    >
    >
    > Thanks for the info - so the Kernel sometimes bombs me out - does anybody know
    > why the python flush sometimes calls lseek?


    I thought it was your own flush method. If it is file.flush method that
    makes the issue more complicated, since stdlib file.flush doesn't call
    lseek method. I suggest you run your program using strace to log system
    calls, without such log it's pretty hard to say what's going on. The
    most interesting part is the end, but make sure you have enough space
    for the whole log, it's going to be big.
     
    Serge Orlov, Jun 11, 2006
    #4
  5. Serge Orlof wrote:

    | H J van Rooyen wrote:
    | > Serge Orloff wrote:
    | >
    | > | H J van Rooyen wrote:
    | > | > Traceback (most recent call last):
    | > | > File "portofile.py", line 232, in ?
    | > | > ret_val = main_routine(port, pollstruct, pfifo)
    | > | > File "portofile.py", line 108, in main_routine
    | > | > send_nak(port, timeout) # so bad luck - comms error
    | > | > File "/home/hvr/Polling/lib/readerpoll.py", line 125, in send_nak
    | > | > port.flush()
    | > | > IOError: [Errno 29] Illegal seek
    | > | > close failed: [Errno 29] Illegal seek
    | > | >
    | > |
    | > |
    | > | > Where can I find out what the Errno 29 really means?
    | > | > Is this Python, the OS or maybe hardware?
    | > |
    | > | It is from kernel: grep -w 29 `locate errno`
    | > | /usr/include/asm-generic/errno-base.h: #define ESPIPE 29
    | > | /* Illegal seek */
    | > |
    | > | man lseek:
    | > |
    | > | ERRORS:
    | > | ESPIPE fildes is associated with a pipe, socket, or FIFO.
    | > |
    | > | RESTRICTIONS:
    | > | Linux specific restrictions: using lseek on a tty device
    | > | returns ESPIPE.
    | >
    | >
    | > Thanks for the info - so the Kernel sometimes bombs me out - does anybody
    know
    | > why the python flush sometimes calls lseek?
    |
    | I thought it was your own flush method. If it is file.flush method that
    | makes the issue more complicated, since stdlib file.flush doesn't call
    | lseek method. I suggest you run your program using strace to log system
    | calls, without such log it's pretty hard to say what's going on. The
    | most interesting part is the end, but make sure you have enough space
    | for the whole log, it's going to be big.

    Thanks - I will research and use the strace - havent used it before - I have
    about 30 gig disk space left...
    Trouble is that the silly thing works for anything from some hours to some days
    before it falls over - ugly...
    Will bleat again when I have some more results...

    In the meantime I have a datascope attached to the line, and it appears as if it
    was on the point of sending a nak in response to a perfectly well formed
    message - almost as if either an interrupt was missed - unlikely at 9600 baud
    and a pentium 3 at some 2 GHz - or there is a weird hardware error - also
    unlikely - hardware normally just breaks, does not work for millions for chars
    and miss one... - I dont like the implication...

    I also dont really understand the second reference - to a close that failed -
    anyway we have to wait for the trace...


    - Hendrik
     
    H J van Rooyen, Jun 12, 2006
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. akhare1
    Replies:
    4
    Views:
    901
    Alvin Bruney [MVP]
    Aug 1, 2004
  2. H J van Rooyen
    Replies:
    2
    Views:
    519
    H J van Rooyen
    Jun 13, 2006
  3. Rob

    Serial port failure

    Rob, Dec 15, 2006, in forum: Python
    Replies:
    13
    Views:
    727
    Nick Craig-Wood
    Dec 17, 2006
  4. Pom
    Replies:
    2
    Views:
    1,793
    Bas-i
    Jan 31, 2007
  5. msalerno
    Replies:
    3
    Views:
    440
    Ilmari Karonen
    Jul 14, 2005
Loading...

Share This Page