CPU usage while reading a named pipe

Discussion in 'Python' started by Miguel P, Sep 12, 2009.

  1. Miguel P

    Miguel P Guest

    Hey everyone,

    I've been working on parsing (tailing) a named pipe which is the
    syslog output of the traffic for a rather busy haproxy instance. It's
    a fair bit of traffic (upto 3k hits/s per server), but I am finding
    that simply tailing the file in python, without any processing, is
    taking up 15% of a CPU core. In contrast HAProxy takes 25% and syslogd
    takes 5% with the same load. `cat < /named.pipe` takes 0-2%

    Am I just doing things horribly wrong or is this normal?

    Here is my code:

    from collections import deque
    import io, sys

    WATCHED_PIPE = '/var/log/haproxy.pipe'

    if __name__ == '__main__':
    try:
    log_pool = deque([],10000)
    fd = io.open(WATCHED_PIPE)
    for line in fd:
    log_pool.append(line)
    except KeyboardInterrupt:
    sys.exit()

    Deque appends are O(1) so that's not it. And I am using 2.6's io
    module because it's supposed to handle named pipes better. I have
    commented the deque appending line and it still takes about the same
    CPU.

    The system is running Ubuntu 9.04 with kernel 2.6.28 and ext4 (not
    sure the FS is relevant).

    Any help bringing down the CPU usage would be really appreciated, and
    if it can't be done I guess that's ok too, server has 6 cores not
    doing much.
     
    Miguel P, Sep 12, 2009
    #1
    1. Advertising

  2. Miguel P

    MRAB Guest

    Miguel P wrote:
    > Hey everyone,
    >
    > I've been working on parsing (tailing) a named pipe which is the
    > syslog output of the traffic for a rather busy haproxy instance. It's
    > a fair bit of traffic (upto 3k hits/s per server), but I am finding
    > that simply tailing the file in python, without any processing, is
    > taking up 15% of a CPU core. In contrast HAProxy takes 25% and syslogd
    > takes 5% with the same load. `cat < /named.pipe` takes 0-2%
    >
    > Am I just doing things horribly wrong or is this normal?
    >
    > Here is my code:
    >
    > from collections import deque
    > import io, sys
    >
    > WATCHED_PIPE = '/var/log/haproxy.pipe'
    >
    > if __name__ == '__main__':
    > try:
    > log_pool = deque([],10000)
    > fd = io.open(WATCHED_PIPE)
    > for line in fd:
    > log_pool.append(line)
    > except KeyboardInterrupt:
    > sys.exit()
    >
    > Deque appends are O(1) so that's not it. And I am using 2.6's io
    > module because it's supposed to handle named pipes better. I have
    > commented the deque appending line and it still takes about the same
    > CPU.
    >
    > The system is running Ubuntu 9.04 with kernel 2.6.28 and ext4 (not
    > sure the FS is relevant).
    >
    > Any help bringing down the CPU usage would be really appreciated, and
    > if it can't be done I guess that's ok too, server has 6 cores not
    > doing much.


    Is this any faster?

    log_pool.extend(fd)
     
    MRAB, Sep 12, 2009
    #2
    1. Advertising

  3. Miguel P

    Ned Deily Guest

    In article
    <>,
    Miguel P <> wrote:
    > I've been working on parsing (tailing) a named pipe which is the
    > syslog output of the traffic for a rather busy haproxy instance. It's
    > a fair bit of traffic (upto 3k hits/s per server), but I am finding
    > that simply tailing the file in python, without any processing, is
    > taking up 15% of a CPU core. In contrast HAProxy takes 25% and syslogd
    > takes 5% with the same load. `cat < /named.pipe` takes 0-2%
    >
    > Am I just doing things horribly wrong or is this normal?
    >
    > Here is my code:
    >
    > from collections import deque
    > import io, sys
    >
    > WATCHED_PIPE = '/var/log/haproxy.pipe'
    >
    > if __name__ == '__main__':
    > try:
    > log_pool = deque([],10000)
    > fd = io.open(WATCHED_PIPE)
    > for line in fd:
    > log_pool.append(line)
    > except KeyboardInterrupt:
    > sys.exit()
    >
    > Deque appends are O(1) so that's not it. And I am using 2.6's io
    > module because it's supposed to handle named pipes better. I have
    > commented the deque appending line and it still takes about the same
    > CPU.


    Be aware that the io module in Python 2.6 is written in Python and was
    viewed as a prototype. In the current svn trunk, what will be Python
    2.7 has a much faster C implementation of the io module backported from
    Python 3.1.

    --
    Ned Deily,
     
    Ned Deily, Sep 12, 2009
    #3
  4. Miguel P

    Miguel P Guest

    On Sep 12, 2:54 pm, Ned Deily <> wrote:
    > In article
    > <>,
    >  Miguel P <> wrote:
    >
    >
    >
    > > I've been working on parsing (tailing) a named pipe which is the
    > > syslog output of the traffic for a rather busy haproxy instance. It's
    > > a fair bit of traffic (upto 3k hits/s per server), but I am finding
    > > that simply tailing the file  in python, without any processing, is
    > > taking up 15% of a CPU core. In contrast HAProxy takes 25% and syslogd
    > > takes 5% with the same load. `cat < /named.pipe` takes 0-2%

    >
    > > Am I just doing things horribly wrong or is this normal?

    >
    > > Here is my code:

    >
    > > from collections import deque
    > > import io, sys

    >
    > > WATCHED_PIPE = '/var/log/haproxy.pipe'

    >
    > > if __name__ == '__main__':
    > >     try:
    > >         log_pool = deque([],10000)
    > >         fd = io.open(WATCHED_PIPE)
    > >         for line in fd:
    > >             log_pool.append(line)
    > >     except KeyboardInterrupt:
    > >         sys.exit()

    >
    > > Deque appends are O(1) so that's not it. And I am using 2.6's io
    > > module because it's supposed to handle named pipes better. I have
    > > commented the deque appending line and it still takes about the same
    > > CPU.

    >
    > Be aware that the io module in Python 2.6 is written in Python and was
    > viewed as a prototype.  In the current svn trunk, what will be Python
    > 2.7 has a much faster C implementation of the io module backported from
    > Python 3.1.
    >
    > --
    >  Ned Deily,
    >  


    Aha, I will test with trunk and see if the performance is better, if
    so I'll use 2.6 in production until 2.7 comes out. I will report back
    when I have made the tests.

    Thanks,
    Miguel Pilar
     
    Miguel P, Sep 12, 2009
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Vivek
    Replies:
    1
    Views:
    1,377
  2. lee, wonsun
    Replies:
    1
    Views:
    490
    Jack Klein
    Nov 2, 2004
  3. hvt
    Replies:
    0
    Views:
    1,215
  4. hvt
    Replies:
    0
    Views:
    1,477
  5. Günther Gruber

    Reading from a named pipe

    Günther Gruber, Nov 17, 2006, in forum: Ruby
    Replies:
    3
    Views:
    152
    Tom Pollard
    Nov 20, 2006
Loading...

Share This Page