Question about unreasonable slowness

Discussion in 'Python' started by allenjo5@mail.northgrum.com, Nov 16, 2006.

  1. Guest

    [ Warning: I'm new to Python. Don't know it at all really yet, but had
    to examine some 3rd party code because of performance problems with it.
    ]

    Here's a code snippet:

    i = 0
    while (i < 20):
    i = i + 1
    (shellIn, shellOut) = os.popen4("/bin/sh -c ':'") # for testing, the
    spawned shell does nothing
    print 'next'
    # for line in shellOut:
    # print line

    On my system (AIX 5.1 if it matters, with Python 2.4.3), this simple
    loop spawning 20 subshells takes .75 sec. Ok, that's reasonable. Now,
    if I uncomment the two commented lines, which loop over the empty
    shellOut array, the progam now takes 11 secs. That slowdown seems
    very hard to believe. Why should it slow down so much?

    John.
     
    , Nov 16, 2006
    #1
    1. Advertising

  2. wrote:
    > i = 0
    > while (i < 20):
    > i = i + 1


    for i in xrange(20):

    > (shellIn, shellOut) = os.popen4("/bin/sh -c ':'") # for testing, the
    > spawned shell does nothing
    > print 'next'
    > # for line in shellOut:
    > # print line
    >
    > On my system (AIX 5.1 if it matters, with Python 2.4.3), this simple
    > loop spawning 20 subshells takes .75 sec. Ok, that's reasonable. Now,
    > if I uncomment the two commented lines, which loop over the empty
    > shellOut array, the progam now takes 11 secs. That slowdown seems
    > very hard to believe. Why should it slow down so much?


    The key fact here is that shellOut isn't an array; it's a living,
    breathing file object. If you don't iterate over it, you can run all 20
    shell processes in parallel if necessary; but if you do iterate over it,
    you're waiting for sh's stdout pipe to reach EOF, which effectively
    means you can only run one process at a time.

    On my system (OS X 10.4 with Python 2.5 installed), your code runs in
    ..187 secs with the loop commented out, and in .268 secs otherwise. But I
    guess AIX's sh is slower than OS X's.
     
    Leif K-Brooks, Nov 16, 2006
    #2
    1. Advertising

  3. wrote:
    > i = 0
    > while (i < 20):
    > i = i + 1


    for i in xrange(20):

    > (shellIn, shellOut) = os.popen4("/bin/sh -c ':'") # for testing, the
    > spawned shell does nothing
    > print 'next'
    > # for line in shellOut:
    > # print line
    >
    > On my system (AIX 5.1 if it matters, with Python 2.4.3), this simple
    > loop spawning 20 subshells takes .75 sec. Ok, that's reasonable. Now,
    > if I uncomment the two commented lines, which loop over the empty
    > shellOut array, the progam now takes 11 secs. That slowdown seems
    > very hard to believe. Why should it slow down so much?


    The key fact here is that shellOut isn't an array; it's a living,
    breathing file object. If you don't iterate over it, you can run all 20
    shell processes in parallel if necessary; but if you do iterate over it,
    you're waiting for sh's stdout pipe to reach EOF, which effectively
    means you can only run one process at a time.

    On my system (OS X 10.4 with Python 2.5 installed), your code runs in
    ..187 secs with the loop commented out, and in .268 secs otherwise. But I
    guess AIX's sh is slower than OS X's.
     
    Leif K-Brooks, Nov 16, 2006
    #3
  4. On Thu, 16 Nov 2006 12:45:18 -0800, allenjo5 wrote:

    > [ Warning: I'm new to Python. Don't know it at all really yet, but had
    > to examine some 3rd party code because of performance problems with it.
    > ]
    >
    > Here's a code snippet:
    >
    > i = 0
    > while (i < 20):
    > i = i + 1



    You probably want to change that to:

    for i in range(20):

    If 20 is just a place-holder, and the real value is much bigger, change
    the range() to xrange().


    > (shellIn, shellOut) = os.popen4("/bin/sh -c ':'") # for testing, the
    > spawned shell does nothing
    > print 'next'
    > # for line in shellOut:
    > # print line
    >
    > On my system (AIX 5.1 if it matters, with Python 2.4.3), this simple
    > loop spawning 20 subshells takes .75 sec. Ok, that's reasonable. Now,
    > if I uncomment the two commented lines, which loop over the empty
    > shellOut array, the progam now takes 11 secs. That slowdown seems
    > very hard to believe. Why should it slow down so much?


    What are you using to time the code?

    Replacing print statements with "pass", I get these results:

    >>> import timeit
    >>>
    >>> def test():

    .... i = 0
    .... while (i < 20):
    .... i = i + 1
    .... (shellIn, shellOut) = os.popen4("/bin/sh -c ':'")
    .... pass # print 'next'
    .... for line in shellOut:
    .... pass # print line
    ....
    >>> timeit.Timer("test()", "from __main__ import test\nimport os").timeit(1)

    0.49781703948974609
    >>> timeit.Timer("test()", "from __main__ import test\nimport os").timeit(100)

    54.894074201583862

    About 0.5 second to open and dispose of 20 subshells, even with the "for
    line in shellOut" loop.


    I think you need some more fine-grained testing to determine whether the
    slowdown is actually happening inside the "for line in shellOut" loop or
    inside the while loop or when the while loop completes.



    --
    Steven.
     
    Steven D'Aprano, Nov 16, 2006
    #4
  5. On 16 Nov 2006 12:45:18 -0800, declaimed the
    following in comp.lang.python:

    > [ Warning: I'm new to Python. Don't know it at all really yet, but had
    > to examine some 3rd party code because of performance problems with it.
    > ]
    >

    No comment on your timing problem, but...

    > Here's a code snippet:
    >
    > i = 0
    > while (i < 20):
    > i = i + 1


    for i in xrange(20): #replaces all three of the above lines
    --
    Wulfraed Dennis Lee Bieber KD6MOG

    HTTP://wlfraed.home.netcom.com/
    (Bestiaria Support Staff: )
    HTTP://www.bestiaria.com/
     
    Dennis Lee Bieber, Nov 17, 2006
    #5
  6. Guest

    Leif K-Brooks wrote:
    > wrote:
    > > i = 0
    > > while (i < 20):
    > > i = i + 1

    >
    > for i in xrange(20):
    >
    > > (shellIn, shellOut) = os.popen4("/bin/sh -c ':'") # for testing, the
    > > spawned shell does nothing
    > > print 'next'
    > > # for line in shellOut:
    > > # print line
    > >
    > > On my system (AIX 5.1 if it matters, with Python 2.4.3), this simple
    > > loop spawning 20 subshells takes .75 sec. Ok, that's reasonable. Now,
    > > if I uncomment the two commented lines, which loop over the empty
    > > shellOut array, the progam now takes 11 secs. That slowdown seems
    > > very hard to believe. Why should it slow down so much?

    >
    > The key fact here is that shellOut isn't an array; it's a living,
    > breathing file object. If you don't iterate over it, you can run all 20
    > shell processes in parallel if necessary; but if you do iterate over it,
    > you're waiting for sh's stdout pipe to reach EOF, which effectively
    > means you can only run one process at a time.


    Aha! I now notice that with the second loop commented out, I see many
    python processes running for a little while after the main program
    ends. So that confirms what you stated.

    > On my system (OS X 10.4 with Python 2.5 installed), your code runs in
    > .187 secs with the loop commented out, and in .268 secs otherwise. But I
    > guess AIX's sh is slower than OS X's.


    Ok, I built Python 2.5 (same AIX 5.1 machine). With the "for line in
    shellOut" loop in, it now takes "only" 7 secs instead of the 11 secs in
    python 2.4.3. So, that's better, but still unreasonably slow. And to
    answer another's question, I'm using the ksh builtin 'time' command to
    time the overall script.

    BTW, I don't think the AIX /bin/sh (actually ksh) is inherently slow.
    This naively translated pure shell version of my python test script
    completes in .1 secs:

    i=1
    while ((i<20))
    do ((i+=1))
    print next
    print "$shellIn" | /bin/sh -c ':' |
    while read line
    do print $line
    done
    done

    Has anyone tried this on a true unix box (AIX, HPUX, Solaris, Linux)?
    It seems to be functioning differently (and faster) on Windows and OS X
    (though I guess at its heard, OS X is essentially unix).

    John.
     
    , Nov 17, 2006
    #6
  7. :
    > Ok, I built Python 2.5 (same AIX 5.1 machine). With the "for line in
    > shellOut" loop in, it now takes "only" 7 secs instead of the 11 secs in
    > python 2.4.3. So, that's better, but still unreasonably slow. And to
    > answer another's question, I'm using the ksh builtin 'time' command to
    > time the overall script.
    >
    > BTW, I don't think the AIX /bin/sh (actually ksh) is inherently slow.
    > This naively translated pure shell version of my python test script
    > completes in .1 secs:
    >
    > i=1
    > while ((i<20))
    > do ((i+=1))
    > print next
    > print "$shellIn" | /bin/sh -c ':' |
    > while read line
    > do print $line
    > done
    > done
    >
    > Has anyone tried this on a true unix box (AIX, HPUX, Solaris, Linux)?
    > It seems to be functioning differently (and faster) on Windows and OS X
    > (though I guess at its heard, OS X is essentially unix).
    >
    > John.
    >


    Linux 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:57:02 EDT 2006 i686 i686
    i386 GNU/Linux

    # <code>

    import os
    import timeit

    def test():
    for i in xrange(20):
    (shellIn, shellOut) = os.popen4("/bin/sh -c ':'")
    print 'next'
    for line in shellOut:
    print line

    print timeit.Timer("test()", "from __main__ import test\nimport
    os").timeit(1)

    # </code>


    This returns in 0.4 seconds. If I time it to do 50 tests, it returns
    after 20.2 - 20.5 seconds. Even if I substitute the for i in xrange()
    construct to your sh-like while statement. And all that through a
    network, with print statements intact. Guess your true Unix box has some
    features unavailable on Fedora Core or MacOS X ;-)

    Regards,
    Åukasz Langa
     
    =?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=, Nov 17, 2006
    #7
  8. Guest

    Åukasz Langa wrote:
    > :
    > > Ok, I built Python 2.5 (same AIX 5.1 machine). With the "for line in
    > > shellOut" loop in, it now takes "only" 7 secs instead of the 11 secs in
    > > python 2.4.3. So, that's better, but still unreasonably slow. And to
    > > answer another's question, I'm using the ksh builtin 'time' command to
    > > time the overall script.
    > >
    > > BTW, I don't think the AIX /bin/sh (actually ksh) is inherently slow.
    > > This naively translated pure shell version of my python test script
    > > completes in .1 secs:
    > >
    > > i=1
    > > while ((i<20))
    > > do ((i+=1))
    > > print next
    > > print "$shellIn" | /bin/sh -c ':' |
    > > while read line
    > > do print $line
    > > done
    > > done
    > >
    > > Has anyone tried this on a true unix box (AIX, HPUX, Solaris, Linux)?
    > > It seems to be functioning differently (and faster) on Windows and OS X
    > > (though I guess at its heard, OS X is essentially unix).
    > >
    > > John.
    > >

    >
    > Linux 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:57:02 EDT 2006 i686 i686
    > i386 GNU/Linux
    >
    > # <code>
    >
    > import os
    > import timeit
    >
    > def test():
    > for i in xrange(20):
    > (shellIn, shellOut) = os.popen4("/bin/sh -c ':'")
    > print 'next'
    > for line in shellOut:
    > print line
    >
    > print timeit.Timer("test()", "from __main__ import test\nimport
    > os").timeit(1)
    >
    > # </code>
    >
    >
    > This returns in 0.4 seconds. If I time it to do 50 tests, it returns
    > after 20.2 - 20.5 seconds. Even if I substitute the for i in xrange()
    > construct to your sh-like while statement. And all that through a
    > network, with print statements intact. Guess your true Unix box has some
    > features unavailable on Fedora Core or MacOS X ;-)


    Yeah, apparently this is an AIX specific issue - perhaps the python
    implementation of popen4() needs to do something special for AIX?

    I've since tested my script on SunOS 5.9 with Python 2.4.2, and it took
    only about 1.5 sec with or without the second for loop, but without it,
    there were no extra python processes running in the background when the
    main one ends, unlike what I saw on AIX. This might be a clue to
    someone who knows more than I do... any Python gurus out there runnin
    AIX?

    John.
     
    , Nov 17, 2006
    #8
  9. James Antill Guest

    On Fri, 17 Nov 2006 12:39:16 -0800, allenjo5 wrote:

    >> :
    >> > Ok, I built Python 2.5 (same AIX 5.1 machine). With the "for line in
    >> > shellOut" loop in, it now takes "only" 7 secs instead of the 11 secs in
    >> > python 2.4.3. So, that's better, but still unreasonably slow. And to
    >> > answer another's question, I'm using the ksh builtin 'time' command to
    >> > time the overall script.
    >> >
    >> > BTW, I don't think the AIX /bin/sh (actually ksh) is inherently slow.
    >> > This naively translated pure shell version of my python test script
    >> > completes in .1 secs:
    >> >
    >> > i=1
    >> > while ((i<20))
    >> > do ((i+=1))
    >> > print next
    >> > print "$shellIn" | /bin/sh -c ':' |
    >> > while read line
    >> > do print $line
    >> > done
    >> > done

    >>

    > Yeah, apparently this is an AIX specific issue - perhaps the python
    > implementation of popen4() needs to do something special for AIX?


    This seems likely a more general issue, rather than just a python issue
    (although the huge speed up from moving to 2.5.x). A
    couple of things I'd try:

    1. Split the spawn/IO apart, twenty procs. should be fine.
    2. Try making the pipe buffer size bigger (optional third argument to
    os.popen4).
    3. Note that you might well be spawning three processes, and
    are definitely doing two shells. Any shell init. slowness is going to be
    none fun. Use an array to run /bin/true, and time that.

    --
    James Antill --
    http://www.and.org/and-httpd/ -- $2,000 security guarantee
    http://www.and.org/vstr/
     
    James Antill, Nov 20, 2006
    #9
  10. Guest

    James Antill wrote:
    > On Fri, 17 Nov 2006 12:39:16 -0800, allenjo5 wrote:
    >
    > >> :
    > >> > Ok, I built Python 2.5 (same AIX 5.1 machine). With the "for line in
    > >> > shellOut" loop in, it now takes "only" 7 secs instead of the 11 secs in
    > >> > python 2.4.3. So, that's better, but still unreasonably slow. And to
    > >> > answer another's question, I'm using the ksh builtin 'time' command to
    > >> > time the overall script.
    > >> >
    > >> > BTW, I don't think the AIX /bin/sh (actually ksh) is inherently slow.
    > >> > This naively translated pure shell version of my python test script
    > >> > completes in .1 secs:
    > >> >
    > >> > i=1
    > >> > while ((i<20))
    > >> > do ((i+=1))
    > >> > print next
    > >> > print "$shellIn" | /bin/sh -c ':' |
    > >> > while read line
    > >> > do print $line
    > >> > done
    > >> > done
    > >>

    > > Yeah, apparently this is an AIX specific issue - perhaps the python
    > > implementation of popen4() needs to do something special for AIX?

    >
    > This seems likely a more general issue, rather than just a python issue
    > (although the huge speed up from moving to 2.5.x). A
    > couple of things I'd try:


    With help from c.u.aix, I've discovered the problem. Python (in
    popen2.py) is attempting to close filedescriptors 3 through 32767
    before running the /bin/sh. This is because os.sysconf('SC_OPEN_MAX')
    is returning 32767. So far, it looks like SC_OPEN_MAX is being set
    correctly to 4 in posixmodule.c, and indeed, os.sysconf_names seems to
    also have SC_OPEN_MAX set to 4:

    python -c 'import os; print os.sysconf_names'

    ....
    'SC_XOPEN_XCU_VERSION': 109, 'SC_OPEN_MAX': 4, 'SC_PRIORITIZED_IO': 91,
    ....

    In fact, none of the values that sysconf_names has set for the various
    constants are being returned by os.sysconf(). For example, the 2
    others I just listed:

    $ ./python -c 'import os; print os.sysconf("SC_XOPEN_XCU_VERSION")'
    4

    $ ./python -c 'import os; print os.sysconf("SC_PRIORITIZED_IO")'
    -1

    This makes no sense to me... unless there is some memory alignment or
    endian issue going on here?
     
    , Nov 21, 2006
    #10
  11. Guest

    wrote:
    > James Antill wrote:
    > > On Fri, 17 Nov 2006 12:39:16 -0800, allenjo5 wrote:
    > >
    > > >> :
    > > >> > Ok, I built Python 2.5 (same AIX 5.1 machine). With the "for line in
    > > >> > shellOut" loop in, it now takes "only" 7 secs instead of the 11 secs in
    > > >> > python 2.4.3. So, that's better, but still unreasonably slow. And to
    > > >> > answer another's question, I'm using the ksh builtin 'time' command to
    > > >> > time the overall script.
    > > >> >
    > > >> > BTW, I don't think the AIX /bin/sh (actually ksh) is inherently slow.
    > > >> > This naively translated pure shell version of my python test script
    > > >> > completes in .1 secs:
    > > >> >
    > > >> > i=1
    > > >> > while ((i<20))
    > > >> > do ((i+=1))
    > > >> > print next
    > > >> > print "$shellIn" | /bin/sh -c ':' |
    > > >> > while read line
    > > >> > do print $line
    > > >> > done
    > > >> > done
    > > >>
    > > > Yeah, apparently this is an AIX specific issue - perhaps the python
    > > > implementation of popen4() needs to do something special for AIX?

    > >
    > > This seems likely a more general issue, rather than just a python issue
    > > (although the huge speed up from moving to 2.5.x). A
    > > couple of things I'd try:

    >
    > With help from c.u.aix, I've discovered the problem. Python (in
    > popen2.py) is attempting to close filedescriptors 3 through 32767
    > before running the /bin/sh. This is because os.sysconf('SC_OPEN_MAX')
    > is returning 32767. So far, it looks like SC_OPEN_MAX is being set
    > correctly to 4 in posixmodule.c, and indeed, os.sysconf_names seems to
    > also have SC_OPEN_MAX set to 4:
    >
    > python -c 'import os; print os.sysconf_names'
    >
    > ...
    > 'SC_XOPEN_XCU_VERSION': 109, 'SC_OPEN_MAX': 4, 'SC_PRIORITIZED_IO': 91,
    > ...
    >
    > In fact, none of the values that sysconf_names has set for the various
    > constants are being returned by os.sysconf(). For example, the 2
    > others I just listed:
    >
    > $ ./python -c 'import os; print os.sysconf("SC_XOPEN_XCU_VERSION")'
    > 4
    >
    > $ ./python -c 'import os; print os.sysconf("SC_PRIORITIZED_IO")'
    > -1
    >
    > This makes no sense to me... unless there is some memory alignment or
    > endian issue going on here?


    More info: clearly I had no idea what I was talking about :)

    The numbers associated with the names returned by os.sysconf_names are
    the indices to an array that the C sysconf() function uses to return
    the value of the name. So, the fact that os.sysconf("SC_OPEN_MAX")
    was returning 32767 on AIX is correct. However, the slowness this
    causes is still an issue. This is because python is closing all these
    file descriptors in python code, not C code - specifically, in
    popen2.py:

    try:
    MAXFD = os.sysconf('SC_OPEN_MAX')
    except (AttributeError, ValueError):
    MAXFD = 256

    ....

    def _run_child(self, cmd):
    if isinstance(cmd, basestring):
    cmd = ['/bin/sh', '-c', cmd]
    for i in range(3, MAXFD):
    try:
    os.close(i)
    except OSError:
    pass
    try:
    os.execvp(cmd[0], cmd)
    finally:
    os._exit(1)


    Any chance the "for i in range(3, MAXFD):" loop could be done in C
    instead? Even having, say, an os.rclose(x,y) low level function to
    close all file descriptors in range [x,y] would be great.

    John.
     
    , Nov 21, 2006
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jason K
    Replies:
    6
    Views:
    3,994
    Jeff Flinn
    May 12, 2005
  2. ming_cuhk
    Replies:
    4
    Views:
    354
    ming_cuhk
    Jan 2, 2009
  3. Paul

    ridicuously unreasonable

    Paul, Feb 23, 2011, in forum: C++
    Replies:
    8
    Views:
    342
    Krice
    Feb 24, 2011
  4. Lynn McGuire
    Replies:
    40
    Views:
    653
  5. Lynn McGuire

    "The Unreasonable Effectiveness of C"

    Lynn McGuire, May 30, 2014, in forum: C Programming
    Replies:
    11
    Views:
    99
Loading...

Share This Page