ulimit on open sockets ?

Discussion in 'Python' started by Maxim Veksler, Apr 10, 2007.

  1. Hi,

    I've written this code, the general idea was to listen on all 65535
    port of tcp for connection.
    """
    #!/usr/bin/env python

    import socket, select

    def get_non_blocking_socket(port_number):
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.setblocking(0)
    s.bind(('0.0.0.0', port_number))
    s.listen(1)
    return s

    all_sockets = map(get_non_blocking_socket, xrange(10000, 15000))

    while 1:
    ready_to_read, ready_to_write, in_error =
    select.select(all_sockets, [], [], 0)
    for nb_active_socket in all_sockets:
    if nb_active_socket in ready_to_read:
    conn, addr = nb_active_socket.accept()
    while 1:
    data = conn.recv(1024)
    if not data: break
    conn.send(data)
    conn.close()
    """

    The thing is that when I tried to run this at first I got
    """
    python non_blocking_range.py
    Traceback (most recent call last):
    File "non_blocking_range.py", line 12, in ?
    all_sockets = map(get_non_blocking_socket, xrange(10000, 15000))
    File "non_blocking_range.py", line 6, in get_non_blocking_socket
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    File "/usr/lib/python2.4/socket.py", line 148, in __init__
    _sock = _realsocket(family, type, proto)
    socket.error: (24, 'Too many open files')
    """

    So I set ulimit -n 500000, now I'm getting
    """
    python non_blocking_range.py
    Traceback (most recent call last):
    File "non_blocking_range.py", line 15, in ?
    ready_to_read, ready_to_write, in_error =
    select.select(all_sockets, [], [], 0)
    ValueError: filedescriptor out of range in select()
    """

    Should I be using a different version of select or something? Or
    should I implement this the other way around, if so please suggest
    how.

    Thank you very much,
    (enthusiastically learning python) Maxim.

    --
    Cheers,
    Maxim Veksler

    "Free as in Freedom" - Do u GNU ?
     
    Maxim Veksler, Apr 10, 2007
    #1
    1. Advertising

  2. Maxim Veksler wrote:

    > I've written this code, the general idea was to listen on all
    > 65535 port of tcp for connection.


    Please excuse the question: Why would anyone want to do such a manic
    thing (instead of, e. g., using raw sockets)?

    Regards,


    Björn

    --
    BOFH excuse #326:

    We need a licensed electrician to replace the light bulbs in the
    computer room.
     
    Bjoern Schliessmann, Apr 10, 2007
    #2
    1. Advertising

  3. Maxim Veksler <> wrote:

    > ValueError: filedescriptor out of range in select()
    > """
    >
    > Should I be using a different version of select or something? Or


    select typically supports 1024 FDs at most (a design limit of the
    underlying operating system). You may want to try poll instead (epoll
    might be better but I doubt Python supports it yet).


    Alex
     
    Alex Martelli, Apr 10, 2007
    #3
  4. On 4/10/07, Alex Martelli <> wrote:
    > Maxim Veksler <> wrote:
    >
    > > ValueError: filedescriptor out of range in select()
    > > """
    > >
    > > Should I be using a different version of select or something? Or

    >
    > select typically supports 1024 FDs at most (a design limit of the
    > underlying operating system). You may want to try poll instead (epoll
    > might be better but I doubt Python supports it yet).
    >


    I've read some post the other day of a guy faced similar problem and
    it turns out {e,}poll is limited as well, besides I don't know how to
    use it so an example would be great.

    Now, someone I work with suggested a simple work around "Pass the list
    objects in groups of 1024 each time to the select.select structure". I
    think it's acceptable and good advice, the thing is I don't know how
    to implement this "the python way" (that is - with out it being ugly).

    Can I do modulation ( % 1024 ) on the main iterator loop?
    Something like:

    for nb_active_socket in (all_sockets % 1024):
    if nb_active_socket in ready_to_read:
    conn, addr = nb_active_socket.accept()
    while 1:
    data = conn.recv(1024)
    if not data: break
    conn.send(data)
    conn.close()

    ?

    Thanks for helping,
    Maxim.

    >
    > Alex
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >



    --
    Cheers,
    Maxim Veksler

    "Free as in Freedom" - Do u GNU ?
     
    Maxim Veksler, Apr 12, 2007
    #4
  5. On Apr 12, 2007, at 1:17 PM, Maxim Veksler wrote:
    ...
    > Now, someone I work with suggested a simple work around "Pass the list
    > objects in groups of 1024 each time to the select.select structure". I
    > think it's acceptable and good advice, the thing is I don't know how
    > to implement this "the python way" (that is - with out it being ugly).


    I don't understand how you're going to make it work (I see no select
    calls in your code and I don't understand how you'd get one in there
    by polling), but I'm going to just explain how to get slices of 1024
    items at a time from a long list.

    Simplest way:

    for i in xrange(0, len(longlist), 1024):
    shortlist = longlist[i:i+1024]
    # rest of the body goes here

    More elegant/reusable:

    def sliceby(longlist, N=1024):
    for i in xrange(0, len(longlist), 1024):
    yield longlist[i:i+1024]

    for shortlist in sliceby(longlist):
    # body goes here

    If you want to show off, itertools.groupby may be suitable for that:

    for _, g in itertools.groupby(enumerate(longlist), lambda (i, j): i//
    1024):
    shortlist = list(a for b, a in g)
    # rest of the body goes here

    but I wouldn't recommend it in this case for other purposes.


    Alex
     
    Alex Martelli, Apr 13, 2007
    #5
  6. On 4/13/07, Alex Martelli <> wrote:
    >
    > On Apr 12, 2007, at 1:17 PM, Maxim Veksler wrote:
    > ...
    > > Now, someone I work with suggested a simple work around "Pass the list
    > > objects in groups of 1024 each time to the select.select structure". I
    > > think it's acceptable and good advice, the thing is I don't know how
    > > to implement this "the python way" (that is - with out it being ugly).

    >
    > I don't understand how you're going to make it work (I see no select
    > calls in your code and I don't understand how you'd get one in there
    > by polling), but I'm going to just explain how to get slices of 1024
    > items at a time from a long list.
    >


    Thank you. I'm attaching the full code so far for reference, sadly it
    still doesn't work. It seems that select.select gets it's count of
    fd's not from the amount passed to it by the sub_list but from the
    kernel (or whatever) count for the process; The main issue here is
    that it seems I won't be able to use select for the simple
    non-blocking process and am forced to check poll to see if that helps.

    The error I'm getting is still the same:

    # ulimit -n
    500000
    # python listener_sockets_range.py
    Traceback (most recent call last):
    File "listener_sockets_range.py", line 22, in ?
    ready_to_read, ready_to_write, in_error =
    select.select(select_cap_sockets, [], [], 0)
    ValueError: filedescriptor out of range in select()


    """
    #!/usr/bin/env python

    import socket, select

    def get_non_blocking_socket(port_number):
    print port_number

    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.setblocking(0)
    s.bind(('0.0.0.0', port_number))
    s.listen(1)
    return s

    def slice_by_fd_limit(longlist, N=1024):
    for i in xrange(0, len(longlist), N):
    yield longlist[i:i+N]

    all_sockets = map(get_non_blocking_socket, xrange(10000, 20000))

    while 1:
    for select_cap_sockets in slice_by_fd_limit(all_sockets):
    ready_to_read, ready_to_write, in_error =
    select.select(select_cap_sockets, [], [], 0)
    for nb_active_socket in all_sockets:
    if nb_active_socket in ready_to_read:
    conn, addr = nb_active_socket.accept()
    while 1:
    data = conn.recv(1024)
    if not data: break
    conn.send(data)
    conn.close()
    """

    --
    Cheers,
    Maxim Veksler

    "Free as in Freedom" - Do u GNU ?
     
    Maxim Veksler, Apr 14, 2007
    #6
  7. Maxim Veksler <> wrote:
    ...
    > Thank you. I'm attaching the full code so far for reference, sadly it
    > still doesn't work. It seems that select.select gets it's count of
    > fd's not from the amount passed to it by the sub_list but from the
    > kernel (or whatever) count for the process; The main issue here is


    It's not a problem of COUNT of FD's, i.e., how many you're passing to
    select; the problem is the value of the _highest_ number you can pass.
    It's an API-level limitation, not an issue with Python per se: the
    select API takes a "bit vector" of N bits, representing a set of FDs in
    that way, and N is fixed at kernel-compilation time (normally to 1024).

    The poll system call does not have this particular limitation, which is
    why select.poll may be better for you.

    Moreover, your code has other performance problems:


    > while 1:
    > for select_cap_sockets in slice_by_fd_limit(all_sockets):
    > ready_to_read, ready_to_write, in_error =
    > select.select(select_cap_sockets, [], [], 0)
    > for nb_active_socket in all_sockets:
    > if nb_active_socket in ready_to_read:


    A small issue is with the last two lines -- instead of looping directly
    on the small "ready-to-read" list, you're looping on the large
    all_sockets one and looking each up in the small list -- that's just
    throwing performance out of the window, and adding complexity, for no
    benefit whatsoever.

    The big issue is that you are "ceaselessly polling". If no socket is
    ready to read, you force select to return immediately anyway, and
    basically call select at once afterwards. You churn on the CPU without
    surcease, using 100% of it, hogging it for this "busy wait", possibly to
    the point of crowding out the kernel from some of the CPU time it needs
    to do useful work in the TCP-IP stack. Busy-wait is a bad thing...
    never call select with a timeout of 0 in a tight loop. This
    recommendation also applies to the polling-object that you can build
    with select.poll, and any other situation where you're waiting for
    another thread or process to deliver some data -- ideally you should
    wait in a blocking way, if that's unfeasible at least make sure you're
    letting some time pass between such calls, by using small but non-0
    timeout (or even by inserting calls to time.sleep if that's what it
    takes).

    The risk of such "antipatterns" is a good reason why it would be better
    to use a well-designed, well-coded, well-debugged existing framework,
    such as Twisted, rather than roll your own, btw. With twisted, you can
    choose among many appropriate implementations of "reactor" (the key
    design pattern for async prorgramming) and activate the one that is most
    suitable for your needs (including, e.g., one based on epoll, which
    gives better performance than poll on suitable operating systems).

    If you're adamant on "rolling your own", though, you can find a Python
    epoll module at <http://cheeseshop.python.org/pypi/pyepoll/0.2> (it's
    said to be in alpha status, though; I believe there are other such
    modules around, but pyepoll seems to be the only one on Cheese Shop).


    Alex
     
    Alex Martelli, Apr 14, 2007
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Maciej Kalisiak

    "ulimit -s" has no effect?

    Maciej Kalisiak, Feb 5, 2004, in forum: Python
    Replies:
    16
    Views:
    2,207
    Josiah Carlson
    Feb 12, 2004
  2. Jarek Zgoda

    Set ulimit when using subprocess.Popen?

    Jarek Zgoda, Jan 28, 2008, in forum: Python
    Replies:
    3
    Views:
    748
    Rob Wolfe
    Jan 28, 2008
  3. Greg Lindahl

    ulimit stack size and python threads

    Greg Lindahl, Jan 8, 2009, in forum: Python
    Replies:
    11
    Views:
    1,688
    Martin v. Löwis
    Jan 9, 2009
  4. Patrick Gundlach

    ulimit alike in ruby?

    Patrick Gundlach, Jul 29, 2004, in forum: Ruby
    Replies:
    9
    Views:
    496
    Patrick Gundlach
    Jul 29, 2004
  5. Kalaky
    Replies:
    0
    Views:
    275
    Kalaky
    Oct 26, 2004
Loading...

Share This Page