result sorting in socket.getaddrinfo?

Discussion in 'Python' started by Bernhard Schmidt, Oct 9, 2004.

  1. Hello,

    sorry for bothering, I'm not a programmer and I don't do much python,
    I'm more a networking guy trying to get his favourite linux distribution
    to update through the shiny new protocol IPv6 again (for those who are
    interested, I'm talking about Gentoo Linux)

    Gentoo's portage system is implemented in python calling rsync to sync
    with a mirror server. There are rotations (metahostnames with many
    address records) where portage has to decide to which IP it wants to

    Basically the program needs a list of all available IP addresses and
    will cycle through those until the sync is finished successfully.

    The old code looked like that

    | ips = socket.gethostbyname(hostname)[2]

    if you test this for example with as hostname you
    will get a list of addresses that changes its order with every call.
    This behaviour is used for loadbalancing and failover through the

    Now to support IPv6 addresses one has to use socket.getaddrinfo. This is
    my current try (don't laugh :) ):

    | ipsockets = socket.getaddrinfo(hostname,None,0,socket.SOCK_STREAM)
    | for socket in ipsockets:
    | if (socket[0]==10):
    | ips.append('[' + socket[4][0] + ']')
    | else:
    | ips.append(socket[4][0])

    Big problem: The result of getaddrinfo() and therefor of ips is sorted
    in some whay. If you again do

    | >>> socket.getaddrinfo("",None,0,socket.SOCK_STREAM);

    and have a closer look at the resulting list you will observe two
    things (at least I do on my box)

    a) The two IPv6 addresses ([2001:andsoon]) are always in front of the
    IPv4 addresses. This is expected behaviour and is consistent with
    most applications/stacks supporting IPv6

    b) The records within one address family (IPv4 or IPv6) are not really
    in randomized order. I called it several hundred times now and the
    order of the IPv6 records is always

    '2001:638:500:101::21', '2001:7b0:11ff:1::1:1'

    even worse, there are 15 IPv4 records in that list, and I have so far
    seen only two of them at the beginning of the list.

    When I debug the on-wire format of the DNS queries I can see that the
    resolver server indeed answers with randomized order, so the sorting
    seems to appear either somewhere in Python or somewhere in the

    The consequence of this would be, that the two servers in front of the
    list would be hammered with traffic and the others idle around.

    1.) Is it possible to change this behaviour?

    2.) If not, does someone have a code snippet available for randomizing
    the resulting list or another idea how to solve this?

    Python 2.3.3 (#1, Jun 4 2004, 00:57:34)
    [GCC 3.3.2 20031218 (Gentoo Linux 3.3.2-r5, propolice-3.3-7)] on linux2

    [ some time later ]
    Gnah, I just found a way so even I being a non-programmer (especially
    regarding C) could test this ... using OpenSSH I verified that C
    programs also suffer from this problem, it's not pythons fault.
    Disregard question one above :)

    Thanks a lot
    Bernhard Schmidt, Oct 9, 2004
    1. Advertisements

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    Gabriel Genellina
    Dec 23, 2006
  2. Thomas Dybdahl Ahle

    Kill thread or at least socket.getaddrinfo

    Thomas Dybdahl Ahle, Mar 26, 2007, in forum: Python
    Chris Mellon
    Apr 10, 2007
  3. John Nagle
    John Nagle
    Apr 21, 2007
  4. franx47
    Jul 21, 2008
  5. Michael Tan
    Jul 21, 2005

Share This Page