socket setdefaulttimeout

Discussion in 'Python' started by Sheila King, Aug 13, 2005.

  1. Sheila King

    Sheila King Guest

    I'm doing DNS lookups on common spam blacklists (such as SpamCop..and
    others) in an email filtering script. Sometimes, because the DNS server
    that is resolving the looksup can go down, it is important to make sure
    that the socket doesn't just hang there waiting for a response.

    After a recent system upgrade to Python 2.4.1 (from 2.2.2) I thought I
    could take advantage of the setdefaulttimeout in the socket module, to
    limit the amount of time the sockets take for a lookup.

    As a test, I set the default timout ridiculously low. But it doesn't
    seem to be having any effect. The sockets will take several seconds to
    make the connection and complete the lookups, even though I've set the
    timeout to millionths of a second, which I thought would ensure a
    timeout (sample script below).

    Am I doing something wrong? Do I misunderstand something? Is what I
    want to do simply not possible?

    Thanks for any tips. Example code follows signature...

    --
    Sheila King
    http://www.thinkspot.net/sheila/

    #!/usr/local/bin/python2.4
    import socket
    import sys
    from time import time, asctime, localtime

    socket.setdefaulttimeout(.00001)
    debugfile = "socketdebug.txt"


    def debug(data):
    timestamp = str(asctime(localtime(time())))
    try:
    f = open(debugfile, 'a')
    f.write('\n*** %s ***\n' % timestamp)
    f.write('%s\n' % data) # 24-Dec-2003 -ctm- removed one
    linefeed
    f.close()
    except IOError:
    pass
    # do nothing if the file cannot be opened


    IPaddy = '220.108.204.114'

    if IPaddy:
    IPquads = IPaddy.split('.')
    IPquads.reverse()
    reverseIP = '.'.join(IPquads)

    bl_list = { 'bl.spamcop.net' : 'IP Address %s Rejected - see:
    http://spamcop.net/bl.shtml' % IPaddy, \
    'relays.ordb.org' : 'IP Address %s Rejected - see:
    http://ordb.org/' % IPaddy, \
    'list.dsbl.org' : 'IP Address %s Rejected - see:
    http://dsbl.org' % IPaddy}

    timing_done = 0
    start_time = time()
    for host in bl_list.keys():
    if host in bl_list.keys():
    IPlookup = "%s.%s" % (reverseIP, host)
    try:
    debug(" IPlookup=%s=" % IPlookup)
    resolvesIP = socket.gethostbyname(IPlookup)
    debug(" resolvesIP=%s=" % resolvesIP)
    if resolvesIP.startswith('127.'):
    end_time = time()
    elapsed_time = end_time - start_time
    timing_done = 1
    debug("Time elapsed for rDNS on bl_list: %f secs" %
    elapsed_time)
    debug("exiting--SPAM! id'd by %s" % host)
    print bl_list[host]
    sys.exit(0)
    except socket.gaierror:
    pass
    if not timing_done:
    end_time = time()
    elapsed_time = end_time - start_time
    debug("2nd try:Time elapsed for rDNS on bl_list: %f secs" %
    elapsed_time)
     
    Sheila King, Aug 13, 2005
    #1
    1. Advertisements

  2. Sheila King

    Sheila King Guest

    I do note that the setdefaulttimeout is accomplishing something in my
    full program.

    I am testing some error handling in the code at the moment, and am
    raising an exception to make the code go into the "except" blocks...

    The part that sends an error email notice bombed due to socket timeout.
    (well, until I raised the timeout to 3 seconds...)

    The last time I asked about this topic in the newsgroups...was a while
    back (like 2 or so years) and someone said that because the socket
    function that I'm trying to use is coded in C, rather than Python, that
    I could not use the timeoutsocket module (which was the only way prior
    to Python 2.3 to set timeouts on sockets).

    I wonder...is it possible this applies to the particular timeouts I'm
    trying to enforce on the "gethostbyname" DNS lookups?

    Insights appreciated....
     
    Sheila King, Aug 13, 2005
    #2
    1. Advertisements

  3. Sheila King

    Bryan Olson Guest

    The timeout applies to network communication on that socket, but
    not to calls such as socket.gethostbyname. The gethostbyname
    function actually goes to the operating system, which can look
    up the name in a cache, or a hosts file, or query DNS servers
    on sockets of its own.

    Modern OS's generally have reasonably TCP/IP implementations,
    and the OS will handle applying a reasonable timeout. Still
    gethostbyname and its brethren can be a pain for single-
    threaded event-driven programs, because they can block for
    significant time.

    Under some older threading systems, any system call would block
    every thread in the process, and gethostbyname was notorious for
    holding things up. Some systems offer an asynchronous
    gethostbyname, but that doesn't help users of Python's library.
    Some programmers would keep around a few extra processes to
    handle their hosts lookups. Fortunately, threading systems are
    now much better, and should only block the thread waiting for
    gethostbyname.
     
    Bryan Olson, Aug 13, 2005
    #3
  4. Sheila King

    Sheila King Guest

    Thanks, Bryan. I'm not doing any threading. But we are running this script on
    incoming email as it arrives at the SMTP server, and scripts have a 16 second
    max time of execution. Generally they run in much less time. However, we have
    seen incidents where, due to issues with the DNS servers for the blacklists,
    that the script exceed it's max time to run and the process was killed by
    the OS. This results in the email being placed back into the mail queue for
    attempted re-delivery later. Of course, if this issue goes undetected, the
    mail can eventually be "returned to sender". There's no effective way to check
    from within the running filter script that the time is not exceeded if the
    gethostbyname blocks and doesn't return. :(

    As I said, normally this isn't a problem. But there have been a handful of
    incidents where it did cause issues briefly over a few days. I was hoping to
    address it. :/

    Sounds like I'm out of luck.
     
    Sheila King, Aug 13, 2005
    #4
  5. Sheila King

    Bryan Olson Guest

    The seperate thread-or-process trick should work. Start a deamon
    thread to do the gethostbyname, and have the main thread give up
    on the check if the deamon thread doesn't report (via a lock or
    another socket) within, say, 8 seconds.

    If you have decent thread support, you might do it like as
    follows. (Oviously didn't have time test this well.)



    from threading import Thread
    from Queue import Queue, Empty
    import socket


    def start_deamon_thread(func):
    """ Run func -- a callable of zero args -- in a deamon thread.
    """
    thread = Thread(target = func)
    thread.setDaemon(True)
    thread.start()

    def gethostbyname_or_timeout(hostname, timeout_secs = 8):
    """ Return IP address from gethostbyname, or None on timeout.
    """
    queue = Queue(1)

    def attempt_ghbn():
    queue.put(socket.gethostbyname(hostname))

    start_deamon_thread(attempt_ghbn)
    try:
    result = queue.get(block = True, timeout = timeout_secs)
    except Empty:
    result = None
    return result
     
    Bryan Olson, Aug 13, 2005
    #5
  6. Sheila King

    Sheila King Guest

    Bryan: Thanks for the tips/suggestion.

    I will definitely look into that. (It will be my first foray into
    coding with threads...I do appreciate that you've laid a great deal of
    it out. I will certainly refer to my references and do substantial
    testing on this...)

    Thanks!
     
    Sheila King, Aug 13, 2005
    #6
  7. I've got the opposite problem -- I'm on a dial-up (well, for a few
    more weeks -- until the DSL gear arrives). For some reason DNS lookups
    seem to be low priority and, if I'm downloading messages in Agent (from
    three servers yet) and email (Eudora), Firefox often comes back with a
    "host not found"; enter the same URL/bookmark again, and it finds the
    page. It seems my system times out DNS requests when 1) I have a lot of
    traffic on the dial-up connection, 2) the request may need to be
    traversed (not cached on Earthlink)
    --
     
    Dennis Lee Bieber, Aug 14, 2005
    #7
  8. Sheila King

    John Machin Guest

    Dragging the thread off-topic a little:

    I was having a similar DNS problem (host not found, try again
    immediately, 2nd try successful) but even in a no-other-load situation,
    and with this set of gear:

    * ADSL with Netgear DG834G v2 "wireless ADSL firewall router"
    * both hardwired and wireless LAN connection to router
    * Firefox, Thunderbird [normally]
    * IE6, Outlook Express [just to test if it was a Mozilla problem]
    * Windows XP Professional SP2, Windows 2000 SP4

    The problem stopped after I upgraded the router firmware from version
    1.something to 2.something, but this may be yet another instance of the
    "waved a dead chicken at the volcano" syndrome :)
     
    John Machin, Aug 14, 2005
    #8
  9. Wouldn't an alarm be much simpler than a whole thread just for this?

    Mike

    --
    Michael P. Soulier <>
    "Those who would give up esential liberty for temporary safety deserve
    neither liberty nor safety." --Benjamin Franklin

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.0.7 (GNU/Linux)

    iD8DBQFDAL/7KGqCc1vIvggRArUdAJ0c40bWlh5sC3BZ5IcMr6iXbT5V+QCgrMw6
    nONtimH/q9W5OQACRdifBos=
    =AiJ1
    -----END PGP SIGNATURE-----
     
    Michael P. Soulier, Aug 15, 2005
    #9
  10. You mean a Unix-specific signal? If so that would be much less
    portable. As for simpler, I'd have to see your code.
     
    bryanjugglercryptographer, Aug 15, 2005
    #10
  11. Sheila King

    Steve Holden Guest

    I don't believe that gethostbyname()'s use of socket technology can be
    expected to raise socket timeout exceptions, since in general it's a
    call to a library that uses standard system calls. This would at least
    explain the behaviour you were seeing.

    It might just be easier to to the DNS work yourself using the rather
    nifty "dnspython" module. This does allow you to easily implement
    timeouts for specific interactions.

    regards
    Steve
     
    Steve Holden, Aug 17, 2005
    #11
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.