socket setdefaulttimeout

S

Sheila King

I'm doing DNS lookups on common spam blacklists (such as SpamCop..and
others) in an email filtering script. Sometimes, because the DNS server
that is resolving the looksup can go down, it is important to make sure
that the socket doesn't just hang there waiting for a response.

After a recent system upgrade to Python 2.4.1 (from 2.2.2) I thought I
could take advantage of the setdefaulttimeout in the socket module, to
limit the amount of time the sockets take for a lookup.

As a test, I set the default timout ridiculously low. But it doesn't
seem to be having any effect. The sockets will take several seconds to
make the connection and complete the lookups, even though I've set the
timeout to millionths of a second, which I thought would ensure a
timeout (sample script below).

Am I doing something wrong? Do I misunderstand something? Is what I
want to do simply not possible?

Thanks for any tips. Example code follows signature...

--
Sheila King
http://www.thinkspot.net/sheila/

#!/usr/local/bin/python2.4
import socket
import sys
from time import time, asctime, localtime

socket.setdefaulttimeout(.00001)
debugfile = "socketdebug.txt"


def debug(data):
timestamp = str(asctime(localtime(time())))
try:
f = open(debugfile, 'a')
f.write('\n*** %s ***\n' % timestamp)
f.write('%s\n' % data) # 24-Dec-2003 -ctm- removed one
linefeed
f.close()
except IOError:
pass
# do nothing if the file cannot be opened


IPaddy = '220.108.204.114'

if IPaddy:
IPquads = IPaddy.split('.')
IPquads.reverse()
reverseIP = '.'.join(IPquads)

bl_list = { 'bl.spamcop.net' : 'IP Address %s Rejected - see:
http://spamcop.net/bl.shtml' % IPaddy, \
'relays.ordb.org' : 'IP Address %s Rejected - see:
http://ordb.org/' % IPaddy, \
'list.dsbl.org' : 'IP Address %s Rejected - see:
http://dsbl.org' % IPaddy}

timing_done = 0
start_time = time()
for host in bl_list.keys():
if host in bl_list.keys():
IPlookup = "%s.%s" % (reverseIP, host)
try:
debug(" IPlookup=%s=" % IPlookup)
resolvesIP = socket.gethostbyname(IPlookup)
debug(" resolvesIP=%s=" % resolvesIP)
if resolvesIP.startswith('127.'):
end_time = time()
elapsed_time = end_time - start_time
timing_done = 1
debug("Time elapsed for rDNS on bl_list: %f secs" %
elapsed_time)
debug("exiting--SPAM! id'd by %s" % host)
print bl_list[host]
sys.exit(0)
except socket.gaierror:
pass
if not timing_done:
end_time = time()
elapsed_time = end_time - start_time
debug("2nd try:Time elapsed for rDNS on bl_list: %f secs" %
elapsed_time)
 
S

Sheila King

I do note that the setdefaulttimeout is accomplishing something in my
full program.

I am testing some error handling in the code at the moment, and am
raising an exception to make the code go into the "except" blocks...

The part that sends an error email notice bombed due to socket timeout.
(well, until I raised the timeout to 3 seconds...)

The last time I asked about this topic in the newsgroups...was a while
back (like 2 or so years) and someone said that because the socket
function that I'm trying to use is coded in C, rather than Python, that
I could not use the timeoutsocket module (which was the only way prior
to Python 2.3 to set timeouts on sockets).

I wonder...is it possible this applies to the particular timeouts I'm
trying to enforce on the "gethostbyname" DNS lookups?

Insights appreciated....
 
B

Bryan Olson

Sheila said:
> I'm doing DNS lookups [...] it is important to make sure
> that the socket doesn't just hang there waiting for a response.
>
> After a recent system upgrade to Python 2.4.1 (from 2.2.2) I thought I
> could take advantage of the setdefaulttimeout in the socket module, to
> limit the amount of time the sockets take for a lookup.
>
> As a test, I set the default timout ridiculously low. But it doesn't
> seem to be having any effect.

The timeout applies to network communication on that socket, but
not to calls such as socket.gethostbyname. The gethostbyname
function actually goes to the operating system, which can look
up the name in a cache, or a hosts file, or query DNS servers
on sockets of its own.

Modern OS's generally have reasonably TCP/IP implementations,
and the OS will handle applying a reasonable timeout. Still
gethostbyname and its brethren can be a pain for single-
threaded event-driven programs, because they can block for
significant time.

Under some older threading systems, any system call would block
every thread in the process, and gethostbyname was notorious for
holding things up. Some systems offer an asynchronous
gethostbyname, but that doesn't help users of Python's library.
Some programmers would keep around a few extra processes to
handle their hosts lookups. Fortunately, threading systems are
now much better, and should only block the thread waiting for
gethostbyname.
 
S

Sheila King

Sheila King wrote:
I'm doing DNS lookups [...] it is important to make sure that the socket
doesn't just hang there waiting for a response.
After a recent system upgrade to Python 2.4.1 (from 2.2.2) I thought I
could take advantage of the setdefaulttimeout in the socket module, to
limit the amount of time the sockets take for a lookup.
As a test, I set the default timout ridiculously low. But it doesn't
seem to be having any effect.
The timeout applies to network communication on that socket, but not to
calls such as socket.gethostbyname. The gethostbyname function actually
goes to the operating system, which can look up the name in a cache, or a
hosts file, or query DNS servers on sockets of its own.
Modern OS's generally have reasonably TCP/IP implementations, and the OS
will handle applying a reasonable timeout. Still gethostbyname and its
brethren can be a pain for single- threaded event-driven programs, because
they can block for significant time.
Under some older threading systems, any system call would block every
thread in the process, and gethostbyname was notorious for holding things
up. Some systems offer an asynchronous gethostbyname, but that doesn't
help users of Python's library. Some programmers would keep around a few
extra processes to handle their hosts lookups. Fortunately, threading
systems are now much better, and should only block the thread waiting for
gethostbyname.

Thanks, Bryan. I'm not doing any threading. But we are running this script on
incoming email as it arrives at the SMTP server, and scripts have a 16 second
max time of execution. Generally they run in much less time. However, we have
seen incidents where, due to issues with the DNS servers for the blacklists,
that the script exceed it's max time to run and the process was killed by
the OS. This results in the email being placed back into the mail queue for
attempted re-delivery later. Of course, if this issue goes undetected, the
mail can eventually be "returned to sender". There's no effective way to check
from within the running filter script that the time is not exceeded if the
gethostbyname blocks and doesn't return. :(

As I said, normally this isn't a problem. But there have been a handful of
incidents where it did cause issues briefly over a few days. I was hoping to
address it. :/

Sounds like I'm out of luck.
 
B

Bryan Olson

Sheila said:
> Bryan Olson wrote: [...]
>>Under some older threading systems, any system call would block every
>>thread in the process, and gethostbyname was notorious for holding things
>>up. Some systems offer an asynchronous gethostbyname, but that doesn't
>>help users of Python's library. Some programmers would keep around a few
>>extra processes to handle their hosts lookups. Fortunately, threading
>>systems are now much better, and should only block the thread waiting for
>>gethostbyname.
>
>
> Thanks, Bryan. I'm not doing any threading. But we are running this script on
> incoming email as it arrives at the SMTP server, and scripts have a 16 second
> max time of execution. Generally they run in much less time. However, we have
> seen incidents where, due to issues with the DNS servers for the blacklists,
> that the script exceed it's max time to run and the process was killed by
> the OS. This results in the email being placed back into the mail queue for
> attempted re-delivery later. Of course, if this issue goes undetected, the
> mail can eventually be "returned to sender". There's no effective way to check
> from within the running filter script that the time is not exceeded if the
> gethostbyname blocks and doesn't return. :(
>
> As I said, normally this isn't a problem. But there have been a handful of
> incidents where it did cause issues briefly over a few days. I was hoping to
> address it. :/
>
> Sounds like I'm out of luck.

The seperate thread-or-process trick should work. Start a deamon
thread to do the gethostbyname, and have the main thread give up
on the check if the deamon thread doesn't report (via a lock or
another socket) within, say, 8 seconds.

If you have decent thread support, you might do it like as
follows. (Oviously didn't have time test this well.)



from threading import Thread
from Queue import Queue, Empty
import socket


def start_deamon_thread(func):
""" Run func -- a callable of zero args -- in a deamon thread.
"""
thread = Thread(target = func)
thread.setDaemon(True)
thread.start()

def gethostbyname_or_timeout(hostname, timeout_secs = 8):
""" Return IP address from gethostbyname, or None on timeout.
"""
queue = Queue(1)

def attempt_ghbn():
queue.put(socket.gethostbyname(hostname))

start_deamon_thread(attempt_ghbn)
try:
result = queue.get(block = True, timeout = timeout_secs)
except Empty:
result = None
return result
 
S

Sheila King

Bryan: Thanks for the tips/suggestion.

I will definitely look into that. (It will be my first foray into
coding with threads...I do appreciate that you've laid a great deal of
it out. I will certainly refer to my references and do substantial
testing on this...)

Thanks!
 
D

Dennis Lee Bieber

The timeout applies to network communication on that socket, but
not to calls such as socket.gethostbyname. The gethostbyname
function actually goes to the operating system, which can look
up the name in a cache, or a hosts file, or query DNS servers
on sockets of its own.

Modern OS's generally have reasonably TCP/IP implementations,
and the OS will handle applying a reasonable timeout. Still
gethostbyname and its brethren can be a pain for single-
threaded event-driven programs, because they can block for
significant time.
I've got the opposite problem -- I'm on a dial-up (well, for a few
more weeks -- until the DSL gear arrives). For some reason DNS lookups
seem to be low priority and, if I'm downloading messages in Agent (from
three servers yet) and email (Eudora), Firefox often comes back with a
"host not found"; enter the same URL/bookmark again, and it finds the
page. It seems my system times out DNS requests when 1) I have a lot of
traffic on the dial-up connection, 2) the request may need to be
traversed (not cached on Earthlink)
--
 
J

John Machin

Dennis said:
I've got the opposite problem -- I'm on a dial-up (well, for a few
more weeks -- until the DSL gear arrives). For some reason DNS lookups
seem to be low priority and, if I'm downloading messages in Agent (from
three servers yet) and email (Eudora), Firefox often comes back with a
"host not found"; enter the same URL/bookmark again, and it finds the
page. It seems my system times out DNS requests when 1) I have a lot of
traffic on the dial-up connection, 2) the request may need to be
traversed (not cached on Earthlink)

Dragging the thread off-topic a little:

I was having a similar DNS problem (host not found, try again
immediately, 2nd try successful) but even in a no-other-load situation,
and with this set of gear:

* ADSL with Netgear DG834G v2 "wireless ADSL firewall router"
* both hardwired and wireless LAN connection to router
* Firefox, Thunderbird [normally]
* IE6, Outlook Express [just to test if it was a Mozilla problem]
* Windows XP Professional SP2, Windows 2000 SP4

The problem stopped after I upgraded the router firmware from version
1.something to 2.something, but this may be yet another instance of the
"waved a dead chicken at the volcano" syndrome :)
 
M

Michael P. Soulier

The seperate thread-or-process trick should work. Start a deamon
thread to do the gethostbyname, and have the main thread give up
on the check if the deamon thread doesn't report (via a lock or
another socket) within, say, 8 seconds.

Wouldn't an alarm be much simpler than a whole thread just for this?

Mike

--
Michael P. Soulier <[email protected]>
"Those who would give up esential liberty for temporary safety deserve
neither liberty nor safety." --Benjamin Franklin

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQFDAL/7KGqCc1vIvggRArUdAJ0c40bWlh5sC3BZ5IcMr6iXbT5V+QCgrMw6
nONtimH/q9W5OQACRdifBos=
=AiJ1
-----END PGP SIGNATURE-----
 
B

bryanjugglercryptographer

Michael said:
Wouldn't an alarm be much simpler than a whole thread just for this?

You mean a Unix-specific signal? If so that would be much less
portable. As for simpler, I'd have to see your code.
 
S

Steve Holden

Sheila said:
I'm doing DNS lookups on common spam blacklists (such as SpamCop..and
others) in an email filtering script. Sometimes, because the DNS server
that is resolving the looksup can go down, it is important to make sure
that the socket doesn't just hang there waiting for a response.

After a recent system upgrade to Python 2.4.1 (from 2.2.2) I thought I
could take advantage of the setdefaulttimeout in the socket module, to
limit the amount of time the sockets take for a lookup.

As a test, I set the default timout ridiculously low. But it doesn't
seem to be having any effect. The sockets will take several seconds to
make the connection and complete the lookups, even though I've set the
timeout to millionths of a second, which I thought would ensure a
timeout (sample script below).

Am I doing something wrong? Do I misunderstand something? Is what I
want to do simply not possible?

Thanks for any tips. Example code follows signature...

--
Sheila King
http://www.thinkspot.net/sheila/

#!/usr/local/bin/python2.4
import socket
import sys
from time import time, asctime, localtime

socket.setdefaulttimeout(.00001)
debugfile = "socketdebug.txt"


def debug(data):
timestamp = str(asctime(localtime(time())))
try:
f = open(debugfile, 'a')
f.write('\n*** %s ***\n' % timestamp)
f.write('%s\n' % data) # 24-Dec-2003 -ctm- removed one
linefeed
f.close()
except IOError:
pass
# do nothing if the file cannot be opened


IPaddy = '220.108.204.114'

if IPaddy:
IPquads = IPaddy.split('.')
IPquads.reverse()
reverseIP = '.'.join(IPquads)

bl_list = { 'bl.spamcop.net' : 'IP Address %s Rejected - see:
http://spamcop.net/bl.shtml' % IPaddy, \
'relays.ordb.org' : 'IP Address %s Rejected - see:
http://ordb.org/' % IPaddy, \
'list.dsbl.org' : 'IP Address %s Rejected - see:
http://dsbl.org' % IPaddy}

timing_done = 0
start_time = time()
for host in bl_list.keys():
if host in bl_list.keys():
IPlookup = "%s.%s" % (reverseIP, host)
try:
debug(" IPlookup=%s=" % IPlookup)
resolvesIP = socket.gethostbyname(IPlookup)
debug(" resolvesIP=%s=" % resolvesIP)
if resolvesIP.startswith('127.'):
end_time = time()
elapsed_time = end_time - start_time
timing_done = 1
debug("Time elapsed for rDNS on bl_list: %f secs" %
elapsed_time)
debug("exiting--SPAM! id'd by %s" % host)
print bl_list[host]
sys.exit(0)
except socket.gaierror:
pass
if not timing_done:
end_time = time()
elapsed_time = end_time - start_time
debug("2nd try:Time elapsed for rDNS on bl_list: %f secs" %
elapsed_time)
I don't believe that gethostbyname()'s use of socket technology can be
expected to raise socket timeout exceptions, since in general it's a
call to a library that uses standard system calls. This would at least
explain the behaviour you were seeing.

It might just be easier to to the DNS work yourself using the rather
nifty "dnspython" module. This does allow you to easily implement
timeouts for specific interactions.

regards
Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,046
Latest member
Gavizuho

Latest Threads

Top