python proxy checker ,change to threaded version

elca · Dec 7, 2009

Hello ALL,

i have some python proxy checker .

and to speed up check, i was decided change to mutlthreaded version,

and thread module is first for me, i was tried several times to convert to
thread version

and look for many info, but it not so much easy for novice python programmar
..

if anyone can help me really much appreciate!!

thanks in advance!

import urllib2, socket

socket.setdefaulttimeout(180)
# read the list of proxy IPs in proxyList
proxyList = open('listproxy.txt').read()

def is_bad_proxy(pip):
try:
proxy_handler = urllib2.ProxyHandler({'http': pip})
opener = urllib2.build_opener(proxy_handler)
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib2.install_opener(opener)
req=urllib2.Request('http://www.yahoo.com') # <---check whether
proxy alive
sock=urllib2.urlopen(req)
except urllib2.HTTPError, e:
print 'Error code: ', e.code
return e.code
except Exception, detail:

print "ERROR:", detail
return 1
return 0

for item in proxyList:
if is_bad_proxy(item):
print "Bad Proxy", item
else:
print item, "is working"

r0g · Dec 7, 2009

elca said:
Hello ALL,

i have some python proxy checker .

and to speed up check, i was decided change to mutlthreaded version,

and thread module is first for me, i was tried several times to convert to
thread version

and look for many info, but it not so much easy for novice python programmar
.

if anyone can help me really much appreciate!!

thanks in advance!

import urllib2, socket

socket.setdefaulttimeout(180)
# read the list of proxy IPs in proxyList
proxyList = open('listproxy.txt').read()

def is_bad_proxy(pip):
try:
proxy_handler = urllib2.ProxyHandler({'http': pip})
opener = urllib2.build_opener(proxy_handler)
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib2.install_opener(opener)
req=urllib2.Request('http://www.yahoo.com') # <---check whether
proxy alive
sock=urllib2.urlopen(req)
except urllib2.HTTPError, e:
print 'Error code: ', e.code
return e.code
except Exception, detail:

print "ERROR:", detail
return 1
return 0

for item in proxyList:
if is_bad_proxy(item):
print "Bad Proxy", item
else:
print item, "is working"

The trick to threads is to create a subclass of threading.Thread, define
the 'run' function and call the 'start()' method. I find threading quite
generally useful so I created this simple generic function for running
things in threads...

def run_in_thread( func, func_args=[], callback=None, callback_args=[] ):
import threading
class MyThread ( threading.Thread ):
def run ( self ):

# Call function
if function_args:
result = function(*function_args)
else:
result = function()

# Call callback
if callback:
if callback_args:
callback(result, *callback_args)
else:
callback(result)

MyThread().start()

You need to pass it a test function (+args) and, if you want to get a
result back from each thread you also need to provide a callback
function (+args). The first parameter of the callback function receives
the result of the test function so your callback would loo something
like this...

def cb( result, item ):
if result:
print "Bad Proxy", item
else:
print item, "is working"

And your calling loop would be something like this...

for item in proxyList:
run_in_thread( is_bad_proxy, func_args=[ item ], cb, callback_args=[
item ] )

Also, you might want to limit the number of concurrent threads so as not
to overload your system, one quick and dirty way to do this is...

import time
if threading.activeCount() > 9: time.sleep(1)

Note, this is a far from exact method but it works well enough for one
off scripting use

Hope this helps.

Suggestions from hardcore pythonistas on how to my make run_in_thread
function more elegant are quite welcome also

Roger Heathcote

elca · Dec 7, 2009

r0g said:
Hello ALL,

i have some python proxy checker .

and to speed up check, i was decided change to mutlthreaded version,

and thread module is first for me, i was tried several times to convert
to
thread version

and look for many info, but it not so much easy for novice python
programmar
.

if anyone can help me really much appreciate!!

thanks in advance!

import urllib2, socket

socket.setdefaulttimeout(180)
# read the list of proxy IPs in proxyList
proxyList = open('listproxy.txt').read()

def is_bad_proxy(pip):
try:
proxy_handler = urllib2.ProxyHandler({'http': pip})
opener = urllib2.build_opener(proxy_handler)
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib2.install_opener(opener)
req=urllib2.Request('http://www.yahoo.com') # <---check
whether
proxy alive
sock=urllib2.urlopen(req)
except urllib2.HTTPError, e:
print 'Error code: ', e.code
return e.code
except Exception, detail:

print "ERROR:", detail
return 1
return 0

for item in proxyList:
if is_bad_proxy(item):
print "Bad Proxy", item
else:
print item, "is working"

Click to expand...

The trick to threads is to create a subclass of threading.Thread, define
the 'run' function and call the 'start()' method. I find threading quite
generally useful so I created this simple generic function for running
things in threads...

def run_in_thread( func, func_args=[], callback=None, callback_args=[] ):
import threading
class MyThread ( threading.Thread ):
def run ( self ):

# Call function
if function_args:
result = function(*function_args)
else:
result = function()

# Call callback
if callback:
if callback_args:
callback(result, *callback_args)
else:
callback(result)

MyThread().start()

You need to pass it a test function (+args) and, if you want to get a
result back from each thread you also need to provide a callback
function (+args). The first parameter of the callback function receives
the result of the test function so your callback would loo something
like this...

def cb( result, item ):
if result:
print "Bad Proxy", item
else:
print item, "is working"

And your calling loop would be something like this...

for item in proxyList:
run_in_thread( is_bad_proxy, func_args=[ item ], cb, callback_args=[
item ] )

Also, you might want to limit the number of concurrent threads so as not
to overload your system, one quick and dirty way to do this is...

import time
if threading.activeCount() > 9: time.sleep(1)

Note, this is a far from exact method but it works well enough for one
off scripting use

Hope this helps.

Suggestions from hardcore pythonistas on how to my make run_in_thread
function more elegant are quite welcome also

Roger Heathcote

Hello

thanks for your reply !
i will test it now and will comment soon
thanks again

Terry Reedy · Dec 7, 2009

r0g said:
The trick to threads is to create a subclass of threading.Thread, define
the 'run' function and call the 'start()' method. I find threading quite
generally useful so I created this simple generic function for running
things in threads...

Great idea. Thanks for posting this.

def run_in_thread( func, func_args=[], callback=None, callback_args=[] ):
import threading
class MyThread ( threading.Thread ):
def run ( self ):

# Call function
if function_args:
result = function(*function_args)
else:
result = function()

The check is not necessary. by design, f(*[]) == f()
Names do not match param names ;=)

# Call callback
if callback:
if callback_args:
callback(result, *callback_args)
else:
callback(result)

Ditto. g(x,*[]) == g(x)

def run(self):
result = func(*func_args) # matching run_in_thread param names
callback(result, *callback_args)

MyThread().start()

This is one of the best uses I have seen for a nested class definition.

Suggestions from hardcore pythonistas on how to my make run_in_thread
function more elegant are quite welcome also

I shortened it, at least.

Terry Jan Reedy

r0g · Dec 7, 2009

Terry said:
r0g said:

The trick to threads is to create a subclass of threading.Thread, define
the 'run' function and call the 'start()' method. I find threading quite
generally useful so I created this simple generic function for running
things in threads...

Click to expand...

Great idea. Thanks for posting this.

def run_in_thread( func, func_args=[], callback=None, callback_args=[] ):
import threading
class MyThread ( threading.Thread ):
def run ( self ):

# Call function
if function_args:
result = function(*function_args)
else:
result = function()

Click to expand...

The check is not necessary. by design, f(*[]) == f()

Excellent, thanks Terry

I knew it would be simpler than I thought! I've been writing a lot of
PHP and AS3 recently and it's easy to forget how python often just works
without needing the same level of hand holding, error checking &
defensive coding as other languages!

Names do not match param names ;=)

Oops yeah! Thought I'd refactor my painfully verbose variable names
before posting in a 70 char wide medium but it looks like I missed one!
*blush*

Roger.

r0g · Dec 7, 2009

Terry said:
r0g said:

The trick to threads is to create a subclass of threading.Thread, define
the 'run' function and call the 'start()' method. I find threading quite
generally useful so I created this simple generic function for running
things in threads...

Click to expand...

Great idea. Thanks for posting this.

def run_in_thread( func, func_args=[], callback=None, callback_args=[] ):

Click to expand...

<snipped cumbersome older version>

Okay, so here's the more concise version for posterity / future googlers...

import threading

def run_in_thread( func, func_args=[], callback=None, callback_args=[] ):
class MyThread ( threading.Thread ):
def run ( self ):
result = func(*func_args)
if callback:
callback(result, *callback_args)
MyThread().start()

Roger.

r0g · Dec 7, 2009

Rhodri said:
r0g said:

The trick to threads is to create a subclass of threading.Thread, define
the 'run' function and call the 'start()' method. I find threading quite
generally useful so I created this simple generic function for running
things in threads...

Click to expand...

Great idea. Thanks for posting this.

def run_in_thread( func, func_args=[], callback=None,
callback_args=[] ):

Click to expand...

Click to expand...

I'm might wary of having mutable defaults for parameters. They make for
the most annoying errors. Even though they're actually safe here, I'd
still write:

def run_in_thread(func, func_args=(), callback=None, callback_args=()):

out of sheer paranoia.

Excellent point, thanks

I'm starting to suspect this is the highest quality group in all of usenet!

Roger.

Lie Ryan · Dec 8, 2009

Neat, but I think you mean

if callback is not None:
callback(result, *callback_args)

for that last line.

how about:
import threading

def run_in_thread( func, func_args=[], callback=lambda r,*a: None,
callback_args=[] ):
class MyThread ( threading.Thread ):
def run ( self ):
result = func(*func_args)
callback(result, *callback_args)
MyThread().start()

(and for me, I'd )

r0g · Dec 8, 2009

Lie said:
Neat, but I think you mean

if callback is not None:
callback(result, *callback_args)

for that last line.

Click to expand...

how about:
import threading

def run_in_thread( func, func_args=[], callback=lambda r,*a: None,
callback_args=[] ):
class MyThread ( threading.Thread ):
def run ( self ):
result = func(*func_args)
callback(result, *callback_args)
MyThread().start()

(and for me, I'd )

Cool, that's a neat trick I'd never have thought of. I think the 2 line
alternative might be a little more pythonic though, in terms of
readability & simplicity...

if callback:
callback(result, *callback_args)

That could be because I'm not terribly au fait with the whole lambda
calculus thing though. What say those who are comfortable with it?
Obvious or oblique?

Roger.

python: HTTP connections through a proxy server requiring authentication	3	Jan 26, 2013
Python: 404 Error when trying to login a webpage by using 'urllib'and 'HTTPCookieProcessor'	4	Jan 12, 2014
Want guidance to set proxy please help	1	Dec 15, 2013
i have configured proxy but fiddler can't capture python's network requests	1	Jul 4, 2011
Unable to connect to internet URL behind firewall	2	Mar 6, 2013
urllib2 request htaccess page through proxy	1	Feb 11, 2007
Python Behind a Squid Corporate Proxy on Windows	6	Jul 17, 2008
ntlm authentication for urllib2	0	Nov 29, 2012

python proxy checker ,change to threaded version

elca

r0g

elca

Terry Reedy

r0g

r0g

r0g

Lie Ryan

r0g

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads