Using a proxy with urllib2

Discussion in 'Python' started by Jack, Jan 10, 2008.

  1. Jack

    Jack Guest

    I'm trying to use a proxy server with urllib2.
    So I have managed to get it to work by setting the environment
    variable:
    export HTTP_PROXY=127.0.0.1:8081

    But I wanted to set it from the code. However, this does not set the proxy:
    httpproxy = '127.0.0.1:3129'
    proxy_support = urllib2.ProxyHandler({"http":"http://" + httpproxy})
    opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler)
    urllib2.install_opener(opener)
    I'm using it from a web.py URL handler file, not sure if it matters.

    I have another question though. It seems that using either of the
    methods above, the proxy will be global. What if I want to use
    a proxy with one site, but not with another site? Or even use a
    proxy for some URLs but not others? The proxy having to be global
    is really not convenient. Is there any way to do per-fetch proxy?
     
    Jack, Jan 10, 2008
    #1
    1. Advertising

  2. Jack

    Rob Wolfe Guest

    "Jack" <> writes:

    > I'm trying to use a proxy server with urllib2.
    > So I have managed to get it to work by setting the environment
    > variable:
    > export HTTP_PROXY=127.0.0.1:8081
    >
    > But I wanted to set it from the code. However, this does not set the proxy:
    > httpproxy = '127.0.0.1:3129'
    > proxy_support = urllib2.ProxyHandler({"http":"http://" + httpproxy})
    > opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler)
    > urllib2.install_opener(opener)


    Works for me.
    How do you know that the proxy is not set?

    > I'm using it from a web.py URL handler file, not sure if it matters.


    I don't think so.

    > I have another question though. It seems that using either of the
    > methods above, the proxy will be global. What if I want to use
    > a proxy with one site, but not with another site? Or even use a
    > proxy for some URLs but not others? The proxy having to be global
    > is really not convenient. Is there any way to do per-fetch proxy?


    Try this:

    <code>
    import urllib2

    def getopener(proxy=None):
    opener = urllib2.build_opener(urllib2.HTTPHandler)
    if proxy:
    proxy_support = urllib2.ProxyHandler({"http": "http://" + proxy})
    opener.add_handler(proxy_support)
    return opener

    def fetchurl(url, opener):
    f = opener.open(url)
    data = f.read()
    f.close()
    return data

    print fetchurl('http://www.python.org', getopener('127.0.0.1:8081'))
    </code>

    HTH,
    Rob
     
    Rob Wolfe, Jan 10, 2008
    #2
    1. Advertising

  3. Jack

    Jack Guest


    > Works for me.
    > How do you know that the proxy is not set?


    The proxy drops some URLs and the URLs were not being dropped when I did
    this :)

    > Try this:


    Thank you. I'll give it a try.
     
    Jack, Jan 11, 2008
    #3
  4. Jack

    Jack Guest

    Rob,

    I tried your code snippet and it worked great. I'm just wondering if
    getopener( ) call
    is lightweight so I can just call it in every call to fetchurl( )? Or I
    should try to share
    the opener object among fetchurl( ) calls?

    Thanks,
    Jack


    "Rob Wolfe" <> wrote in message
    news:...
    > Try this:
    >
    > <code>
    > import urllib2
    >
    > def getopener(proxy=None):
    > opener = urllib2.build_opener(urllib2.HTTPHandler)
    > if proxy:
    > proxy_support = urllib2.ProxyHandler({"http": "http://" + proxy})
    > opener.add_handler(proxy_support)
    > return opener
    >
    > def fetchurl(url, opener):
    > f = opener.open(url)
    > data = f.read()
    > f.close()
    > return data
    >
    > print fetchurl('http://www.python.org', getopener('127.0.0.1:8081'))
    > </code>
    >
    > HTH,
    > Rob
     
    Jack, Jan 11, 2008
    #4
  5. Jack

    Rob Wolfe Guest

    "Jack" <> writes:

    > Rob,
    >
    > I tried your code snippet and it worked great. I'm just wondering if
    > getopener( ) call
    > is lightweight so I can just call it in every call to fetchurl( )? Or I
    > should try to share
    > the opener object among fetchurl( ) calls?


    Creating an opener for every url is rather not reasonable way to go.
    Sharing is the better approach. In your case you might
    create e.g. two instances: simple_opener and proxy_opener.

    Regards,
    Rob
     
    Rob Wolfe, Jan 11, 2008
    #5
  6. Jack

    Jack Guest


    >> I'm trying to use a proxy server with urllib2.
    >> So I have managed to get it to work by setting the environment
    >> variable:
    >> export HTTP_PROXY=127.0.0.1:8081
    >>
    >> But I wanted to set it from the code. However, this does not set the
    >> proxy:
    >> httpproxy = '127.0.0.1:3129'
    >> proxy_support = urllib2.ProxyHandler({"http":"http://" + httpproxy})
    >> opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler)
    >> urllib2.install_opener(opener)


    I find out why it doesn't work in my code but I don't have a solution -
    somewhere
    else in the code calls these two lines:

    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
    urllib2.install_opener(opener)

    and they override the proxy opener. Could anyone tell me how to use both
    openers?
     
    Jack, Jan 12, 2008
    #6
  7. Jack

    Rob Wolfe Guest

    "Jack" <> writes:

    >>> I'm trying to use a proxy server with urllib2.
    >>> So I have managed to get it to work by setting the environment
    >>> variable:
    >>> export HTTP_PROXY=127.0.0.1:8081
    >>>
    >>> But I wanted to set it from the code. However, this does not set the
    >>> proxy:
    >>> httpproxy = '127.0.0.1:3129'
    >>> proxy_support = urllib2.ProxyHandler({"http":"http://" + httpproxy})
    >>> opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler)
    >>> urllib2.install_opener(opener)

    >
    > I find out why it doesn't work in my code but I don't have a solution -
    > somewhere
    > else in the code calls these two lines:
    >
    > opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
    > urllib2.install_opener(opener)
    >
    > and they override the proxy opener. Could anyone tell me how to use both
    > openers?
    >


    You don't have to create another opener if you only want to add
    some handler. You can use `add_handler` method, e.g.:
    opener.add_handler(urllib2.HTTPCookieProcessor(cj))

    HTH,
    Rob
     
    Rob Wolfe, Jan 12, 2008
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. O. Koch

    FTP with urllib2 behind a proxy

    O. Koch, Aug 7, 2003, in forum: Python
    Replies:
    4
    Views:
    1,304
    John J. Lee
    Aug 14, 2003
  2. Andre Bocchini

    Proxy Authentication using urllib2

    Andre Bocchini, Sep 20, 2003, in forum: Python
    Replies:
    2
    Views:
    854
    Myles
    Sep 23, 2003
  3. Herman Geldenhuys

    % in POST when using URLLIB2.URLOPEN with PROXY

    Herman Geldenhuys, Jul 14, 2004, in forum: Python
    Replies:
    1
    Views:
    468
    John J. Lee
    Jul 14, 2004
  4. Josef Cihal
    Replies:
    0
    Views:
    771
    Josef Cihal
    Sep 5, 2005
  5. Licheng Fang
    Replies:
    2
    Views:
    514
    Jarek Zgoda
    Feb 19, 2006
Loading...

Share This Page