Re: http error 301 for urlopen

Discussion in 'Python' started by D'Arcy J.M. Cain, Nov 8, 2010.

  1. On Sun, 7 Nov 2010 19:30:23 -0600
    Wenhuan Yu <> wrote:
    > I tried to open a link with urlopen:
    >
    > import urllib2
    > alink = "
    > http://feeds.nytimes.com/click.phdo?i=ff074d9e3895247a31e8e5efa5253183"
    > f = urllib2.urlopen(alink)
    > print f.read()
    >
    > and got the followinig error:
    >
    > urllib2.HTTPError: HTTP Error 301: The HTTP server returned a redirect error
    > that would lead to an infinite loop.
    > The last 30x error message was:
    > Moved Permanently
    >
    > I can open the link in browser. Any way to get solve this? Thanks.


    I checked with my tools and was told that it redirects more than five
    times. Maybe it's not infinite but too many for urlopen. Or, maybe
    the browser just ignores the extra redirects and the part of the page
    with the redirects isn't critical for viewing it. I think that you are
    going to have to investigate the HTML manually and follow all the
    individual links to find the problem. You may have to put in a bug
    request with the New York Times. Good luck with that.

    --
    D'Arcy J.M. Cain <> | Democracy is three wolves
    http://www.druid.net/darcy/ | and a sheep voting on
    +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
    D'Arcy J.M. Cain, Nov 8, 2010
    #1
    1. Advertising

  2. D'Arcy J.M. Cain

    Nobody Guest

    On Sun, 07 Nov 2010 20:51:50 -0500, D'Arcy J.M. Cain wrote:

    >> urllib2.HTTPError: HTTP Error 301: The HTTP server returned a redirect error
    >> that would lead to an infinite loop.
    >> The last 30x error message was:
    >> Moved Permanently
    >>
    >> I can open the link in browser. Any way to get solve this? Thanks.

    >
    > I checked with my tools and was told that it redirects more than five
    > times. Maybe it's not infinite but too many for urlopen.


    The default value of urllib2.HTTPRedirectHandler.max_redirections is 10.
    Setting it to 11 allows the request to complete.
    Nobody, Nov 8, 2010
    #2
    1. Advertising

  3. D'Arcy J.M. Cain

    John Nagle Guest

    On 11/7/2010 5:51 PM, D'Arcy J.M. Cain wrote:
    > On Sun, 7 Nov 2010 19:30:23 -0600
    > Wenhuan Yu<> wrote:
    >> I tried to open a link with urlopen:
    >>
    >> import urllib2
    >> alink = "
    >> http://feeds.nytimes.com/click.phdo?i=ff074d9e3895247a31e8e5efa5253183"
    >> f = urllib2.urlopen(alink)
    >> print f.read()
    >>
    >> and got the followinig error:
    >>
    >> urllib2.HTTPError: HTTP Error 301: The HTTP server returned a redirect error
    >> that would lead to an infinite loop.
    >> The last 30x error message was:
    >> Moved Permanently
    >>
    >> I can open the link in browser. Any way to get solve this? Thanks.

    >
    > I checked with my tools and was told that it redirects more than five
    > times. Maybe it's not infinite but too many for urlopen. Or, maybe
    > the browser just ignores the extra redirects and the part of the page
    > with the redirects isn't critical for viewing it. I think that you are
    > going to have to investigate the HTML manually and follow all the
    > individual links to find the problem. You may have to put in a bug
    > request with the New York Times. Good luck with that.


    It's the New York Times' paywall. They're trying to set a cookie,
    and will redirect the URL until you store and return the cookie.

    John Nagle
    John Nagle, Nov 8, 2010
    #3
  4. In message <4cd7987e$0$1674$>, John Nagle wrote:

    > It's the New York Times' paywall. They're trying to set a cookie,
    > and will redirect the URL until you store and return the cookie.


    And if they find out you’re acessing them from a script, they’ll probably
    try to find a way to block that as well.
    Lawrence D'Oliveiro, Nov 9, 2010
    #4
  5. On Tuesday 09 November 2010, 03:10:24 Lawrence D'Oliveiro wrote:
    > In message <4cd7987e$0$1674$>, John Nagle

    wrote:
    > > It's the New York Times' paywall. They're trying to set a
    > > cookie, and will redirect the URL until you store and return the
    > > cookie.

    >
    > And if they find out you’re acessing them from a script, they’ll
    > probably try to find a way to block that as well.


    ..which could be alleviated by carefully crafting the requests ;-)

    Luckily, unpleasant related ground work was already done by others,
    e.g.: http://bugs.python.org/issue2275

    Pete
    Hans-Peter Jansen, Nov 10, 2010
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Xu, C.S.
    Replies:
    5
    Views:
    462
    John J. Lee
    Sep 17, 2003
  2. Chris
    Replies:
    0
    Views:
    1,034
    Chris
    Jul 10, 2005
  3. Jonas Galvez

    Re: HTTP request error with urlopen

    Jonas Galvez, Jul 4, 2008, in forum: Python
    Replies:
    0
    Views:
    435
    Jonas Galvez
    Jul 4, 2008
  4. Mark Devine
    Replies:
    2
    Views:
    1,065
    amadain
    Jun 29, 2009
  5. Waterstraat, Joern
    Replies:
    1
    Views:
    334
    NAKAMURA, Hiroshi
    Jul 16, 2005
Loading...

Share This Page