Getting final url when original url redirects

I

IanR

I'm processing RSS content from a # of given sources. Most of the
time the url given by the RSS feed redirects to the real URL (I'm
guessing they do this for tracking purposes)

For example.

This is a url that I get from and RSS feed,
http://www.pheedcontent.com/click.phdo?i=d22e9bc7641aab8a0566526f61806512
It redirects to
http://www.macsimumnews.com/index.php/archive/klipsch_developing_headphones_for_new_ipod_shuffle/

I want to record the final URL and not the URL I get from the RSS feed
(However sometimes there is no redirect so I might want the original
URL)

I've tried sniffing the header and don't see any "Location:"... I
think sites are using different ways to redirect. Does anyone have
any suggestions on how I might handle this?
 
P

Philip Semanchuk

I'm processing RSS content from a # of given sources. Most of the
time the url given by the RSS feed redirects to the real URL (I'm
guessing they do this for tracking purposes)

For example.

This is a url that I get from and RSS feed,
http://www.pheedcontent.com/click.phdo?i=d22e9bc7641aab8a0566526f61806512
It redirects to
http://www.macsimumnews.com/index.php/archive/klipsch_developing_headphones_for_new_ipod_shuffle/

I want to record the final URL and not the URL I get from the RSS feed
(However sometimes there is no redirect so I might want the original
URL)

I've tried sniffing the header and don't see any "Location:"... I
think sites are using different ways to redirect. Does anyone have
any suggestions on how I might handle this?


Hi Ian,
Using Firefox's Live HTTP Headers extension, I see a 302 redirect with
a Location header (see session log below). Are aware that urrlib2
resolves redirects for you? That might be why you're not seeing what
you expect. If you want a record of each URL you'll have to implement
an HTTPRedirectHandler.



http://www.pheedcontent.com/click.phdo?i=d22e9bc7641aab8a0566526f61806512

GET /click.phdo?i=d22e9bc7641aab8a0566526f61806512 HTTP/1.1
Host: www.pheedcontent.com
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:
1.9.0.7) Gecko/2009021906 Firefox/3.0.7
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.7,sv;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive

HTTP/1.x 302 Found
Date: Thu, 12 Mar 2009 20:41:29 GMT
Server: Apache
X-Powered-By: PHP/5.2.3-1ubuntu6.3
Pragma: no-cache
Cache-Control: no-cache, must-revalidate
Set-Cookie: phdo=1-tst
%7Cv3
%3Ac3cbcae440ff783381d0d9fa96f14d05
%3Aa8t5sELbkk9oy3pXsrohSnPslqQxQKIhVP%2F8Ots%3D; expires=Fri, 13-
Mar-2009 20:41:29 GMT; path=/; domain=pheedo.com
Location: http://www.macsimumnews.com/index.php/archive/klipsch_developing_headphones_for_new_ipod_shuffle/
Content-Encoding: gzip
Vary: Accept-Encoding
Content-Length: 26
Connection: close
Content-Type: text/html
 
A

Albert Hopkins

I'm processing RSS content from a # of given sources. Most of the
time the url given by the RSS feed redirects to the real URL (I'm
guessing they do this for tracking purposes)

For example.

This is a url that I get from and RSS feed,
http://www.pheedcontent.com/click.phdo?i=d22e9bc7641aab8a0566526f61806512
It redirects to
http://www.macsimumnews.com/index.php/archive/klipsch_developing_headphones_for_new_ipod_shuffle/

I want to record the final URL and not the URL I get from the RSS feed
(However sometimes there is no redirect so I might want the original
URL)

I've tried sniffing the header and don't see any "Location:"... I
think sites are using different ways to redirect. Does anyone have
any suggestions on how I might handle this?

If you are using urllib[2]:
'http://www.macsimumnews.com/index.php/archive/klipsch_developing_headphones_for_new_ipod_shuffle/'
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top