open-uri can read slashdot rss, but not digg rss ?

A

aktxyz

Here's an irb session, this used to work, maybe something new about
digg's rss feed ?

irb(main):001:0> require 'open-uri'
=> true
irb(main):002:0> open('http://rss.slashdot.org/Slashdot/
slashdot').readlines.length
=> 263
irb(main):003:0> open('http://www.digg.com/rss/
containertechnology.xml').readlines.length
/usr/lib/ruby/1.8/timeout.rb:54:in `rbuf_fill': execution expired
(Timeout::Error)
from /usr/lib/ruby/1.8/timeout.rb:56:in `timeout'
from /usr/lib/ruby/1.8/timeout.rb:76:in `timeout'
from /usr/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill'
from /usr/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
from /usr/lib/ruby/1.8/net/protocol.rb:126:in `readline'
from /usr/lib/ruby/1.8/net/http.rb:1988:in `read_status_line'
from /usr/lib/ruby/1.8/net/http.rb:1977:in `read_new'
from /usr/lib/ruby/1.8/net/http.rb:1046:in `request'
... 8 levels...
from /usr/lib/ruby/1.8/open-uri.rb:86:in `open'
from (irb):3:in `irb_binding'
from /usr/lib/ruby/1.8/irb/workspace.rb:52:in `irb_binding'
from /usr/lib/ruby/1.8/irb/workspace.rb:52


happens every time !
 
A

aktxyz

Here's an irb session, this used to work, maybe something new about
digg's rss feed ?

irb(main):001:0> require 'open-uri'
=> true
irb(main):002:0> open('http://rss.slashdot.org/Slashdot/
slashdot').readlines.length
=> 263
irb(main):003:0> open('http://www.digg.com/rss/
containertechnology.xml').readlines.length
/usr/lib/ruby/1.8/timeout.rb:54:in `rbuf_fill': execution expired
(Timeout::Error)
from /usr/lib/ruby/1.8/timeout.rb:56:in `timeout'
from /usr/lib/ruby/1.8/timeout.rb:76:in `timeout'
from /usr/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill'
from /usr/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
from /usr/lib/ruby/1.8/net/protocol.rb:126:in `readline'
from /usr/lib/ruby/1.8/net/http.rb:1988:in `read_status_line'
from /usr/lib/ruby/1.8/net/http.rb:1977:in `read_new'
from /usr/lib/ruby/1.8/net/http.rb:1046:in `request'
... 8 levels...
from /usr/lib/ruby/1.8/open-uri.rb:86:in `open'
from (irb):3:in `irb_binding'
from /usr/lib/ruby/1.8/irb/workspace.rb:52:in `irb_binding'
from /usr/lib/ruby/1.8/irb/workspace.rb:52

happens every time !



Digg wants a specific user-agent, like the one my FF uses. Why in the
world would they do that ? What/who are they hoping to prevent. Do
they really only want browsers connecting to there feeds ?
 
J

John Joyce

They may just be keeping track of what user-agents are being used.
Though browser sniffing is almost pointless.
Should be easy enough to give it whatever string you like. "user-
agent" is nothing but a string anyway, almost meaningless, because it
isn't guaranteed to be true or mean anything valid or verifiable.
(like microsoft's early corruption of it, to circumvent browser
sniffing and blocking by starting it's MSIE user-agent string with
Mozilla 4.x (compatible... ))
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top