open-uri / net/http bug?

Discussion in 'Ruby' started by Dick Davies, Jun 5, 2004.

  1. Dick Davies

    Dick Davies Guest

    I was trying to use RSSscraper to pul some web forums, and something
    level went bang in the Net::* libraries.

    I found some old references to this error from last year, and I
    got the impression it was platform specific?

    Can anyone else let me know if this causes problems for them?

    It's obviously site specific , url = 'http://www.google.com' has no problems...

    Here's the miniaml code (open(url)... is 'line 6' in the code below):

    require 'open-uri'

    url = 'http://p218.ezboard.com/fdebatingukfrm9'
    page = open(url).readlines

    If I run this I get:

    rasputin@lb:rss$ ./regex.rb
    /data/ruby/lib/ruby/1.9/net/protocol.rb:135:in `sysread': End of file reached (EOFError)
    from /data/ruby/lib/ruby/1.9/net/protocol.rb:135:in `rbuf_fill'
    from /data/ruby/lib/ruby/1.9/net/protocol.rb:116:in `readuntil'
    from /data/ruby/lib/ruby/1.9/net/protocol.rb:126:in `readline'
    from /data/ruby/lib/ruby/1.9/net/http.rb:1850:in `read_status_line'
    from /data/ruby/lib/ruby/1.9/net/http.rb:1839:in `read_new'
    from /data/ruby/lib/ruby/1.9/net/http.rb:934:in `request'
    from /data/ruby/lib/ruby/1.9/net/http.rb:834:in `request_get'
    from /data/ruby/lib/ruby/1.9/open-uri.rb:545:in `proxy_open'
    ... 7 levels...
    from /data/ruby/lib/ruby/1.9/open-uri.rb:134:in `open_uri'
    from /data/ruby/lib/ruby/1.9/open-uri.rb:424:in `open'
    from /data/ruby/lib/ruby/1.9/open-uri.rb:85:in `open'
    from ./regex.rb:6

    This is exactly the error I was getting on the front of RSSscraper.
    If it helps narrow it down, through a proxy i get:

    rasputin@lb:rss$ ./regex.rb
    /data/ruby/lib/ruby/1.9/open-uri.rb:574:in `proxy_open': 503 Service Unavailable (OpenURI::HTTPError)
    from /data/ruby/lib/ruby/1.9/open-uri.rb:167:in `open_loop'
    from /data/ruby/lib/ruby/1.9/open-uri.rb:164:in `catch'
    from /data/ruby/lib/ruby/1.9/open-uri.rb:164:in `open_loop'
    from /data/ruby/lib/ruby/1.9/open-uri.rb:134:in `open_uri'
    from /data/ruby/lib/ruby/1.9/open-uri.rb:424:in `open'
    from /data/ruby/lib/ruby/1.9/open-uri.rb:85:in `open'
    from ./regex.rb:6

    --
    A general leading the State Department resembles a dragon commanding
    ducks.
    -- New York Times, Jan. 20, 1981
    Rasputin :: Jack of All Trades - Master of Nuns
     
    Dick Davies, Jun 5, 2004
    #1
    1. Advertising

  2. Dick Davies

    Chad Fowler Guest

    On Sun, 6 Jun 2004 05:02:25 +0900, Dick Davies
    <> wrote:
    >
    > I was trying to use RSSscraper to pul some web forums, and something
    > level went bang in the Net::* libraries.
    >
    > I found some old references to this error from last year, and I
    > got the impression it was platform specific?
    >
    > Can anyone else let me know if this causes problems for them?
    >
    > It's obviously site specific , url = 'http://www.google.com' has no problems...
    >
    > Here's the miniaml code (open(url)... is 'line 6' in the code below):
    >
    > require 'open-uri'
    >
    > url = 'http://p218.ezboard.com/fdebatingukfrm9'
    > page = open(url).readlines
    >
    > If I run this I get:
    >
    > rasputin@lb:rss$ ./regex.rb
    > /data/ruby/lib/ruby/1.9/net/protocol.rb:135:in `sysread': End of file reached (EOFError)
    > from /data/ruby/lib/ruby/1.9/net/protocol.rb:135:in `rbuf_fill'
    > from /data/ruby/lib/ruby/1.9/net/protocol.rb:116:in `readuntil'
    > from /data/ruby/lib/ruby/1.9/net/protocol.rb:126:in `readline'
    > from /data/ruby/lib/ruby/1.9/net/http.rb:1850:in `read_status_line'
    > from /data/ruby/lib/ruby/1.9/net/http.rb:1839:in `read_new'
    > from /data/ruby/lib/ruby/1.9/net/http.rb:934:in `request'
    > from /data/ruby/lib/ruby/1.9/net/http.rb:834:in `request_get'
    > from /data/ruby/lib/ruby/1.9/open-uri.rb:545:in `proxy_open'
    > ... 7 levels...
    > from /data/ruby/lib/ruby/1.9/open-uri.rb:134:in `open_uri'
    > from /data/ruby/lib/ruby/1.9/open-uri.rb:424:in `open'
    > from /data/ruby/lib/ruby/1.9/open-uri.rb:85:in `open'
    > from ./regex.rb:6
    >
    > This is exactly the error I was getting on the front of RSSscraper.
    > If it helps narrow it down, through a proxy i get:
    >
    > rasputin@lb:rss$ ./regex.rb
    > /data/ruby/lib/ruby/1.9/open-uri.rb:574:in `proxy_open': 503 Service Unavailable (OpenURI::HTTPError)
    > from /data/ruby/lib/ruby/1.9/open-uri.rb:167:in `open_loop'
    > from /data/ruby/lib/ruby/1.9/open-uri.rb:164:in `catch'
    > from /data/ruby/lib/ruby/1.9/open-uri.rb:164:in `open_loop'
    > from /data/ruby/lib/ruby/1.9/open-uri.rb:134:in `open_uri'
    > from /data/ruby/lib/ruby/1.9/open-uri.rb:424:in `open'
    > from /data/ruby/lib/ruby/1.9/open-uri.rb:85:in `open'
    > from ./regex.rb:6
    >



    It appears to me that this site refuses to respond unless you have a
    recognized User-agent set in the request header. That's probably the
    problem with open-uri.

    Chad
     
    Chad Fowler, Jun 6, 2004
    #2
    1. Advertising

  3. Dick Davies

    Dick Davies Guest

    * Chad Fowler <> [0655 03:55]:
    > On Sun, 6 Jun 2004 05:02:25 +0900, Dick Davies


    > > require 'open-uri'
    > >
    > > url = 'http://p218.ezboard.com/fdebatingukfrm9'
    > > page = open(url).readlines


    > > /data/ruby/lib/ruby/1.9/net/protocol.rb:135:in `sysread': End of file reached (EOFError)
    > > from /data/ruby/lib/ruby/1.9/net/protocol.rb:135:in `rbuf_fill'
    > > from /data/ruby/lib/ruby/1.9/net/protocol.rb:116:in `readuntil'
    > > from /data/ruby/lib/ruby/1.9/net/protocol.rb:126:in `readline'
    > > from /data/ruby/lib/ruby/1.9/net/http.rb:1850:in `read_status_line'
    > > from /data/ruby/lib/ruby/1.9/net/http.rb:1839:in `read_new'
    > > from /data/ruby/lib/ruby/1.9/net/http.rb:934:in `request'
    > > from /data/ruby/lib/ruby/1.9/net/http.rb:834:in `request_get'
    > > from /data/ruby/lib/ruby/1.9/open-uri.rb:545:in `proxy_open'
    > > ... 7 levels...
    > > from /data/ruby/lib/ruby/1.9/open-uri.rb:134:in `open_uri'
    > > from /data/ruby/lib/ruby/1.9/open-uri.rb:424:in `open'
    > > from /data/ruby/lib/ruby/1.9/open-uri.rb:85:in `open'
    > > from ./regex.rb:6


    > It appears to me that this site refuses to respond unless you have a
    > recognized User-agent set in the request header. That's probably the
    > problem with open-uri.


    Ah crap. wget worked fine.

    Is there a workaround (other than wget'ting the file to a local
    webserver and pulling it from there)? I can't see an easy way of
    adding a user-agent header to net/http.rb headers.....

    --
    The District of Columbia has a law forbidding you to exert pressure on
    a balloon and thereby cause a whistling sound on the streets.
    Rasputin :: Jack of All Trades - Master of Nuns
     
    Dick Davies, Jun 8, 2004
    #3
  4. Dick Davies

    Dick Davies Guest

    Bad form to reply to myself, but for the record, adding a
    header was incredibly easy:

    .....
    class DukPolScanner < RSSscraper::AbstractScanner
    def initialize
    @get_headers = {'User-agent' => 'RssScraper' }
    .....

    thanks Chad for the pointer, and RSSScrapers creator for a
    well-designed tool....

    * Dick Davies <> [0632 12:32]:
    > * Chad Fowler <> [0655 03:55]:
    > > On Sun, 6 Jun 2004 05:02:25 +0900, Dick Davies


    > > > /data/ruby/lib/ruby/1.9/net/protocol.rb:135:in `sysread': End of file reached (EOFError)
    > > > from /data/ruby/lib/ruby/1.9/net/protocol.rb:135:in `rbuf_fill'


    > > It appears to me that this site refuses to respond unless you have a
    > > recognized User-agent set in the request header. That's probably the
    > > problem with open-uri.


    --
    There are two types of people in this world, good and bad. The good
    sleep better, but the bad seem to enjoy the waking hours much more.
    -- Woody Allen
    Rasputin :: Jack of All Trades - Master of Nuns
     
    Dick Davies, Jun 8, 2004
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Simon Harris
    Replies:
    0
    Views:
    6,485
    Simon Harris
    May 10, 2005
  2. Stanimir Stamenkov
    Replies:
    1
    Views:
    2,533
    Stanimir Stamenkov
    Aug 17, 2005
  3. sujeet kumar

    How to use open uri or net/http class

    sujeet kumar, Jun 2, 2005, in forum: Ruby
    Replies:
    3
    Views:
    158
    Shajith
    Jun 2, 2005
  4. Replies:
    0
    Views:
    105
  5. Jay 99
    Replies:
    2
    Views:
    214
    Jay 99
    Apr 4, 2009
Loading...

Share This Page