[ANN] scRUBYt! 0.2.3 - Hpricot and Mechanize on steroids

Peter Szinek · Feb 21, 2007

Hello,

I am pleased to announce that the new release of scRUBYt!, 0.2.3 is
available for download.

scRUBYt! is a very easy to learn and use, yet powerful Web scraping
framework based on Hpricot and mechanize. It's purpose is to free you
from the drudgery of web page crawling, looking up HTML tags,
attributes, XPaths, form names and other typical low-level web scraping
woes by figuring these out from your examples copy'n'pasted from the Web
page.

The current release has a lot of new features, tons of bugfixes and
some shiny new examples - scraping reddit, del.icio.us, rubyforge login,
wordpress automatic commenting for example.

Thanks everybody for the great feedback!

Cheers,
Peter
__
http://www.rubyrailways.com :: Ruby and Web2.0 blog
http://scrubyt.org :: Ruby web scraping framework
http://rubykitchensink.ca/ :: The indexed archive of all things Ruby.

toulax · Feb 21, 2007

Peter,

I really, really like scRUBYt! so far, especially after so much
scraping using PHP, however the lack of documentation kills me. Are
there going to be more tutorials and examples soon? When I was first
testing it I could not believe it was possible to scrape google with
only a few lines, but what about more complicated pages? For example,
it isn't possible to scrape Ask.com the same way you do Google because
of it's markup, what are you supposed to do on those cases?

I know how scraping works but I'm not very experienced with XPath so
it would be really good to have more examples (in their final form,
not the learner only because they stop working after a while for most
of the time), plus a more detailed explanation of everything that can
be done.

Either way, this seems to be shaping up very well and I wish you good
luck with it!

Cheers

Peter Szinek · Feb 21, 2007

Peter,

I really, really like scRUBYt! so far, especially after so much
scraping using PHP, however the lack of documentation kills me. Are
there going to be more tutorials and examples soon? When I was first
testing it I could not believe it was possible to scrape google with
only a few lines, but what about more complicated pages? For example,
it isn't possible to scrape Ask.com the same way you do Google because
of it's markup, what are you supposed to do on those cases?

There are tons of examples here:

http://rubyforge.org/frs/download.php/17684/scrubyt-examples-0.2.3.zip

I am also planning to finish the tutorials and add even more docs.
What's up with ask.com? Send me what would you like to accomplish and I
will help you with it.

I know how scraping works but I'm not very experienced with XPath so
it would be really good to have more examples (in their final form,
not the learner only because they stop working after a while for most
of the time),

OK - however, you can always replace the examples with the current ones
and export the extractor to get a production one.

plus a more detailed explanation of everything that can

be done.

Yeah, that's my goal too, just I am flooded with everything else... But
I am
working on it all the time. Please subscribe to the feed at
http://scrubyt.org
if you would like to be notified if new stuff arrives... I am announcing
everything there,

Either way, this seems to be shaping up very well and I wish you good
luck with it!

Great! Please send as much feedback as possible so I can improve the
whole stuff and add what's missing.

Cheers,
Peter

__
http://www.rubyrailways.com :: Ruby and Web2.0 blog
http://scrubyt.org :: Ruby web scraping framework
http://rubykitchensink.ca/ :: The indexed archive of all things Ruby.

[ANN] scRUBYt! 0.2.8	4	Apr 19, 2007
[ANN] scRUBYt! 0.4.1	1	Dec 11, 2008
[ANN] scRUBYt! 0.3.4	0	Sep 27, 2007
scRUBYt! 0.3.1 released	0	May 29, 2007
Can't install//use the scrubyt gem? LoadError: no such file to load-- parse_tree_reloaded. What did	1	Aug 29, 2007
fast XML parser, other than libxml	19	Apr 4, 2007
machine learning in Ruby?	1	Apr 18, 2007
Computing average of time deltas	1	Mar 19, 2007

[ANN] scRUBYt! 0.2.3 - Hpricot and Mechanize on steroids

Peter Szinek

toulax

Peter Szinek

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads