Just starting out, where do I go from here?

A

Alexander York

This is my first experience with any kind of programming language.

I purchased The Pragmatic Programmers Guide (for Ruby) aka "Pickaxe."
I started to read through it, but found it dived in a bit too deep
initially; making references to perl and not fully explaining the
construction of the first example program, regardless of the
aforementioned it looks to be a great guide for someone with a little
more experience than myself.

Ruby-lang.org led me to Chris Pine's tutorial which was excellent for
someone who knows nothing, but I can't find anything that picks up where
he leaves off that carries that same style and in depth teaching.

Where I have interest in going with Ruby

-Screen scraping eBay for specific products/information
-(unrelated but need help finding this)Image recognition software
-Web based management for screen scraping and storing of data/pdf
creation of auctions

could anyone point me in the right direction, maybe any articles on this
stuff? I know of scrAPI and Ruby on Rails, but I am new and do need
direction.

Thanks in advance for your help.

Cheers,
Alex
 
P

Peter Szinek

Alexander said:
This is my first experience with any kind of programming language.

I purchased The Pragmatic Programmers Guide (for Ruby) aka "Pickaxe."
I started to read through it, but found it dived in a bit too deep
initially; making references to perl and not fully explaining the
construction of the first example program, regardless of the
aforementioned it looks to be a great guide for someone with a little
more experience than myself.

Ruby-lang.org led me to Chris Pine's tutorial which was excellent for
someone who knows nothing, but I can't find anything that picks up where
he leaves off that carries that same style and in depth teaching.

Where I have interest in going with Ruby

-Screen scraping eBay for specific products/information
-(unrelated but need help finding this)Image recognition software
-Web based management for screen scraping and storing of data/pdf
creation of auctions

could anyone point me in the right direction, maybe any articles on this
stuff? I know of scrAPI and Ruby on Rails, but I am new and do need
direction.

I have found "Ruby for Rails" the best fit for this situation (I also
felt that the pickaxe is about a level or two higher what I needed in
the beginning). Actually I just blogged about this:

http://www.rubyrailways.com/book-review-ruby-cookbook/

(it is mainly a review of the Ruby cookbook, but the starting part is
addressing your situation a bit).

wrt the screen scraping: I am just finishing a (IMHO) very easy to use
(=considerably easier than scrAPI since you need no CSS/XPath or any
kind of special knowledge) yet quite powerful Web scraping framework
called scRUBYt! (scRUBYt! eats ebay for breakfast ;-) - basically I am
waiting for the release of HPricot 0.5 (I am using some features from
the edge-Hpricot trunk) and I need to add a few minor features but
basically I am more or less ready - the exact release time depends on
_why more than on me I guess :).

I would be more than happy if you could try it so I would have some
real-life scenario feedback.

Cheers,
Peter

__
http://www.rubyrailways.com
 
A

Alexander York

Peter said:
I would be more than happy if you could try it so I would have some
real-life scenario feedback.

Cheers,
Peter
Thanks for the informative reply, the book looks promising and I would
love to try out your scraper when it is ready. The scraping i will need
to do is going to be pretty advanced I think.

Could you shoot me your email address or is it on your site?

Cheers,
Alex
 
J

James Britt

Peter said:
I have found "Ruby for Rails" the best fit for this situation (I also
felt that the pickaxe is about a level or two higher what I needed in
the beginning).


I second the suggestion of getting Ruby for Rails, regardless of any
plans to use Rails.


For screen-scraping, I wrote up my experience with Mechanize in creating
rubystuff.com by extracting data from CafePress sites.

http://www.neurogami.com/cafe-fetcher/

It may be a bit dated; Mechanize has changed some.

I would also suggest looking into Hpricot; my recent site snarfing
forays have been fairly pain-free thanks to this library.


--
James Britt

http://www.ruby-doc.org - Ruby Help & Documentation
http://beginningruby.com - Beginning Ruby: The Online Book
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys
 
A

Alexander York

Troy said:
Screen scraping is good fun and all, but have you looked into the web
services APIs that eBay makes available? There may be much easier ways
to
get at data than screen scraping. Look at the "Developers" link at the
bottom of their homepage.

Hmm looks like good advice, could you perhaps give me an example of how
I could apply this with ruby? (I'm still new)

I will need to be able to search a brand of products and search each
individual auction for specific text.
 
P

Peter Szinek

This is unquestionably true: no web scraper will ever (?) give you 100%
reliability and robustness and ... that a WS API can offer. If your thing
is mission critical, and the API offers everything you need, no one is
questioning that it is the way to go - even if the other alternative is
scRUBYt! :)
Hmm looks like good advice, could you perhaps give me an example of how
I could apply this with ruby? (I'm still new)
Do you have the Ruby Cookbook? There is one recipe for using the ebay API...
The scraping i will need
to do is going to be pretty advanced I think
The present version of scRUBYt! (about 0.0.8) works like this:

ebay_data = Scrubyt::Extractor define do
#Navigate to our page
fetch 'ebay.com'
fill_textfield 'dell laptop'
submit
click_link 'Laptops/Notebooks'

#Construct the scraper
record do
name "DELL LATITUDE C600 P3 1.0 LAPTOP NOTEBOOK 256MB 20GB HD"
price "$192.50"
shipping "$37.00"
end.ensure_presence_of_pattern:)price)
next_page "8 9 Next", :limit => 5
end

So, this extractor does the following: first navigates to your page of
interest automatically as you would do it (powered by Mechanize - from
0.2.0 I would like to add Watir as well so it can handle JavaScript,
then later on Selenium), Then the scraper is constructed.

As you can see, there is no need for any technical knowledge (powered by
Hpricot); You just show an example record and all the others are
extracted automatically. Moreover, only those records are extracted
which have a price (i.e. not the 'buy now' ones). Another goodie for
free is the navigation to the next pages - in this case the first 5
pages (you can do also unlimited as well).

Now, this is just a part of the functionality even of the version which
is on my HDD - I am planning to add more stuff into 0.1.0 - and even
that will be just the top of the iceberg of the features I am planning
to add long term.

And of course the best thing is that it *works* - it's not a toy (at
least not on the pages I have tried it so far) - and though I am sure
there will be problems with the earlier versions, I count on the
community to help to catch these and propose features important for them
so I guess the future might be bright for scRUBYt! :)

Cheers,
Peter

__
http://www.rubyrailways.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top