help re recording/replaying (i.e. automating) HTTP interactions to aweb-site???

G

Greg Hauptmann

Hi,

Actually can anyone recommend a good technique / software / plugin
that would assist if I wanted to effectively (a) record my interaction
with my bank at the HTTP level, then (b) use this to automate that
behavior in my RoR application to automate pulling down daily account
details?

The best I can think of at the moment is: (a) Firefox Live HTTP
Headers plugin then (b) manually write Ruby code that sends these out
and waits for the response & check it before proceeding to the next
http request. I'm thinking someone probably has a better way, or
plugin, to handle at least part (b)?

Tks
 
P

Peter Szinek

[Note: parts of this message were removed to make it a legal post.]

Greg,

Have you looked into (Fire)Watir?

I am just releasing a new version of scRUBYt! (a web scraping
framework) where it will be possible to use FireWatir as the agent for
navigation/scraping, so you can write a simple but powerful DSL (stuff
like 'click_link', 'fill_textfield' etc) which is executed through
Firefox and is very well suited for scenarios you just described. Drop
me a line if you are interested.

But of course plain (Fire)Watir would do, too.

Cheers,
Peter
___
http://www.rubyrailways.com
http://scrubyt.org
 
G

Greg Hauptmann

sounds good, it support:
=95 https?
=95 cookies?
=95 building in some intelligence? (say when the link for step N will
change over time but you can write an algorithm for it)

thanks
 
G

Greg Hauptmann

PS. 4th question Peter I forgot:
=95 does it support downloading a file (eg csv file, account transactions)
 
P

Phillip Gawlowski

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Greg Hauptmann wrote:
| PS. 4th question Peter I forgot:
| • does it support downloading a file (eg csv file, account transactions)

http://wtr.rubyforge.org/

Find it out?

- --
Phillip Gawlowski
Twitter: twitter.com/cynicalryan

A born loser:
~ Somebody who calls the number that's scrawled in lipstick on the phone
~ booth wall-- and his wife answers.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkgTn5wACgkQbtAgaoJTgL+jvgCePwARmYTIE1hktGz6yVD0JeWk
rHMAnRt+JpgafQAJivHFyXvag8Tt2duT
=smt6
-----END PGP SIGNATURE-----
 
G

Greg Hauptmann

I see Watar requircs/drives a browser...i'm after something browser
independent...any other library/plugin suggestions?
 
P

Phillip Gawlowski

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

A: Because it makes it hard to follow the discussion.
Q: Why is top posting bad?

Greg Hauptmann wrote:
| I see Watar requircs/drives a browser...i'm after something browser
| independent...any other library/plugin suggestions?

WWW::Mechanize is quite popular, from what I've seen so far.
- --
Phillip Gawlowski
Twitter: twitter.com/cynicalryan

~ - You know you've been hacking too long when...
...you discover that you're balancing your checkbook in octal.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkgT8P4ACgkQbtAgaoJTgL+AogCggpoBDGb4G+FJwdFWR5y0HCUd
OcEAoJAt9GtYI6/yawF2INPbm5mt8lum
=vQTZ
-----END PGP SIGNATURE-----
 
7

7stud --

Greg said:
Hi,

Actually can anyone recommend a good technique / software / plugin
that would assist if I wanted to effectively (a) record my interaction
with my bank at the HTTP level, then (b) use this to automate that
behavior in my RoR application to automate pulling down daily account
details?

The best I can think of at the moment is: (a) Firefox Live HTTP
Headers plugin then (b) manually write Ruby code that sends these out
and waits for the response & check it before proceeding to the next
http request. I'm thinking someone probably has a better way, or
plugin, to handle at least part (b)?

I'm not sure what the Firefox Live HTTP headers plugin will do for you.
If you write a ruby program to send out requests to a url, then you know
what headers you are sending in your request, and when you get the
response, you can read the headers in the response.
 
P

Peter Szinek

PS. 4th question Peter I forgot:
=95 does it support downloading a file (eg csv file, account =20
transactions)

Yes, scRUBYt! supports all these things... In the current =20
implementation WWW::Mechanize is used as the agent, but it doesn't =20
support JavaScript and (more often than not) e-banking sites have =20
some JS... so that's why I suggested the FireWatir based solution.

A browser-agnostic solution doesn't exist (Mechanize is a browser too) =20=

- the nature of the task requires a browser. Call it as you like, but =20=

if something is able to GET/POST requests, store cookies, use https, =20
sessions, .... then it is a browser in my vocabulary.

Besides FireWatir is platform-independent (unlike Watir which is win32 =20=

only).

Cheers,
Peter=
 
G

Greg Hauptmann

thanks Peter - I was starting to look at Mechanize but will focus in
at scRUBYt...
 
P

Peter Szinek

[Note: parts of this message were removed to make it a legal post.]


thanks Peter - I was starting to look at Mechanize but will focus in
at scRUBYt...
2008/4/28 Peter Szinek <[email protected]>:

OK, cool. If you don't have JS/AJAX or other trick that Mechanize
can't handle, you should be OK.

On the other hand, if you do have JS/AJAX on the page, you will need
FireWatir, whether you like it or not :) the FireWatir-enabled
version of scRUBYt! is not yet officially released - if you want to
try it, you need to d/l it from http://scrubyt.org/scrubyt-0.4.03.gem
and install.

Let me know if you encounter any problems!

Cheers,
Peter
___
http://www.rubyrailways.com
http://scrubyt.org
 
G

Greg Hauptmann

[Note: parts of this message were removed to make it a legal post.]

I must admit you're managing to overwhelm me slightly with the number of
libraries / packages here :)
So what does the stick look like you're talking about. Will it be fronted
by scrubyt then like:

- Scrubyt
- FireWatir
- Mechanize
- hpricot

Is this correct? If it were simply could you put in brackets the key thing
each layer does/focuses on?

Cheers
Greg
 
P

Peter Szinek

[Note: parts of this message were removed to make it a legal post.]

In the current official release (0.3.4), FireWatir is not yet added.
So you have just Mechanize + Hpricot.

Mechanize does the navigational part - fill this textfield, then
click that button, and if you arrived at the result page, crawl to all
the detail pages etc.
Once you arrive at the final page from where you don't want to go on
further, you start the actual scraping, and in this case that's done
through Hpricot. You take the page where you arrived, parse it with
Hpricot and collect the results from it.

In the development release, it is possible to plug-in other agents
than Mechanize - theoretically anything, currently FireWatir is
implemented. But if you want to use Mechanize as the agent for
crawling, you don't need to install FireWatir at all.

FireWatir based scraping has other benefits beyond JS/AJAX - for
example more robust HTML parsing (which is done by Firefox in this
case). Hpricot is a great parser but it can't beat Hpricot (yet).
Firefox-parsed HTML also means you can use XPaths straight from
FireBug or DOM Inspector (which is not the case with Mechanize)

On the downside, Mechanize-based navigation/scraping is faster (you
don't have to wait until the page renders, which is a prerequisite for
FireWater-based navigation etc.)

Does this answer your question? (If not, be sure to keep asking ;-)

Cheers,
Peter
___
http://www.rubyrailways.com
http://scrubyt.org
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top