Web automation (was: Pressing a Webpage Button)

Q

qwweeeit

Hi all,
Elliot said:
How do I make Python press a button on a webpage? I looked at
urllib, but I only see how to open a URL with that. I searched
google but no luck.
For example, google has a button <input type=submit value="Google
Search" name=btnG> how would i make a script to press that button?

I have a similar target: web automation, which
needs not only to press web buttons but also
fill up input fields (like in your case the
search field of Google).
On the suggestion of (e-mail address removed), I tried
twill (http://www.idyll.org/~t/www-tools/twill.html).

I think it can be the solution:
I already applied it for reading data from an asp file
(http://groups.google.it/group/comp....3e08d28baf?q=qwweeeit&rnum=2#2e0a593e08d28baf)
I try to solve your problem using the interactive mode
(but twill can also be called as a module).

Set twill in interactive mode: twill-sh
- load Google webpage:
go www.google.it (I'm Italian!)
- show the page with the command 'show'
- Get the page forms:
showforms

## __Name______ __Type___ __ID________ __Value__________________
hl hidden (None) it
ie hidden (None) ISO-8859-1
q text (None)
meta radio all [''] of ['', 'lr=lang_it',
'cr=count ...
1 btnG submit (None) Cerca con Google
2 btnI submit (None) Mi sento fortunato
current page: http://www.google.it

The input field is q (Type:text), while there are two buttons
(Type: submit) and a radio button meta (Type: radio).

- fill values:
fv 0 q twill
(being "twill" the search string")
- press the search button:
fv 1 btnG "Cerca con Google"
submit
twill answers with the query to Google:
http://www.google.it/search?hl=it&ie=ISO-8859-1&q=twill&btnG=Cerca+con+Google&meta=
- save the search result on a file:
save_html /home/qwweeeit/searching_twill.html

Here they are the 1st 10 hits of the search!
Don't ask me to continue! Perhaps asking to the author of twill
(C. Titus Brown)...

With such a method you can bypass the Google's restrictions, because
you are using the browser (only building automatically the query).

And this answers to the right observation of Grant Edwards:
Ah, never mind. That doesn't work. Google somehow detects
you're not sending the query from a browser and bonks you.

Bye.
 
C

calfdog

Hello,

It's fairly easy to do see code below:
##########################################################################
from win32com.client import DispatchEx
import time

# instaniate a new IE object
ie = DispatchEx('InternetExplorer.Application')

# Naviagte to the site
ie.Navigate("www.google.com")

# You have to wait for the site to load. Best use a method that check
for readystate ="complete" and ie.Busy
while ie.Busy:
time.sleep(0.1)

doc = self._ie.Document

while doc.readyState != 'complete':
time.sleep(0.1)

ie.Document.all["q"].value ="Python rocks"
ie.Document.all["btnG"].click()

I have a wrapper that does all this for you. It is geared for people in
QA. You can download if here
URL : http://pamie.sourceforge.net


To use it is simple" The wait to laod the doc is already in the code
###########################################
from cPAMIE import PAMIE

ie=PAMIE()

ie.Navigate(www.google.com)

# Set the text - arguments - value to set, textbox name, formname
ie.SetTextBox('Python","q","f")
ie.ClickButton("btnG,"f")
ie.ClickLink('Python Programming Language')

Hope this helps

Rob





Hi all,
Elliot said:
How do I make Python press a button on a webpage? I looked at
urllib, but I only see how to open a URL with that. I searched
google but no luck.
For example, google has a button <input type=submit value="Google
Search" name=btnG> how would i make a script to press that button?

I have a similar target: web automation, which
needs not only to press web buttons but also
fill up input fields (like in your case the
search field of Google).
On the suggestion of (e-mail address removed), I tried
twill (http://www.idyll.org/~t/www-tools/twill.html).

I think it can be the solution:
I already applied it for reading data from an asp file
(http://groups.google.it/group/comp....3e08d28baf?q=qwweeeit&rnum=2#2e0a593e08d28baf)
I try to solve your problem using the interactive mode
(but twill can also be called as a module).

Set twill in interactive mode: twill-sh
- load Google webpage:
go www.google.it (I'm Italian!)
- show the page with the command 'show'
- Get the page forms:
showforms

## __Name______ __Type___ __ID________ __Value__________________
hl hidden (None) it
ie hidden (None) ISO-8859-1
q text (None)
meta radio all [''] of ['', 'lr=lang_it',
'cr=count ...
1 btnG submit (None) Cerca con Google
2 btnI submit (None) Mi sento fortunato
current page: http://www.google.it

The input field is q (Type:text), while there are two buttons
(Type: submit) and a radio button meta (Type: radio).

- fill values:
fv 0 q twill
(being "twill" the search string")
- press the search button:
fv 1 btnG "Cerca con Google"
submit
twill answers with the query to Google:
http://www.google.it/search?hl=it&ie=ISO-8859-1&q=twill&btnG=Cerca+con+Google&meta=
- save the search result on a file:
save_html /home/qwweeeit/searching_twill.html

Here they are the 1st 10 hits of the search!
Don't ask me to continue! Perhaps asking to the author of twill
(C. Titus Brown)...

With such a method you can bypass the Google's restrictions, because
you are using the browser (only building automatically the query).

And this answers to the right observation of Grant Edwards:
Ah, never mind. That doesn't work. Google somehow detects
you're not sending the query from a browser and bonks you.

Bye.
 
C

calfdog

Twill or Pamie looks good, if your looking for a solution using Python.

If you are looking for Cross- browser solution for testing there is
also selenium that does that.
The have some code for use in Python.

There is also WATIR in RUBY and SAMIE in PERL

It depends on your enviroment and needs.

Hope that helps!!
Rob M.
 
C

calfdog

Twill or Pamie looks good, if your looking for a solution using Python.

If you are looking for Cross- browser solution for testing there is
also selenium that does that.
The have some code for use in Python.

There is also WATIR in RUBY and SAMIE in PERL

It depends on your enviroment and needs.

Hope that helps!!
Rob M.
 
Q

qwweeeit

Hi all,
I must correct myself:
With such a method you can bypass the Google's restrictions, because
you are using the browser (only building automatically the query).

Of course that's not correct because you are using a program (twill)
different from a browser and if google controls from where the query
arrives...

I must thank calfdog for his reply, but the tool he suggest (pamie) is
only working in Windows and with the browser Internet Explorer. I
didn't explicity refer to Linux, but I supposed the everyone knew that
web automation (and in general "automation") is only a problem in
Linux.
Bye.
 
M

Mike Meyer

but I supposed the everyone knew that web automation (and in general
"automation") is only a problem in Linux.

I don't know it. I don't believe it, either. I automate web tasks on
Unix systems (I don't use many Linux systems, but it's the same tool
set) on a regular basis. In fact, I automate *lots* of tasks on Unix
systems on a regular basis, and have been doing it for decades. Most
Unix tools are very amenable to automation, much more so than I've
found either Windows or OS X tools to be.

<mike
 
P

Paul Boddie

Mike said:
I don't know it. I don't believe it, either. I automate web tasks on
Unix systems (I don't use many Linux systems, but it's the same tool
set) on a regular basis.

I imagine that "Web automation" is taken here to mean the automation of
Web browsers, with all the advantages/issues such an approach entails.
The problem on non-Windows systems is the lack of a common (or
enforced) technology for exposing application object models: Mozilla
has XPCOM which apparently doesn't permit the exposure of enough useful
functionality to other processes for "Web automation" tasks (and whose
components seem bizarre enough to defeat my casual investigations into
automation with in-browser components), whilst Konqueror/KHTML is
somewhat accessible via DCOP although the interfaces to much of KDE are
somewhat limited.

Taking the challenge on board, I decided to build on the existing KPart
plugin work done with PyKDE [1] and produce a component which exposes
active documents using DCOP [2]. Combined with an extended version of
qtxmldom [3] the result is a system which permits out-of-process
automation of KHTML and thus Konqueror with the documents available
using a PyXML-style DOM. Currently, the work is in an early phase and
there's a lot of learning about DCOP and PyKDE to be done, but I think
the concept is more or less worked out.

If only GNOME and KDE had stuck with CORBA, though... :-/

Paul

P.S. Of course, the existing KPart plugins permit in-browser embedding
which is easily good enough for many automation tasks, and there are
plenty of examples of moderately useful tools and scripts to prove this
point. In the revised plugins collection [2], there's a plugin which
extracts hCalendar information, for example.

[1] http://www.boddie.org.uk/david/Projects/Python/KDE/index.html
[2] http://www.boddie.org.uk/python/kpartplugins.html
[3] http://www.boddie.org.uk/python/qtxmldom.html
 
M

Mike Meyer

Paul Boddie said:
I imagine that "Web automation" is taken here to mean the automation of
Web browsers

Yeah, I know. It still seems like an ass-backwards approach to
me. It's not at all clear that emulating user actions makes a sane
model for scripting. I know the non-Unix systems I've seen that did
things like that were clumsy compared to scripting interfaces that
were designed from the ground up to be scripting interfaces. But that
kind of thing varies from application to application on the systems
that support scripting.
The problem on non-Windows systems is the lack of a common (or
enforced) technology for exposing application object models

OS X has AppleScript. VM/CMS has Rexx. The Amiga had ARexx when MS was
still peddling DOS. Plan 9 has files. I don't think any of them are
"enforced" - then again, I don't think anything enforces exporting
objects from Windows applications, either. The thing that sets all
these apart from Unix is that the technology to export objects from
applications came bundled as a core part of the OS. Unix still doesn't
have that; instead it has a collection of languages that can be
embedded in the application. Lots of applications do that, meaning
they are automatable - but maybe not in the language you want to use
for the project.

Classic Unix (meaning pre-X) was automatable because programs were
expected to produce output that could be processed by other
programs. Those tools are still around and in daily use, and make task
automation on Unix a common and easy thing. It's only when you
restrict the meaning of "automation" to be "driving a GUI app" that
you run into problems.

<mike
 
P

Paul Boddie

Mike said:
OS X has AppleScript. VM/CMS has Rexx. The Amiga had ARexx when MS was
still peddling DOS. Plan 9 has files.

I knew I should have written "UNIX systems" or "non-Windows but still
mainstream systems". ;-)
I don't think any of them are "enforced" - then again, I don't think anything enforces
exporting objects from Windows applications, either.

No, but COM is the obvious choice for doing so on Windows. Combine that
with the component developer mindset and it's likely that some kind of
object model will be exposed by an application.

Still, I wouldn't say that automation is necessarily "ass-backwards":
sometimes you want the additional baggage that the browser can give you
- witness the occasional comp.lang.python thread about working with
JavaScript-laden pages - and it's not necessarily automation involving
the activation of coincidental user interface components (find the
"Register Later" button in the "Register Now?" pop-up dialogue and
click on it") that's involved here either.

Yes, the very architecture of the Web should have made automation tasks
a lot more open and convenient, but sometimes there's a need for a
"complete" browser to get at the data.

Paul
 
Q

qwweeeit

Hi all,
answering to Mike Meyer who replied to the following
assertion on my part:
with...:
I don't know it. I don't believe it, either. I automate web tasks on
Unix systems (I don't use many Linux systems, but it's the same tool
set) on a regular basis. In fact, I automate *lots* of tasks on Unix

Perhaps there is a misunderstanding: I intend not the
script (like bash) automation, but the user emulation on the GUI side,
which allows automation by programmatically emulating a lot of repetive
user tasks.
A perfect example is referred in another of my posts:
"web automation with twill"
(http://groups.google.it/group/comp....39993bda79?q=qwweeeit&rnum=3#f7faa139993bda79)
in which from a program (twill) I "press" 220 times a button on a
asp file (or better, on the html reply of the html server) to get all
the data.
But if the html server only enables the access from a browser, I am
obliged to cheat and disguise twill as a browser.
In Windows you can use a macro language as AutoIt.
In Linux instead, and more generally in Unix, the pre-X is
quite automatable, but (as Mike Meyer said)...
It's only when you restrict the meaning of "automation" to be
"driving a GUI app" that you run into problems.
By the way you already took part in a discussion on sending keystrokes
to a GUI application, "fake" flags and low-level programming
(you proposed python-xlib to avoid indeed low-level programming).
But in that I disagree with you. For me it's better to spend my time
in strenghtening security (firewalls, antivirus etc...) and then choose

among the various solutions in Windows as far as macro languages are
concerned (very likely I will choose Autot).
And I can keep on using Python if I want...

The contribution of Paul Boddie is valuable. I too examined DCOP
and even chose as browser Konqueror, being a KDE application.
But DCOP doesn't go to such a low level. It is not possible
to send a simulated keystroke from one KDE application to another.

Not being an expert I can't understand nor comment on the more
technical parts of your reply (out-of-process automation,
PyXML-style DOM etc.).

Nevertheless I thank you all for your contribution.

Bye.
 
M

Mike Meyer

answering to Mike Meyer who replied to the following
assertion on my part:
Perhaps there is a misunderstanding: I intend not the
script (like bash) automation, but the user emulation on the GUI side,
which allows automation by programmatically emulating a lot of repetive
user tasks.

That's a *very* silly way to automate a task. When I'm automating a
task, the *last* thing I want to do is think about selecting menu
entries and pressing buttons on a GUI. I want to think about the
operations on the object in question. A good automation interface will
let me do the latter. A poor one will force me to do the
former. Further, a good automation interface will let my script work
without needing to open a GUI, or even needing the resources required
to open the GUI - unless my script explicitly wants to open a GUI,
anyway.

That's also a very silly way to define automation. Sort of like
defining "transportation" as "a horse and buggy", and then declaring
that people who only have access to Mac trucks or Lear jets have a
"problem with transportation."
But if the html server only enables the access from a browser, I am
obliged to cheat and disguise twill as a browser.

So? Most such sites actually enable access from a small set of
browsers. So lots of off-brand browsers have setting to disguise
themselves as other browsers, or even enter an arbitrary string. If
your browser has to do it, why should it matter if your not-a-browser
has to do it?
By the way you already took part in a discussion on sending keystrokes
to a GUI application, "fake" flags and low-level programming
(you proposed python-xlib to avoid indeed low-level programming).

Yes, I'm aware of that. Just because I disagree with your definitions
and think you're doing things the hard way doesn't mean I'm not going
try and help you if I think I can. But when you then make a false
statement - or even one that could be misinterpreted as false -
disparaging solutions I've had excellent success with, I'm going to
point it out.
But in that I disagree with you. For me it's better to spend my time
in strenghtening security (firewalls, antivirus etc...) and then choose
among the various solutions in Windows as far as macro languages are
concerned (very likely I will choose Autot).

On the whole, I believe that Unix has better macro languages than
Windows. That's because Unix scripting has been around since before
there was MS Windows. Or before MS DOS, for that matter.
And I can keep on using Python if I want...

The contribution of Paul Boddie is valuable. I too examined DCOP
and even chose as browser Konqueror, being a KDE application.
But DCOP doesn't go to such a low level. It is not possible
to send a simulated keystroke from one KDE application to another.

That you *want* to do those things would seem to be an indication that
you're trying to do the wrong thing. When scripting on Unix, you
usually get to work at a *much* higher level than emulating keystrokes
or mouse clicks. Why would you want to work at that low a level when
high-level alternatives are available? Wanting to do that on Unix
because that's how you'd do it on Windows is sort of like wanting to
process a string character at a time in Python because that's how
you'd do it in C/C++/Java/whatever. You're better off learning to use
the tools on the platform to your advantage than trying to make the
platform be what it isn't.

<mike
 
M

Mike Meyer

Paul Boddie said:
I knew I should have written "UNIX systems" or "non-Windows but still
mainstream systems". ;-)

Except OS X is Unix, non-Windows, and still mainstream. Maybe
"non-GUI-intensive systems"?
No, but COM is the obvious choice for doing so on Windows. Combine that
with the component developer mindset and it's likely that some kind of
object model will be exposed by an application.

I think I pointed out that the real thing all those system have is
they come bundled with a way of exporting objects from
applications. Generally, one that's better than embedding an
interpreter in the application, too. There are patches for Linux that
provide the system functionality needed for Plan 9's filesystem export
facilities. If those ever go mainstream, I'll seriously consider
switching to Linux.
Still, I wouldn't say that automation is necessarily "ass-backwards":
sometimes you want the additional baggage that the browser can give you
- witness the occasional comp.lang.python thread about working with
JavaScript-laden pages - and it's not necessarily automation involving
the activation of coincidental user interface components (find the
"Register Later" button in the "Register Now?" pop-up dialogue and
click on it") that's involved here either.

You know, I recently automated a web task that invovled framed,
JavaScript heavy pages. The pages had a link to a "no javascript"
version, but that just returned a server error. Pretty hopeless,
right? Turns out my script doesn't need any of that. I drilled down
through the Frames to access the pages with real forms on them, and
examineed the javascript source to figure out what it was doing - then
did the appropriate POSTs by hand, and it all works quite nicely.
Yes, the very architecture of the Web should have made automation tasks
a lot more open and convenient, but sometimes there's a need for a
"complete" browser to get at the data.

Well, there are two options to that. One is to find a modern browser
with a good scripting interface. The other is to provide a scripting
facility with the capabilities of a good browser - which mostly means
JavaScript. CSS is a non-issue, and plugins aren't frequent enough to
be a real problem. Especially since you may not be able to run the
proprietary plugins on your system anyway, so even a full-blown
browser won't help :-(.

For JavaScript - there are standalone implementations available. If I
ever run into a case where I actually have to run JavaScript to deal
with a web automation task, I'll check them out. That no one has
wrapped one for use in Python scripts tends to indicate that the
JavaScript problem isn't as bad as it appears at first.

<mike
 
P

Paul Boddie

The contribution of Paul Boddie is valuable. I too examined DCOP
and even chose as browser Konqueror, being a KDE application.
But DCOP doesn't go to such a low level. It is not possible
to send a simulated keystroke from one KDE application to another.

I imagine that you can send keystrokes using the xlib package described
earlier. Nevertheless, a "proper" automation interface doesn't work at
that level. Instead, you work with more high-level concepts than
sending keypresses and scanning around the window list to see what
happened.

One example of automation is the OutlookExplorer program I wrote [1]
which connects to Microsoft Outlook and exports messages, calendar
events, and so on. Instead of pretending that to be a user clicking on
different things, reading things off the screen, and then navigating
around - something which would be very easy to get wrong - the program
instead connects to Outlook's automation interface via COM, selects
each folder in turn using the high-level interface provided, and
invokes various methods on the interface to export messages.

With a browser, one may use a similarly high-level interface: instead
of firing keypresses into the location bar and then firing a Return
keypress to tell the browser to load a page, you invoke a method in the
browser's automation interface - openURL in the mainwindow interface
for Konqueror, I believe. After that, things can be more difficult, but
even so, you should still have moderately high-level access to the
document being displayed, for example, even if it is via a DOM.
Not being an expert I can't understand nor comment on the more
technical parts of your reply (out-of-process automation,
PyXML-style DOM etc.).

All I meant by "out-of-process" was whether you can just start a Python
program outside the browser (eg. in a normal console) which connects to
the browser in order to do its work. The PyXML-style DOM was a
reference to the way the HTML document is represented - if you're used
to XML processing in Java, JavaScript, Qt or even Python, you'll have
seen such a thing before.

Paul

[1] http://www.boddie.org.uk/python/COM.html
 
Q

qwweeeit

Hi Mike,
thank you very much for your reply.
I know that mine could be considered
a "very" silly way to define automation.
but I'm not a purist nor a professional
programmer.
Besides that, I know that case by case
every problem can be solved and in more
"right" way, also in very difficult environments
(framed, Javascript heavy pages) ... but not by me!

I must confess: before pressing manually 220 times a
"Next" button and save the data sent by the
html server (using simply cut/paste), I tried to use
shell programming, DCOP etc., but in the end
I reverted to the "by hand" method...

Perhaps if I were an expert like you, I could have
programmed a small script in a matter of minutes.

Only after a week I found the solution (twill) but
I have discovered that also this solution obliges
to consider every case and program the script
accordingly (to not mentioning the need of
disguising it as a browser).
On the other end, the "cheating" method doesn't
assure a 100% success, because the html servers can
have a high degree of "cleverness".

If I would cut out all automatic queries from my server,
I would also time between them to see if the stream
of queries ("apparently" coming from a browser) are
indeed compatible with a browser operated by a
human being...

For all this reasons, reluctantly, I am going back to Windows
and its macro languages.

Unfortunately in Windows there are many more security problems...

If you have a "general" solution for the X world...

While I was posting the reply to Mike I saw the last
contribution of Paul Boddie.
From what he says I infer that he is a Windows programmer
(OutlookExplorer).
In Windows I have no problems (a part security) to program
"dirty" automation scripts!
But I have a doubt... how my Windows theory accords
with DCOP and KDE?
Bye.
 
P

Paul Boddie

While I was posting the reply to Mike I saw the last
contribution of Paul Boddie.

From what he says I infer that he is a Windows programmer

Far from it! OutlookExplorer was written as an experiment when I had to
use Windows in a corporate environment. I don't run Windows at all on
my own hardware, and I even assembled my own Linux-compatible machine
in order to (a) not pay the Windows tax and (b) get open enough
hardware for which Linux drivers exist.

Rant: instead of quietly grumbling and installing Linux over Windows on
newly purchased hardware, and then grumbling some more about incomplete
device support, I'd advise both individuals and businesses to tell
vendors where they can shove their bundled Windows licences and special
agreements with Microsoft. Installing Linux after the fact does little
to enlighten the average vendor - to them you're still running Windows,
loving it, and presumably willing to buy the whole package from them
again in future. Rant over!
(OutlookExplorer).
In Windows I have no problems (a part security) to program
"dirty" automation scripts!
But I have a doubt... how my Windows theory accords
with DCOP and KDE?

DCOP is mostly equivalent to COM for automation purposes. Whilst there
are command line tools (dcop) and simple graphical tools (kdcop) for
accessing the interfaces exposed by applications, you might want to
consider PyKDE and its own DCOP support. What I've just added to
qtxmldom is support for DCOP-based access to DOM documents in
Konqueror, although the installation of the different components is
still a bit awkward: PyKDE needs a patch, for example.

Paul
 
M

Mike Meyer

Hi Mike,
thank you very much for your reply.
I know that mine could be considered
but I'm not a purist nor a professional
programmer.

Yes, but you still need to communicate with other people. Using words
to mean something other than what those people expect them to mean is
a recipe for trouble.
Besides that, I know that case by case
every problem can be solved and in more
"right" way, also in very difficult environments
(framed, Javascript heavy pages) ... but not by me!

The "right" way is what works for you. I'd call using a higher-level
approach the "easy" way - at least when compared to to simulating GUI
events!
I must confess: before pressing manually 220 times a
"Next" button and save the data sent by the
html server (using simply cut/paste), I tried to use
shell programming, DCOP etc., but in the end
I reverted to the "by hand" method...

Perhaps if I were an expert like you, I could have
programmed a small script in a matter of minutes.

I can't say without having looked at your example whether or not
that's possible. I can say that it would probably take more than a few
minutes, having done similar things myself.

Automating web stuff is very fragile in any case. Minor changes in web
formatting can break the automation. Someone already pointed out that
the web isn't well-designed for automation.
Only after a week I found the solution (twill) but
I have discovered that also this solution obliges
to consider every case and program the script
accordingly (to not mentioning the need of
disguising it as a browser).
On the other end, the "cheating" method doesn't
assure a 100% success, because the html servers can
have a high degree of "cleverness".

So can your twill script.
If I would cut out all automatic queries from my server,
I would also time between them to see if the stream
of queries ("apparently" coming from a browser) are
indeed compatible with a browser operated by a
human being...

So I'd add an automatic - and random - delay between each fetch. No
problem. Well, once I figured out what you were checking for, anyway.
For all this reasons, reluctantly, I am going back to Windows
and its macro languages.

Unfortunately in Windows there are many more security problems...

If you have a "general" solution for the X world...

Well, what you want is SMOP. The problem is, there are easier to use
solutions for almost every case you run into in the real world, so
there's little incentive for providing a "general" (i.e. - low-level,
as you can't force a high-level interface on apps) solution. Scripting
tools on Windows aren't generally as capable - or at least weren't
until relatively recently - so there's more incentive for a low-level
solution to be developed. You happen to be hitting one of the corner
cases where the high-level ltools on Unix just aren't up to the job.

<mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,160
Latest member
CollinStri
Top