Automating Internet Explorer

G

~greg

Hello,

I am trying to find a reliable leak-proof way to control Internet Explorer
by way of buttons (or whatever) in a parallel Tk window.

The script below is where it's at at the moment.
(Please forgive the learner's-comments,
I just clipped the script exactly as it currently is.)

At the moment the Tk button is supposed to switch the IE document
in and out of edit mode (via document.designMode = 'On' and 'Off'.)

And it works!

However, there seems to a leak.
And my question is, how to stop it?
(I'd like to get this right, and not just seem to be right.)

The printout is this:

BEGIN Loop
TEST 1:
TEST 2:
TEST 3:
TEST 4:
TEST 5:
TEST 6:
BEGIN OnQuit event
END OnQuit event
END Loop
BEGIN Test for leaks:
Object=Win32::OLE=HASH(0x1db18b0) Class=DispHTMLDocument
Object=Win32::OLE=HASH(0x1d99bc0) Class=IWebBrowser2
Object=Win32::OLE=HASH(0x225530) Class=IWebBrowser2
END test for leaks.
BEGIN MyOleQuit
END MyOleQuit

So the leak is that those 3 objects still exist
when the END block is executed.


I'm thinking that I'm supposed to call
Win32::OLE->Uninitialize();
and/or
Win32::OLE->FreeUnusedLibraries();

but I don't know,
and I don't know if FreeUnusedLibraries() has to be protected,
from the VB bug the kludgy way mentioned in the OLE doc
(--as quoted below after the __END__)

and I don't know where to call them, if they are the answer --
(-- in the 'OnQuit' event handler ?
-- in 'MyOnQuit()' ?
-- after the loop exit ?
-- in an END block ?
)

Currently the script exits when the X is clicked in the IE window,
which sends the 'OnQuit' message to MyIEHandler(),
which then stops OLE from listening to IE events,
and destroys the Tk window.
So I'm doing all that in the event handler,
while the MyOleQuit() and loop drop-out do nothing.


I hope I've made it at least a little bit clear what I'm asking!


(Another question: every time I switch in and out
of edit-mode, the page seems to get refreshed.
Can that be stopped?)

~greg.

(note: I've seen something like this (ie, simultaneous Tk and OLE loops)
being done
(here:
D:\Us\Programming\Perl\Internet Explorer\Notes\Scripting iTunes with Perl - Part 2 at cyberrazor.htm
)
by using both Tk's MainLoop and OLE's MessageLoop,
like this:
1.. $tkWin->waitVariable(\$iTunes_OLE);
2.. MainLoop;
3.. Win32::OLE->MessageLoop();

and I think that might have some advantages
in terms of MainLoop and MessageLoop being well tweaked
so as not to take up too much cpu time when you're doing
other things. And I got it to work too, but it got
too confusing for me, and the script always hung
when I quit the IE.

In general, I don't know where to close things.
(as this post is now an example! :)

In particular, I don't know why $TkWindow->destroy();
can't be called just anywhere at all, completely
independant of the OLE automation, but it can't.
The only place I can put it so that Tk doesn't stay hanging
after I've closed IE is in the 'OnQuit' event-handler case.



# ------------------------------------------------------------------------
# the script: ...
# ------------------------------------------------------------------------

use strict;
$|=1;

my $TkWindow;
my $IEWindow;
my $Document;
my $Looping = 1;

# ------------------------------------------------------------------------
# IE

use Win32::OLE qw(EVENTS in);
#use Win32::OLE::Variant;

my $IE = Win32::OLE->new("InternetExplorer.Application", \&MyOleQuit)
|| die "Could not start Internet Explorer.Application\n";

$IE->{visible} = 1;

$IE->Navigate("http://www.google.com");

Win32::OLE->WithEvents($IE, \&MyIEHandler, "DWebBrowserEvents2");

sub MyIEHandler
{
my ($obj,$event,@args) = @_;
#print " Event triggered: $event\n";
if ($event eq "DocumentComplete")
{
$IEWindow = shift @args;
$Document = $IEWindow->{Document};
#print "URL: " . $Document->URL . "\n";
}
elsif($event eq 'OnQuit')
{
print "BEGIN OnQuit event\n";
Win32::OLE->WithEvents($IE); # stop trying to get messages from IE
$TkWindow->destroy(); # end Tk
$Looping = 0;
print "END OnQuit event\n";
}
}

# ------------------------------------------------------------------------
# Tk

use Tk;

$TkWindow = MainWindow->new;
$TkWindow->title("Control Box");
$TkWindow->Button(-text => 'TEST', -command => \&Test)->pack;

my $TestNumber=0;
sub Test
{
print 'TEST ', ++$TestNumber, ': ';
#if ($Document->title)
#{
# print "The title is " . $Document->title;
#}
$Document->{designMode} =
$Document->{designMode} eq 'On' ? 'Off' : 'On';
# starts undef? In anycase, not eq 'On'.
print "\n";
}

# ------------------------------------------------------------------------
# Loop
print "BEGIN Loop\n";

while($Looping)
{
# "A delay of 50 milliseconds typically is fine"
Win32::Sleep(50);
$TkWindow->update(); # process Tk messages
Win32::Sleep(50);
Win32::OLE->SpinMessageLoop(); # process IE messages
}

# ------------------------------------------------------------------------
# End

print "END Loop\n";

sub MyOleQuit
{
# This really is After everything!
print "BEGIN MyOleQuit\n";
print "END MyOleQuit\n";
}

END
{
print "BEGIN Test for leaks:\n";
Win32::OLE->EnumAllObjects
(
sub
{
my $object = shift;
my $class = Win32::OLE->QueryObjectType($object);
$class = '?' if ! defined $class;
printf " Object=%s Class=%s\n", $object, $class;
}
);
print "END test for leaks.\n";
# "The EnumAllObjects() method is primarily a debugging tool.
# It can be used e.g. in an END block to check if all
# external connections have been properly destroyed."
}


__END__

# "Win32::OLE->Uninitialize
#
# The Uninitialize() class method
# uninitializes the OLE subsystem.
# It also destroys the hidden top level window
# created by OLE for single threaded apartments.
# All OLE objects will become invalid after this call!
# It is possible to call the Initialize() class method again
# with a different apartment model
# after shutting down OLE with Uninitialize()."

# "Win32::OLE->FreeUnusedLibraries
#
# The FreeUnusedLibraries() class method
# unloads all unused OLE resources.
# These are the libraries of those classes of which
# all existing objects have been destroyed.
# The unloading of object libraries
# is really only important for long running processes
# that might instantiate a huge number of different objects
# over time.
# Be aware that objects implemented in Visual Basic
# have a buggy implementation of this functionality:
# They pretend to be unloadable
# while they are actually still running their cleanup code.
# Unloading the DLL at that moment
# typically produces an access violation.
# The probability for this problem can be reduced
# by calling the SpinMessageLoop() method
# and sleep()ing for a few seconds."



# microsoft DHTML reference:
# http://msdn.microsoft.com/library/d...hor/dhtml/reference/dhtml_reference_entry.asp
 
M

Mark Clements

~greg said:
Hello,

I am trying to find a reliable leak-proof way to control Internet Explorer
by way of buttons (or whatever) in a parallel Tk window.

The script below is where it's at at the moment.
(Please forgive the learner's-comments,
I just clipped the script exactly as it currently is.)

Other than using Win32::OLE, you could look at

Win32::IE::Mechanize

or using Selenium to generate perl code that controls the browser.

Mark
 
G

~greg

Other than using Win32::OLE, you could look at
Win32::IE::Mechanize

or using Selenium to generate perl code that controls the browser.

Mark



Thanks,

"Selenium" seems to be both more, and less, than what I want.

Also, "Selenium uses JavaScript and Iframe" -- not just perl.

(Also, Selenium appears to have something or other to do with "Agile teams",
--which may, or many not, have something to do with The "Agile group",
--which Leonard Cohen had a very nasty run-in with, about year or so ago.
And I am a great fan of Leonard Cohen. :)

As for IE::Mechanize, my version of Win32::IE::Mechanize is 0.009.
and the doc says:

"This module tries to be a sort of drop-in replacement
for the WWW::Mechanize manpage.
It uses the Win32::OLE manpage to manipulate the Internet Explorer.
Don't expect it to be like the mech in that the class
is not derived from the user-agent class (like LWP).
WARNING: This is a work in progress and my first priority
will be to implement the WWW::Mechanize interface
(which is still in full development). Where ever possible
and needed I will also implement LWP::UserAgent methods
that the mech inherits and will help make this thing useful.


I have been learning Mechanize better and better, and I will
be using it in this thing of mine. So, since IE::Mechanize
isn't really Mechanize yet, it would just be one more
unnecessary layer for me to have to learn.

~~
For what it's worth, my approach began
by using Dave Roth's "Win32 Perl Programming",
and Henry Wasserman's essay: "Automating Windows Applications with Win32::OLE", April 21, 2005,
at:
D:\Us\Programming\Perl\Internet Explorer\Notes\perl_com Automating Windows Applications with Win32OLE.htm

Wasserman ends his essay by mentioning the further evolution
of the idea in SAMIE, "Simple Automation Module For Internet Explorer",
here: http://samie.sourceforge.net/

And SAMIE may be exactly the wheel I'm trying to re-invent.
I don't know. But it's not available via ActiveState,
and the download includes exes (--which, I got the impression
from somewhere, aren't open-source, --which always bothers me.)

In any case, I am close enough to what I want
just by using Tk and Win32::OLE
(--which I think all the other ways use anyway)
that I'd prefer to stick with it.
I just need to fix the leak.

~greg












http://www.perl.com/pub/a/2005/04/21/win32ole.html
 
G

~greg

If anyone cares, I tried putting
Win32::OLE->Uninitialize();
in MyOleQuit()
and (of course) got a "Deep recursion" error.

Then I tried putting it in my END block.
And then right after the loop exit.

Each of which got these error calls:

BEGIN Test for leaks:
Win32::OLE(0.1707): GetOleObject() Not a Win32::OLE object at ...
Object=Win32::OLE=HASH(0x1db18f0) Class=?
Win32::OLE(0.1707): GetOleObject() Not a Win32::OLE object at ...
Object=Win32::OLE=HASH(0x1d99c1c) Class=?
Win32::OLE(0.1707): GetOleObject() Not a Win32::OLE object at ...
Object=Win32::OLE=HASH(0x225530) Class=?
END test for leaks


Which is confusing because
why would
Win32::OLE->EnumAllObjects()
be enumerating objects
that aren't
Win32::OLE objects?

~~

Finally I read
Win32::OLE::NEWS - What's new in Win32::OLE
for the version (0.18) which I have.

It says, in effect, that since version 0.1007,
I should not have to worry about leaks.

Quote:

"more robust global destruction of Win32::OLE objects

The final destruction of Win32::OLE objects
has always been somewhat fragile. The reason for this
is that Perl doesn't honour reference counts during
global destruction but destroys objects in seemingly
random order. This can lead to leaked database connections
or unterminated external objects. The only solution
was to make all objects lexical and hope that no object
would be trapped in a closure. Alternatively all objects
could be explicitly set to undef, which doesn't work
very well with exception handling.

With version 0.1007 of Win32::OLE this problem should be gone:
The module keeps a list of active Win32::OLE objects.
It uses an END block to destroy all objects at program
termination before the Perl's global destruction starts.
Objects still existing at program termination
are now destroyed in reverse order of creation.
The effect is similar to explicitly calling
Win32::OLE->Uninitialize() just prior to termination."


So I guess that the only thing for me to do
is to run the script a few thousand times
and watch ram usage.

If it goes up monotonically, then there's a problem.
Otherwise not.


~greg
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,056
Messages
2,570,443
Members
47,091
Latest member
IsaacLuna

Latest Threads

Top