Automating Internet Explorer

Discussion in 'Perl Misc' started by ~greg, Feb 24, 2007.

  1. ~greg

    ~greg Guest


    I am trying to find a reliable leak-proof way to control Internet Explorer
    by way of buttons (or whatever) in a parallel Tk window.

    The script below is where it's at at the moment.
    (Please forgive the learner's-comments,
    I just clipped the script exactly as it currently is.)

    At the moment the Tk button is supposed to switch the IE document
    in and out of edit mode (via document.designMode = 'On' and 'Off'.)

    And it works!

    However, there seems to a leak.
    And my question is, how to stop it?
    (I'd like to get this right, and not just seem to be right.)

    The printout is this:

    BEGIN Loop
    TEST 1:
    TEST 2:
    TEST 3:
    TEST 4:
    TEST 5:
    TEST 6:
    BEGIN OnQuit event
    END OnQuit event
    END Loop
    BEGIN Test for leaks:
    Object=Win32::OLE=HASH(0x1db18b0) Class=DispHTMLDocument
    Object=Win32::OLE=HASH(0x1d99bc0) Class=IWebBrowser2
    Object=Win32::OLE=HASH(0x225530) Class=IWebBrowser2
    END test for leaks.
    BEGIN MyOleQuit
    END MyOleQuit

    So the leak is that those 3 objects still exist
    when the END block is executed.

    I'm thinking that I'm supposed to call

    but I don't know,
    and I don't know if FreeUnusedLibraries() has to be protected,
    from the VB bug the kludgy way mentioned in the OLE doc
    (--as quoted below after the __END__)

    and I don't know where to call them, if they are the answer --
    (-- in the 'OnQuit' event handler ?
    -- in 'MyOnQuit()' ?
    -- after the loop exit ?
    -- in an END block ?

    Currently the script exits when the X is clicked in the IE window,
    which sends the 'OnQuit' message to MyIEHandler(),
    which then stops OLE from listening to IE events,
    and destroys the Tk window.
    So I'm doing all that in the event handler,
    while the MyOleQuit() and loop drop-out do nothing.

    I hope I've made it at least a little bit clear what I'm asking!

    (Another question: every time I switch in and out
    of edit-mode, the page seems to get refreshed.
    Can that be stopped?)


    (note: I've seen something like this (ie, simultaneous Tk and OLE loops)
    being done
    D:\Us\Programming\Perl\Internet Explorer\Notes\Scripting iTunes with Perl - Part 2 at cyberrazor.htm
    by using both Tk's MainLoop and OLE's MessageLoop,
    like this:
    1.. $tkWin->waitVariable(\$iTunes_OLE);
    2.. MainLoop;
    3.. Win32::OLE->MessageLoop();

    and I think that might have some advantages
    in terms of MainLoop and MessageLoop being well tweaked
    so as not to take up too much cpu time when you're doing
    other things. And I got it to work too, but it got
    too confusing for me, and the script always hung
    when I quit the IE.

    In general, I don't know where to close things.
    (as this post is now an example! :)

    In particular, I don't know why $TkWindow->destroy();
    can't be called just anywhere at all, completely
    independant of the OLE automation, but it can't.
    The only place I can put it so that Tk doesn't stay hanging
    after I've closed IE is in the 'OnQuit' event-handler case.

    # ------------------------------------------------------------------------
    # the script: ...
    # ------------------------------------------------------------------------

    use strict;

    my $TkWindow;
    my $IEWindow;
    my $Document;
    my $Looping = 1;

    # ------------------------------------------------------------------------
    # IE

    use Win32::OLE qw(EVENTS in);
    #use Win32::OLE::Variant;

    my $IE = Win32::OLE->new("InternetExplorer.Application", \&MyOleQuit)
    || die "Could not start Internet Explorer.Application\n";

    $IE->{visible} = 1;


    Win32::OLE->WithEvents($IE, \&MyIEHandler, "DWebBrowserEvents2");

    sub MyIEHandler
    my ($obj,$event,@args) = @_;
    #print " Event triggered: $event\n";
    if ($event eq "DocumentComplete")
    $IEWindow = shift @args;
    $Document = $IEWindow->{Document};
    #print "URL: " . $Document->URL . "\n";
    elsif($event eq 'OnQuit')
    print "BEGIN OnQuit event\n";
    Win32::OLE->WithEvents($IE); # stop trying to get messages from IE
    $TkWindow->destroy(); # end Tk
    $Looping = 0;
    print "END OnQuit event\n";

    # ------------------------------------------------------------------------
    # Tk

    use Tk;

    $TkWindow = MainWindow->new;
    $TkWindow->title("Control Box");
    $TkWindow->Button(-text => 'TEST', -command => \&Test)->pack;

    my $TestNumber=0;
    sub Test
    print 'TEST ', ++$TestNumber, ': ';
    #if ($Document->title)
    # print "The title is " . $Document->title;
    $Document->{designMode} =
    $Document->{designMode} eq 'On' ? 'Off' : 'On';
    # starts undef? In anycase, not eq 'On'.
    print "\n";

    # ------------------------------------------------------------------------
    # Loop
    print "BEGIN Loop\n";

    # "A delay of 50 milliseconds typically is fine"
    $TkWindow->update(); # process Tk messages
    Win32::OLE->SpinMessageLoop(); # process IE messages

    # ------------------------------------------------------------------------
    # End

    print "END Loop\n";

    sub MyOleQuit
    # This really is After everything!
    print "BEGIN MyOleQuit\n";
    print "END MyOleQuit\n";

    print "BEGIN Test for leaks:\n";
    my $object = shift;
    my $class = Win32::OLE->QueryObjectType($object);
    $class = '?' if ! defined $class;
    printf " Object=%s Class=%s\n", $object, $class;
    print "END test for leaks.\n";
    # "The EnumAllObjects() method is primarily a debugging tool.
    # It can be used e.g. in an END block to check if all
    # external connections have been properly destroyed."


    # "Win32::OLE->Uninitialize
    # The Uninitialize() class method
    # uninitializes the OLE subsystem.
    # It also destroys the hidden top level window
    # created by OLE for single threaded apartments.
    # All OLE objects will become invalid after this call!
    # It is possible to call the Initialize() class method again
    # with a different apartment model
    # after shutting down OLE with Uninitialize()."

    # "Win32::OLE->FreeUnusedLibraries
    # The FreeUnusedLibraries() class method
    # unloads all unused OLE resources.
    # These are the libraries of those classes of which
    # all existing objects have been destroyed.
    # The unloading of object libraries
    # is really only important for long running processes
    # that might instantiate a huge number of different objects
    # over time.
    # Be aware that objects implemented in Visual Basic
    # have a buggy implementation of this functionality:
    # They pretend to be unloadable
    # while they are actually still running their cleanup code.
    # Unloading the DLL at that moment
    # typically produces an access violation.
    # The probability for this problem can be reduced
    # by calling the SpinMessageLoop() method
    # and sleep()ing for a few seconds."

    # microsoft DHTML reference:
    ~greg, Feb 24, 2007
    1. Advertisements

  2. Other than using Win32::OLE, you could look at


    or using Selenium to generate perl code that controls the browser.

    Mark Clements, Feb 24, 2007
    1. Advertisements

  3. ~greg

    ~greg Guest

    Other than using Win32::OLE, you could look at


    "Selenium" seems to be both more, and less, than what I want.

    Also, "Selenium uses JavaScript and Iframe" -- not just perl.

    (Also, Selenium appears to have something or other to do with "Agile teams",
    --which may, or many not, have something to do with The "Agile group",
    --which Leonard Cohen had a very nasty run-in with, about year or so ago.
    And I am a great fan of Leonard Cohen. :)

    As for IE::Mechanize, my version of Win32::IE::Mechanize is 0.009.
    and the doc says:

    "This module tries to be a sort of drop-in replacement
    for the WWW::Mechanize manpage.
    It uses the Win32::OLE manpage to manipulate the Internet Explorer.
    Don't expect it to be like the mech in that the class
    is not derived from the user-agent class (like LWP).
    WARNING: This is a work in progress and my first priority
    will be to implement the WWW::Mechanize interface
    (which is still in full development). Where ever possible
    and needed I will also implement LWP::UserAgent methods
    that the mech inherits and will help make this thing useful.

    I have been learning Mechanize better and better, and I will
    be using it in this thing of mine. So, since IE::Mechanize
    isn't really Mechanize yet, it would just be one more
    unnecessary layer for me to have to learn.

    For what it's worth, my approach began
    by using Dave Roth's "Win32 Perl Programming",
    and Henry Wasserman's essay: "Automating Windows Applications with Win32::OLE", April 21, 2005,
    D:\Us\Programming\Perl\Internet Explorer\Notes\perl_com Automating Windows Applications with Win32OLE.htm

    Wasserman ends his essay by mentioning the further evolution
    of the idea in SAMIE, "Simple Automation Module For Internet Explorer",

    And SAMIE may be exactly the wheel I'm trying to re-invent.
    I don't know. But it's not available via ActiveState,
    and the download includes exes (--which, I got the impression
    from somewhere, aren't open-source, --which always bothers me.)

    In any case, I am close enough to what I want
    just by using Tk and Win32::OLE
    (--which I think all the other ways use anyway)
    that I'd prefer to stick with it.
    I just need to fix the leak.


    ~greg, Feb 24, 2007
  4. ~greg

    ~greg Guest

    ~greg, Feb 24, 2007
  5. ~greg

    ~greg Guest

    If anyone cares, I tried putting
    in MyOleQuit()
    and (of course) got a "Deep recursion" error.

    Then I tried putting it in my END block.
    And then right after the loop exit.

    Each of which got these error calls:

    BEGIN Test for leaks:
    Win32::OLE(0.1707): GetOleObject() Not a Win32::OLE object at ...
    Object=Win32::OLE=HASH(0x1db18f0) Class=?
    Win32::OLE(0.1707): GetOleObject() Not a Win32::OLE object at ...
    Object=Win32::OLE=HASH(0x1d99c1c) Class=?
    Win32::OLE(0.1707): GetOleObject() Not a Win32::OLE object at ...
    Object=Win32::OLE=HASH(0x225530) Class=?
    END test for leaks

    Which is confusing because
    why would
    be enumerating objects
    that aren't
    Win32::OLE objects?


    Finally I read
    Win32::OLE::NEWS - What's new in Win32::OLE
    for the version (0.18) which I have.

    It says, in effect, that since version 0.1007,
    I should not have to worry about leaks.


    "more robust global destruction of Win32::OLE objects

    The final destruction of Win32::OLE objects
    has always been somewhat fragile. The reason for this
    is that Perl doesn't honour reference counts during
    global destruction but destroys objects in seemingly
    random order. This can lead to leaked database connections
    or unterminated external objects. The only solution
    was to make all objects lexical and hope that no object
    would be trapped in a closure. Alternatively all objects
    could be explicitly set to undef, which doesn't work
    very well with exception handling.

    With version 0.1007 of Win32::OLE this problem should be gone:
    The module keeps a list of active Win32::OLE objects.
    It uses an END block to destroy all objects at program
    termination before the Perl's global destruction starts.
    Objects still existing at program termination
    are now destroyed in reverse order of creation.
    The effect is similar to explicitly calling
    Win32::OLE->Uninitialize() just prior to termination."

    So I guess that the only thing for me to do
    is to run the script a few thousand times
    and watch ram usage.

    If it goes up monotonically, then there's a problem.
    Otherwise not.

    ~greg, Feb 25, 2007
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.