DocumentHTML ?

Discussion in 'Perl Misc' started by ~greg, Feb 26, 2007.

  1. ~greg

    ~greg Guest

    I am trying to get an InternetExplorer.Application to print out
    the whole HTML document as text,
    from the <HTML> (or before) to the </HTML>.
    (-so as to feed it to a TreeBuilder parse).


    print $Document->Body->innerHTML works,
    but returns only the body's innerHTML.

    print $Document->Body->outterHTML,
    and print $Document->DocumentHTML,
    don't work.

    The error is:
    Win32::OLE(0.1707) error 0x80020003: "Member not found"
    in METHOD/PROPERTYGET "" at ...


    Any hints, please?

    ~greg


    use strict;
    $|=1;
    my $IEWindow;
    my $Document;
    my $Looping = 1;
    use Win32::OLE qw(EVENTS in);
    my $IE = Win32::OLE->new("InternetExplorer.Application")
    || die "Could not start Internet Explorer.Application\n";
    Win32::OLE->WithEvents($IE,\&MyIEHandler,"DWebBrowserEvents2");
    sub MyIEHandler
    {
    my ($obj,$event,@args) = @_;
    if ($event eq "DocumentComplete")
    {
    $IEWindow = shift @args;
    $Document = $IEWindow->{Document};
    #print $Document->DocumentHTML; # doesn't work
    #print $Document->Body->outterHTML; # doesn't work
    print $Document->Body->innerHTML; # works
    }
    elsif($event eq 'OnQuit')
    {
    Win32::OLE->WithEvents($IE);
    $Looping = 0;
    }
    }

    $IE->{visible} = 1;
    $IE->Navigate("http://www.google.com");

    while($Looping)
    {
    Win32::Sleep(40);
    Win32::OLE->SpinMessageLoop();
    }
    ~greg, Feb 26, 2007
    #1
    1. Advertising

  2. "~greg" <> wrote in news:p:

    > I am trying to get an InternetExplorer.Application to print out
    > the whole HTML document as text,
    > from the <HTML> (or before) to the </HTML>.
    > (-so as to feed it to a TreeBuilder parse).
    >
    >
    > print $Document->Body->innerHTML works,
    > but returns only the body's innerHTML.
    >
    > print $Document->Body->outterHTML,
    > and print $Document->DocumentHTML,
    > don't work.
    >
    > The error is:
    > Win32::OLE(0.1707) error 0x80020003: "Member not found"
    > in METHOD/PROPERTYGET "" at ...
    >
    >
    > Any hints, please?


    Well, the first one would to use

    http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/

    I have successfully used that module to do some really complicated
    automated downloading of about 10 GB of HTML from various web sites
    (sorry can't be more specific).

    Note the comment at

    http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/lib/Win32/IE/Mechanize.pm#$ie->content

    > use strict;


    use warnings; # do not leave it out.

    #!/usr/bin/perl

    use strict;
    use warnings;


    $|=1;
    my $IEWindow;
    my $Document;
    my $Looping = 1;

    use Win32::OLE qw(EVENTS in);

    my $IE = Win32::OLE->new("InternetExplorer.Application")
    or die "Could not start Internet Explorer.Application\n";

    Win32::OLE->WithEvents($IE, \&MyIEHandler, "DWebBrowserEvents2");

    sub MyIEHandler {
    my ($obj, $event, @args) = @_;

    if ($event eq "DocumentComplete") {
    my $IEWindow = shift @args;
    print $IEWindow->Document->documentElement->{outerHTML};
    }
    elsif($event eq 'OnQuit') {
    Win32::OLE->WithEvents($IE);
    $Looping = 0;
    }
    }

    $IE->{visible} = 1;
    $IE->Navigate("http://www.google.com");

    while ($Looping) {
    Win32::Sleep(40);
    Win32::OLE->SpinMessageLoop();
    }

    __END__

    Sinan
    A. Sinan Unur, Feb 27, 2007
    #2
    1. Advertising

  3. ~greg

    ~greg Guest

    "A. Sinan Unur" > wrote ...
    > "~greg" > wrote ...
    >> ...
    >> Any hints, please?

    >
    > Well, the first one would to use
    >
    > http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/
    >
    > I have successfully used that module to do some really complicated
    > automated downloading of about 10 GB of HTML from various web sites
    > (sorry can't be more specific).
    >
    > Note the comment at
    >
    > http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/lib/Win32/IE/Mechanize.pm#$ie->content
    >
    >> use strict;

    >
    > use warnings; # do not leave it out.





    Thanks.

    I do use Mechanize, and TreeBuilder, together, quite a bit.

    But what I am really trying to do is to add value to my regular browser
    (i.e, IE), --without having to write COM plug-ins
    (or whatever they're called these days.)

    ~~

    I don't know what you mean by "the comment" at the link
    to cpan's Win32::IE::Mechanize,

    but the DESCRIPTION of its current state is not at all encouraging
    (---"Don't expect it to be like the mech in that the class is not derived
    from the user-agent class (like LWP). WARNING: This is a work in progress ... ")

    and the CAVEATS (---"...This means that you may need
    to set your security settings to a low and possibly unsafe level. ...")

    sounds down right dire to me.

    (Part of what I mean by adding value to IE is ADDING security, not subtracting it!)

    ~~~

    But of course I use warnings!

    You didn't see it in my snippet because I always run scripts
    from within a text editor that has it on the command line:
    perl.exe -w -Mstrict ...

    But I do want to thank you because you made me look
    at the setup again, and it turns out that I'd had it as:
    perl.exe -w mstrict ...

    - with small 'm' instead of capital 'M',
    -- which is why I had to still use "use strict;"
    in all my scripts!

    And now I don't have to look at either one of them - "use strict;" or "use warnings;"
    ever again! :)


    (Next I've got to figure out how to hide "$|=1;" )


    ~greg
    ~greg, Feb 27, 2007
    #3
  4. ~greg

    Joe Smith Guest

    ~greg wrote:

    > But of course I use warnings!
    >
    > You didn't see it in my snippet because I always run scripts
    > from within a text editor that has it on the command line:
    > perl.exe -w -Mstrict ...


    Do you expect that you will ever pass the responsibility of running those
    scripts to someone else?
    Joe Smith, Feb 27, 2007
    #4
  5. ~greg

    ~greg Guest

    Sinan, and Joe,
    I am very sorry about this. And I beg your forgiveness.
    And I most profusely apologize to you both.


    A couple of days ago I had posted here asking about
    "Automating Internet Explorer".

    My question was about OLE,
    And I got one response that simply ignored my question
    and told me to use Mechanize instead.

    Gentlemen, I am old.
    It seemed to me the same thing was happening again here.

    When I saw that Mr Unur was telling me to use
    Win32-IE-Mechanize instead, - I started dimming out.

    When next I saw him telling me to "use warnings",
    I started blanking out.

    And when, finally, I saw that he'd added "use warnings" to my original code
    -- but ---it seemed to me, when I skimmed over it, with these old eyes,
    --- nothing else ..!..

    Gentlemen, I'm old.

    I glanced over the rest of the script and didn't notice
    that any other changes had been made than the addition
    of "use warnings".

    I thought Mr Unur had "middle-posted" and left,
    after telling me to use warnings.

    ~~
    Also,

    I am not a professional.
    I write maybe 3 or 4 small scripts a day, purely for my own needs.

    It would never have occurred to me that explicit "use strict"
    and "use warnings" is a courtesy to others.

    I will do it from now on.

    ~~~

    Again, I am very sorry about all of this.

    I do understand, now, finally, how you reacted to me.
    Very much the same way that I had mistakenly initially reacted to you.
    But you didn't know that. You didn't know how or why I had reacted
    the way I did. And I didn't know that you didn't know.
    So your reaction to me came as a complete surprise to me.

    I think perhaps such total mis-communication
    has occured before on usenet. If so it would explain a lot.

    Anyway, thank you both again.

    $IEWindow->Document->documentElement->{outerHTML};
    is ideed exactly what I was asking for. It works.
    (How ever did you find it?)

    And I will remember to include the explicit "use strct" and "use warnings"
    from now on.


    Thank you!


    Greg.














    "A. Sinan Unur" <> wrote in message news:Xns98E46D1FA9AFFasu1cornelledu@127.0.0.1...
    > "~greg" <> wrote in
    > news:eek::
    >
    >>
    >> "A. Sinan Unur" > wrote ...
    >>> "~greg" > wrote ...
    >>>> ...
    >>>> Any hints, please?
    >>>
    >>> Well, the first one would to use
    >>>
    >>> http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/
    >>>
    >>> I have successfully used that module to do some really complicated
    >>> automated downloading of about 10 GB of HTML from various web sites
    >>> (sorry can't be more specific).
    >>>
    >>> Note the comment at
    >>>
    >>> http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_

    > 17/lib/Win32/
    >>> IE/Mechanize.pm#%24ie-%3Econtent
    >>>
    >>>> use strict;
    >>>
    >>> use warnings; # do not leave it out.

    >>

    >
    > Repeat:
    >
    > Do not leave
    >
    > use strict;
    > use warnings;
    >
    > out in your source code (whatever peculiar development environment you
    > might have).
    >
    >> But what I am really trying to do is to add value to my regular
    >> browser (i.e, IE), --without having to write COM plug-ins
    >> (or whatever they're called these days.)
    >>
    >> I don't know what you mean by "the comment" at the link
    >> to cpan's Win32::IE::Mechanize,

    >
    > Well, if you had followed the link, you would have seen:
    >
    > $ie->content
    >
    > Fetch the outerHTML from the $ie->Document->documentElement.
    >
    > I have found no way to get to the exact contents of the document. This
    > is basically the interpretation of IE of what the HTML looks like and
    > beware all tags are upcased :(
    >
    >> but the DESCRIPTION of its current state is not at all encouraging
    >> (---"Don't expect it to be like the mech in that the class is not
    >> derived from the user-agent class (like LWP). WARNING: This is a work
    >> in progress ... ")
    >>
    >> and the CAVEATS (---"...This means that you may need
    >> to set your security settings to a low and possibly unsafe level.
    >> ...")
    >>
    >> sounds down right dire to me.

    >
    > Note the *may*. I have never needed to tinker with any security settings
    > and I have used the module for quite complicated tasks where the sites
    > were so dependent on IE that no other solution would have worked.
    >
    > You are free not to take advice and try to re-invent the wheel. I am not
    > likely to waste my time helping you do that.
    >
    >> (Part of what I mean by adding value to IE is ADDING security, not
    >> subtracting it!)

    >
    > One can choose to use CPAN modules and contribute improvements as one
    > comes up with them. IMHO, that is both more productive and more useful
    > to everyone.
    >
    >> But of course I use warnings!

    >
    > I can only see what you chose to show.
    >
    > <SNIP>
    >
    > All this stuff about your configuration and not one comment about
    > whether the solution I posted worked for you or not (a solution which I
    > copied straight from Win32::IE::Mechanize). I will now bid you farewell.
    >
    > Sinan
    > --
    > A. Sinan Unur <>
    > (remove .invalid and reverse each component for email address)
    >
    > comp.lang.perl.misc guidelines on the WWW:
    > http://augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
    >
    ~greg, Feb 27, 2007
    #5
  6. ~greg

    ~greg Guest

    Ah!

    Sinan,

    It turns out that I was not entirely to blame
    for our mis-communciation!

    But neither were you!

    It's all due to an HTML mistake
    on the Win32::IE::Mechanize document page!

    Starting with NAME as line 1, the 21st and 22nd lines are:
    CONTENT-HANDLING METHODS
    $ie->content

    which I know now is what you meant.

    They are both coded as links to internal anchors,
    and the first one
    http://search.cpan.org/~abeltje/Win...in32/IE/Mechanize.pm#CONTENT-HANDLING_METHODS

    works. That is to say, it links to an existing internal anchor.

    But the second one -- the one you gave me --
    http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/lib/Win32/IE/Mechanize.pm#$ie->content

    is broken!
    You can search the source code for "24ie-%3Econtent"
    and you don't find it anywhere else!

    The effect was that when I clicked on the link you gave me,
    it just opened the whole document at the top!
    So I had no idea what "comment" you were specifically
    trying to direct me to. And I couldn't possibly guess
    what you meant.

    Neither your fault nor mine!


    ~greg.
    ~greg, Feb 27, 2007
    #6
  7. "~greg" <> wrote in
    news::

    > But the second one -- the one you gave me --
    > http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/lib/Win32/IE/Mechanize.pm#$ie->content
    >
    > is broken!
    > You can search the source code for "24ie-%3Econtent"
    > and you don't find it anywhere else!


    Well, more than likely, it is your browser or newsreader that is broken.
    %24 is the URL encoded version of $ and and %3E is the URL encoded version
    of #. So, the link above is the same as:

    http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/lib/Win32/IE/Mechanize.pm#$ie->content

    but with the unsafe characters replaced with safe URL encodings.

    > Neither your fault nor mine!


    True.

    Sinan
    A. Sinan Unur, Feb 27, 2007
    #7
  8. "~greg" <> wrote in
    news::

    > When I saw that Mr Unur was telling me to use
    > Win32-IE-Mechanize instead, - I started dimming out.
    >
    > When next I saw him telling me to "use warnings",
    > I started blanking out.
    >
    > And when, finally, I saw that he'd added "use warnings" to my
    > original code -- but ---it seemed to me, when I skimmed over it, with
    > these old eyes, --- nothing else ..!..
    >
    > Gentlemen, I'm old.
    >
    > I glanced over the rest of the script and didn't notice
    > that any other changes had been made than the addition
    > of "use warnings".
    >
    > I thought Mr Unur had "middle-posted" and left,
    > after telling me to use warnings.


    As I wrote that message, I replaced your script with the copy-pasted
    version of my script but gave no indication of the fact that I had changed
    a single line in it.

    That's my fault.

    > $IEWindow->Document->documentElement->{outerHTML};
    > is ideed exactly what I was asking for. It works.
    > (How ever did you find it?)


    I wondered how Win32::IE::Mechanize did it and looked at its source code.

    Microsoft has documentation on Internet.Application at msdn.microsoft.com.
    The whole thing is really messy and I am not sure if you will able to do
    better than Win32::IE::Mechanize (and if you are, I am sure we would all
    appreciate your contributions).

    On the other hand, that module helped me achieve a lot recently despite all
    the warnings on CPAN. I am more worried about modules that don't tell me
    their shortcomings up front.

    Sinan
    A. Sinan Unur, Feb 27, 2007
    #8
  9. ~greg

    ~greg Guest


    > Well, more than likely, it is your browser or newsreader that is broken.


    Microsoft Outlook Express - broken ?? --Impossible!
    ;)

    But seriously, ...
    your:

    > http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/lib/Win32/IE/Mechanize.pm#$ie->content


    does work.

    And I think I understand what's going on now.
    ~~

    When we used the context menu to "Copy Shortcut",
    we both got the URL Encoding: #%24ie-%3Econtent
    of both the '$' and the '>'.

    However, in the HTML source,
    the link and the anchor that it's supposed to point to
    are, instead:

    <li class='indexItem indexItem2'><a href='#%24ie-%3Econtent'>$ie->content</a>

    and

    <h2><a class='u' href='#___top' title='click to go to top of document'
    name="$ie->content"
    >$ie->content</a></h2>


    respectively.

    That is to say, in the href of the link
    both the '$' and '>' are URL-encoded .

    However, both in the display of the link,
    and in the name of the target anchor,
    the '$' is not encoded at all,
    and the '>' is encoded rather as an HTLM entity!

    '$' is safe (I beleive) and so needn't be encoded at all.

    '>', on the other hand, has to be encoded,
    in the HTML of course, and I think also as url,
    And either the URL-encoding or the HTML-entity encoding would be fine.

    It's the using of the one way in the href of the link,
    and the other way in the name of the target of the link,
    that breaks the connection!


    >> Neither your fault nor mine!

    >
    > True.
    >


    Ditto.


    ~greg
    ~greg, Feb 27, 2007
    #9
  10. ~greg <> wrote:


    > Gentlemen, I'm old.



    Me too. Bummer eh?


    > It would never have occurred to me that explicit "use strict"
    > and "use warnings" is a courtesy to others.



    Please see the Posting Guidelines that are posted here frequently.


    > I will do it from now on.



    Thank you.



    [ snip TOFU.
    Please don't do that either.
    ]

    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Feb 28, 2007
    #10
  11. ~greg

    ~greg Guest


    > [ snip TOFU.
    > Please don't do that either.
    > ]
    >
    > --
    > Tad McClellan SGML consulting
    > Perl programming
    > Fort Worth, Texas





    please don't spam.
    ~greg, Feb 28, 2007
    #11
  12. ~greg

    ~greg Guest

    changed to OT Re: DocumentHTML ?

    Sinan > wrote
    > I don't know what to make of you.




    Sinan,
    I wrote the following post just after I read your post.

    And if you read it you'll see why I didn't post it then.
    And why I decided not to post it at all.

    I often write this way.
    And almost never post this way.
    And I'm sure I'll never do it again here.

    But I just I kept thinking about you saying
    you don't know what to make of me!

    And that started making me feel very lonely.

    And that's why I've decided to post this after all.
    Just so you'll know what to make of me.

    In this one instance anyway.

    But please understand me --the temperament that
    appears here is not anything that ever lasts in me
    for more than a minute.

    It just so happened that around the time this thread occurred,
    I happened to be searching for something in Google,
    and it seemed to me that every single promising looking
    hit I followed --- wound up in a empty red-herring like
    post - just like Tad's is in this thread.

    And that's the whole point I want to make.

    All the rest here is "colored bubbles".

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


    >>> --
    >>> Tad McClellan
    >>> SGML consulting
    >>>
    >>> Perl programming
    >>> Fort Worth, Texas
    >>>

    >>
    >> please don't spam.


    > Greg:
    >
    > I don't know what to make of you. What is the point of that remark? Tad's
    > signature is just that: A standard sig which is separated from the body of
    > the post with the proper sequence of characters so that any reasonable
    > newsreader can automatically snip it in replies.
    > If you haven't read the posting guidelines yet, you should read them and
    > follow them so as to maximize the chances of getting useful responses.
    >
    > Sinan
    > --


    ~~~~~~~~~


    > What is the point of that remark?


    Well, Sinan,
    What is the point of Tad's contribution to this thread?


    Please break out of the little world here for a second
    and read it objectively.

    It is a nasty, insulting, trivializing, pontificating,
    officious -- and - content-wise - a completely
    empty thing.


    ~greg t> wrote:
    > Gentlemen, I'm old.

    Me too. Bummer eh?
    > It would never have occurred to me that explicit
    > "use strict"
    > and "use warnings" is a courtesy to others.

    Please see the Posting Guidelines that are posted
    here frequently.
    > I will do it from now on.

    Thank you.
    [ snip TOFU.
    Please don't do that either.
    ]


    I had said to you (Sinan) that I will
    use the "use..." stuff from now on.

    So what was Tad's point in bringing it up again?
    right after I said I would?,
    - in his quoting me saying so?,
    --and in his chiming-in with (--what emoticon?)
    "Thank you" ?

    (The answer is that he was being passive agressive.)

    Morevoer, and more tellingly,
    -- I do not find anything at all in Tad's famous guidelines
    about "use strict" and "use warnings' !

    I had written >>

    >> It would never have occurred to me that explicit
    >> "use strict"
    >> and "use warnings" is a courtesy to others.
    >> I will do it from now on.


    And he answered:

    > Please see the Posting Guidelines that are posted
    >here frequently.


    and then:

    >Thank you.


    Is there something, - anything, - specific in his guidelines
    that he wanted me to see?

    Some generality, perhaps, of obvious applicability
    to "use strict" "use warnings"??

    Or is his mentioning his guidelines
    on every pretext he can think of,
    to everyone he can get away with it,
    --simply a completely empty mechanical habit with him? .

    Is his doing it, therefore, spam?

    ~~~

    I did see one interesting thing in Tad's guidelines.

    And you have asked me. So I will tell you.

    The meaning of my remark was not the same thing as the point of it.
    But the meaning of it was this: ....

    Right near the very top of Tad's
    "Posting Guidelines for comp.lang.perl.misc ($Revision: 1.7 $)"
    it says:

    "This newsgroup, commonly called clpmisc,
    is a technical newsgroup intended to be used for discussion
    of Perl related issues (except job postings), ..."

    And near the middle it says:
    "Never quote a .signature
    (unless that is what you are commenting on)."

    And near the bottom it says:

    "AUTHOR
    Tad McClellan <>
    and many others on the
    comp.lang.perl.misc newsgroup. "

    ~~


    I am now commenting on Tad's .signature.

    Note especially the specifically-mentioned parenthetical exception:
    "(except job postings)"

    Note how it is given pride of place,
    right near the very top of Tad's document.

    So apparently Tad has written just for himself
    some kind of secret-exemption to this rule.

    So. Now.
    Why does Tad tell everybody to go read his guidelines?

    And why does he put "except job postings" right at the top of it?

    Well, this is the reason: ...
    It's so that any other "SGML consultant" who chances by,
    and who wants to follow all "the rules", to be polite,
    - will decide not to advertise himself
    in the same way that (only) Tad is permitted to do,
    in his .signature.

    Thus giving Tad the market advantage.

    And that's what's called Spam.

    The sneaky effort to achieve an unfair market advantage.

    ~~~

    But don't blame me!
    I wasn't among the so-called "many others"
    involved in the authoring of Tad's "guidelines."

    ~~~

    Now, honestly, I don't care at all about little hypocrisies like that.
    (Big ones in the government are much more fun to spot.)

    And I don't have any grudge at all against Tad.
    Certainly not a personal one.
    After all, I only know of him from that single post!
    Obviously I am pre-judging him, based on some
    past experience with a certain type of character
    that that single post reminds me of.

    More than that. I do know that Tad has done
    a lot of good here. In fact I suspect that he is probably
    the most responsible for much of the efficiency of this newsgroup.
    - exactly what makes this place such a
    pleasure to romp through.

    Guidelines are good things.
    And I always try to follow them, whether or not I agree with them.


    But, having said that, .....

    ~~~

    What, pray tell, was Tad's point in commenting to me
    the way he did?
    --this way? : ...

    > ~greg > wrote:
    > > Gentlemen, I'm old.

    > Me too. Bummer eh?


    That remark would have sounded very different
    if it had a clear purpose. Or if it had been
    elaborated in some friendly way.

    As it stands, though, it is pure monkey business.
    It's a razz. It doesn't have any point at all
    other than being catty.

    And my point, - the one you asked me about,
    was simply to respond in-kind!

    --- the better to help Tad become a kinder,
    gentler old man.

    In Peter Jackson's King Kong,
    Kong thinks it's so funny the way he keeps knocking
    Ann Darrow (Naomi Watts) down. Until she makes him
    stop. And then he throws a tantrum. And then a rock
    falls on his head. And then, just a few seconds later,
    he gets the connection. And he becomes (almost) human.


    If you, Sinan, honestly believe that it is ok for Tad to be
    as nasty and as irrelevant as he wants to be,
    - to any one he so happens to feel like being nasty to

    - and if you also honestly believe that it is not ok
    for anybody else to respond in-kind to him
    --under penalty of being ostracized,
    (-- and not just by Tad, and perhaps you,
    --but by everybody here --- you two speaking for everybody! ...)

    Well, then, Sinan, that is what I would call
    "megalomania by proxy".


    ( Your comment:
    > "maximize the chances of getting useful responses".

    is a blatant euphemism for
    "if you don't follow Tad's rules to the letter, you'll be ostracized here".
    )

    I prefer plain language.

    (by the way,
    "the guidelines" has it slightly different:
    In "the guidelines" it's: "maximize your chances of getting meaningful replies"
    -- ie, "meaingful',
    - not "useful".
    (--the difference is that "useful" is a dime a dozen,
    whereas "meaningful" is close to the essence of humanity.)

    It's possible, - just possible, --that you, Sinan -- in quoting that, and all,
    - were throwing just a Tad-bit of a curve in your post.
    But I'll probably never know.)


    ~~~~


    I know it was ... something...., of me to admit that I'm old.

    But - quite unlike Tad's monkeying of it
    - my saying it had a clear purpose.
    Clear enough anyway that you got the point, since
    you wrote:
    "As I wrote that message, I replaced your script with the
    copy-pasted version of my script but gave no indication
    of the fact that I had changed a single line in it.
    That's my fault."


    There is probably a rule about it somewhere.
    Something like

    "when you add improvements or make corrections
    or in any other way alter somebody else's code,
    please comment on what you've done, above the code,
    so that old people, who don't see so good, and don't
    scan so well no more, can better see that a change
    has been made, so that they can then better
    apportion their dwindling ability to concentrate to better effect."

    If there is such a rule, and if I were Tad,
    I could say:

    "please COCK. thank you."

    "COCK" meaning
    "(please) Comment Over Changes, (you) Krazy (person!)"
    or something equally obvious as that.

    I mean really --- if Tad is permitted to tell me:
    > " [ snip TOFU.
    > Please don't do that either.
    > ]


    making me having to guess that "T" stands for "TOP"
    (--which I did immediately, because immediately
    after posting the offending "TOFU", when it was too late,
    I had already, all by myself, realized the goof .)

    - then, certainly, I should be permitted to tell Tad:
    "PDMAWOPFFNGR"
    (--"Please Don't Monkey Around With Other People's
    Feelings For No Good Reason")


    So I got the "T" part no problem.

    But there is no way that I or anybody else
    could ever guess the rest of "TOFU" :
    "TOP OVER, FULLQUOTE UNDER".

    For one thing, "fullquote" isn't a real word.

    So I had to waste my valuable time looking up "TOFU".
    (Bean curd.)

    Which I bothered doing, - only because I suspected
    that the "FU" part meant something quite different.

    (Which it probably does, for the cognoscent,
    since, again, 'fullquote' isn't a real word.)

    Whereas I'd thought that the whole point in having rules in the first place
    was to help people avoid wasting other people's time!

    ~~

    I know that Tad is truthful when he tells me that he's old too.
    Because this kind of pointless cutesy acronym did once
    play a vital role in Usenet. Back when connections were
    full-duplex 300 baud over phone lines.

    Today however they are more often used
    only by obnoxious adolescent cliques,
    (And some dwindlingly percentage of us troglodytes.)


    As for my "TOFU" blunder
    -- I had really thought that it was pretty obvious that I had
    simply forgotten, in my hast, to delete the automatically quoted part.
    Quite obviously so, to anybody who read the post,
    because the bottom quoted part isn't referred to
    at all in the "TOP OVER" part.

    Being in hast is not a sin.
    Spelling mistakes are not a sin.

    Posting to newsgroups is not the same thing as publishing
    active legal documents.

    ~~

    I have posted perhaps a dozen posts in total to this newsgroup,
    over the last 7 years. So I don't know.

    Maybe I have made this same mistake before?

    If so, then it would mean that it is a persistently bad habit with me.
    And, if so, then I would have to be grateful to anybody who
    pointed it out and made me stop.

    However, when a mistake is made just once, or twice,
    it isn't the mistake that wastes people's time.

    It is obsessive commenting on this kind of trivia,
    - the way that some people seem to be addicted to doing,
    for whatever inscrutable pleasure they derive from it,
    - that is the greatest waste of our time.

    ~~~~


    But I have to tell you the real reason that I am going on like this!

    It is because the greatest frustration in Google newsgroup searching
    is clicking on the about 20% of the Re: posts that look promising
    for an answer, --only to have to scroll down to one or another
    variation on:
    "This is not an appropriate question to be asking in this
    (read: 'my') newsgroup.
    Go ask somewhere else."

    Or, for another example, anything like:
    > " [ snip TOFU.
    > Please don't do that either.
    > ]


    Can't you-all see, those of you who habitually post
    blanks like that how much happier the whole world
    at large would be if only ---when you have nothing to say,
    -- you simply refrained from saying anything?

    If everybody did obey my little rule,
    then, when I go through a Google return list
    and see that a particular post doesn't have
    any responses,
    - I won't have to waste my time on red herring!

    When somebody doesn't get a response in a particular newsgroup,
    why can't you-all conceive that they just might be intelligent enough
    to figure out for themselves that it's just probably the wrong group
    to ask the question in?

    If they persist, obviously, then you must tell them.
    You must put them out of their misery.

    But, honestly, how often does that happen?

    No. Some people are just way too-cocked for jumping
    on the heads of rules-violators -- for me to believe
    them when they say that they are just doing
    the rest of us a service.

    Because the truth is, clearly, that they derive
    some sort of perverse pleasure in playing the
    authority and telling people off.

    Read Foucault.
    "His work concerning power and the relation
    between power and knowledge, ...
    have been widely discussed and applied...." - Wikipedia.


    It's precisely the incessant picayune carping that wastes the most
    of people's time. Negative comments are always, whatever their
    claimed holy intent, a waste of time. And when they become
    too high a percentage in somebody's overall contribution
    then, --bummer: -- maybe that person really is just too old
    and dried up, and ought to be let out to pasture.


    Sinan > wrote
    > What is the point of that remark?


    There was no point, Sinan.
    I was joking.
    Just responding in-kind to Tad.


    > ERIKSSON: Pardon me sir, what's your point, sir?
    >
    > HILL: There ain't no point, Eriksson. I'm simply
    > trying to illuminate the terrain in which we currently
    > find ourselves deployed. You don't mind that, do you?
    > And if you do ...
    > (- "Casualties of War").




    Vietnam was "my" war.

    And we had a word
    - a rule -
    for the way to deal with Taliban-like self-appointed prefects:

    "frag 'em".

    --
    ~greg.




    (but please understand!
    I am old.
    And I have just been to the dentist!

    And that is, honestly, the only reason
    this post sounds the way it does.)
    ~greg, Mar 2, 2007
    #12
  13. ~greg

    Uri Guttman Guest

    Re: changed to OT Re: DocumentHTML ?

    <snip>

    what a sad waste of precious pixels.

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
    Uri Guttman, Mar 2, 2007
    #13
  14. ~greg

    ~greg Guest

    Re: changed to OT Re: DocumentHTML ?

    "sherm" "plonked" me,
    but this is for him too...



    I completely agree with both of you.


    I'm sure you noticed that I retitled this
    Off Topic,
    --so I have no idea why you read it,
    much less why you bothered responding.

    I assume it's brotherhood.

    Anyway, you two aren't the problem that I was talking about.
    I just clicked on a bunch of your posts and they
    are almost exclusively technical. I could learn
    a lot (a lot technical) from every one of them.
    None of them would bother me if they came
    up in a google search.
    So they aren't the time-wasters I was talking about.

    Also, your few nasty comments in your posts
    are almost always clearly deserved and relevant
    to the posts that they're in response to.
    (Even in this case, I admit.)

    So. Now.
    Please! - Just do me this one favor.
    Answer me this:

    Tell me why Tad responded to my:

    >> (I'm not a professional)
    >> It would never have occurred to me
    >> that explicit "use strict"
    >> and "use warnings"
    >> is a courtesy to others.


    with

    >Please see the Posting Guidelines that are posted
    > here frequently.


    When there is nothing whatever
    in his guidelines about "use strict"
    and "use warnings'!

    --and when I had immediately added : ...
    >> I will do it from now on


    ?

    If you can just answer that,
    then of course I'll have to take it all back.

    (which I'd do anyway if I could.
    it posted at 4am.
    But that's not your problem. You don't have
    a problem. Not with me anyway.

    Thank you for righting my boat.
    ~greg, Mar 2, 2007
    #14
  15. Re: changed to OT Re: DocumentHTML ?

    ~greg <> wrote:

    > Please! - Just do me this one favor.
    > Answer me this:
    >
    > Tell me why Tad responded to my:
    >
    > >> (I'm not a professional)
    > >> It would never have occurred to me
    > >> that explicit "use strict"
    > >> and "use warnings"
    > >> is a courtesy to others.

    >
    > with
    >
    > >Please see the Posting Guidelines that are posted
    > > here frequently.



    Because you displayed "good attitude" in your 1st followup, which
    is a depressingly rare response, most people gripe about netiquette
    rather than responding with the expected "Oh, i didn't know that".

    I had actually added you to the "nice people to help" (scored up)
    in my scorefile (that changed later).


    > When there is nothing whatever
    > in his guidelines about "use strict"
    > and "use warnings'!


    Ask perl to help you
    You can ask perl itself to help you find common programming mistakes
    by doing two things: enable warnings (perldoc warnings) and enable
    "strict"ures (perldoc strict).

    > --and when I had immediately added : ...
    > >> I will do it from now on



    And I did not point you to the guidelines so that you could learn
    about those pragmas, you already knew about them.

    I figured, here's a guy that _wants_ to be socially acceptable. Oh,
    but he doesn't know about top-posting (TOFU), so he probably wants
    to know that folks don't like that either...


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Mar 2, 2007
    #15
  16. Re: changed to OT Re: DocumentHTML ?

    ~greg <> wrote:
    > Sinan > wrote
    >> I don't know what to make of you.



    > But please understand me --the temperament that
    > appears here is not anything that ever lasts in me
    > for more than a minute.



    Unfortunately, once the message is sent, you can't undo it.

    Be extra cautious when you get upset
    Count to ten before composing a followup when you are upset
    This is recommended in all Usenet newsgroups. Here in clpmisc, most
    flaming sub-threads are not about any feature of Perl at all! They
    are most often for what was seen as a breach of netiquette. If you
    have lurked for a bit, then you will know what is expected and won't
    make such posts in the first place.

    But if you get upset, wait a while before writing your followup. I
    recommend waiting at least 30 minutes.

    Count to ten after composing and before posting when you are upset
    After you have written your followup, wait *another* 30 minutes
    before committing yourself by posting it. You cannot take it back
    once it has been said.


    >>>> --
    >>>> Tad McClellan
    >>>> SGML consulting
    >>>>
    >>>> Perl programming
    >>>> Fort Worth, Texas
    >>>>
    >>>
    >>> please don't spam.



    Including what you do for a living in a .sig is not spam, it is
    perfectly acceptable tenant of general netiquette.


    >> What is the point of that remark?

    >
    > Well, Sinan,
    > What is the point of Tad's contribution to this thread?



    You found out one of the group's expectations the hard way.

    My point was that you could find out most of the rest of them
    the easy way.


    > Please break out of the little world here for a second
    > and read it objectively.
    >
    > It is a nasty, insulting, trivializing, pontificating,
    > officious



    Sorry, it was surely not intended as such.


    > -- and - content-wise - a completely
    > empty thing.



    While using warnings/strict is a Perl-specific element of netiquette
    and therefore of limited applicability, avoiding top-posting is
    universally accepted across all of Usenet.

    I wanted you to know about TOFU before you experienced angst
    in other newsgroups as well.


    > ~greg t> wrote:
    > > Gentlemen, I'm old.

    > Me too. Bummer eh?
    > > It would never have occurred to me that explicit
    > > "use strict"
    > > and "use warnings" is a courtesy to others.

    > Please see the Posting Guidelines that are posted
    > here frequently.
    > > I will do it from now on.

    > Thank you.
    > [ snip TOFU.
    > Please don't do that either.
    > ]
    >
    >
    > I had said to you (Sinan) that I will
    > use the "use..." stuff from now on.
    >
    > So what was Tad's point in bringing it up again?
    > right after I said I would?,



    I thought you'd smack your head and say:

    If only I had known about warnings/strict before posting.


    > - in his quoting me saying so?,
    > --and in his chiming-in with (--what emoticon?)
    > "Thank you" ?



    That was truly sincere.

    The most common response to "you should use warnings/strict" is:

    Don't tell me what to do.

    I expected a more pleasant response from you though, based on
    the attitude displayed in your first followup.


    > (The answer is that he was being passive agressive.)



    A "conclusion" does not correspond directly to an "answer".

    We have experienced yet more miscommunication it would appear.


    > Is there something, - anything, - specific in his guidelines
    > that he wanted me to see?


    Use an effective followup style
    When composing a followup, quote only enough text to establish the
    context for the comments that you will add. Always indicate who
    wrote the quoted material. Never quote an entire article. Never
    quote a .signature (unless that is what you are commenting on).

    Intersperse your comments *following* each section of quoted text to
    which they relate. Unappreciated followup styles are referred to as
    "top-posting", "Jeopardy" (because the answer comes before the
    question), or "TOFU" (Text Over, Fullquote Under).

    Reversing the chronology of the dialog makes it much harder to
    understand (some folks won't even read it if written in that style).
    For more information on quoting style, see:

    http://web.presby.edu/~nnqadmin/nnq/nquote.html


    > Some generality, perhaps, of obvious applicability
    > to "use strict" "use warnings"??



    Besides knowing the syntax for them, it might be nice
    to peruse their documentation (referenced in the guidelines).


    > Right near the very top of Tad's
    > "Posting Guidelines for comp.lang.perl.misc ($Revision: 1.7 $)"



    They are not "my" guidelines, they are "our" guidelines.

    They were discussed, and agreed upon, over several weeks here.


    > it says:
    >
    > "This newsgroup, commonly called clpmisc,
    > is a technical newsgroup intended to be used for discussion
    > of Perl related issues (except job postings), ..."
    >
    > And near the middle it says:
    > "Never quote a .signature
    > (unless that is what you are commenting on)."
    >
    > And near the bottom it says:
    >
    > "AUTHOR
    > Tad McClellan <>
    > and many others on the
    > comp.lang.perl.misc newsgroup. "
    >
    > ~~
    >
    >
    > I am now commenting on Tad's .signature.
    >
    > Note especially the specifically-mentioned parenthetical exception:
    > "(except job postings)"
    >
    > Note how it is given pride of place,
    > right near the very top of Tad's document.
    >
    > So apparently Tad has written just for himself
    > some kind of secret-exemption to this rule.



    I did not post a job posting.


    > Why does Tad tell everybody to go read his guidelines?



    So that they can avoid being silently killfiled.


    > And why does he put "except job postings" right at the top of it?



    Because even Perl-related job postings are not welcomed here.


    > Well, this is the reason: ...
    > It's so that any other "SGML consultant" who chances by,
    > and who wants to follow all "the rules", to be polite,
    > - will decide not to advertise himself
    > in the same way that (only) Tad is permitted to do,
    > in his .signature.



    Everyone is permitted to include their job title in their .sig.

    They are even permitted outright advertising in their .sig.

    If you don't know much about Usenet netiquette, then commenting
    on Usenet netiquette is talking out of place...


    > What, pray tell, was Tad's point in commenting to me
    > the way he did?
    > --this way? : ...
    >
    >> ~greg > wrote:
    >> > Gentlemen, I'm old.

    >> Me too. Bummer eh?



    It was meant as a witty aside.


    > That remark would have sounded very different
    > if it had a clear purpose. Or if it had been
    > elaborated in some friendly way.



    But I can see how it could be interpreted that way. Let me rephrase:

    I feel your pain.

    I am old too, and it is a drag (but much preferable to the alternative!)


    > As it stands, though, it is pure monkey business.
    > It's a razz.



    No it isn't. I actually _am_ old.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Mar 2, 2007
    #16
  17. ~greg

    ~greg Guest

    Re: changed to OT Re: DocumentHTML ?

    Tad> wrote ...

    > I figured, here's a guy that _wants_ to be socially acceptable. Oh,
    > but he doesn't know about top-posting (TOFU), so he probably wants
    > to know that folks don't like that either...


    Well, I didn't know.

    That instance was a mistake. However,
    I have deliberately posted in every which way
    in a different, very informal newsgroup.

    Google shows the first 3 or 4 lines of posts
    in their search-results-listings, so the only
    rule that ever made logical sense to me
    was to try to make the first few lines
    as informative as possible about the rest of the content.

    I guess that *would* normally mean from the quoted text!

    In any case, I will of course abide.

    ~~~

    But hey, man! - I just wanted to say thank you!

    I don't myself have much of it, but I know
    real wisdom when I see it.

    And it is real wisdom to see when somebody
    else's tantrum has nothing to do with you,
    -you just happened to be there that's all.

    Thank you!

    Greg.
    ~greg, Mar 2, 2007
    #17
  18. ~greg

    Dr.Ruud Guest

    ~greg schreef:
    > I am trying to get an InternetExplorer.Application to print out
    > the whole HTML document as text,
    > from the <HTML> (or before) to the </HTML>.
    > (-so as to feed it to a TreeBuilder parse).



    Try wget.

    --
    Affijn, Ruud

    "Gewoon is een tijger."
    Dr.Ruud, Mar 5, 2007
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.

Share This Page