Windows ActiveState Perl: MSXML transformNodeToObject finally succeeded

Discussion in 'Perl Misc' started by tuser, Feb 17, 2006.

  1. tuser

    tuser Guest

    I have finally found a solution for my long-standing problem
    with Xslt-transformation under Windows ActiveState Perl and
    I thought that other people might have the same problem so I
    would like to share my solution with the group. I hope you
    don't mind this long post, here is the story:

    I had read an article by Shawn Ribordy on
    http://www.perl.com/pub/a/2001/04/17/msxml.html
    ('MSXML, It's Not Just for VB Programmers Anymore')
    in which he described how to do Xslt-transform on XML-files
    using the "transformNodeToObject" method of a Win32::OLE
    object.

    The following lines are copied straight from his article:

    >> $doc_to_transform->transformNodeToObject($style_sheet_doc,
    >> $transformed_doc);
    >> $transformed_doc->save("$new_xml_doc_file");


    "Great...", I thought, "...let's try this at home".

    So I sat down at my Windows XP computer (with Activestate
    v5.8.7 and the latest Msxml2.DOMDocument.4.0/SP2 installed),
    fired up notepad.exe and pasted Shawn's example straight
    into my perl program, and his example worked -- but that
    was as far as it got!

    When I started to use my own xslt-stylesheet, things went
    seriously wrong. Well, I knew that my own xslt-stylesheets
    had some problems, but I hoped (and expected) that the
    transformNodeToObject() method would throw something useful
    at me (which unfortunately it did not!) The problem was
    that Shawn's example did not have any error handling
    whatsoever.

    I googled every possible combination of (perl, xslt, msxml,
    win32, errorhandling) under the sun and I searched CPAN to
    destruction, but to no avail.

    Finally, after months of "pulling out my hair" I finally
    stumbled upon the following variables/functions which
    allowed me to correctly and reliably test for (almost)
    every possible error condition.
    - Win32::OLE::LastError()
    - $doc->{parseError}->{reason}
    - $doc->{parseError}->{line};
    - $doc->{parseError}->{linePos};
    - $doc->{parseError}->{srcText};

    With the improved error-handling, I was now able to
    experiment with different situations in my xslt-stylesheets.
    Here is what I experienced:

    XML-input-files: Use <?xml version='1.0' encoding='...'?>
    =========================================================
    In your XML-Input-Files, always specify the encoding in the
    first line <?xml version='1.0' encoding='...'?>. This is
    'ISO-8859-1' for plain old ASCII, but could also be 'UTF-8'
    or 'UTF-16' if your XML-Input-File is set-up this way.
    If you don't respect the correct encoding, you will end up
    with an error ("An invalid character was found in text
    content")

    XSLT-files: Use <?xml version='1.0' encoding='...'?>
    ====================================================
    In your XSLT-Files, always specify the encoding in the first
    line <?xml version='1.0' encoding='...'?>.
    Strictly speaking it is not necessary to specify the encoding
    in the first line of the XSLT-file, a simple
    <?xml version='1.0'?> is enough. but by doing so, you let
    Microsoft guess the encoding, which it does correctly in 95%
    of the cases. However, in the remaining 5% of the cases,
    Microsoft gets it wrong and you end up with an error
    ("Switch from current encoding to specified encoding not
    supported"). Consequently, I suggest to always specify the
    actual encoding directly in the first line of the XSLT-file.

    XSLT-files: Use <xsl:eek:utput encoding='ISO-8859-1'/>
    ===================================================
    It is more convenient to use
    <xsl:eek:utput encoding='ISO-8859-1'/> in your XSLT-file. This
    works very well, even with accented characters and Umlaute.
    You can use other encodings (such as
    <xsl:eek:utput encoding='UTF-8'/>), and the XML-Output-File
    will be displayed correctly in Internet Explorer, but then
    you will find it inconvenient that Notepad does not display
    the XML-Output-file correctly any more.

    XSLT-files: Use <xsl:eek:utput method='xml'/>
    ==========================================
    If you want to generate Html, you can do so easily by
    generating an XML file with its tags in Html-syntax
    (such as <p>, <table>, <hr/>, etc...). However, do not
    attempt to use <xsl:eek:utput method='html'/> in your XSLT-file,
    use <xsl:eek:utput method='xml'/> instead (even if you want to
    generate 'Html', think of 'XHtml' and use
    <xsl:eek:utput method='xml'/>). You may end in up tears when you
    discover that by using <xsl:eek:utput method='html'/>, your
    encoding does not work the way you want to. And you might
    even discover that '&#160' and/or '&nbsp' will cause an error
    after having erased your output-file! - Why is that so? - I
    don't know.
    The ultimate rule is: Never use 'html' as your method in
    <xsl:eek:utput method='...'/>, you must use
    <xsl:eek:utput method='xml'/> at all times.

    XSLT-files: Use <xsl:eek:utput indent='yes'/>
    ==========================================
    This advice is more for convenience than anything else. If
    you specify <xsl:eek:utput indent='yes'/> and you look at your
    XML-Output-file with Notepad, you will find that its
    linebreaks are more conveniently located than they would
    have been without <xsl:eek:utput indent='yes'/>. It is still
    not perfect, but it is better. So finally, the
    <xsl:eek:utput... /> line in your XSLT-file should look like
    this:
    <xsl:eek:utput method='xml' indent='yes' encoding='ISO-8859-1'/>

    In XSLT-files: Use ' ' instead of '&nbsp;'
    ===============================================
    The instruction '&nbsp;' does not work with MSXML. If you
    want your XSLT-file to generate a non-breaking space, use
    ' ' instead.


    ....that's the end of my list.

    For those of you who want to try, here is a test program:

    use strict;
    use warnings;
    use Win32::OLE;

    my $MxErr;

    testcase(1, 'transformation succeeds');
    testcase(2, 'unbalanced tags in *.xml');
    testcase(3, 'unbalanced tags in *.xsl');
    testcase(4, 'syntax error in *.xsl');
    testcase(5, 'output method=html fails');

    sub testcase {
    my ($Case, $Description) = @_;

    makefiles($Case);

    system('cls');
    print "Testcase no $Case: $Description\n";

    print "\n\nThis is the xml file 'test$Case.xml':\n";
    print "=============================================\n";
    system("type test$Case.xml");
    print "=============================================\n";
    system('pause');

    print "\n\nThis is the xsl file 'trf$Case.xsl':\n";
    print "=============================================\n";
    system("type trf$Case.xsl");
    print "=============================================\n";
    system('pause');

    my $success = TransformXslt(xml => "test$Case.xml",
    xslt => "trf$Case.xsl",
    out => "output$Case.html");

    if ($success) {
    print "\n\nTransformXslt succeeded, result:\n";
    print "=========================================\n";
    system("type output$Case.html");
    print "=========================================\n";
    }
    else {
    print "\n\nProblem with TransformXslt:\n";
    print "=========================================\n";
    print "$MxErr\n";
    print "=========================================\n";
    }
    system('pause');
    print "\n";
    }

    sub makefiles {
    my ($Case) = @_;

    my $XData = ($Case == 2 ? 'data1' : 'data');
    my $XTitle = ($Case == 3 ? 'title1' : 'title');
    my $XFunc = ($Case == 4 ? 'r([?' : '.');
    my $XMethod = ($Case == 5 ? 'html' : 'xml');

    open OFL, '>', "test$Case.xml"
    or die "err write test$Case.xml: $!";
    print OFL qq{<?xml version="1.0"}.
    qq{ encoding="ISO-8859-1"?>\n};
    print OFL qq{<index>\n};
    print OFL qq{ <data>aaaa</$XData>\n};
    print OFL qq{ <data>bbbb</data>\n};
    print OFL qq{</index>\n};
    close OFL;

    open OFL, '>', "trf$Case.xsl"
    or die "err write trf$Case.xsl: $!";
    print OFL qq{<?xml version="1.0"}.
    qq{ encoding="ISO-8859-1"?>\n};
    print OFL qq{<xsl:stylesheet version="1.0"\n};
    print OFL qq{xmlns:xsl="http://www.w3.org/1999}.
    qq{/XSL/Transform">\n};
    print OFL qq{ <xsl:eek:utput method="$XMethod" indent=}.
    qq{"yes" encoding="ISO-8859-1"/>\n};
    print OFL qq{ <xsl:template match="/">\n};
    print OFL qq{ <html>\n};
    print OFL qq{ <body>\n};
    print OFL qq{ <title>Test</$XTitle>\n};
    print OFL qq{ <p>nonbreaking space</p>\n};
    print OFL qq{ <hr/>\n};
    print OFL qq{ <xsl:for-each select="index/data">\n};
    print OFL qq{ <p>Test: *** <xsl:value-of}.
    qq{ select="$XFunc"/> ***</p>\n};
    print OFL qq{ </xsl:for-each>\n};
    print OFL qq{ </body>\n};
    print OFL qq{ </html>\n};
    print OFL qq{ </xsl:template>\n};
    print OFL qq{</xsl:stylesheet>\n};
    close OFL;
    }

    sub TransformXslt {
    my ($xml_input_file, $xslt_file, $xml_output_file)
    = ($_[1], $_[3], $_[5]);
    $MxErr = '';
    my $DomDocument = 'Msxml2.DOMDocument.4.0';

    # Load the document (Xml-Input-File)
    my $xml_input_doc = Win32::OLE->new($DomDocument);
    unless ($xml_input_doc) {
    $MxErr = qq{Mx-0040: Couldn't create Win32::OLE}.
    qq{ $DomDocument for XML-Input-File}.
    qq{ "$xml_input_file"};
    return undef;
    }

    $xml_input_doc->{async} = 'False';
    $xml_input_doc->{validateOnParse} = 'True';
    if (!$xml_input_doc->Load($xml_input_file)) {
    my $Rs = $xml_input_doc->{parseError}->{reason};
    $Rs =~ s/\r//; chomp $Rs;
    my $Ln = $xml_input_doc->{parseError}->{line};
    my $Ps = $xml_input_doc->{parseError}->{linePos};
    my $Tx = $xml_input_doc->{parseError}->{srcText};
    $MxErr = qq{Mx-0060: XML-Input-File}.
    qq{ "$xml_input_file"}.
    qq{ did not load for $DomDocument at line}.
    qq{ $Ln, pos $Ps, reason: $Rs, text: '$Tx'};
    return undef;
    }

    # create Output-object
    my $xml_output_doc = Win32::OLE->new($DomDocument);
    unless ($xml_output_doc) {
    $MxErr = qq{Mx-0055: Couldn't create Win32::OLE}.
    qq{ $DomDocument for XML-Output-File}.
    qq{ "$xml_output_file"};
    return undef;
    }

    # Load the Stylesheet (Xsl-File)
    my $xslt_doc = Win32::OLE->new($DomDocument);
    unless ($xslt_doc) {
    $MxErr = qq{Mx-0050: Couldn't create Win32::OLE}.
    qq{ $DomDocument for XSLT-File "$xslt_file"};
    return undef;
    }

    $xslt_doc->{async} = 'False';
    $xslt_doc->{validateOnParse} = 'True';
    if (!$xslt_doc->Load($xslt_file)) {
    my $Rs = $xslt_doc->{parseError}->{reason};
    $Rs =~ s/\r//; chomp $Rs;
    my $Ln = $xslt_doc->{parseError}->{line};
    my $Ps = $xslt_doc->{parseError}->{linePos};
    my $Tx = $xslt_doc->{parseError}->{srcText};
    $MxErr = qq{Mx-0070: XSLT-file "$xslt_file" did not}.
    qq{ load for $DomDocument at line}.
    qq{ $Ln, pos $Ps, reason: $Rs, text: '$Tx'};
    return undef;
    }

    # Do the work: transform xml using an xslt stylesheet
    $xml_input_doc->transformNodeToObject($xslt_doc,
    $xml_output_doc);
    if (Win32::OLE::LastError()) {
    my $Rs = Win32::OLE::LastError(); $Rs =~s/\s+/ /g;
    $MxErr = qq{Mx-0080: XSLT-file "$xslt_file" has}.
    qq{ syntax-errors for $DomDocument, }.
    qq{reason: $Rs};
    return undef;
    }

    # Save the done work to the output-file
    $xml_output_doc->save($xml_output_file);
    if (Win32::OLE::LastError()) {
    my $Rs = Win32::OLE::LastError(); $Rs =~s/\s+/ /g;
    $MxErr = qq{Mx-0090: Can't save to output-file}.
    qq{ "$xml_output_file" for $DomDocument, }.
    qq{reason: $Rs};
    return undef;
    }

    # "-z" tests for empty file, which is considered to be
    # a fatal error
    if (-z $xml_output_file) {
    $MxErr = qq{Mx-0100: A fatal error occured in either}.
    qq{ your XSLT-file "$xslt_file", or in}.
    qq{ your XML-input-file "$xml_input_file",}.
    qq{ the output-file "$xml_output_file" will}.
    qq{ be empty.};
    return undef;
    }

    return 1;
    }
     
    tuser, Feb 17, 2006
    #1
    1. Advertising

  2. tuser

    Samwyse Guest

    Re: Windows ActiveState Perl: MSXML transformNodeToObject finallysucceeded

    tuser wrote:
    > I have finally found a solution for my long-standing problem
    > with Xslt-transformation under Windows ActiveState Perl and
    > I thought that other people might have the same problem so I
    > would like to share my solution with the group.


    Thank you very much for posting this. While I doubt that I will ever
    have any need for your specific solution, I hope that you will serve as
    an example to others. Many times when researching a problem, I will
    find USENET posts from people with the same problem as me, but never any
    hint of how it was eventually solved. I sincerely wish that others will
    remember this post and share whatever solutions they find for their
    problems.
     
    Samwyse, Feb 18, 2006
    #2
    1. Advertising

  3. tuser

    robic0 Guest

    On 17 Feb 2006 13:14:27 -0800, "tuser" <> wrote:

    >I have finally found a solution for my long-standing problem
    >with Xslt-transformation under Windows ActiveState Perl and
    >I thought that other people might have the same problem so I
    >would like to share my solution with the group. I hope you
    >don't mind this long post, here is the story:
    >
    >I had read an article by Shawn Ribordy on
    >http://www.perl.com/pub/a/2001/04/17/msxml.html
    >('MSXML, It's Not Just for VB Programmers Anymore')
    >in which he described how to do Xslt-transform on XML-files
    >using the "transformNodeToObject" method of a Win32::OLE
    >object.
    >

    Good job! You have used a bunch of modules.
    Style sheet transforms? I'm willing to bet you don't know
    a rats ass about markup at all !! You've quoted code and
    folks that do though...
    I wouldn't hire you to clean the toilets!
     
    robic0, Feb 18, 2006
    #3
  4. tuser <> wrote:


    > sub TransformXslt {
    > my ($xml_input_file, $xslt_file, $xml_output_file)
    > = ($_[1], $_[3], $_[5]);



    An "array slice" would make that much prettier:

    my ($xml_input_file, $xslt_file, $xml_output_file) = @_[1,3,5];


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Feb 18, 2006
    #4
  5. tuser

    tuser Guest

    Tad McClellan wrote:
    > tuser <> wrote:
    >
    >
    > > sub TransformXslt {
    > > my ($xml_input_file, $xslt_file, $xml_output_file)
    > > = ($_[1], $_[3], $_[5]);

    >
    >
    > An "array slice" would make that much prettier:
    >
    > my ($xml_input_file, $xslt_file, $xml_output_file) = @_[1,3,5];


    Thanks for your input, I haven't thought of using array slices in perl
    before.
    I will use that in my program.
     
    tuser, Feb 19, 2006
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. mtugnoli

    XSL and transformNodeToObject

    mtugnoli, Mar 27, 2007, in forum: XML
    Replies:
    2
    Views:
    990
    mtugnoli
    Mar 27, 2007
  2. David Lozzi

    Try...Catch...Finally not firing finally?

    David Lozzi, Apr 23, 2007, in forum: ASP .Net
    Replies:
    12
    Views:
    817
    Alvin Bruney [MVP]
    May 11, 2007
  3. Anthony Jones
    Replies:
    6
    Views:
    298
    Anthony Jones
    Aug 13, 2008
  4. Eric Promislow
    Replies:
    1
    Views:
    169
    Phil Tomson
    Aug 6, 2005
  5. Dilbert
    Replies:
    0
    Views:
    859
    Dilbert
    Nov 10, 2011
Loading...

Share This Page