XSLT speed comparisons

Discussion in 'Python' started by Damian, Sep 27, 2006.

  1. Damian

    Damian Guest

    Hi, I'm from an ASP.NET background an am considering making the switch
    to Python. I decided to develop my next project in tandem to test the
    waters and everything is working well, loving the language, etc.

    What I've got is:
    two websites, one in ASP.NET v2 and one in Python 2.5 (using 4suite for
    XML/XSLT)
    both on the same box (Windows Server 2003)
    both using the same XML, XSLT, CSS

    The problem is, the Python version is (at a guess) about three times
    slower than the ASP one. I'm very new to the language and it's likely
    that I'm doing something wrong here:

    from os import environ
    from Ft.Lib.Uri import OsPathToUri
    from Ft.Xml import InputSource
    from Ft.Xml.Xslt import Processor

    def buildPage():
    try:
    xsluri = OsPathToUri('xsl/plainpage.xsl')
    xmluri = OsPathToUri('website.xml')

    xsl = InputSource.DefaultFactory.fromUri(xsluri)
    xml = InputSource.DefaultFactory.fromUri(xmluri)

    proc = Processor.Processor()
    proc.appendStylesheet(xsl)

    params = {"url":environ['QUERY_STRING'].split("=")[1]}
    for i, v in enumerate(environ['QUERY_STRING'].split("/")[1:]):
    params["selected_section%s" % (i + 1)] = "/" + v

    return proc.run(xml, topLevelParams=params)
    except:
    return "Error blah blah"

    print "Content-Type: text/html\n\n"
    print buildPage()

    You can compare the development sites here:
    asp: http://consultum.pointy.co.nz/about/team
    python: http://python.pointy.co.nz/about/team

    Cheers!
     
    Damian, Sep 27, 2006
    #1
    1. Advertising

  2. Damian

    Ross Ridge Guest

    Damian wrote:
    > two websites, one in ASP.NET v2 and one in Python 2.5 (using 4suite for
    > XML/XSLT)
    > both on the same box (Windows Server 2003)
    > both using the same XML, XSLT, CSS
    >
    > The problem is, the Python version is (at a guess) about three times
    > slower than the ASP one.


    It could just be that 4suite is slower than MSXML. If so, you can use
    MSXML in Python if you want. You'll need to install the Python for
    Windows extensions. Something like this:

    from os import environ
    import win32com.client

    def buildPage():
    xsluri = 'xsl/plainpage.xsl'
    xmluri = 'website.xml'

    xsl = win32com.client.Dispatch("Msxml2.FreeThreadedDOMDocument.4.0")
    xml = win32com.client.Dispatch("Msxml2.DOMDocument.4.0")
    xsl.load(xsluri)
    xml.load(xmluri)

    xslt = win32com.client.Dispatch("Msxml2.XSLTemplate.4.0")
    xslt.stylesheet = xsl
    proc = xslt.createProcessor()
    proc.input = xml

    params = {"url":environ['QUERY_STRING'].split("=")[1]}
    for i, v in enumerate(environ['QUERY_STRING'].split("/")[1:]):
    params["selected_section%s" % (i + 1)] = "/" + v

    for param, value in params.items():
    proc.addParameter(param, value)
    proc.transform()
    return proc.output

    print "Content-Type: text/html\n\n"
    print buildPage()


    Ross Ridge
     
    Ross Ridge, Sep 27, 2006
    #2
    1. Advertising

  3. Damian

    Damian Guest

    Ross Ridge wrote:
    > It could just be that 4suite is slower than MSXML. If so, you can use
    > MSXML in Python if you want. You'll need to install the Python for
    > Windows extensions. Something like this:


    Thanks for that Ross. That would make sense, I'd read somewhere that
    the 4suite code was a little slower but wanted to be sure it wasn't
    just something I'd done wrong.

    I've experimented with the code you supplied and after installing the
    necessaries have run into a brick wall with a series of errors.

    The errors can be seen at http://python.pointy.co.nz/test (I'm leaving
    the existing, slower version running at the moment for the rest of the
    site).

    I've got to get back on with some work but will look further into this
    tonight.

    Thanks for your help! I really appreciate it.
     
    Damian, Sep 28, 2006
    #3
  4. Damian

    Ross Ridge Guest

    Damian wrote:
    > The errors can be seen at http://python.pointy.co.nz/test (I'm leaving
    > the existing, slower version running at the moment for the rest of the
    > site).


    Hmm... it seems that you don't have MSXML 4.0 installed on your
    machine. I missed the fact that you're using ASP.NET, so your ASP code
    probably is probably using the .NET XML implementation instead of
    MSXML. In that case, another alternative might be to use IronPython
    and just translate your ASP script into Python.

    Ross Ridge
     
    Ross Ridge, Sep 28, 2006
    #4
  5. Damian

    Damian Guest

    Ross Ridge wrote:
    > Hmm... it seems that you don't have MSXML 4.0 installed on your
    > machine. I missed the fact that you're using ASP.NET, so your ASP code
    > probably is probably using the .NET XML implementation instead of
    > MSXML. In that case, another alternative might be to use IronPython
    > and just translate your ASP script into Python.
    >
    > Ross Ridge


    Sorted!

    I installed msxml4 and then struggled for an hour or so with an
    encoding error (UnicodeEncodeError: 'ascii' codec.... etc) which was
    fixed by altering your code from:

    return proc.output --> return proc.output.encode('utf-8')

    The performance of MSXML over 4suite is substantial.
    4suite: http://python.pointy.co.nz/test = 2.5s
    MSXML: http://python.pointy.co.nz/test_msxml = 1.1s

    I'd like to eventually break all ties with MS at some stage. It'll be
    interesting to test this performance on a Linux server.

    Thank you for your help Ross.
     
    Damian, Sep 28, 2006
    #5
  6. Ross Ridge wrote:


    >> The problem is, the Python version is (at a guess) about three times
    >> slower than the ASP one.

    >
    > It could just be that 4suite is slower than MSXML. If so, you can use
    > MSXML in Python if you want.


    or use lxml:

    http://codespeak.net/lxml/

    (does anyone have any lxml/libxslt vs. msxml benchmarks, btw?)

    </F>
     
    Fredrik Lundh, Sep 28, 2006
    #6
  7. Damian

    Damian Guest

    Ross Ridge wrote:
    > Hmm... it seems that you don't have MSXML 4.0 installed on your
    > machine. I missed the fact that you're using ASP.NET, so your ASP code
    > probably is probably using the .NET XML implementation instead of
    > MSXML. In that case, another alternative might be to use IronPython
    > and just translate your ASP script into Python.


    Sorted!

    I installed msxml4 and then struggled for an hour or so with an
    encoding error (UnicodeEncodeError: 'ascii' codec.... etc) which was
    fixed by altering your code from:

    return proc.output --> return proc.output.encode('utf-8')

    The performance of MSXML over 4suite is substantial.
    4suite: http://python.pointy.co.nz/test = 2.5s
    MSXML: http://python.pointy.co.nz/test_msxml = 1.1s

    I'd like to eventually break all ties with MS at some stage. It'll be
    interesting to test this performance on a Linux server.

    Thank you for your help.

    Damian
     
    Damian, Sep 28, 2006
    #7
  8. Damian

    Damian Guest

    Sorted!

    I installed msxml4 and then struggled for an hour or so with an
    encoding error (UnicodeEncodeError: 'ascii' codec.... etc) which was
    fixed by altering your code from:

    return proc.output --> return proc.output.encode('utf-8')

    The performance of MSXML over 4suite is substantial.
    4suite: http://python.pointy.co.nz/test = 2.5s
    MSXML: http://python.pointy.co.nz/test_msxml = 1.1s

    I'd like to eventually break all ties with MS at some stage. It'll be
    interesting to test this performance on a Linux server.

    Thank you for your help.

    Damian
     
    Damian, Sep 28, 2006
    #8
  9. Damian

    Larry Bates Guest

    Damian wrote:
    > Hi, I'm from an ASP.NET background an am considering making the switch
    > to Python. I decided to develop my next project in tandem to test the
    > waters and everything is working well, loving the language, etc.
    >
    > What I've got is:
    > two websites, one in ASP.NET v2 and one in Python 2.5 (using 4suite for
    > XML/XSLT)
    > both on the same box (Windows Server 2003)
    > both using the same XML, XSLT, CSS
    >
    > The problem is, the Python version is (at a guess) about three times
    > slower than the ASP one. I'm very new to the language and it's likely
    > that I'm doing something wrong here:
    >
    > from os import environ
    > from Ft.Lib.Uri import OsPathToUri
    > from Ft.Xml import InputSource
    > from Ft.Xml.Xslt import Processor
    >
    > def buildPage():
    > try:
    > xsluri = OsPathToUri('xsl/plainpage.xsl')
    > xmluri = OsPathToUri('website.xml')
    >
    > xsl = InputSource.DefaultFactory.fromUri(xsluri)
    > xml = InputSource.DefaultFactory.fromUri(xmluri)
    >
    > proc = Processor.Processor()
    > proc.appendStylesheet(xsl)
    >
    > params = {"url":environ['QUERY_STRING'].split("=")[1]}
    > for i, v in enumerate(environ['QUERY_STRING'].split("/")[1:]):
    > params["selected_section%s" % (i + 1)] = "/" + v
    >
    > return proc.run(xml, topLevelParams=params)
    > except:
    > return "Error blah blah"
    >
    > print "Content-Type: text/html\n\n"
    > print buildPage()
    >
    > You can compare the development sites here:
    > asp: http://consultum.pointy.co.nz/about/team
    > python: http://python.pointy.co.nz/about/team
    >
    > Cheers!
    >

    For max speed you might want to try pyrxp:

    http://www.reportlab.org/pyrxp.html

    -Larry
     
    Larry Bates, Sep 28, 2006
    #9
  10. Microsoft has put a lot of effort into their XML libraries as they are
    (or will be) the foundation of most of their software suites. I think
    you'll be hard pressed to find a library that exceeds it in both
    breadth of functionality and performance.

    Istvan
     
    Istvan Albert, Sep 28, 2006
    #10
  11. Damian

    Jan Dries Guest

    Larry Bates wrote:
    > Damian wrote:

    [...]
    > > What I've got is:
    > > two websites, one in ASP.NET v2 and one in Python 2.5 (using 4suite for
    > > XML/XSLT)
    > > both on the same box (Windows Server 2003)
    > > both using the same XML, XSLT, CSS
    > >
    > > The problem is, the Python version is (at a guess) about three times
    > > slower than the ASP one. I'm very new to the language and it's likely
    > > that I'm doing something wrong here:

    [...]
    > >

    > For max speed you might want to try pyrxp:
    >
    > http://www.reportlab.org/pyrxp.html
    >


    Except that pyrxp, to the best of my knowledge, is an XML parser and
    doesn't support XSLT, which is a requirement for Damian.

    Regards,
    Jan
     
    Jan Dries, Sep 28, 2006
    #11
  12. Damian

    Larry Bates Guest

    Jan Dries wrote:
    > Larry Bates wrote:
    >> Damian wrote:

    > [...]
    >> > What I've got is:
    >> > two websites, one in ASP.NET v2 and one in Python 2.5 (using 4suite for
    >> > XML/XSLT)
    >> > both on the same box (Windows Server 2003)
    >> > both using the same XML, XSLT, CSS
    >> >
    >> > The problem is, the Python version is (at a guess) about three times
    >> > slower than the ASP one. I'm very new to the language and it's likely
    >> > that I'm doing something wrong here:

    > [...]
    >> >

    >> For max speed you might want to try pyrxp:
    >>
    >> http://www.reportlab.org/pyrxp.html
    >>

    >
    > Except that pyrxp, to the best of my knowledge, is an XML parser and
    > doesn't support XSLT, which is a requirement for Damian.
    >
    > Regards,
    > Jan

    Oops, I should have read the OPs post closer.

    -Larry
     
    Larry Bates, Sep 28, 2006
    #12
  13. Damian

    Larry Bates Guest

    Jan Dries wrote:
    > Larry Bates wrote:
    >> Damian wrote:

    > [...]
    >> > What I've got is:
    >> > two websites, one in ASP.NET v2 and one in Python 2.5 (using 4suite for
    >> > XML/XSLT)
    >> > both on the same box (Windows Server 2003)
    >> > both using the same XML, XSLT, CSS
    >> >
    >> > The problem is, the Python version is (at a guess) about three times
    >> > slower than the ASP one. I'm very new to the language and it's likely
    >> > that I'm doing something wrong here:

    > [...]
    >> >

    >> For max speed you might want to try pyrxp:
    >>
    >> http://www.reportlab.org/pyrxp.html
    >>

    >
    > Except that pyrxp, to the best of my knowledge, is an XML parser and
    > doesn't support XSLT, which is a requirement for Damian.
    >
    > Regards,
    > Jan

    Oops, I should have read the OPs post closer.

    -Larry
     
    Larry Bates, Sep 28, 2006
    #13
  14. Jan Dries wrote:

    >> For max speed you might want to try pyrxp:
    >>
    >> http://www.reportlab.org/pyrxp.html

    >
    > Except that pyrxp, to the best of my knowledge, is an XML parser and
    > doesn't support XSLT, which is a requirement for Damian.


    and last time I checked, both cElementTree and libxml2 (lxml.etree) was
    faster, so the "max speed" claim isn't that accurate either...

    </F>
     
    Fredrik Lundh, Sep 28, 2006
    #14
  15. Damian

    Damian Guest

    Sorry about the multiple posts folks. I suspect it was the "FasterFox"
    FireFox extension I installed yesterday tricking me.

    I had a brief look at libxml(?) on my Ubuntu machine but haven't run it
    on the server.

    I'll look into pyrxp Larry.

    I have to say I'm struggling a little with the discoverability and
    documentation side of things with Python. I realise that
    discoverability is purported to be one of its strong sides but coming
    from the Visual Studio IDE where Intellisense looks after everything as
    you are typing and you can see exactly what methods are available to
    what class and what variables are required and why what I've seen so
    far is not that flash.

    I've installed Eclipse with Pydev (very impressive) on my Linux box and
    am using IDLE on Windows and it could just be my lack of familiarity
    that is letting me down. Any other IDE recommendations?

    I'd be keen to test pyrxp and libxslt but may need help with the code -
    I spent literally hours yesterday trying to make a 20-line bit of code
    work. To make things worse I started with 4suite in Ubuntu and it
    refused to work with an error about not being able to find default.cat
    or something. Googled for hours with no luck.

    That said, I really want to make the switch and so far Python looks to
    be the best choice.

    Cheers
    Damian
     
    Damian, Sep 28, 2006
    #15
  16. Damian

    Damian Guest

    Ahhhh, thanks for that, I've been searching the documentation and it
    only briefly mentions XSLT but it sounds like a half-arsed attempt.
     
    Damian, Sep 28, 2006
    #16
  17. Damian

    Guest

    Damian wrote:
    > Hi, I'm from an ASP.NET background an am considering making the switch
    > to Python. I decided to develop my next project in tandem to test the
    > waters and everything is working well, loving the language, etc.
    >
    > What I've got is:
    > two websites, one in ASP.NET v2 and one in Python 2.5 (using 4suite for
    > XML/XSLT)
    > both on the same box (Windows Server 2003)
    > both using the same XML, XSLT, CSS
    >
    > The problem is, the Python version is (at a guess) about three times
    > slower than the ASP one. I'm very new to the language and it's likely


    The ASP one being MSXML, right? In that case that result doesn't
    surprise me.

    > that I'm doing something wrong here:


    Now wrong, but we can definitely simplify your API

    > from os import environ
    > from Ft.Lib.Uri import OsPathToUri
    > from Ft.Xml import InputSource
    > from Ft.Xml.Xslt import Processor
    >
    > def buildPage():
    > try:
    > xsluri = OsPathToUri('xsl/plainpage.xsl')
    > xmluri = OsPathToUri('website.xml')
    >
    > xsl = InputSource.DefaultFactory.fromUri(xsluri)
    > xml = InputSource.DefaultFactory.fromUri(xmluri)
    >
    > proc = Processor.Processor()
    > proc.appendStylesheet(xsl)
    >
    > params = {"url":environ['QUERY_STRING'].split("=")[1]}
    > for i, v in enumerate(environ['QUERY_STRING'].split("/")[1:]):
    > params["selected_section%s" % (i + 1)] = "/" + v
    >
    > return proc.run(xml, topLevelParams=params)
    > except:
    > return "Error blah blah"
    >
    > print "Content-Type: text/html\n\n"
    > print buildPage()


    This should work:

    from os import environ
    from Ft.Xml.Xslt import Transform

    def buildPage():
    try:
    params = {"url":environ['QUERY_STRING'].split("=")[1]}
    for i, v in enumerate(environ['QUERY_STRING'].split("/")[1:]):
    params["selected_section%s" % (i + 1)] = "/" + v

    return Transform('website.xml', 'xsl/plainpage.xsl',
    topLevelParams=params)
    except:
    return "Error blah blah"

    print "Content-Type: text/html\n\n"
    print buildPage()

    -- % --

    For what it's worth I just developed, and switched to WSGI middleware
    that only does the transform on the server side if the client doesn't
    understand XSLT. It's called applyxslt and is part of wsgi.xml [1].
    That reduces server load, and with caching (via Myghty), there's really
    no issue for me. For more on WSGI middleware see [2].

    [1] http://uche.ogbuji.net/tech/4suite/wsgixml/
    [2] http://www.ibm.com/developerworks/library/wa-wsgi/

    --
    Uche Ogbuji Fourthought, Inc.
    http://uche.ogbuji.net http://fourthought.com
    http://copia.ogbuji.net http://4Suite.org
    Articles: http://uche.ogbuji.net/tech/publications/
     
    , Sep 29, 2006
    #17
  18. Damian

    Guest

    Ross Ridge wrote:
    > Damian wrote:
    > It could just be that 4suite is slower than MSXML. If so, you can use
    > MSXML in Python if you want. You'll need to install the Python for
    > Windows extensions. Something like this:
    >
    > from os import environ
    > import win32com.client
    >
    > def buildPage():


    [SNIP]

    Added to:

    http://uche.ogbuji.net/tech/akara/nodes/2003-01-01/python-xslt

    --
    Uche Ogbuji Fourthought, Inc.
    http://uche.ogbuji.net http://fourthought.com
    http://copia.ogbuji.net http://4Suite.org
    Articles: http://uche.ogbuji.net/tech/publications/
     
    , Sep 29, 2006
    #18
  19. Damian

    Guest

    wrote:
    > For what it's worth I just developed, and switched to WSGI middleware
    > that only does the transform on the server side if the client doesn't
    > understand XSLT. It's called applyxslt and is part of wsgi.xml [1].
    > That reduces server load, and with caching (via Myghty), there's really
    > no issue for me. For more on WSGI middleware see [2].
    >
    > [1] http://uche.ogbuji.net/tech/4suite/wsgixml/
    > [2] http://www.ibm.com/developerworks/library/wa-wsgi/


    I just wanted to clarify that not only does the applyxslt middleware
    approach reduce server load, but in the case of clients running IE6 or
    IE7, the XSLT *does* end up being executed in MSXML after all: MSXML on
    the client's browser, rather than on the server. In the case of
    Mozilla it's Transformiix, which is between MSXML and 4Suite in
    performance. Not sure what's the XSLT processor in the case of Safari
    (only the most recent versions of Safari). But regardless, with that
    coverage you can write apps using XSLT, support the entire spectrum of
    browsers (and mobile apps, spiders, etc.) and yet rarely ever require
    XSLT applied on the server side.


    > --
    > Uche Ogbuji Fourthought, Inc.
    > http://uche.ogbuji.net http://fourthought.com
    > http://copia.ogbuji.net http://4Suite.org
    > Articles: http://uche.ogbuji.net/tech/publications/
     
    , Sep 29, 2006
    #19
  20. Damian

    Jordan Guest

    If your using python 2.4.3 or essentially any of the 2.3, 2.4 series,
    i'd test out PyScripter as an IDE, it's one of the best that I've used.
    Unfortunately, they have yet to fully accomedate 2.5 code (you can
    still write 2.5 code with almost no problems, but you won't be able to
    use a 2.5 interactive interpeter).


    Damian wrote:
    > Sorry about the multiple posts folks. I suspect it was the "FasterFox"
    > FireFox extension I installed yesterday tricking me.
    >
    > I had a brief look at libxml(?) on my Ubuntu machine but haven't run it
    > on the server.
    >
    > I'll look into pyrxp Larry.
    >
    > I have to say I'm struggling a little with the discoverability and
    > documentation side of things with Python. I realise that
    > discoverability is purported to be one of its strong sides but coming
    > from the Visual Studio IDE where Intellisense looks after everything as
    > you are typing and you can see exactly what methods are available to
    > what class and what variables are required and why what I've seen so
    > far is not that flash.
    >
    > I've installed Eclipse with Pydev (very impressive) on my Linux box and
    > am using IDLE on Windows and it could just be my lack of familiarity
    > that is letting me down. Any other IDE recommendations?
    >
    > I'd be keen to test pyrxp and libxslt but may need help with the code -
    > I spent literally hours yesterday trying to make a 20-line bit of code
    > work. To make things worse I started with 4suite in Ubuntu and it
    > refused to work with an error about not being able to find default.cat
    > or something. Googled for hours with no luck.
    >
    > That said, I really want to make the switch and so far Python looks to
    > be the best choice.
    >
    > Cheers
    > Damian
     
    Jordan, Sep 29, 2006
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ham

    I need speed Mr .Net....speed

    Ham, Oct 28, 2004, in forum: ASP .Net
    Replies:
    6
    Views:
    2,359
    Antony Baula
    Oct 29, 2004
  2. efiedler
    Replies:
    1
    Views:
    2,090
    Tim Ward
    Oct 9, 2003
  3. Replies:
    2
    Views:
    2,316
    Howard
    Apr 28, 2004
  4. mk

    (silly?) speed comparisons

    mk, Jul 8, 2008, in forum: Python
    Replies:
    1
    Views:
    266
    Peter Otten
    Jul 9, 2008
  5. Replies:
    3
    Views:
    115
    -berlin.de
    Jan 24, 2007
Loading...

Share This Page