XSLT speed comparisons

D

Damian

Hi, I'm from an ASP.NET background an am considering making the switch
to Python. I decided to develop my next project in tandem to test the
waters and everything is working well, loving the language, etc.

What I've got is:
two websites, one in ASP.NET v2 and one in Python 2.5 (using 4suite for
XML/XSLT)
both on the same box (Windows Server 2003)
both using the same XML, XSLT, CSS

The problem is, the Python version is (at a guess) about three times
slower than the ASP one. I'm very new to the language and it's likely
that I'm doing something wrong here:

from os import environ
from Ft.Lib.Uri import OsPathToUri
from Ft.Xml import InputSource
from Ft.Xml.Xslt import Processor

def buildPage():
try:
xsluri = OsPathToUri('xsl/plainpage.xsl')
xmluri = OsPathToUri('website.xml')

xsl = InputSource.DefaultFactory.fromUri(xsluri)
xml = InputSource.DefaultFactory.fromUri(xmluri)

proc = Processor.Processor()
proc.appendStylesheet(xsl)

params = {"url":environ['QUERY_STRING'].split("=")[1]}
for i, v in enumerate(environ['QUERY_STRING'].split("/")[1:]):
params["selected_section%s" % (i + 1)] = "/" + v

return proc.run(xml, topLevelParams=params)
except:
return "Error blah blah"

print "Content-Type: text/html\n\n"
print buildPage()

You can compare the development sites here:
asp: http://consultum.pointy.co.nz/about/team
python: http://python.pointy.co.nz/about/team

Cheers!
 
R

Ross Ridge

Damian said:
two websites, one in ASP.NET v2 and one in Python 2.5 (using 4suite for
XML/XSLT)
both on the same box (Windows Server 2003)
both using the same XML, XSLT, CSS

The problem is, the Python version is (at a guess) about three times
slower than the ASP one.

It could just be that 4suite is slower than MSXML. If so, you can use
MSXML in Python if you want. You'll need to install the Python for
Windows extensions. Something like this:

from os import environ
import win32com.client

def buildPage():
xsluri = 'xsl/plainpage.xsl'
xmluri = 'website.xml'

xsl = win32com.client.Dispatch("Msxml2.FreeThreadedDOMDocument.4.0")
xml = win32com.client.Dispatch("Msxml2.DOMDocument.4.0")
xsl.load(xsluri)
xml.load(xmluri)

xslt = win32com.client.Dispatch("Msxml2.XSLTemplate.4.0")
xslt.stylesheet = xsl
proc = xslt.createProcessor()
proc.input = xml

params = {"url":environ['QUERY_STRING'].split("=")[1]}
for i, v in enumerate(environ['QUERY_STRING'].split("/")[1:]):
params["selected_section%s" % (i + 1)] = "/" + v

for param, value in params.items():
proc.addParameter(param, value)
proc.transform()
return proc.output

print "Content-Type: text/html\n\n"
print buildPage()


Ross Ridge
 
D

Damian

Ross said:
It could just be that 4suite is slower than MSXML. If so, you can use
MSXML in Python if you want. You'll need to install the Python for
Windows extensions. Something like this:

Thanks for that Ross. That would make sense, I'd read somewhere that
the 4suite code was a little slower but wanted to be sure it wasn't
just something I'd done wrong.

I've experimented with the code you supplied and after installing the
necessaries have run into a brick wall with a series of errors.

The errors can be seen at http://python.pointy.co.nz/test (I'm leaving
the existing, slower version running at the moment for the rest of the
site).

I've got to get back on with some work but will look further into this
tonight.

Thanks for your help! I really appreciate it.
 
R

Ross Ridge

Damian said:
The errors can be seen at http://python.pointy.co.nz/test (I'm leaving
the existing, slower version running at the moment for the rest of the
site).

Hmm... it seems that you don't have MSXML 4.0 installed on your
machine. I missed the fact that you're using ASP.NET, so your ASP code
probably is probably using the .NET XML implementation instead of
MSXML. In that case, another alternative might be to use IronPython
and just translate your ASP script into Python.

Ross Ridge
 
D

Damian

Ross said:
Hmm... it seems that you don't have MSXML 4.0 installed on your
machine. I missed the fact that you're using ASP.NET, so your ASP code
probably is probably using the .NET XML implementation instead of
MSXML. In that case, another alternative might be to use IronPython
and just translate your ASP script into Python.

Ross Ridge

Sorted!

I installed msxml4 and then struggled for an hour or so with an
encoding error (UnicodeEncodeError: 'ascii' codec.... etc) which was
fixed by altering your code from:

return proc.output --> return proc.output.encode('utf-8')

The performance of MSXML over 4suite is substantial.
4suite: http://python.pointy.co.nz/test = 2.5s
MSXML: http://python.pointy.co.nz/test_msxml = 1.1s

I'd like to eventually break all ties with MS at some stage. It'll be
interesting to test this performance on a Linux server.

Thank you for your help Ross.
 
D

Damian

Ross said:
Hmm... it seems that you don't have MSXML 4.0 installed on your
machine. I missed the fact that you're using ASP.NET, so your ASP code
probably is probably using the .NET XML implementation instead of
MSXML. In that case, another alternative might be to use IronPython
and just translate your ASP script into Python.

Sorted!

I installed msxml4 and then struggled for an hour or so with an
encoding error (UnicodeEncodeError: 'ascii' codec.... etc) which was
fixed by altering your code from:

return proc.output --> return proc.output.encode('utf-8')

The performance of MSXML over 4suite is substantial.
4suite: http://python.pointy.co.nz/test = 2.5s
MSXML: http://python.pointy.co.nz/test_msxml = 1.1s

I'd like to eventually break all ties with MS at some stage. It'll be
interesting to test this performance on a Linux server.

Thank you for your help.

Damian
 
D

Damian

Sorted!

I installed msxml4 and then struggled for an hour or so with an
encoding error (UnicodeEncodeError: 'ascii' codec.... etc) which was
fixed by altering your code from:

return proc.output --> return proc.output.encode('utf-8')

The performance of MSXML over 4suite is substantial.
4suite: http://python.pointy.co.nz/test = 2.5s
MSXML: http://python.pointy.co.nz/test_msxml = 1.1s

I'd like to eventually break all ties with MS at some stage. It'll be
interesting to test this performance on a Linux server.

Thank you for your help.

Damian
 
L

Larry Bates

Damian said:
Hi, I'm from an ASP.NET background an am considering making the switch
to Python. I decided to develop my next project in tandem to test the
waters and everything is working well, loving the language, etc.

What I've got is:
two websites, one in ASP.NET v2 and one in Python 2.5 (using 4suite for
XML/XSLT)
both on the same box (Windows Server 2003)
both using the same XML, XSLT, CSS

The problem is, the Python version is (at a guess) about three times
slower than the ASP one. I'm very new to the language and it's likely
that I'm doing something wrong here:

from os import environ
from Ft.Lib.Uri import OsPathToUri
from Ft.Xml import InputSource
from Ft.Xml.Xslt import Processor

def buildPage():
try:
xsluri = OsPathToUri('xsl/plainpage.xsl')
xmluri = OsPathToUri('website.xml')

xsl = InputSource.DefaultFactory.fromUri(xsluri)
xml = InputSource.DefaultFactory.fromUri(xmluri)

proc = Processor.Processor()
proc.appendStylesheet(xsl)

params = {"url":environ['QUERY_STRING'].split("=")[1]}
for i, v in enumerate(environ['QUERY_STRING'].split("/")[1:]):
params["selected_section%s" % (i + 1)] = "/" + v

return proc.run(xml, topLevelParams=params)
except:
return "Error blah blah"

print "Content-Type: text/html\n\n"
print buildPage()

You can compare the development sites here:
asp: http://consultum.pointy.co.nz/about/team
python: http://python.pointy.co.nz/about/team

Cheers!
For max speed you might want to try pyrxp:

http://www.reportlab.org/pyrxp.html

-Larry
 
I

Istvan Albert

Microsoft has put a lot of effort into their XML libraries as they are
(or will be) the foundation of most of their software suites. I think
you'll be hard pressed to find a library that exceeds it in both
breadth of functionality and performance.

Istvan
 
J

Jan Dries

Larry said:
Damian wrote: [...]
What I've got is:
two websites, one in ASP.NET v2 and one in Python 2.5 (using 4suite for
XML/XSLT)
both on the same box (Windows Server 2003)
both using the same XML, XSLT, CSS

The problem is, the Python version is (at a guess) about three times
slower than the ASP one. I'm very new to the language and it's likely
that I'm doing something wrong here: [...]
For max speed you might want to try pyrxp:

http://www.reportlab.org/pyrxp.html

Except that pyrxp, to the best of my knowledge, is an XML parser and
doesn't support XSLT, which is a requirement for Damian.

Regards,
Jan
 
L

Larry Bates

Jan said:
Larry said:
Damian wrote: [...]
What I've got is:
two websites, one in ASP.NET v2 and one in Python 2.5 (using 4suite for
XML/XSLT)
both on the same box (Windows Server 2003)
both using the same XML, XSLT, CSS

The problem is, the Python version is (at a guess) about three times
slower than the ASP one. I'm very new to the language and it's likely
that I'm doing something wrong here: [...]
For max speed you might want to try pyrxp:

http://www.reportlab.org/pyrxp.html

Except that pyrxp, to the best of my knowledge, is an XML parser and
doesn't support XSLT, which is a requirement for Damian.

Regards,
Jan
Oops, I should have read the OPs post closer.

-Larry
 
L

Larry Bates

Jan said:
Larry said:
Damian wrote: [...]
What I've got is:
two websites, one in ASP.NET v2 and one in Python 2.5 (using 4suite for
XML/XSLT)
both on the same box (Windows Server 2003)
both using the same XML, XSLT, CSS

The problem is, the Python version is (at a guess) about three times
slower than the ASP one. I'm very new to the language and it's likely
that I'm doing something wrong here: [...]
For max speed you might want to try pyrxp:

http://www.reportlab.org/pyrxp.html

Except that pyrxp, to the best of my knowledge, is an XML parser and
doesn't support XSLT, which is a requirement for Damian.

Regards,
Jan
Oops, I should have read the OPs post closer.

-Larry
 
F

Fredrik Lundh

Jan said:
Except that pyrxp, to the best of my knowledge, is an XML parser and
doesn't support XSLT, which is a requirement for Damian.

and last time I checked, both cElementTree and libxml2 (lxml.etree) was
faster, so the "max speed" claim isn't that accurate either...

</F>
 
D

Damian

Sorry about the multiple posts folks. I suspect it was the "FasterFox"
FireFox extension I installed yesterday tricking me.

I had a brief look at libxml(?) on my Ubuntu machine but haven't run it
on the server.

I'll look into pyrxp Larry.

I have to say I'm struggling a little with the discoverability and
documentation side of things with Python. I realise that
discoverability is purported to be one of its strong sides but coming
from the Visual Studio IDE where Intellisense looks after everything as
you are typing and you can see exactly what methods are available to
what class and what variables are required and why what I've seen so
far is not that flash.

I've installed Eclipse with Pydev (very impressive) on my Linux box and
am using IDLE on Windows and it could just be my lack of familiarity
that is letting me down. Any other IDE recommendations?

I'd be keen to test pyrxp and libxslt but may need help with the code -
I spent literally hours yesterday trying to make a 20-line bit of code
work. To make things worse I started with 4suite in Ubuntu and it
refused to work with an error about not being able to find default.cat
or something. Googled for hours with no luck.

That said, I really want to make the switch and so far Python looks to
be the best choice.

Cheers
Damian
 
D

Damian

Ahhhh, thanks for that, I've been searching the documentation and it
only briefly mentions XSLT but it sounds like a half-arsed attempt.
 
U

uche.ogbuji

Damian said:
Hi, I'm from an ASP.NET background an am considering making the switch
to Python. I decided to develop my next project in tandem to test the
waters and everything is working well, loving the language, etc.

What I've got is:
two websites, one in ASP.NET v2 and one in Python 2.5 (using 4suite for
XML/XSLT)
both on the same box (Windows Server 2003)
both using the same XML, XSLT, CSS

The problem is, the Python version is (at a guess) about three times
slower than the ASP one. I'm very new to the language and it's likely

The ASP one being MSXML, right? In that case that result doesn't
surprise me.
that I'm doing something wrong here:

Now wrong, but we can definitely simplify your API
from os import environ
from Ft.Lib.Uri import OsPathToUri
from Ft.Xml import InputSource
from Ft.Xml.Xslt import Processor

def buildPage():
try:
xsluri = OsPathToUri('xsl/plainpage.xsl')
xmluri = OsPathToUri('website.xml')

xsl = InputSource.DefaultFactory.fromUri(xsluri)
xml = InputSource.DefaultFactory.fromUri(xmluri)

proc = Processor.Processor()
proc.appendStylesheet(xsl)

params = {"url":environ['QUERY_STRING'].split("=")[1]}
for i, v in enumerate(environ['QUERY_STRING'].split("/")[1:]):
params["selected_section%s" % (i + 1)] = "/" + v

return proc.run(xml, topLevelParams=params)
except:
return "Error blah blah"

print "Content-Type: text/html\n\n"
print buildPage()

This should work:

from os import environ
from Ft.Xml.Xslt import Transform

def buildPage():
try:
params = {"url":environ['QUERY_STRING'].split("=")[1]}
for i, v in enumerate(environ['QUERY_STRING'].split("/")[1:]):
params["selected_section%s" % (i + 1)] = "/" + v

return Transform('website.xml', 'xsl/plainpage.xsl',
topLevelParams=params)
except:
return "Error blah blah"

print "Content-Type: text/html\n\n"
print buildPage()

-- % --

For what it's worth I just developed, and switched to WSGI middleware
that only does the transform on the server side if the client doesn't
understand XSLT. It's called applyxslt and is part of wsgi.xml [1].
That reduces server load, and with caching (via Myghty), there's really
no issue for me. For more on WSGI middleware see [2].

[1] http://uche.ogbuji.net/tech/4suite/wsgixml/
[2] http://www.ibm.com/developerworks/library/wa-wsgi/
 
U

uche.ogbuji

For what it's worth I just developed, and switched to WSGI middleware
that only does the transform on the server side if the client doesn't
understand XSLT. It's called applyxslt and is part of wsgi.xml [1].
That reduces server load, and with caching (via Myghty), there's really
no issue for me. For more on WSGI middleware see [2].

[1] http://uche.ogbuji.net/tech/4suite/wsgixml/
[2] http://www.ibm.com/developerworks/library/wa-wsgi/

I just wanted to clarify that not only does the applyxslt middleware
approach reduce server load, but in the case of clients running IE6 or
IE7, the XSLT *does* end up being executed in MSXML after all: MSXML on
the client's browser, rather than on the server. In the case of
Mozilla it's Transformiix, which is between MSXML and 4Suite in
performance. Not sure what's the XSLT processor in the case of Safari
(only the most recent versions of Safari). But regardless, with that
coverage you can write apps using XSLT, support the entire spectrum of
browsers (and mobile apps, spiders, etc.) and yet rarely ever require
XSLT applied on the server side.
 
J

Jordan

If your using python 2.4.3 or essentially any of the 2.3, 2.4 series,
i'd test out PyScripter as an IDE, it's one of the best that I've used.
Unfortunately, they have yet to fully accomedate 2.5 code (you can
still write 2.5 code with almost no problems, but you won't be able to
use a 2.5 interactive interpeter).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top