[ANN] lxml 1.0 released

S

Stefan Behnel

Hallo everyone,

I have the honour to announce the availability of lxml 1.0.

http://codespeak.net/lxml/

It's downloadable from cheeseshop:
http://cheeseshop.python.org/pypi/lxml

"""
lxml is a Pythonic binding for the libxml2 and libxslt libraries. It provides
safe and convenient access to these libraries using the ElementTree API. It
extends the ElementTree API significantly to offer support for XPath, RelaxNG,
XML Schema, XSLT, C14N and much, much more.

Its goals are:

* Pythonic API.
* Documented.
http://codespeak.net/lxml/#documentation
* FAST!
http://codespeak.net/lxml/performance.html
* Use Python unicode strings in API.
* Safe (no segfaults).
* No manual memory management!
(as opposed to the official libxml2 Python bindings)
"""

While the list of features added since the last beta version (1.0.beta) is
rather small, this version contains a large number of bug fixes found by
various users and testers. Thank you all for your help!

Stefan


Features added since 0.9.2:

* Element.getiterator() and the findall() methods support finding
arbitrary elements from a namespace (pattern {namespace}*)
* Another speedup in tree iteration code
* General speedup of Python Element object creation and deallocation
* Writing C14N no longer serializes in memory (reduced memory footprint)
* PyErrorLog for error logging through the Python logging module
* element.getroottree() returns an ElementTree for the root node of the
document that contains the element.
* ElementTree.getpath(element) returns a simple, absolute XPath expression
to find the element in the tree structure
* Error logs have a last_error attribute for convenience
* Comment texts can be changed through the API
* Formatted output via pretty_print keyword to serialization functions
* XSLT can block access to file system and network via XSLTAccessControl
* ElementTree.write() no longer serializes in memory (reduced memory
footprint)
* Speedup of Element.findall(tag) and Element.getiterator(tag)
* Support for writing the XML representation of Elements and ElementTrees
to Python unicode strings via etree.tounicode()
* Support for writing XSLT results to Python unicode strings via unicode()
* Parsing a unicode string no longer copies the string (reduced memory
footprint)
* Parsing file-like objects now reads chunks rather than the whole file
(reduced memory footprint)
* Parsing StringIO objects from the start avoids copying the string
(reduced memory footprint)
* Read-only 'docinfo' attribute in ElementTree class holds DOCTYPE
information, original encoding and XML version as seen by the parser
* etree module can be compiled without libxslt by commenting out the line
include "xslt.pxi" near the end of the etree.pyx source file
* Better error messages in parser exceptions
* Error reporting now also works in XSLT
* Support for custom document loaders (URI resolvers) in parsers and XSLT,
resolvers are registered at parser level
* Implementation of exslt:regexp for XSLT based on the Python 're' module,
enabled by default, can be switched off with 'regexp=False' keyword
argument
* Support for exslt extensions (libexslt) and libxslt extra functions
(node-set, document, write, output)
* Substantial speedup in XPath.evaluate()
* HTMLParser for parsing (broken) HTML
* XMLDTDID function parses XML into tuple (root node, ID dict) based on
xml:id implementation of libxml2 (as opposed to ET compatible XMLID)


Bugs fixed since 0.9.2:

* Memory leak in Element.__setitem__
* Memory leak in Element.attrib.items() and Element.attrib.values()
* Memory leak in XPath extension functions
* Memory leak in unicode related setup code
* Element now raises ValueError on empty tag names
* Namespace fixing after moving elements between documents could fail if
the source document was freed too early
* Setting namespace-less tag names on namespaced elements ('{ns}t' -> 't')
didn't reset the namespace
* Unknown constants from newer libxml2 versions could raise exceptions in
the error handlers
* lxml.etree compiles much faster
* On libxml2 <= 2.6.22, parsing strings with encoding declaration could
fail in certain cases
* Document reference in ElementTree objects was not updated when the root
element was moved to a different document
* Running absolute XPath expressions on an Element now evaluates against
the root tree
* Evaluating absolute XPath expressions (/*) on an ElementTree could fail
* Crashes when calling XSLT, RelaxNG, etc. with uninitialized ElementTree
objects
* Memory leak when using iconv encoders in tostring/write
* Deep copying Elements and ElementTrees maintains the document
information
* Serialization functions raise LookupError for unknown encodings
* Memory deallocation crash resulting from deep copying elements
* Some ElementTree methods could crash if the root node was not
initialized (neither file nor element passed to the constructor)
* Element/SubElement failed to set attribute namespaces from passed attrib
dictionary
* tostring() now adds an XML declaration for non-ASCII encodings
* tostring() failed to serialize encodings that contain 0-bytes
* ElementTree.xpath() and XPathDocumentEvaluator were not using the
ElementTree root node as reference point
* Calling document('') in XSLT failed to return the stylesheet
 
S

Stefan Behnel

Kent said:
Are there any plans to offer a Windows installer?

Already there. :)

It just takes a minute longer sometimes, but Windows users are not forgotten.

Stefan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top