[ANNOUNCEMENT}: VTD-XML released under GPL

Discussion in 'Java' started by Jimmy zhang, Jun 29, 2004.

  1. Jimmy zhang

    Jimmy zhang Guest

    I am pleased to announce that version 0.5 of VTD-XML -- a new,
    non-extractive, Java-base XML processing API licensed under GPL
    -- is now freely available on sourceforge.net. For source code,
    documentation, detailed description of API and code examples,
    please visit

    http://vtd-xml.sf.net

    Capable of random-access, VTD-XML attempts to be both memory
    efficient and high performance. The starting point of this project is
    the observation that, for XML documents that don't declare entities
    in DTD, tokenization can indeed be done by only recording the starting
    offset and length of a token. A discussion on this subject appeared
    in a recently article on xml.com
    (http://www.xml.com/pub/a/2004/05/19/parsing.html).

    The core technology of VTD-XML is a binary format specification
    called Virtual Token Descriptor (VTD). A VTD record is a 64-bit integer
    that encodes the starting offset, length, type and nesting depth of a
    token in an XML document. Because VTD records don't contain actually
    token content, they work alongside of the original XML document, which
    is maintained intact in memory by the processing model.

    VTD's memory-conserving features can be summarized as follows:

    * Avoid Per-object overhead -- In many VM-based object-oriented
    programming languages, per-object allocation incurs a small amount
    of memory overhead. A VTD record is immune to the overhead because
    it is not an object.
    * Bulk-allocation of storage -- Fixed in length, VTD records can be
    stored in large memory blocks, which are more efficient to allocate
    and GC. By allocating a large array for 4096 VTD records, one incurs
    the per-array overhead (16 bytes in JDK 1.4) only once across 4096
    records, thus reducing per-record overhead to very little.

    Our benchmark indicates that VTD-XML processes XML at the performance
    level similar to (and often better than) SAX with NULL content handler.
    The memory usage is typically between 1.3x ~ 1.6x of the size of the
    document, with "1" being the document itself.

    Other features included in this release are:

    * Incremental update -- VTD-XML allows one to modify content of XML
    without touching irrelevant parts of the document.
    * Content extraction -- VTD-XML also allows one to pull an element
    out of XML in its serialized format. This can be an important
    feature for partial signing/encryption of SOAP payload for
    WS-security.

    In the upcoming releases, we plan to add the persistence support so
    that one can save/load VTD to/from the disk along with the XML documents
    to avoid repetitive parsing in read-only situations. XPATH support is
    also on the development roadmap. However, we would like to collect as
    many suggestions and bug reports before taking the next step.

    Your input and suggestions are very important to make VTD-XML a truly
    useful XML processor.

    Thanks,

    Jimmy Zhang
    Jimmy zhang, Jun 29, 2004
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jimmy zhang
    Replies:
    0
    Views:
    353
    Jimmy zhang
    Jun 30, 2004
  2. Elroy
    Replies:
    1
    Views:
    356
    Jimmy zhang
    Oct 6, 2008
  3. Jimmy zhang
    Replies:
    0
    Views:
    362
    Jimmy zhang
    Jun 29, 2004
  4. Jimmy zhang
    Replies:
    0
    Views:
    409
    Jimmy zhang
    Jun 30, 2004
  5. Jimmy zhang

    ANNOUNCEMENT: VTD-XML 1.0 released

    Jimmy zhang, Oct 6, 2008, in forum: XML
    Replies:
    0
    Views:
    841
    Jimmy zhang
    Oct 6, 2008
Loading...

Share This Page