XML parse/build metaphor mismatch

S

Steve Jorgensen

I'm wondering if there's an approach to writing consistent code to read/write
XML data in arbitrary order that I'm simply missing.

It seems to be easy getting stuff -out- of a DOM via XPath, but it's much
tougher building a DOM document in arbitrary order. Yes - I can get the
parent context element first, using XPath, but then I build custom wrappers
and helpers to simplify the building and adding fragments in the correct
namespace, etc. Adding attributes is especially wierd and unneccessarily
cumbersome.

I can reuse my own libraries across applications, of course, but I've
essentially invented my own private code island at this point for something
everyone has to do when building XML documents using the DOM. What I end up
with doesn't nicely mirror how we use XPath to read content out, so I have
more duplication than I'd like between the code that reads and writes any
particular document type.

I keep wishing that XPath or XQuery would be more like SQL inasmuch as a
simple SQL expressions can add or update a record in a table in a syntax
independent of the database library employed (e.g. UPDATE employee SET
first_name='Jane', last_name='Doe' WHERE id=123"). Correct me if I'm wrong,
but I don't see any evidence that XQuery or any other standard XML library has
data updating capability like that.

Since I'm still fairly new to XML, I'm presuming there's at least a 50/50
chance that I'm simply making mountains out of molehills, but I see evidence
that perhaps, I am not. For instance, see "Lesson 5: Use DOM Wrapper Objects"
at http://www.developer.com/xml/article.php/2194491.

Now - imagine something like this...

document.selectSingleNode(/Document/Body/Employees).modify("insert
Employee[@id:='123'][ Person[ Name[@first:='Jane'][@last:='Doe']
][Notes/self::text():='Hi there']");

The format for insertion roughly mirrors the XPath predicate format, but each
expected node that did not already exist would be created, and any "="
operators perform value assignment, not comparison (other ideas are possible,
like using := for assignment). Any item we assign a value to that already has
a value (including empty-string attributes) should cause an error because we
said we were inserting. There would be a similar 'update' method for creating
new nodes as required, and also unconditionally replacing any existing node
values.

There are lost more details to work out for a full spec, of course, like
whether to support position predicates, and whether an insert should fail or
push the existing element forward in that case, etc., but it's the start of an
idea, anyway.

Now, let's add to this that the elements would be added to the DOM pretty much
as if you had typed the expanded XML text right into that point in the
document. For instance, inserted elements with no prefixes would be treated
the same as child elements with no prefixes, regardless of what namespace that
implies that they should be in (but allow overriding the default with a
defaultNamespaceURI parameter). For node names with prefices, the namespaces
should be assigned the same way they would be interpreted by a select,
depending on the settings for the DOM object.

What say y'all? Am I inventing a solution spec for a non-problem if I just
knew the best DOM coding practices, or is this somwehat on-target, and
something we should all be clammoring for?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top