JAXB: Take more robustness to the unmarshaller. Please comment

C

Christoph Brunner

Hi,

on the sun homepage i had submit to the bugparade a request for
feature enhancement for the JAXB API.
After a period of time sun called me to post my request to a newsgroup
an get comments from other java developers.
So please comment my following request for feature enhancement:

If i unmarshall an invalid xml document i got a
javax.xml.bind.UnmarshalException: Unexpected end of element
exception.
The exception is thrown because the adequate element is required and
it is
not in the xml file.
When an exception is throwed the unmarshalling process is aborted.
At the moment i can only create xml files with all the required
elements,
because the unmarshaller can not read such a xml file into java.

What do you mean about this?

Thanks
Christoph Brunner
 
H

Heiko Sommer

Hi,

I think that being able to marshal/unmarshal incomplete documents is really
essential. Examples are

(a) a user enters complex data into an application that internally works with
JAXB classes. To go for lunch break etc., the user wants to save his work, even
though he or she is not yet done, so the data will likely not be valid with
respect to the schema. Nonetheless it should be possible to store the data
temporarily as XML on disk or in a database.

(b) from my project: we use a CORBA-based container-component model which uses
XML for transport and persistence of Value Objects. The framework presents Java
binding classes to both the client and the server code, and automatically takes
care of the (un-)marshalling for the CORBA transport in between. As several Java
components might collaborate to produce a valid Value Object (XML), the
container framework will never impose schema validation, but has to merely
convert between JAXB-trees and serialized XML. Validation must be performed
explicitly by the applications.

In my opinion it is a bit sad that the JAXB spec is so tolerant on how
implementations may deal with invalid XML, since it seems to imply that
applications using JAXB are supposed to be monolithic enough to always first
produce valid in-memory trees in one place, and only then serialize/parse them.

I would very much like for the SUN reference implementation to not throw any
exception or issue a **FATAL** error when it encounters a missing child element.
A non-fatal error would really do, and applications could deal with this using
the foreseen ValidationHandler mechanism.

Maybe I should add that in our project (http://www.eso.org/projects/alma/) we
are currently using the Castor framework, which lets us un-/marshal incomplete
XML without problems. For other reasons we'd like to switch to JAXB though,
which unfortunately does not seem possible given the current restriction in the
Unmarshaller.

If anyone else feels that this is an issue, please add your comments, as they
can be used by Christoph Brunner to convince SUN to take some action there.

cheers, Heiko
 
S

Steve Slatcher

Heiko said:
(a) a user enters complex data into an application that internally
works with JAXB classes. To go for lunch break etc., the user wants
to save his work, even though he or she is not yet done, so the data
will likely not be valid with respect to the schema. Nonetheless it
should be possible to store the data temporarily as XML on disk or in
a database.

I can understand how such a feature is useful from an end-users point of
view, but I am not convinced that the programmer should reasonably expect
any help from JAXB. What is wrong with simply serialising the object with
the data to save the work?
 
H

Heiko Sommer

well, you'll need two mechanisms, Java serialization and XML marshalling.
As a developer, I'd be happier to have one common and easy way to serialize my
object tree, and since this is about XML anyway, additional Java serialization
seems like a bother.
It gets worse if you depend on some XML technology, like an XML database or
anything web service like. It won't easily accept native serialized Java objects.
 
A

Alan Bridger

I'd like to add my support to this request. Saving "in progress" work is
vital in many applications, and this will frequently be invalid when
measured against the schema. This situation has been the case in many
applications I've worked on in the past, pre- and post- XML. Allowing
the developer to make the judgement as to when to take the validation
seriously simply makes sense. The developer should be able to make
that choice.

As Heiko Sommer points out, though other serialisation mechanisms could
be used for storing work in progress, this will often mean extra work
and perhaps a less satisfactory result.
 
B

Bob Foster

If it is necessary to store invalid documents, JAXB is the wrong tool for
the job. Even if it allowed you to write the document, which no doubt could
be done, JAXB wouldn't be able to read it, because in the general case the
generated parsing code, which is based on the schema, would fail.

I don't think I'd use JAXB to implement an editor.

Bob Foster
 
S

Stefan Bold

Hi,
my opinion is that this would be a good feature,
because i also had the Problem to save an incomplete form!

regards
Stefan
 
H

Heiko Sommer

Hi Bob,

why would parsing an *incomplete* XML document have to fail when using JAXB?
As far as I understand, the spec leaves it open to the implementation to handle
this case gracefully or not. Or did I miss something? After all, Castor manages
to parse such documents.

With incomplete XML I mean "structurally valid except for missing child
elements", something that can be made valid by just adding to it, not taking
away or replacing elements.

Please let me know if there is something more fundamental that prevents
unmarshalling incomplete XML into a JAXB tree "in the general case" as you say.

Heiko
 
B

Bob Foster

Heiko Sommer said:
Hi Bob,

why would parsing an *incomplete* XML document have to fail when using JAXB?
As far as I understand, the spec leaves it open to the implementation to handle
this case gracefully or not. Or did I miss something? After all, Castor manages
to parse such documents.

Good point. The spec allows an implementation to accept an invalid or not
validated document. But it gives no guidance as to what sort of invalid
documents might be accepted, and does not require an implementation to
accept one. I believed when I first read it that this language is there to
encourage implementations to accept documents with no schemas but the
authors hadn't thought through the implications of that, much less those of
accepting invalid documents.

I am not fond of sloppy specifications that leave major features to the whim
of the implementation. The language around accepting invalid documents is
doubly vague; not only is an implementation not required to do it, but even
if it does it, it is allowed to fail in the process of doing it. Who would
want to use a feature like that? You don't want an editor that only
sometimes can read a document it has written out, any more than you want an
editor that can't save an edit in progress.

You are technically correct, but I stand by the assertion that JAXB is not
appropriate for implementing an application like the editor that started
this thread.
With incomplete XML I mean "structurally valid except for missing child
elements", something that can be made valid by just adding to it, not taking
away or replacing elements.

Yes, James Clark has implemented something like this for RELAX NG he calls
"feasible" validation. Essentially, it takes a schema and makes every
element optional. If the JAXB spec had defined something like this, even as
a suggestion, one might hope that some implementation would do it. But that
wouldn't help our editor writer, as the user might want to save a document
that didn't pass this test, either.

Bob Foster
 
H

Heiko Sommer

Bob,

I fully agree that the JAXB spec is undesireably sloppy here, and partly agree
that this sloppyness should scare me away from using such a feature in any JAXB
implementation, if available.

Since this thread is about collecting opinions from the xml community so that
something can be changed for the better, would you agree if I'd summarize your
comments as follows:
One should not use JAXB in the areas outlined by Heiko Sommer or Alan Bridger,
because using implementation features of a binding framework that are not
mandated by the JAXB spec is simply too risky. However, if the JAXB spec would
be improved to describe more precisely how serialization and parsing of
incomplete XML must be handled, then you'd agree that these projects and
possibly many others could benefit from using JAXB binding classes.

In other words, you might vote for a more radical change request than what
Christoph Brunner posted, in the sense that not only SUN's JAXB implementation,
but also the spec should be changed. (yep, that would be great!)

Is that more or less correct?

cheers, Heiko
 
B

Bob Foster

I agree that using JAXB to read or write invalid documents, even if it
"works", is at best a form of vendor lock-in, and at worst a risky
proposition, even with one supplier's software, because there is no precise
specification that tells you what documents it will _not_ work for. (The
wording of the JAXB spec suggests to me that there are cases that the
committee, at least, couldn't figure out how to handle.)

If the feature were fully specified and required, it would be safe enough to
use. But I still wouldn't use it in an editor unless there were _no_ cases
where a document written out couldn't be read back in again, or vice versa.

Bob
 
H

Heiko Sommer

I just learned that XMLBeans are specifically designed to bind to even invalid
XML documents, so that these can still be manipulated (see http://tinyurl.com/lhm8).
This indicates that it's possible to descibe such behavior accurately in a
specification, and that other projects also see a need for it.

Does this mean that JAXB should retreat from the issue, leaving this desirable
feature to the "complementary" product XMLBeans?

See http://dev2dev.bea.com/technologies/xmlbeans/index.jsp and
http://xml.apache.org/xmlbeans/

Heiko
 
C

chbr0001

Hi all,

at 2005-03-20 my request of feature in the SUN bug database, where i
described our problem with the robustness to the unmarshaller, this
'bug' is fixed in version 2.0!

regards
Christoph Brunner
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top