XML Validation

A

Andrew Thompson

There are some XML documents that I am attempting to
validate against DTD's, XSD's & most lately, Schematron
and Relax-NG style validation documents.

Although both of the latter are mentioned in the documentation*
of the Java classes, I am having trouble figuring how to get
a Java based validators for either (note that I am most interested
in Schematron, but am attempting Relax-NG first, simply because
it is mentioned more in the JavaDocs, and has a constant to
represent it.

However, this throws exceptions..

<sscce>
import javax.xml.validation.SchemaFactory;
import javax.xml.XMLConstants;

class SchemaFactoryTypes {

public static void main(String[] args) {
SchemaFactory factory = SchemaFactory.newInstance(
XMLConstants.RELAXNG_NS_URI);
}
}
</sscce>

Exception in thread "main" java.lang.IllegalArgumentException:
http://relaxng.org/ns/structure/1.0
at javax.xml.validation.SchemaFactory.newInstance(SchemaFactory.java:186)
at test.SchemaFactoryTypes.main(SchemaFactoryTypes.java:7)

Why is there a constant defined for Relax-NG if the
validator does not understand it?

How do I get a validator for the Schematron files?

*
<http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/validation/package-summary.html>
<http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/XMLConstants.html#RELAXNG_NS_URI>
 
A

Andrew Thompson

I don't know if this is relevant, but that URL does not point to a
DTD.

The DTD is at http://relaxng.org/relaxng.dtd

No. That is a DTD that describes the format of the Relax-NG
schemas themselves.

The Relax-NG schemas are written in XML and often have
a '.rng' extension. Here is a snippet of one..

<?xml version="1.0"?>
<grammar ns="" xmlns="http://relaxng.org/ns/structure/1.0"
datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
<start>
<choice>
<ref name="command"/>
<ref name="string"/>
<ref name="option"/>
<ref name="file"/>
<choice>
<notAllowed/>
<element name="screensaver">
<attribute name="_label">
......

That started as a DTD. I used a conversion tool to
convert it to a basic RNG schema.

The major advantage of the newer validators (especially
Schematron) is that they should allow me to check the
information in the elements and attributes of an XML
document to a much greater extent than is possible using
just the DTD format.
 
R

Raymond DeCampo

Andrew said:
There are some XML documents that I am attempting to
validate against DTD's, XSD's & most lately, Schematron
and Relax-NG style validation documents.

Although both of the latter are mentioned in the documentation*
of the Java classes, I am having trouble figuring how to get
a Java based validators for either (note that I am most interested
in Schematron, but am attempting Relax-NG first, simply because
it is mentioned more in the JavaDocs, and has a constant to
represent it.

However, this throws exceptions..

<sscce>
import javax.xml.validation.SchemaFactory;
import javax.xml.XMLConstants;

class SchemaFactoryTypes {

public static void main(String[] args) {
SchemaFactory factory = SchemaFactory.newInstance(
XMLConstants.RELAXNG_NS_URI);
}
}
</sscce>

Exception in thread "main" java.lang.IllegalArgumentException:
http://relaxng.org/ns/structure/1.0
at javax.xml.validation.SchemaFactory.newInstance(SchemaFactory.java:186)
at test.SchemaFactoryTypes.main(SchemaFactoryTypes.java:7)

Why is there a constant defined for Relax-NG if the
validator does not understand it?

Well the documentation for SchemaFactory does say that the parser need
only support W3C XML Schema 1.0 to be compliant.

I guess your job is to find an XML parser that does support what you want.

HTH,
Ray
 
R

Roedy Green

No. That is a DTD that describes the format of the Relax-NG
schemas themselves.

I think a mini lecture is needed here to explain what is going on to
the peanut gallery, myself included.

Here what I think is likely so:

there is a DTD language called ___ ? (Its favourite keyword is CDATA)
that is used for specifying what a legitimate XML file looks like,
including generic XML. It looks like (or perhaps is) the DTD
language used for describing the various flavours of HTML.

There are also two other schema languages for describing formats of
XML files.

There are competing languages, one from W3C.org and one from
relaxing.org.

These languages are XML supersets.

So there are three different ways to describe an XML file layout.

The relaxing schema defines more strictly just what a given XML file
is allowed to look like.
 
A

Andrew Thompson

oops it is relaxng.org no i.

I make that mistake regularly.

YADN. 'Yet Another Dumb Name'
the schema language is called RELAX NG

BTW - I'll think to put together the clear explanation
you suggested, perhaps this time only mentioning Schematron,
since it is the only schema format that does what I want,
in any case.
 
A

Andrew Thompson

I guess your job is to find an XML parser that does support what you want.

Yep. I reckon you are right Ray.

I was stuffing about trying to get the Relax-NG validation
working when it was actually Schematron I ultimately wanted.

If I have no luck finding a validator that will understand
Schematron, I'll start a new thread.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,608
Members
45,246
Latest member
softprodigy

Latest Threads

Top