Parsing an XML stream with java ( sax )

X

Xavier Seneque

Hi everybody,

i wish to create a server that receives some data/queries from an xml
formatted stream, using java xml parsers :)

but i got some problems :(

here are some generaly thoughts about the project :

the DTD describing the format of the stream that the server will receive
is an external DTD and is only on the server... right ?

then, in no way the client should be able to specify a DTD or write an
internal DTD within its stream... the server should arbitrary use its
DTD to parse the stream

so i started to make some code to parse a well formed stream wich works :

first, here is my simple DTD, seems valid isn't it ?

//begin DTD
<?xml version='1.0' encoding='UTF-8' ?>
<!ELEMENT session (query_get_im_srv)>
<!ELEMENT query_get_im_srv (user)>
<!ELEMENT user (#PCDATA)>
//end DTD

so the server should receive a message that looks like this :
<session><query_get_im_srv><user>foo</user></query_get_im_srv></session>

here is the code i use to parse it, it works quite well.... how do you
like it ? :)

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setNamespaceAware(true);
factory.setValidating(true);
saxParser = factory.newSAXParser();
saxParser.parse(new InputSource(stream),handler);
// stream is the XML stream from the client
// handler a class that extends DefaultHandler which overloads the
callbacks from the events

i've read somewhere on the net that in order to validate a stream, i
should write something like this, but i'm not sure at all !!

SchemaFactory sf =
SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema s = sf.newSchema( new File("mydtd.dtd") );
Validator validator = s.newValidator();
validator.validate(xmlSource);

but it gets me some pretty exception :
--> org.xml.sax.SAXParseException: The markup in the document preceding
the root element must be well-formed.

and i don't really know where it comes from ... is my dtd not valid ?
anyhow, do you think i'm in the right way ? or do you thing i should use
something else ? some links on the topic ? any advice is welcome,
because for parsing files their is plenty of info, but streams... not a
lot :(

thanks !

xavier seneque
 
H

Hugo Pragt

Hi Xavier,

well-formed means that the document must comply with the basic XML
formatting.
In practice this means that:
1) all content must in a tag, or between a start and an end tag
2) The document must have 1 root element
3) every start tag must have a matching end tag

So, check your document (open it in Ineternet Explorer for example)

Hugo
 
X

Xavier Seneque

Hugo said:
Hi Xavier,

well-formed means that the document must comply with the basic XML
formatting.
In practice this means that:
1) all content must in a tag, or between a start and an end tag
2) The document must have 1 root element
3) every start tag must have a matching end tag

So, check your document (open it in Ineternet Explorer for example)

Hugo

- The SAX api ( at least in Java ) checks that the streams is well
formed, but I want it to be valid, that is, it follows the definition of
a DTD.
- My problem is that i want to validate a stream, not a file, so it's
quite different
- I don"t have internet explorer, i'm a follower of the penguin lord !
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top