Newbie: parsing and validation

M

michelle

I'm a seasoned C++ programmer, but I'm new to xml. I'm working on a
multimedia application. For getting my feet wet, I've decided to create a
toy application that can play and display a very basic midi file. I've come
up with an xml file format that the application should use.

For the player portion, I simply want to translate my file format to a
standard midi file (which is a binary format) and call a midi player. My
usual method would be to write a small utility that parses the xml file and
outputs the binary file. By using xml, do I have the opportunity to save
some programming effort? Can I use an xst to do a translation? Does it make
sense to try and make use of an existing xml parser? I'm still trying to
figure out if the xml tools are valuable for this project or if my file
format being xml is simply a convenient way for me to organize the file
data.

The validation portion of the xerces parser seems to be useful. But using a
DTD only provides very limited help. What would be very helpful would be to
be able to create new tokens used for constraining data. For example, I
would like to be able to have integer values in the range 0..127 and hex
values like 0x2f3d. Using a DTD, the best I seem to be able to do is
constrain the first to NMTOKEN and the second to PCDATA. Do DTD's allow me
to do this? Or do I have to use a schema? If so which schema, and where can
I find some more info?

Does xml change the way I should be thinking about the data structures in
my application? I seem to be deriving a very close relationship between C++
classes in the application and xml elements.

Any help is appreciated.
 
T

Toni Uusitalo

Hi,
Are you familiar with http://www.recordare.com/ (MusicXML and stuff)?

There's DTDs etc for that format. I'm not familiar with it myself, but
there's mailing lists mentioned in the site etc.

with respect,
Toni Uusitalo


"And I wish that I was made of stone
So that I would not have to see
A beauty impossible to define
A beauty impossible to believe"
- Nick Cave (Brompton Oratory) - romanticist?


"There are lots of myths that people have around issues of beauty and
attraction, and part of the issue is to stop thinking about things in terms
of myth, but to use the tools of neuroscience, and start dissecting and
understanding how things actually function," said Dr. Hans Breiter, a
psychiatrist and co-author of the study."
- The Brain Is Stimulated by Beauty, Study Finds - abcnews.com - scientist?
 
M

michelle

Toni said:
Hi,
Are you familiar with http://www.recordare.com/ (MusicXML and stuff)?

Yes. Ultimately, I want to provide a more general system. MusicXML suffers
from the same drawback as all the other music-related formats: You can't
give different tempos/meters to different staves at the same time. It also
limits you to a 12-tone scale. Ultimately, I want to define a scale using
frequencies. Then a conversion program can come along and convert these
notes into midi (note, pitchwheel) tuples. Thus, all sorts of scales can be
represented.

But my first project is a learning exercise for organizing how to store and
access the various objects in memory. I'm trying to find if I can make use
of an existing validating parser for loading files. xerces includes parsers
for SAX, SAX2, and DOM. I don't know the strengths/weaknesses of these
parsers. I'm also not sure how to use them. It appears that I implement a
handler that gets called at each element.

What I'd like is a validating parser that creates a tree of the xml file in
memory. Then I can visit nodes on the tree and create/populate the objects
of my application. So I guess my question is: Is there such a parser that I
can do this with? And, if so, is there an example program that illustrates
this?
 
T

Toni Uusitalo

michelle said:
Yes. Ultimately, I want to provide a more general system. MusicXML suffers
from the same drawback as all the other music-related formats: You can't
give different tempos/meters to different staves at the same time. It also
limits you to a 12-tone scale. Ultimately, I want to define a scale using
frequencies. Then a conversion program can come along and convert these
notes into midi (note, pitchwheel) tuples. Thus, all sorts of scales can be

Ok.

But my first project is a learning exercise for organizing how to store and
access the various objects in memory. I'm trying to find if I can make use
of an existing validating parser for loading files. xerces includes parsers
for SAX, SAX2, and DOM. I don't know the strengths/weaknesses of these
parsers. I'm also not sure how to use them. It appears that I implement a
handler that gets called at each element.

In SAX processing, yes.
What I'd like is a validating parser that creates a tree of the xml file in
memory. Then I can visit nodes on the tree and create/populate the objects
of my application. So I guess my question is: Is there such a parser that I
can do this with? And, if so, is there an example program that illustrates
this?

this might be helpful:
A Survey of APIs and Techniques for Processing XML
http://www.xml.com/pub/a/2003/07/09/xmlapis.html

this is an article about schema definition language (you must use schemas
(or RelaxNG) for validation if you want to do datatype constrains):
Using W3C XML Schema
http://www.xml.com/pub/a/2000/11/29/schemas/part1.html

for Xerces-C spesific examples, check the xerces-c example files,
google for some xerces-c examples etc.

with respect,
Toni Uusitalo


"And I wish that I was made of stone
So that I would not have to see
A beauty impossible to define
A beauty impossible to believe"
- Nick Cave (Brompton Oratory) - romanticist?


"There are lots of myths that people have around issues of beauty and
attraction, and part of the issue is to stop thinking about things in terms
of myth, but to use the tools of neuroscience, and start dissecting and
understanding how things actually function," said Dr. Hans Breiter, a
psychiatrist and co-author of the study."
- The Brain Is Stimulated by Beauty, Study Finds - abcnews.com - scientist?
 
M

Martin Honnen

michelle said:
What I'd like is a validating parser that creates a tree of the xml file in
memory. Then I can visit nodes on the tree and create/populate the objects
of my application. So I guess my question is: Is there such a parser that I
can do this with? And, if so, is there an example program that illustrates
this?

A DOM parser creates a tree in memory so I am sure if you use Xerces C
DOM parser you get your tree. As I don't use Xerces C I can't tell
however whether it also allows validation with the DOM parser but I
suspect so.
As for example programs when you download Apache Xerces it comes with
some examples.
 
P

Patrick TJ McPhee

% By using xml, do I have the opportunity to save
% some programming effort?

Not if you write your own parser.

% Can I use an xst to do a translation? Does it make
% sense to try and make use of an existing xml parser?

Sure. For your purposes, I'd suggest one of xerces, libxml, tinyxml, or
expat. xerces and tinyxml have C++ APIs, while libxml and expat have
C APIs. xerces and libxml are quite full-featured, while the other
two provide parsing and nothing else, which may be enough for your
purposes. My current favourite is libxml, but that's just me.

The advantage of using one of these parsers is that you don't have to
write the parsing code, and you can say with some confidence that you're
dealing with XML and not some rough approximation which doesn't support
half the things in the spec.

% The validation portion of the xerces parser seems to be useful. But using a
% DTD only provides very limited help. What would be very helpful would be to
% be able to create new tokens used for constraining data.

You can do that using xml schemas, which are supproted by both xerces and
libxml. You could also write your own data validation code, which might
be the same amount of work. You're correct about the limitations of DTDs
in this area. I don't know of any good sources for information about
w3c schemas. There are some tutorials floating around, though, and it
doesn't sound like you want to do anything too complicated, so that may
be enough.

% Does xml change the way I should be thinking about the data structures in
% my application?

In my opinion, it shouldn't. The structure of the XML file should be
descriptive of the data, and the data structures in your application should
be sensible for whatever you want to do with the data. You don't want
to put yourself in a situation where changing one of those structures
requires you to change the other.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,007
Latest member
obedient dusk

Latest Threads

Top