Validation of XML file against external XSD Schema using Xerces CDT

C

christian.eickhoff

Hi Everyone,

I am currently implementing an XercesDOMParser to parse an XML file and
to validate this file against its XSD Schema file which are both
located on my local HD drive. For this purpose I set the corresponding
XercesDOMParser feature as shown in the upcoming subsection of my code.

As far as I understand, the parsing process should throw an
DOMException in case the XML file doesn't match the Schema file (e.g.
Element declarations mismatch..) but in my case nothing happens at all.
I even set parser->setValidationConstraintFatal(true) but this also
doesnt help. Have i missed to implement some certain Exception Handler?
Does anybody has an idea what I am doing wrong?

XercesDOMParser* XMLparser = new XercesDOMParser();
// optionally set some features
XMLparser->setDoNamespaces(true);
XMLparser->setDoSchema(true);
XMLparser->setDoValidation(true);
XMLparser->setValidationScheme(XercesDOMParser::Val_Always);
XMLparser->setExternalNoNamespaceSchemaLocation(myXSDfilepath);
XMLparser->setValidationSchemaFullChecking(true);
XMLparser->setValidationConstraintFatal(true);

//Define Error Handler
ErrorHandler *errHandler = (ErrorHandler*) new HandlerBase();
XMLparser->setErrorHandler(errHandler);

//parsing the according XML file
try
{
XMLparser->parse(myXMLfilepath);
}
//Exception Handling if parsing fails
catch (const XMLException& toCatch)
{
char* message = XMLString::transcode(toCatch.getMessage());
cout << "Exception during parsing. " << "Exception message is: \n"
<< message << "\n";
XMLString::release(&message);
return -1;
}
catch (const DOMException& toCatch)
{
char* message = XMLString::transcode(toCatch.msg);
cout << "Exception during parsing. " << "Exception message is: \n"
<< message << "\n";
XMLString::release(&message);
return -1;
}
catch (...)
{
cout << "Unexpected Exception \n";
return -1;
}

PLEASE! Its urgent.. Can anyone give me at least a hint?
Thanks in advance for response..
 
S

spiff

Hi Christian!

What do you mean with nothing happens? Is ErrorHandler of type
DOMErrorHandler?

Usually you get called in the error handler you set with
setErrorHandler() for each error in the XML instance. Is the schema
referrenced in the XML file? What happens if you remove the
setExternalNoNamespaceSchemaLocation() statement? Unfortunately I don't
have the code available now but I can take a look in the evening for
the details.

Regards

spiff

http://www.spycomponents.com
ValidatorBuddy - A tool for using different XML validators
 
C

christian.eickhoff

Hello spiff,

thanks A LOT for your fast response. First of all, I should mention
that I am a totally newbie in Xerces so that i most probably did a
quite elementary mistake.
What do you mean with nothing happens?

In terms of nothing happens means, when i manipulate either the schema
file or the xml file in a way that they become inconsistently, I still
dont get any Exception call. I thought that if the validation process
fails, the parsing method throws an DOMException which I can catch with
the follwing section... ?

catch (const DOMException& toCatch)
{
char* message = XMLString::transcode(toCatch.msg);
cout << "Exception during parsing. " << "Exception message is: \n"
<< message << "\n";
XMLString::release(&message);
return -1;
}
catch (...)
{
cout << "Unexpected Exception \n";
return -1;
}
Is ErrorHandler of type DOMErrorHandler?

The Error Handler itself is not of type DOMErrorHandler cause when I
want to parse an DOMErrorHandler to the setErrorHandler() function, I
get an error message roughly saying:

(translated..) Error: no suitable function call for
xercesc_2_7::XercesDOMParser::setErrorHandler(xercesc_2_7::DOMErrorHandler*&)«


... even though I have included <xercesc/dom/DOMErrorHandler.hpp>... Is
there another way of assigning this ErrorHandler to the parser?? By the
way, is it mandatory to instantiate my own error handler methods to
just indicate a validation problem? I dont necessarily need to know
where the error as happened nor what error. I just want to know VALID
or NOT VALID..
What happens if you remove the setExternalNoNamespaceSchemaLocation() statement?

In terms of the setExternalNoNamespaceSchemaLocation() statement I just
wanted to assign the pathname of my external .xsd file
("/home/eickhoff/myfile.xsd"). What other way could I make use of to
tell the parser where to find my external schema? As I finally seem to
have made a more trivial mistake, this function has no influence on the
performance at all -> means if I erase it, the behaviour of the code is
still the same.

Other solutions than mine to easily validate an XML file against its
external Schema are highly appreciated as well!!

Best regards,
Christian A. Eickhoff
 
S

spiff

Christian,

I'm sorry but I don't have the code available now but I will post it
this (late) evening (european time).

I'm not aware of any other method to know if the XML is valid then
taking a look at the DOMErrorHandler exceptions. If you didn't get one
the XML is valid. And I also don't know any other method to set the
error handler.

The schema could also assigned in the XML instance itself. In fact I
believe this is the more common case.

Maybe try to get it running with a real DOMErrorHandler in the
meantime.

spiff

http://www.spycomponents.com
ValidatorBuddy - A tool for using different XML validators
 
S

spiff

Christian,

to define my own DOMErrorHandler class I use the following #includes in
the header file:

#include <xercesc/dom/DOMErrorHandler.hpp>
#include <xercesc/dom/DOMLocator.hpp>

XERCES_CPP_NAMESPACE_USE


This code snippet should do the validation for you. Please note that I
don't set the schema instance here. So you might want to add this on
your own:

// Initialize the XML4C system
try
{
XMLPlatformUtils::Initialize();
}
catch (const XMLException& toCatch)
{
strGeneralResult.Format(_T("Error during Xerces
initialization! :\n%s\n"), toCatch.getMessage());
return;
}

// Instantiate the DOM parser.
static const XMLCh gLS[] = { chLatin_L, chLatin_S, chNull };

DOMImplementation* impl =
DOMImplementationRegistry::getDOMImplementation(gLS);
DOMBuilder* parser = ((DOMImplementationLS
*)impl)->createDOMBuilder(DOMImplementationLS::MODE_SYNCHRONOUS, 0);

parser->setFeature(XMLUni::fgDOMNamespaces, true);
parser->setFeature(XMLUni::fgXercesSchema, true);
parser->setFeature(XMLUni::fgXercesSchemaFullChecking, true);

parser->setFeature(XMLUni::fgDOMValidation, true);

// enable datatype normalization - default is off
parser->setFeature(XMLUni::fgDOMDatatypeNormalization, true);

// And create our error handler and install it
// CMyDOMErrorHandler must be derived from DOMErrorHandler
parser->setErrorHandler(new CMyDOMErrorHandler);

//
// Get the starting time and kick off the parse of the indicated
// file. Catch any exceptions that might propogate out of it.
//
unsigned long duration;

XERCES_CPP_NAMESPACE_QUALIFIER DOMDocument* doc = 0;

try
{
// reset document pool
parser->resetDocumentPool();

doc = parser->parseURI(_T("C:\your_file.xml"));
}
catch (const XMLException& toCatch)
{
strGeneralResult.Format(_T("Error during parsing: %s\nXML
Exception message is:\n %s\n"), strXMLFile, toCatch.getMessage());
}
catch (const DOMException& toCatch)
{
const unsigned int maxChars = 2047;
XMLCh errText[maxChars + 1];

strGeneralResult.Format(_T("DOM Error during parsing:
%s\nDOMException code is: %d\n"), strXMLFile, toCatch.code);

if (DOMImplementation::loadDOMExceptionMsg(toCatch.code,
errText, maxChars))
{
CString strDOMExc;
strDOMExc.Format(_T("\nMessage is: %s"), errText);

strGeneralResult +=strDOMExc;
}
}
catch (...)
{
strGeneralResult.Format(_T("Unexpected exception during
parsing: %s\n"), strXMLFile);
}


Regards
 
C

christian.eickhoff

Hey spiff,

thanks a lot for your source code subsection. Now that I use the
DOMBuilder instead of XercesDOMParser, i can set my own ErrorHandler
smoothly and the validation process works fine :).. The only problem I
still had was the external Schema location but as I inserted the
xsi:noNamespaceSchemaLocation within the root tag of my XML file,
everything works fine!!

Thanks again!! You really helped me out a lot :)))

Regards,
Christian
 
O

Oskar Stuffer

Hello Christian!

Some time ago I developed a little tool that does what you want.
I compiled it on a linux system using xerces-c2.6.0.
Usage:
xsdval XSDSchema XML_file

I hadn't time to examine the code in your posting but I think
you may easily figure out what's wrong looking at the attached sources of
xsdval.


regards,
Oskar Stuffer





Hello spiff,

thanks A LOT for your fast response. First of all, I should mention
that I am a totally newbie in Xerces so that i most probably did a
quite elementary mistake.


In terms of nothing happens means, when i manipulate either the schema
file or the xml file in a way that they become inconsistently, I still
dont get any Exception call. I thought that if the validation process
fails, the parsing method throws an DOMException which I can catch with
the follwing section... ?

catch (const DOMException& toCatch)
{
char* message = XMLString::transcode(toCatch.msg);
cout << "Exception during parsing. " << "Exception message is: \n"
<< message << "\n";
XMLString::release(&message);
return -1;
}
catch (...)
{
cout << "Unexpected Exception \n";
return -1;
}


The Error Handler itself is not of type DOMErrorHandler cause when I
want to parse an DOMErrorHandler to the setErrorHandler() function, I
get an error message roughly saying:

(translated..) Error: no suitable function call for
xercesc_2_7::XercesDOMParser::setErrorHandler(xercesc_2_7::DOMErrorHandler*&)«


.. even though I have included <xercesc/dom/DOMErrorHandler.hpp>... Is
there another way of assigning this ErrorHandler to the parser?? By the
way, is it mandatory to instantiate my own error handler methods to
just indicate a validation problem? I dont necessarily need to know
where the error as happened nor what error. I just want to know VALID
or NOT VALID..


In terms of the setExternalNoNamespaceSchemaLocation() statement I just
wanted to assign the pathname of my external .xsd file
("/home/eickhoff/myfile.xsd"). What other way could I make use of to
tell the parser where to find my external schema? As I finally seem to
have made a more trivial mistake, this function has no influence on the
performance at all -> means if I erase it, the behaviour of the code is
still the same.

Other solutions than mine to easily validate an XML file against its
external Schema are highly appreciated as well!!

Best regards,
Christian A. Eickhoff


FILES = xsdval.cpp ErrorReporter.cpp
HEADERS = ErrorReporter.hh
PROG = xsdval

OBJECTS = $(FILES:.cpp=.o)

# path of xerces library
XERCESROOT = /home/oskar/software/languages/C++/libraries/xerces-c-src_2_6_0

INCLUDEDIRS = -I $(XERCESROOT)/include
LIBDIRS = -L $(XERCESROOT)/lib
LIBS = -lxerces-c
CC = g++
CFLAGS = -g $(INCLUDEDIRS)
INSTALLDIR = /opt/paghe6

..PHONY: clean cleantags install all

all: $(PROG)

$(PROG): $(OBJECTS) Makefile
$(CC) $(CFLAGS) $(LIBDIRS) -o $(PROG) $(OBJECTS) $(LIBS)

$(OBJECTS): %.o: %.cpp $(HEADERS)
$(CC) -c $(CFLAGS) $< -o $@

clean:
rm -f *.o $(PROG)

TAGS: $(FILES)
etags $(FILES) $(HEADERS)

cleantags:
@rm -f TAGS

install: all
strip $(PROG)
cp $(PROG) $(INSTALLDIR)/bin
 
Joined
May 16, 2011
Messages
3
Reaction score
0
xerces xml C++ library: validating xml against xsd

Hi Guys,

I have a problem where i want to validate my xml file against xsd both located at my current working directory. Please have a look to the code snippet below:

XercesDOMParser *domParser = new XercesDOMParser;
LocalFileInputSource fin(X("./sample.xsd"));

domParser->setExternalSchemaLocation(X("./sample.xsd"));

Grammar *gmr = domParser->loadGrammar(fin, Grammar::SchemaGrammarType);

if (gmr == NULL)
{
cerr << "couldn't load schema" << endl;
//return false;
}

ParserErrorHandler errorHandler; //Derived from SetErrorhandler Class

domParser->setErrorHandler(&errorHandler);
domParser->setValidationScheme(XercesDOMParser::Val_Auto);
domParser->setDoNamespaces(true);
domParser->setDoSchema(true);
domParser->setValidationConstraintFatal(true);


domParser->parse("./sample.xml");
cerr << domParser->getErrorCount() << endl;
if (domParser->getErrorCount() == 0)
cerr << "XML file validated against the schema successfully" << endl;
else
cerr << "XML file doesn't conform to the schema" << endl;

when I try to execute above code, i get "couldn't load schema" error message.
I am not getting the exact point of problem here, Is this the way we use DOM Parser to validate xml? Please help in getting out of this error.
It's urgent.
Thanks a lot in advance.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top