[Newbie:]Determine DOMNode value datatype

Ralph Stuber · Apr 20, 2005

Hello,

I just started coding in c++, and I am currently working on a library to
access stored data in xml format. I am using VC++6 and xerces-c-2.6.0. I
parsed an xml document via a DOMBuilder*. If I take a DOMNode*, I am
able to access its value by myNode->getFirstChild()->getNodeValue(). The
parsed xml document contains an xml schema definition, which actually
defines two types of elements under the root element:

1) <nominal> with type <xs:string>
2) <numerical> with type <xs:integer>

The method myNode->getFirstChild()->getNodeType() always returns the
value 3 which means that is is of type TEXT_NODE.

My question: is it possible to determine the type of stored data in a
DOMNode? I want to determine if it is a xs:string or an xs:integer. So I
would be able to create char* results for string nodes (f.ex. company
name), and integer results for integer nodes (f.ex. zipcode). Does the
DOMBuilder parse the xml schema and does it transport the information
provided by the scheme into the DOMNode*-pointer?

Thanky in advance,

greetings from Oldenburg, Germany

Ralph

P.S.: Code-Sniplets:

Parsing:
[...]
static const XMLCh gLS[] = { chLatin_L, chLatin_S, chNull };
DOMImplementation *impl =
DOMImplementationRegistry::getDOMImplementation(gLS);
DOMBuilder* parser = ((DOMImplementationLS*)impl)->
createDOMBuilder(DOMImplementationLS::MODE_SYNCHRONOUS, 0);
if (parser->canSetFeature(XMLUni::fgDOMValidation, true))
parser->setFeature(XMLUni::fgDOMValidation, true);
if (parser->canSetFeature(XMLUni::fgDOMNamespaces, true))
parser->setFeature(XMLUni::fgDOMNamespaces, true);
if (parser->canSetFeature(XMLUni::fgDOMDatatypeNormalization, true))
parser->setFeature(XMLUni::fgDOMDatatypeNormalization, true);
rootNode = parser->parseURI(xmlFile); // rootNode is a DOMNode*
[...]
Reading and trying to determine the value type...

char* getValue(DOMNode* rootNode, char* elementName) {
char* nodeValue = "";
try{
DOMNodeList* rootElementList = rootNode->getChildNodes();
rootNode = rootElementList->item(0);
DOMNodeList* subElementList = rootNode->getChildNodes();
DOMNode* current = 0;
for (unsigned int i=0; i<subElementList->getLength(); i++) {
current = subElementList->item(i);
if (current->getNodeType() == DOMNode::ELEMENT_NODE) {
char* strValue = XMLString::transcode(current->getNodeName());
if (XMLString::equals(strValue, elementName)) {
nodeValue = XMLString::transcode(current->getFirstChild()->getNodeValue());
cout << "Element-Wert:" << nodeValue << endl;
cout << "Element-Typ:" << current->getFirstChild()->getNodeType() << endl;
//*******************************************************************
// it would be great here to determine the type of the
// current->getFirstChild() value
//*******************************************************************
} else {
}
}
}
} catch (...) {
}
return nodeValue;
}
The xml document:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="carelis_config.xsd">
<nominal>Test eines nominalen Wertes</nominal>
<numerical>12221</numerical>
</config>

The xml schema:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="config">
<xs:complexType>
<xs:sequence>
<xs:element name="nominal" type="xs:string"/>
<xs:element name="numerical" type="xs:integer"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

ajm · Apr 21, 2005

Ralph,

if i understood correctly it is the XML document (as opposed to the
schema which asserts its validity) that you are parsing. as such
there is not an obligation to even read the schema (though of course
you are electing to use a validating parser in your case.)

to my knowledge it is not possible to do what you want (if anyone has
an idea in this direction I am certainly interested in hearing about it)
and as a general rule when rolling your own parser you will have to
build in document "intelligence" e.g., if element name = "numerical"
then parse as int etc. working on the assumption that this is in fact
true (i.e., that the document successfully validates)

btw: if certain nodes of your document have a natural binding to objects
in your code you might consider using operator<< and operator>> to handle
XML aspects.

hth
ajm.

Ralph Stuber said:
Hello,

I just started coding in c++, and I am currently working on a library to
access stored data in xml format. I am using VC++6 and xerces-c-2.6.0. I
parsed an xml document via a DOMBuilder*. If I take a DOMNode*, I am
able to access its value by myNode->getFirstChild()->getNodeValue(). The
parsed xml document contains an xml schema definition, which actually
defines two types of elements under the root element:

1) <nominal> with type <xs:string>
2) <numerical> with type <xs:integer>

The method myNode->getFirstChild()->getNodeType() always returns the
value 3 which means that is is of type TEXT_NODE.

My question: is it possible to determine the type of stored data in a
DOMNode? I want to determine if it is a xs:string or an xs:integer. So I
would be able to create char* results for string nodes (f.ex. company
name), and integer results for integer nodes (f.ex. zipcode). Does the
DOMBuilder parse the xml schema and does it transport the information
provided by the scheme into the DOMNode*-pointer?

Thanky in advance,

greetings from Oldenburg, Germany

Ralph

P.S.: Code-Sniplets:

Parsing:
[...]
static const XMLCh gLS[] = { chLatin_L, chLatin_S, chNull };
DOMImplementation *impl =
DOMImplementationRegistry::getDOMImplementation(gLS);
DOMBuilder* parser = ((DOMImplementationLS*)impl)->
createDOMBuilder(DOMImplementationLS::MODE_SYNCHRONOUS, 0);
if (parser->canSetFeature(XMLUni::fgDOMValidation, true))
parser->setFeature(XMLUni::fgDOMValidation, true);
if (parser->canSetFeature(XMLUni::fgDOMNamespaces, true))
parser->setFeature(XMLUni::fgDOMNamespaces, true);
if (parser->canSetFeature(XMLUni::fgDOMDatatypeNormalization, true))
parser->setFeature(XMLUni::fgDOMDatatypeNormalization, true);
rootNode = parser->parseURI(xmlFile); // rootNode is a DOMNode*
[...]
Reading and trying to determine the value type...

char* getValue(DOMNode* rootNode, char* elementName) {
char* nodeValue = "";
try{
DOMNodeList* rootElementList = rootNode->getChildNodes();
rootNode = rootElementList->item(0);
DOMNodeList* subElementList = rootNode->getChildNodes();
DOMNode* current = 0;
for (unsigned int i=0; i<subElementList->getLength(); i++) {
current = subElementList->item(i);
if (current->getNodeType() == DOMNode::ELEMENT_NODE) {
char* strValue = XMLString::transcode(current->getNodeName());
if (XMLString::equals(strValue, elementName)) {
nodeValue = XMLString::transcode(current->getFirstChild()->getNodeValue());
cout << "Element-Wert:" << nodeValue << endl;
cout << "Element-Typ:" << current->getFirstChild()->getNodeType() << endl;
//*******************************************************************
// it would be great here to determine the type of the
// current->getFirstChild() value
//*******************************************************************
} else {
}
}
}
} catch (...) {
}
return nodeValue;
}
The xml document:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="carelis_config.xsd">
<nominal>Test eines nominalen Wertes</nominal>
<numerical>12221</numerical>
</config>

The xml schema:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="config">
<xs:complexType>
<xs:sequence>
<xs:element name="nominal" type="xs:string"/>
<xs:element name="numerical" type="xs:integer"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

Ralph Stuber · Apr 22, 2005

Hello AJ,

if i understood correctly it is the XML document (as opposed to the
schema which asserts its validity) that you are parsing.

You understood my intention correctly.

to my knowledge it is not possible to do what you want (if anyone has
an idea in this direction I am certainly interested in hearing about it)
and as a general rule when rolling your own parser you will have to
build in document "intelligence" e.g., if element name = "numerical"
then parse as int etc. working on the assumption that this is in fact
true (i.e., that the document successfully validates)

In the meantime I implemented a method getType(DOMNode* node) that
determines the type of node by executing several steps:

First, it looks for the schema file name given in the xml node named
"config". Then, it parses the schema file in order to find the
corresponding node definition given in the schema file defining the
element found. After having found, it parses the "type" attribute, and
returns an enum indicating its type (STRING,INTEGER,DATE,...). A struct
containing the enum value and a void* pointer to the child value of the
selected xml node will be returned by the getValue() method. Based on
these two objects, an int* or a char* pointer can be casted from the
void*-pointer after querying the type from the enum object.

This way, it is (sophisticatedly) possible to determine the type of
every node of an xml document, assuming that every element with the same
name has the same type, and assuming that the name and path of the
corresponding schema file is known.

Thank You anyway, greetings from Oldenburg, Germany

Ralph

How to write XML declaration with DOMWriter class Xerces-c	3	Mar 5, 2007
Newbie problems migrating XercesC from 2_7_0 to 3_0 trunk	1	Jan 9, 2007
Problem in parsing xml document with japanese text	0	Jan 9, 2004
XalanNode getNodeValue always returns NULL	2	Nov 18, 2004

[Newbie:]Determine DOMNode value datatype

Ralph Stuber

ajm

Ralph Stuber

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads