Problem understanding what XPATH to write

C

Clarence

Hi - I have a problem and here is the verbose version of what I am trying to
do (better too much info than not enough).

I am searching through about 4,700 XML files containing company contact
details. In the company details are phone numbers. The phone numbers with
the formats I need have the following structure.

<xs:complexType name="PHONENO">
<xs:sequence maxOccurs="unbounded">
<xs:element name="COUNTRY" type="xs:string"/>
<xs:element name="NUMBER" type="xs:string"/>
<xs:element name="AREACODE" type="xs:string"/>
</xs:sequence>
</xs:complexType>

I have determined this structure and I don't have a schema for the XML files
I am searching. Many of these files have many different XML structures but
the structure of the phone numbers is always the same.

My question is how do I tell if a bit of XML contains this PHONENO structure
(above)? What is a typical XPATH expression I could use to search my XML to
return a result saying Yes, this is a valid bit of XML containing the phone
number or no it isn't?

For example, for the following bit of XML I want a result (using XPATH?) to
say that the complex type PHONENO exists

<COMPANY>
<NAME>ACME Manufacturing</NAME>
<PHONENO>
<COUNTRY>UK</COUNTRY>
<NUMBER>887323</NUMBER>
<AREACODE>01865</AREACODE>
</PHONENO>
</COMPANY>



I want a result using XPATH to say that the complex type PHONENO exists too.

<COMPANY>
<NAME>ACME Manufacturing</NAME>
<PHONENO>
<COUNTRY>UK</COUNTRY>
<AREACODE>01865</AREACODE>
<NUMBER>887323</NUMBER>
</PHONENO>
</COMPANY>


However for the following THREE bits of XML, representative of the XML in my
files, I want the result using XPATH to say that the complex type PHONENO
does NOT exist.


This bit of XML has no PHONENO at all.

<COMPANY>
<NAME>ACME Manufacturing</NAME>
<ADDRESS>18 St Giles Street</ADDRESS>
<ADDRESS>Carfax</ADDRESS>
<ADDRESS>Oxford</ADDRESS>
</COMPANY>



This bit is missing the <NUMBER> element in PHONENO and therefore since
every element in type PHONENO isn't present, it doesn't contain the
structure I want.

<COMPANY>
<NAME>ACME Manufacturing</NAME>
<PHONENO>
<COUNTRY>UK</COUNTRY>
<AREACODE>01865</AREACODE>
</PHONENO>
</COMPANY>



This bit of XML doesn't have the structure of PHONENO in it that I want
either, the PHONENO in this XML has an extra element EXCHANGE and therefore
isn't the same as the complex type phone number defined above.


<COMPANY>
<NAME>ACME Manufacturing</NAME>
<PHONENO>
<COUNTRY>UK</COUNTRY>
<AREACODE>01865</AREACODE>
<NUMBER>887323</NUMBER>
<EXCHANGE>Banbury 1183</EXCHANGE>
</PHONENO>
</COMPANY>


I'm not sure whether this is difficult of easy but I am certain that it has
taken me a long time thinking about it before making this post.

Thanks for everyone thinking about this.

Clarence (this is for home use, not for a job)
 
R

Richard Tobin

Clarence said:
<xs:complexType name="PHONENO">
<xs:sequence maxOccurs="unbounded">
<xs:element name="COUNTRY" type="xs:string"/>
<xs:element name="NUMBER" type="xs:string"/>
<xs:element name="AREACODE" type="xs:string"/>
</xs:sequence>
</xs:complexType>

Do you really want that maxOccurs="unbounded"? And from what you say
lower down, you don't seem to care about the order.

You seem to be wanting to find elements called PHONENO that contain a
COUNTRY, a NUMBER, an AREACODE and nothing else. If so, this will
match such an element:

PHONENO[count(*) = 3 and COUNTRY and NUMBER and AREACODE]

If you want to discover whether there is one of those within the current
element,

.//PHONENO[count(*) = 3 and COUNTRY and NUMBER and AREACODE]

-- Richard
 
J

Joris Gillis

Do you really want that maxOccurs="unbounded"? And from what you say
lower down, you don't seem to care about the order.

If "maxOccurs" should really be "unbounded", you could use:

count(*) div (4*count(COUNTRY)*count(AREACODE)*count(NUMBER)) &lt; 1
 
C

Clarence

WOAAAAAH, THANK YOU!!!!
Do you really want that maxOccurs="unbounded"? And from what you say
lower down, you don't seem to care about the order.

I don't care about the order you are right. Your observation above just
reflects my limited understanding of what I wrote :( I put this there
because some companies have more than one PHONENO and I thought I was making
things easier for people to help me out.
If you want to discover whether there is one of those within the current
element,

.//PHONENO[count(*) = 3 and COUNTRY and NUMBER and AREACODE]


Exactly what I was looking for. I was missing the count(*) bit in my code.

Thank you again Richard. Very clear answer.
Clarence



Richard Tobin said:
Clarence said:
<xs:complexType name="PHONENO">
<xs:sequence maxOccurs="unbounded">
<xs:element name="COUNTRY" type="xs:string"/>
<xs:element name="NUMBER" type="xs:string"/>
<xs:element name="AREACODE" type="xs:string"/>
</xs:sequence>
</xs:complexType>

Do you really want that maxOccurs="unbounded"? And from what you say
lower down, you don't seem to care about the order.

You seem to be wanting to find elements called PHONENO that contain a
COUNTRY, a NUMBER, an AREACODE and nothing else. If so, this will
match such an element:

PHONENO[count(*) = 3 and COUNTRY and NUMBER and AREACODE]

If you want to discover whether there is one of those within the current
element,

.//PHONENO[count(*) = 3 and COUNTRY and NUMBER and AREACODE]

-- Richard
 
C

Clarence

You guys have got to stop this, there will be nothing left for me to do.
Thank you too Joris.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top