XML<-> JSON conversion. What do you think?

M

Max

According to Stefan Goessner, to obtain a correct conversion from XML to
JSON it must apply some patterns to maintain order and structure of the
elements. This not only serves to ensure a reversible conversion, but
especially for documents such as SVG and SMIL that they require a
precisely correct order of the elements.
For example, in the case of:

<e>
some
<a>textual</a>
content
</e>

a direct conversion would be:

"e": {
"#text": ["some", "content"],
"a": "textual"
}

that it would not make sense because it concatenates a text in an array ...
According to Stefan should be as follows:

"e": "some <a>textual</a> content"

In practice the central node should be included in the text.
The same goes for examples like this:

<a>x<c/>y</a>

{
"a":"x<c/>y"
}

Or with a CDATA:

<e>
some text
<![CDATA[ .. some data .. ]]>
more text
</e>

{
"e":"\n some text\n <![CDATA[ .. some data .. ]]>\n more text\n"
}

Or with a double CDATA:

<e>
<![CDATA[ .. some data .. ]]>
<![CDATA[ .. more data .. ]]>
</e>

{
"e":"<![CDATA[ .. some data .. ]]><![CDATA[ .. more data .. ]]>"
}

(I did not understand this last conversion...)


In any case, these examples are confined to CDATA, but in fact they may
well extend to Processing Instruction, Comment etc
Indeed, this approach raises a serious question on the difference
between JSON and XML.
Is it a correct approach or does the nodes need to convert in objects
and chained arrays normally?
What do you think?

Max
 
T

Thomas 'PointedEars' Lahn

Max said:
According to Stefan Goessner, to obtain a correct conversion from XML to
JSON it must apply some patterns to maintain order and structure of the
elements. This not only serves to ensure a reversible conversion, but
especially for documents such as SVG and SMIL that they require a
precisely correct order of the elements.

Document of which type would not require a precisely correct order of the
elements?
For example, in the case of:

<e>
some
<a>textual</a>
content
</e>

a direct conversion would be:

"e": {
"#text": ["some", "content"],
"a": "textual"
}

It would not. A direct conversion could be:

[
{
nodeName: "e",
attributes: {},
childNodes: [
{
nodeName: "#text",
nodeValue: "\n some\n "
},
{
nodeName: "a",
attributes: {},
childNodes: [
{"#text": "textual"}
]
},
{
nodeName: "#text",
nodeValue: "\n content\n"
}
]
}
]
that it would not make sense because it concatenates a text in an array ...

It is you who is not making sense here. "Concatenates a text in an array"?
According to Stefan should be as follows:

"e": "some <a>textual</a> content"

A possibility, but it would involve deserializing the content afterwards.
It would be even more inefficient than to serve XML in the first place.


PointedEars
 
T

Thomas 'PointedEars' Lahn

Thomas said:
Max said:
[...]
For example, in the case of:

<e>
some
<a>textual</a>
content
</e>

a direct conversion would be:

"e": {
"#text": ["some", "content"],
"a": "textual"
}

It would not. A direct conversion could be:

[
{
nodeName: "e",
attributes: {},
childNodes: [
{
nodeName: "#text",
nodeValue: "\n some\n "
},
{
nodeName: "a",
attributes: {},
childNodes: [
{"#text": "textual"}

Should have been

{
nodeName: "#text",
nodeValue: "textual"
}


PointedEars
 
M

Max

Thomas 'PointedEars' Lahn ha scritto:
Document of which type would not require a precisely correct order of the
elements?

Any custom XML document can not require a correct order of elements...
It would not. A direct conversion could be:

[
{
nodeName: "e",
attributes: {},
childNodes: [
{
nodeName: "#text",
nodeValue: "\n some\n "
},
{
nodeName: "a",
attributes: {},
childNodes: [
{"#text": "textual"}
]
},
{
nodeName: "#text",
nodeValue: "\n content\n"
}
]
}
]

A possibility...
It is you who is not making sense here. "Concatenates a text in an array"?

When a JSON converter find two or more #text nodes it create one array
with all strings elements:

<e>
some
<a>textual</a>
content
</e>

->

"e": {
"#text": ["some", "content"],
"a": "textual"

}

I think that a JSON converter, in this case, can join strings:

"e": {
"#text": "some content",
"a": "textual"

}

What do you think?
A possibility, but it would involve deserializing the content afterwards.
It would be even more inefficient than to serve XML in the first place.

Yes, I agree with you. In this case, a use of JSON it would be wrong.
 
T

Thomas 'PointedEars' Lahn

Max said:
Thomas 'PointedEars' Lahn ha scritto:

Any custom XML document can not require a correct order of elements...

If I meant nodes for "elements", would you agree then?
It would not. A direct conversion could be:

[
{
nodeName: "e",
attributes: {},
childNodes: [
{
nodeName: "#text",
nodeValue: "\n some\n "
},
{
nodeName: "a",
attributes: {},
childNodes: [
{"#text": "textual"}
]
},
{
nodeName: "#text",
nodeValue: "\n content\n"
}
]
}
]

A possibility...

It would be an accurate representation of the subtree, using the relevant
properties of the corresponding DOM objects. One that is easy and efficient
to iterate over.
When a JSON converter find two or more #text nodes it create one array
with all strings elements:

<e>
some
<a>textual</a>
content
</e>

->

"e": {
"#text": ["some", "content"],
"a": "textual"

}

I think that a JSON converter, in this case, can join strings:

"e": {
"#text": "some content",
"a": "textual"

}

What do you think?

It could, but that would not be an accurate representation of the content at
all. It matters that one text node precedes the element node and the other
follows it; neither the array of string values or the resulting primitive
string value would show that. And object properties have no defined order
in the for-in iteration that one would need to resort to then; hence my
arranging data of adjacent nodes as array elements instead.


PointedEars
 
M

Max

Thomas 'PointedEars' Lahn ha scritto:
If I meant nodes for "elements", would you agree then?
Yes.

It could, but that would not be an accurate representation of the content at
all. It matters that one text node precedes the element node and the other
follows it; neither the array of string values or the resulting primitive
string value would show that. And object properties have no defined order
in the for-in iteration that one would need to resort to then; hence my
arranging data of adjacent nodes as array elements instead.

These cases demonstrates the difference between JSON and XML. I think to
use JSON to transmit simples data and XML for structured data.
More thanks for the help and suggestions.

Max
 
L

Lasse Reichstein Nielsen

Max said:
When a JSON converter find two or more #text nodes it create one array
with all strings elements:

Which "JSON converter" are we talking about. I can assure you that it won't
be any XML<->JSON converter that I'll ever write (because I know that in XML,
order is important).

<e>
some
<a>textual</a>
content
</e>

->

"e": {
"#text": ["some", "content"],
"a": "textual"

}

I think that a JSON converter, in this case, can join strings:

Can, yes, obviosuly. If anyone writes it to do so. Which they definitly
shouldn't.
"e": {
"#text": "some content",
"a": "textual"

}

What do you think?

That it would be a lousy "converter".

/L
 
L

Lasse Reichstein Nielsen

Max said:
For example, in the case of:

<e>
some
<a>textual</a>
content
</e>

a direct conversion would be:

"e": {
"#text": ["some", "content"],
"a": "textual"
}

That would be a direct conversion *only* if you consider the order of
elements in the "e" element to be irrelevant. The contents suggest that
this is not the case.

This is no more a direct conversion than sorting the words of a book
is a translation of the book. Sometimes, often indeed, the order is
the most important part of the content.

....
Is it a correct approach or does the nodes need to convert in objects
and chained arrays normally?

That depends entirly on what you need it for.
A generic conversion between XML and JSON would preserve order and
structure. More specialized conversions, for particular uses, might
know where it can cut corners and keep some of the XML in string form.

/L
 
L

Lasse Reichstein Nielsen

Max said:
These cases demonstrates the difference between JSON and XML.

Not really. It shows that a particularly naïve implementation
of a conversion from XML to JSON doesn't work well.


What if the conversion of
<e>
some
<a>textual</a>
content
</e>

was:

{"tag": "e",
"content" : [ "some",
{"tag": "a", "content": ["textual"]}
"content" ]}

What is the big difference then?
I think
to use JSON to transmit simples data and XML for structured data.

Your choice. Neither is inherently better (although JSON is often
shorter), but their performances depend on the choice of encoding
as much as the format of the data.

XML only has raw text and elements nodes. Element nodes both work as a
list of XML nodes and as a map from strings to strings (attributes),
and it has a type itself (the tag name). Everything is rolled into
this one compound construct.

JSON has two types of compound structures: (unordered) Maps and
(ordered) Lists (i.e., indexed by either name or by number).

In that sense, JSON is richer than XML, where name-indexed attributes
can only contain simple text.

I find that most data can be well represented in JSON, but starting
with XML data obviously makes JSON look worse than XML. Just as starting
with JSON data would probably make XML look worse.

/L
 
M

Max

Lasse Reichstein Nielsen ha scritto:
Not really. It shows that a particularly naïve implementation
of a conversion from XML to JSON doesn't work well.

Really, then most of implementations of conversion from XML to JSON are
naïve!
What if the conversion of
<e>
some
<a>textual</a>
content
</e>

was:

{"tag": "e",
"content" : [ "some",
{"tag": "a", "content": ["textual"]}
"content" ]}

What is the big difference then?

For JSON, "textual" is a value of "a" and then { a: "textual" }.
This convertion is your expansive implementation created to bypass the
JSON limitations...
In fact, to obtain the string "some" instead of (example) e["#text"],
you must use a blinded mode tag.content[1], while it is more correct to
log on with the real name of the object/tag, that is "e"!
Your choice. Neither is inherently better (although JSON is often
shorter), but their performances depend on the choice of encoding
as much as the format of the data.

XML only has raw text and elements nodes. Element nodes both work as a
list of XML nodes and as a map from strings to strings (attributes),
and it has a type itself (the tag name). Everything is rolled into
this one compound construct.

JSON has two types of compound structures: (unordered) Maps and
(ordered) Lists (i.e., indexed by either name or by number).

In that sense, JSON is richer than XML, where name-indexed attributes
can only contain simple text.

I find that most data can be well represented in JSON, but starting
with XML data obviously makes JSON look worse than XML. Just as starting
with JSON data would probably make XML look worse.

/L


Can you suggests to me a good XML-JSON converter?
 
T

Thomas 'PointedEars' Lahn

Max said:
Lasse Reichstein Nielsen ha scritto:

Really, then most of implementations of conversion from XML to JSON are
naïve!

It would seem you are not exactly in a position to make a correct assessment.
What if the conversion of <e> some <a>textual</a> content </e>

was:

{"tag": "e", "content" : [ "some", {"tag": "a", "content": ["textual"]}
"content" ]}

What is the big difference then?

For JSON, "textual" is a value of "a"
Nonsense.

and then { a: "textual" }.

And if there was

{ a: "textual", b: "foo" }

you could not know which one came first.
This convertion is your expansive implementation created to bypass the
JSON limitations...

There are no limitations in JSON but those you make up here.
In fact, to obtain the string "some" instead of (example) e["#text"], you
must use a blinded mode tag.content[1],

Not necessarily. As the DOM provides getElementsByTagName(), a similar
method can be implemented to traverse the object created from parsing JSON.
while it is more correct to log on with the real name of the object/tag,
that is "e"!

You are mistaken. Text child nodes of the same level do not belong together
unless they are adjacent. Your approach would allow for one text child node
per element only.


PointedEars
 
D

Douglas Crockford

Max said:
According to Stefan Goessner, to obtain a correct conversion from XML to
JSON it must apply some patterns to maintain order and structure of the
elements. This not only serves to ensure a reversible conversion, but
especially for documents such as SVG and SMIL that they require a
precisely correct order of the elements.

That may be true if the text you are dealing with is primarily document. If it
is primarily data, then preservation of document structure will add a lot of
inefficiency to the JSON data structure. A problem in much XML practice is that
it is sometime difficult to understand the difference.

There are a number of transformations possible between JSON and XML. See for
example org.json.XML and org.json.JSONML at www.json.org/java
 
M

Max

Thomas 'PointedEars' Lahn ha scritto:
It would seem you are not exactly in a position to make a correct assessment.

Why? I'm talking about proper implementation of a XML2JSON converter.
I say that then most of converters are naïve because I find many
converters that make a simple conversion from JSON.
At the same json.org website there are examples of simple conversion
(http://www.json.org/example.html)!
I have raised doubts and asked a question:
"Can you suggests to me a good XML-JSON converter?"
I have not received any answer, but only chats.

Max
 
M

Max

Douglas Crockford ha scritto:
That may be true if the text you are dealing with is primarily document.
If it is primarily data, then preservation of document structure will
add a lot of inefficiency to the JSON data structure. A problem in much
XML practice is that it is sometime difficult to understand the difference.

There are a number of transformations possible between JSON and XML. See
for example org.json.XML and org.json.JSONML at www.json.org/java

Thank you Douglas. I hope to find a solution to implement a Javascript
converter from XML to JSON.
(Pay attention to page examples on json.org because there are errors)

Max
 
T

Thomas 'PointedEars' Lahn

Max said:
Thomas 'PointedEars' Lahn ha scritto:

Why? I'm talking about proper implementation of a XML2JSON converter. I
say that then most of converters are naïve because I find many converters
that make a simple conversion from JSON. At the same json.org website
there are examples of simple conversion
(http://www.json.org/example.html)! I have raised doubts and asked a
question: "Can you suggests to me a good XML-JSON converter?" I have not
received any answer, but only chats.

You have received several answers addressing your question while we have
been engaging in a technical discussion about what would make up a good
converter. Whether you like that or not is a different matter, and how to
use search engines is beyond the scope of this newsgroup.

<http://jibbering.com/faq/>


PointedEars
 
M

Max

Thomas 'PointedEars' Lahn ha scritto:
You have received several answers addressing your question while we have
been engaging in a technical discussion about what would make up a good
converter. Whether you like that or not is a different matter, and how to
use search engines is beyond the scope of this newsgroup.

Ok, i have received technical answers but also accusation of lack about
XML2JSON converters... The only practical help was posted by Douglas. I
was wondering just that. Everything else is chats.

Max
 
T

Thomas 'PointedEars' Lahn

Max said:
Thomas 'PointedEars' Lahn ha scritto:

Ok, i have received technical answers but also accusation of lack about
XML2JSON converters... The only practical help was posted by Douglas. I
was wondering just that. Everything else is chats.

Ask for a refund.


Score adjusted

PointedEars
 
D

Douglas Crockford

Max said:
(Pay attention to page examples on json.org because there are errors)

That remark is not helpful. If there are errors, it would be helpful if you
specifically identified them. You can thank the community for the free
information you have obtained by providing correct information of your own.
 
L

lorlarz

According to Stefan Goessner, to obtain a correct conversion from XML to
JSON it must apply some patterns to maintain order and structure of the
elements. This not only serves to ensure a reversible conversion, but
especially for documents such as SVG and SMIL that they require a
precisely correct order of the elements.
For example, in the case of:
[snip]

Max

jQuery has a RSS to JSON converter. No doubt order is maintained
here.
You might like to take a look:
http://ejohn.org/projects/rss2json/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,562
Members
45,038
Latest member
OrderProperKetocapsules

Latest Threads

Top