Why Use XML?

R

Randy Yates

I was hoping someone here could answer this very basic question or
point me to something on the web (I googled but didn't find anything
reasonable).
--
% Randy Yates % "With time with what you've learned,
%% Fuquay-Varina, NC % they'll kiss the ground you walk
%%% 919-577-9882 % upon."
%%%% <[email protected]> % '21st Century Man', *Time*, ELO
http://home.earthlink.net/~yatescr
 
P

Philippe Poulard

Randy said:
I was hoping someone here could answer this very basic question or
point me to something on the web (I googled but didn't find anything
reasonable).

Why not use XML ?

--
Cordialement,

///
(. .)
--------ooO--(_)--Ooo--------
| Philippe Poulard |
-----------------------------
http://reflex.gforge.inria.fr/
Have the RefleX !
 
B

Bjoern Hoehrmann

* Randy Yates wrote in comp.text.xml:
I was hoping someone here could answer this very basic question or
point me to something on the web (I googled but didn't find anything
reasonable).

Your question is much like "why use wood?" The answers would depend on
whether you are trying to build a plane, a ship, or a house. There are
many wood-based solutions already available, there are many tools to
efficiently process wood, many people understand wood and processing of
wood, it comes in many flavour and is all in all quite flexible. You
would not use wood to build a plane though.
 
A

Andy Dingley

Randy said:
I was hoping someone here could answer this very basic question

OK, so we have very clear trollsign on this one.

However I like your domain name and you've posted to a TeX group in the
past, so lets hope you're real.

Reasons to use XML:

* It's there

* It's useful

* It's international

* It's synergistic


The world is already full of XML. You have to use it (good or not),
just because everything else you're connecting to is already doing it.
"Why" is still up for debate, but "whether to bother" was forced
several years ago. Just too late to not deal with it nowadays.

It's useful. It actually works. Imagine that, a protocol that comes out
of nowhere, does something useful, has a readable spec and actually
works pretty well. There are a few corner cases where the alternative
might be "better", but by and large XML is not only a possible
solution, it's a damn good one that beats the competition on its own
merits.

It's international. These days The Hell Of Software Development (tm)
isn't about dodging the bouncing feature request, it's about the sudden
internationalisation request. Take a big, ugly (very ugly) English-only
web app and have your sales team suddenly flog it to both Eastern
Europe and an arabic-speaking country. Now deal with those character
encoding issues in a plain text format (by yesterday). In XML though,
you just pick a workable encoding, and the DOM does the rest of the
hard work. Even for CJKV

Best of all though it's synergistic. This is a great word, even though
sadly mis-used by duck-squeezers and crystal-botherers. It's the idea
that 2+2=5, or at least can give you the benefits of 5. The whole may
be greater than the simple sum of parts.

With XML, synergy means that if I use off-the-shelf tools to work to a
standard protocol, and that if you use compatible tools to work to the
same protocol, then our overall systems together will interwork well
and be more capable than either one in isolation. To take an example
from the TeX world, TeX is a great document format for typesetting, but
it's poor for document management of large libraries. In XML though
(such as DocBook) any generalised tools I've already built to look at
"XML documents" and "extract and index embedded Dublin Core" will
magically find themselves capable of working on my newly imported
library, simply because we've all used XML and some decent good
practice and other common standards. My hypothetical "XML indexing
toolset" doesn't care too much if it's looking at RSS newsfeed entries,
the British Library or contractual definitions.

(Not that I'm at all biased against TeX, which there's a risk I might
have to be working with soon for just this purpose)

On the downside, XML doesn't do a damn thing on its own and always
needs to have a "dialect" defined for it. This can be ad hoc and
unspecified (i.e. the emergent dialect that's observable by looking at
the data itself) or it can be formally specified and made rigid by DTD,
XML Schema and OWL ontology (the ability to do the first casually is a
big benefit over SGML). However you do always have such a dialect --
whenever you hear snake oil talked about in the XML world, it's usually
by someone who doesn't appreciate this and who thinks that synergistic
benefits arise purely from using XML, not from sharing this dialect
too. XML is _not_ an instant lingua franca for all data.
 
R

Randy Yates

Andy Dingley said:
OK, so we have very clear trollsign on this one.

However I like your domain name and you've posted to a TeX group in the
past, so lets hope you're real.

Reasons to use XML:

* It's there

* It's useful

* It's international

* It's synergistic


The world is already full of XML. You have to use it (good or not),
just because everything else you're connecting to is already doing it.
"Why" is still up for debate, but "whether to bother" was forced
several years ago. Just too late to not deal with it nowadays.

It's useful. It actually works. Imagine that, a protocol that comes out
of nowhere, does something useful, has a readable spec and actually
works pretty well. There are a few corner cases where the alternative
might be "better", but by and large XML is not only a possible
solution, it's a damn good one that beats the competition on its own
merits.

It's international. These days The Hell Of Software Development (tm)
isn't about dodging the bouncing feature request, it's about the sudden
internationalisation request. Take a big, ugly (very ugly) English-only
web app and have your sales team suddenly flog it to both Eastern
Europe and an arabic-speaking country. Now deal with those character
encoding issues in a plain text format (by yesterday). In XML though,
you just pick a workable encoding, and the DOM does the rest of the
hard work. Even for CJKV

Best of all though it's synergistic. This is a great word, even though
sadly mis-used by duck-squeezers and crystal-botherers. It's the idea
that 2+2=5, or at least can give you the benefits of 5. The whole may
be greater than the simple sum of parts.

With XML, synergy means that if I use off-the-shelf tools to work to a
standard protocol, and that if you use compatible tools to work to the
same protocol, then our overall systems together will interwork well
and be more capable than either one in isolation. To take an example
from the TeX world, TeX is a great document format for typesetting, but
it's poor for document management of large libraries. In XML though
(such as DocBook) any generalised tools I've already built to look at
"XML documents" and "extract and index embedded Dublin Core" will
magically find themselves capable of working on my newly imported
library, simply because we've all used XML and some decent good
practice and other common standards. My hypothetical "XML indexing
toolset" doesn't care too much if it's looking at RSS newsfeed entries,
the British Library or contractual definitions.

(Not that I'm at all biased against TeX, which there's a risk I might
have to be working with soon for just this purpose)

On the downside, XML doesn't do a damn thing on its own and always
needs to have a "dialect" defined for it. This can be ad hoc and
unspecified (i.e. the emergent dialect that's observable by looking at
the data itself) or it can be formally specified and made rigid by DTD,
XML Schema and OWL ontology (the ability to do the first casually is a
big benefit over SGML). However you do always have such a dialect --
whenever you hear snake oil talked about in the XML world, it's usually
by someone who doesn't appreciate this and who thinks that synergistic
benefits arise purely from using XML, not from sharing this dialect
too. XML is _not_ an instant lingua franca for all data.

Thanks Andy. I can assure you I am for real. That's a real phone
number below and I really live in North Carolina in a town called
Fuquay-Varina. I really am an electrical engineer who's a member of
the IEEE. Really. I'm not one of those wispy internet "non-entities."

I'm also asking for the very practical and relevent reason that I may
have an opportunity to develop a large on-line, web-based system for
my client and am wondering if XML would be applicable.

Also, please everyone note, I'm not trying to offend anyone or make
anyone angry. I'm simply asking for information.

In a nutshell, here's my dilemma (and I think it may be related to the
"dialect" you were referring to): To interpret any stream of
data---for example, a document in plain TeX---you must know the rules
for interpreting the symbols. So even though XML may provide a
mechanism for automating the definition of data types, the rules for
interpretation of those data types must also in a likewise manner be
known.

That's as precisely and concisely as I can state it given my current
feeble understanding of XML. If you, Andy, or anyone can help me get
a better grasp or understanding of XML, I'd appreciate it.
--
% Randy Yates % "...the answer lies within your soul
%% Fuquay-Varina, NC % 'cause no one knows which side
%%% 919-577-9882 % the coin will fall."
%%%% <[email protected]> % 'Big Wheels', *Out of the Blue*, ELO
http://home.earthlink.net/~yatescr
 
J

Joseph Kesselman

Randy said:
So even though XML may provide a
mechanism for automating the definition of data types, the rules for
interpretation of those data types must also in a likewise manner be
known.

Absolutely. XML is a shared syntax layer. When you build a data
representation language on top of XML, you still have to define and
implement its semantics.

Whether XML would be applicable depends on specifically what you're
doing with the system and what corner of it you're talking about.

There are a lot of good tutorials on the web about what XML is, what
tools are associated with it, and how to take advantage of it. It sounds
like you should start by reading some of those so you can ask more
specific/productive questions.
 
I

Istvan

Why Use XML ???


Because we don't waste time with teaching different languages to the
machines.
AND
so they understood each other much better, communicate better,
cooperate better.. and they don't make misstakes.
And that's very important, that they don't make misstakes. Because we
depend on our machines. If they broke, we have a catastrophe.
 
R

Randy Yates

Joseph Kesselman said:
[...]
There are a lot of good tutorials on the web about what XML is, what
tools are associated with it, and how to take advantage of it. It
sounds like you should start by reading some of those so you can ask
more specific/productive questions.

I'm happy with the scope and level of productivity my current
questions yield. Thanks for your concern, though, Joseph.
--
% Randy Yates % "And all that I can do
%% Fuquay-Varina, NC % is say I'm sorry,
%%% 919-577-9882 % that's the way it goes..."
%%%% <[email protected]> % Getting To The Point', *Balance of Power*, ELO
http://home.earthlink.net/~yatescr
 
A

Andy Dingley

Randy said:
In a nutshell, here's my dilemma (and I think it may be related to the
"dialect" you were referring to):

Actually it's far worse! - albeit a fascinating problem.

I know _very_ little of TeX. We're currently looking at adopting LyX
for project documentation, which I'm quite against. I see TeX as a hard
problem to parse, even in the simplest case, and an abandonment of
document management tools I've already put effort into building - it's
still workable though, if I found an "XML DOM" equivalent that would
connect me equally well from my Python / Java build scripts to the
underlying docs.

At this point we're worrying about "dialect". We need some shared DTD /
Schema to represent an abstract notion of "documentation". We also need
this same "dialect" to be shared across all of the tools within our
system. Candidates in the XML world are HTML (crude, but it has
paragraph structure), DocBook (too bulky, and not additionally good
enough to really justify itself over HTML) or a new and as yet
un-proven candidate from OpenOffice (looks interesting though).

M$oft's offering of "XML" for Word / Office is, hardly surprisingly,
execrable and unworthy of serious contemplation.


We also need "vocabularies", which are taxonomic lists of stable
identifiers to classify objects in the document. They could be simple,
such as "table of contents" to classify structure to a semantic level
beyond that of the XML Schema, or more complex such as
"customer-0001-contract-2006-2" to identify particular instances.
Generally these are easy to process in a CMS because they're treated
opaquely as mere identifiers, or at most treated as URIs.

To interpret any stream of
data---for example, a document in plain TeX---you must know the rules
for interpreting the symbols.

Now you get the tricky one, which is why TeX is both interesting and
unusably difficult to work with! This is an instance of what's still
known as A Hard AI problem, that of "nominals". They're things which
are sufficiently dynamic to stay dynamic even into the instance of the
dataset, ie. they don't get nailed down by some separate and
pre-existing structure definition.

This is a fascinating area of study in contemporary AI and ontology
design. I've published on it myself. When it comes down to getting
real practical work done though, especially if you limit yourself to
XML, then run far away!

In practice, nominals are like software development. We find the old
"waterfall model" of software dev to be unworkable because it relies on
an unbroken linear progression from a concept to a description to an
implementation. Changing the concept (feature creep) breaks the
following stages and the implementation. Nominals also prevent you
using a waterfall model of knowledge representation.

_If_ you can avoid feature creep (dynamic redefinition of the relevant
structural vocabularies) then you can use waterfall and everything
works fine. In many business cases, you can do just this.

If the vocabularies that change are merely annotational (they identify
things, but they're just either opaque identifiers or leaf nodes with a
little self-contained description) then it's quite easy. Maybe needs a
management process to track who gets to append to the vocabularies and
to propagate this (old stuff we solved years ago).

If the document can itself introduce a new structural term like
"subclause only applying in the 23 states area", then you have problems.
 
R

Randy Yates

Andy Dingley said:
"dialect"
"vocabularies"
taxonomic lists of stable identifiers to classify objects
Hard AI problem
nominals
[...]

Andy,

I apparently struck a root nerve with you. You mention some things that
sound fascinating!

Let me ask a follow-up simple-minded question:

Which standard XML "vocabularies" exist at this point in time?
--
% Randy Yates % "I met someone who looks alot like you,
%% Fuquay-Varina, NC % she does the things you do,
%%% 919-577-9882 % but she is an IBM."
%%%% <[email protected]> % 'Yours Truly, 2095', *Time*, ELO
http://home.earthlink.net/~yatescr
 
S

Stefan Ram

Randy Yates said:
Which standard XML "vocabularies" exist at this point in time?

To me, there are only "vocabularies", not "XML vocabularies".

Examples of vocabularies are:

http://www.iana.org/assignments/language-subtag-registry
http://dublincore.org/documents/dcmi-terms/
http://atompub.org/2005/03/12/draft-ietf-atompub-format-06.html

But I see no need to bind them to XML. They can be used
with XML or without XML.

Like the HTML element types: They are used both used in
the SGML-based HTML as also in the XML-based XHTML.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top