why ascii ?

  • Thread starter Tejas Arun Kokje
  • Start date
T

Tejas Arun Kokje

Hi,

What was the reason to have XML files in ASCII format ? I understand
that primary purpose of XML is for data interchange. Why could a binary
format not be used instead of ASCII ? Wouldn't binary format be
more efficient than ASCII ?

Tejas Kokje
University of Southern California
 
R

Richard Tobin

What was the reason to have XML files in ASCII format ?

Presumably you mean test format, since XML formats can be encoded in
any character set.
I understand that primary purpose of XML is for data interchange.

That's your mistake. XML was created to be a simplified SGML for the
web, and human-readability was an explicit goal. Read section 1.1 of
the XML specification: http://www.w3.org/TR/REC-xml/#sec-origin-goals.

-- Richard
 
K

Keith M. Corbett

Richard Tobin said:
Presumably you mean test format, since XML formats can be encoded in
any character set.


That's your mistake. XML was created to be a simplified SGML for the
web, and human-readability was an explicit goal. Read section 1.1 of
the XML specification: http://www.w3.org/TR/REC-xml/#sec-origin-goals.

Good answer.

Furthermore, SGML itself is designed for "document processing", with
emphasis on features to support document interchange and formatting. (See
Annex A of Goldfarb, "The SGML Handbook".)

Clearly there is a lot of interest in (and experience with) leveraging XML
for data interchange. But depending on requirements, XML may or may not be a
great fit.

/kmc
 
A

Andy Dingley

What was the reason to have XML files in [text] format ?

Because there were already many binary solutions to much the same
problems as XML, but no-one was using them. Implementing XML with text
makes it incomparably easier to work with and that's an enormous

SGML pre-dates XML, but that wasn't in widespread use either, being
seen as too complex (Berners-Lee had already rejected it, quite
rightly, in favour of HTML). Clearly something with a lower cost of
adoption was needed.

A "binary XML" would also not have been more efficient, in any way
that mattered. Disk space is cheap, programmer time is expensive.
Text-XML only needs to be in its "inefficient but readable" form when
you're actually working with it - for transmission we already have
enough compression down in the network stacks to take care of this.
Compression belong below the transport layer, not in the application!

There's also the similar issue of efficiency in element naming. Around
1999, when XML first started to surface commercially, there were
schemas that looked like this:

<AAA>
<ABC>123</ABC>
<ABB>def</ABB>
</AAA>

These schemas betray a major lack of understanding of XML, and discard
some of its best benefits.


XML was the synthesis of the decision to stick with text, the ability
to base much of it on SGML, and the target of integrating it with
HTML, that was already in use on more volume than any other single
format.
 
P

Peter Flynn

Andy said:
What was the reason to have XML files in [text] format ?

Because there were already many binary solutions to much the same
problems as XML, but no-one was using them. Implementing XML with text
makes it incomparably easier to work with and that's an enormous

SGML pre-dates XML, but that wasn't in widespread use either, being
seen as too complex (Berners-Lee had already rejected it, quite
rightly, in favour of HTML). Clearly something with a lower cost of
adoption was needed.

Although it turned out that using SGML over the Web worked perfectly,
even with large and complex DTDs like TEI or DocBook. The Panorama
plugin and its big sister, Publisher, and the Synex SGML browser engine
were object lessons in Ho To Do It Right, from which some current browsers
have yet to learn :)
There's also the similar issue of efficiency in element naming. Around
1999, when XML first started to surface commercially, there were
schemas that looked like this:

<AAA>
<ABC>123</ABC>
<ABB>def</ABB>
</AAA>

These schemas betray a major lack of understanding of XML, and discard
some of its best benefits.

These had their roots in early SGML DTDs like the AAP ones, designed for
speed of keyboarding onto punched cards, not for human comprehension :)

By contrast we daily see XML generated with element type names running
into dozens or even hundreds of characters, which also betrays a major
lack of understanding.

///Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top