XML, XHTML, Text Structuring, and CSS

A

Andy Dingley

Not true.

Sorry, what I meant here was that you can't achieve the functions of
elements in <head> by taking arbitrary XML elements and applying CSS
to try and turn them into <title>
 
J

Jukka K. Korpela

Blue said:
IE doesn't care about me so I don't care about IE.

That's your privilege, but you might get better answers if you
explicitly said, at the very beginning, that you don't want to author
for the WWW.
Frankly. XHTML 1.0 should be served as application/xhtml+xml.

The question really is whether you win anything by using XHTML 1.0 as
opposite to HTML 4. As soon as you find the right question, the answer
is pretty obvious.
(And as a
side note you can still have frames in 1.1 if you write a DTD
driver for it like I did or just ignore the DTD and just use those
tags anyway.)

You can draw a square and call it a circle, but it ain't no circle.
The DTD of XHTML 1.1 is fixed. If you change a single bit in the DTD,
you are not using XHTML 1.1. Is the marketese nonsense around XHTML
really so impressive that people attach that label to virtually
anything? Why not call it XHTML 3.0? Well, apart from the fact that the
W3C claims XHTML to be a "generic trademark" - yet another oxymoron.
Actually, I have XHTML 3.0 for you:
http://www.cs.tut.fi/~jkorpela/html/xhtml3.html
HTML failed as a structural language early in
it's life. It IS a presentational language.

No, it is bastard, or hybrid, language. And it's a practical tool,
which can be used structurally, or as tag soup that is little more
(actually, it's much less) than a clumsy collection of second-hand
text-processing macros. Or something in between. Your choice.
XML is an
ackknowledgement of all the proprietary extentions that were going
to and did happen to HTML like it or not.

No, of course not. XML is just a strongly simplified metalanguage for
defining the syntax of markup notations. You can use it for pure
structure, or pure presentation, or something in between. It's almost
as exciting as ASCII.
It alows the w3 to enforce it's semantics

XML has no semantics, and it cannot serve as a weapon of enforcing
semantics (or anything else).
To be quite frank you have it backward. I'm actually more
interested in structure and I'm saying that HTML 4.01 isn't
structural enough.

We surely agree on HTML 4.01 not being structural enough. But what we
can see around is movement to less structure under the great fallacy of
getting more structure when you just use XML.
Custom Structural XML -> XSLT -> Presentational XML+CSS

What's so really bad that?

First, the fact that the "custom structure" exists in your mind only,
not communicated to anyone else. Second, that the only way to access
your document is in the specific, single visible format that you have
defined in your text processing application (implemented using macros
in XML and XSLT and CSS clothes). It would be easier to write the
document using MS Word, with disciplined use of Word styles - and it
would be, in practice, more accessible that way, since so many users
know what to do with MS Word documents, and so few would wish to learn
to use your text processing application e.g. just to increase font
size.

And, of course, you it would be even simpler, and considerably more
accessible, to use just HTML today, and maybe add some CSS tomorrow.
 
T

Toby A Inkster

Blue said:
This in neither here no there but I have read somewhere that in XHTML
2.0 the <img> might be removed and relagated to the object tag. How's
that for compatability?

It's compatible with IE 5+, Netscape 6+, Opera 5(?)+.

Besides -- that's the whole point of XHTML 2 -- it sheds compatibility for
the sake of creating a better, more semantic, more structured markup
language.
Furthermore HTML 2.0 isn't compatabile at all
with legacy user agents because the http mime has to be
application/xhtml+xml and from what I have read all legacy browsers
choke on it.

Most browsers support ftp:// URLs.

Look, Ma! No Content-Type header! ;-)
 
B

Blue

That's your privilege, but you might get better answers if you
explicitly said, at the very beginning, that you don't want to author
for the WWW.

How would I not be authoring for the web? One can post word or pdf
files you can still put it on the web.
The question really is whether you win anything by using XHTML 1.0 as
opposite to HTML 4. As soon as you find the right question, the answer
is pretty obvious.

That's not he question because you do.

You get namespaces so you can integrate other markup like MathML which
should definetely not directly be a part of HTML. This is also very
similar and related to XHTML modularization which more easily allows
subsets and supersets of HTML.

But the immediate benefit is server side scripting. You might not
need those namespaces features but server scripting does. You get
proprietary HTML anyways, namespaces and XML just make it official.
But basicly 90% of the benefit is XHTML conforms to XML's uniform
markup rules which allows XML parsers to process XHTML files with
fewer CPU cycles.
You can draw a square and call it a circle, but it ain't no circle.
The DTD of XHTML 1.1 is fixed. If you change a single bit in the DTD,
you are not using XHTML 1.1. Is the marketese nonsense around XHTML
really so impressive that people attach that label to virtually
anything? Why not call it XHTML 3.0? Well, apart from the fact that the
W3C claims XHTML to be a "generic trademark" - yet another oxymoron.
Actually, I have XHTML 3.0 for you:
http://www.cs.tut.fi/~jkorpela/html/xhtml3.html

Do you understand the point of the 1.1 revision was over 1.0? With
1.0 you had three different doctypes and if you changed those it
didn't conform where as HTML 1.1 is about drivers which allows you to
pick and choose the tag sets (modules) to support. 1.1 is about about
DTD drivers and modules instead of defining multiple DTD's for various
features and subsets of the language. With 1.1 you define a driver
and you include modules. Instead of using the frames DTD you create
a driver to include the 1.1 generic DTD and add frames. Instead of
big DTD's for everyone you get little DTD's for subsets.

No, it is bastard, or hybrid, language. And it's a practical tool,
which can be used structurally, or as tag soup that is little more
(actually, it's much less) than a clumsy collection of second-hand
text-processing macros. Or something in between. Your choice.

It's amazing to me that people insist it's structural just because of
the heading and list tags.
No, of course not. XML is just a strongly simplified metalanguage for
defining the syntax of markup notations. You can use it for pure
structure, or pure presentation, or something in between. It's almost
as exciting as ASCII.

The XML format itself is just simple basic rules for file parsing.
It's like comma delimited text files. You do not define you syntax in
the file, it's defined externally though a DTD or a Schema which in
the case of Schema can be an XML file. But I understand what your
trying to say in the last statement.
XML has no semantics, and it cannot serve as a weapon of enforcing
semantics (or anything else).

I beg to differ. It allows them or anyone else to point and say
definitively that so and so isn't following the spec. With XML and
namespaces there is no justification. Statements of conformence when
it's not could ammount to fraud in advertizing. There is no
justification when that entity can easily create a DTD of their own
syntax. But they can't confuse the issue claiming conformence with
such and such a reason for not following spec exactly.
We surely agree on HTML 4.01 not being structural enough. But what we
can see around is movement to less structure under the great fallacy of
getting more structure when you just use XML.

You do get more structure if you define a format with more structure.
With XML there is nothing stoping you or me from trying to create a
format and get others to follow it. With XML parsers if we define our
structure adequtely we can translate it to whatever "standard"
document structure wins out. And we can still translate it to HTML
through XHTML if I really need to.

My origional post was about asking what the inherent value of HTML
(althrough I realize now I should have specified structural) was and
all anyone can say about it is backward compatability. I.E. that it
will DISPLAY in older browsers.
First, the fact that the "custom structure" exists in your mind only,
not communicated to anyone else. Second, that the only way to access
your document is in the specific, single visible format that you have
defined in your text processing application (implemented using macros
in XML and XSLT and CSS clothes). It would be easier to write the
document using MS Word, with disciplined use of Word styles - and it
would be, in practice, more accessible that way, since so many users
know what to do with MS Word documents, and so few would wish to learn
to use your text processing application e.g. just to increase font
size.

And, of course, you it would be even simpler, and considerably more
accessible, to use just HTML today, and maybe add some CSS tomorrow.

I already use CSS simply because it is more able and it cleans up the
HTML files I create in a text editor. I learned a lot about XSL-FO
last night. I would gladly use XSL-FO instead of CSS if browsers
supported it. What's ironic is that FO looks just like
advanced-presentational HTML. But that ironicly is what is preventing
it's adoption even though it's already a W3 recommondation.

I believe that people are right and FO will become a semi standalone
pdf like format but I don't think it will be widely abused like that.
I and I believe most will not abuse it because like me they will
embrase multiple stylesheet. The fear of FO is irrational in my
opinion.

And I think it's obvious I have come to the conclusion to use my own
XML formats until a good standard structural format comes along and
then thanks to namespaces I can use it but still add extentions to it
until the standard catches up.

But the question in my mind origionally was why use HTML as an output
format? I have yet to recieve anything pointing to a benefit so I
believe that my conclusion is to only use HTML when I really need it
until I no longer need it, hope for something better eventually, but
in the meanwhile create my own structure.

I thank everyone for their feedback.
 
B

Barry Pearson

Jukka said:
XML has no semantics, and it cannot serve as a weapon of enforcing
semantics (or anything else).
[snip]

See:
http://www.barry.pearson.name/articles/new_html/

<extract>
"The World Wide Web Organisation of governments (W3O) has announced its plans
for future standards for the web. W3O recently replaced W3C, the World Wide
Web Consortium, as the guardian of standards-recommendations for the web. W3O
has now announced that it is scrapping the W3C's XHTML recommendations, and is
replacing them with "NewHTML".

W3O is publishing the specification for NewHTML 1.0 Strict to coincide with
this announcement, and work is underway to develop NewHTML 2.0 Totalitarian.
Contributions to the latter are particularly sought from political and
religious groups who are in favour of the restrictions that the 2.0
Totalitarian standard will enable them to impose on authors."
</extract>
 
J

Jukka K. Korpela

Blue said:
How would I not be authoring for the web?

By deliberately ignoring the fact that your methods will exclude the
browser that is currently clearly dominant.
One can post word or pdf
files you can still put it on the web.

And get much better coverage than by publishing by XHTML 1.1
recommendation.
You get namespaces so you can integrate other markup like MathML

On which planet? MathML is a monstrosity with almost negligible
support. Besides, then your documents would not be XHTML documents but
XML documents that contain, among other things, some XHTML tags.
This is
also very similar and related to XHTML modularization which more
easily allows subsets and supersets of HTML.

Is that supposed to be a _benefit_? Is there a sudden need for
additional incompatible dialects of HTML?
But the immediate benefit is server side scripting.

I have no idea of what you are talking about. If you have decided to
use a particular server side technology that requires XHTML, then this
has little implications on the rest of us.
But basicly 90% of the benefit is XHTML conforms to XML's uniform
markup rules which allows XML parsers to process XHTML files with
fewer CPU cycles.

No, in the real world, processing XHTML is slower than processing HTML.
Do you understand the point of the 1.1 revision was over 1.0?

_I_ do.
With
1.0 you had three different doctypes and if you changed those it
didn't conform

Strangely, if you don't conform, you don't conform.
where as HTML 1.1 is about drivers which allows you
to pick and choose the tag sets (modules) to support.

No it isn't. Have you actually read the XHTML 1.1 specification?
(I presume that's what you refer to by "HTML 1.1".) XHTML 1.1 is
(considerably) _more_ restricted than XHTML 1.0. It has a single
document type definition (DTD). You are confusing it with the general
idea and technology of modularization. You can pick up your favorite
collection of modules (so revealingly called "tag sets", indeed), but
it won't be XHTML 1.1 unless you pick up exactly the same modules.
It's amazing to me that people insist it's structural just because
of the heading and list tags.

Well, maybe you should learn a little more HTML, perhaps starting from
HTML 2.0, in that case.

Trying to ridicule the existing modest structurality isn't a very
bright idea if _your_ alternative is a collection of tags with no
public agreement on semantics.
The XML format itself is just simple basic rules for file parsing.

Well, roughly so, though it does not postulate the existence of files,
and parsing is just the "consumer" side of the matter.
I beg to differ. It allows them or anyone else to point and say
definitively that so and so isn't following the spec.

Syntactically, yes. Do you understand the difference between syntax and
semantics.
With XML there is nothing stoping you or me from trying
to create a format and get others to follow it.

That's undescribably trivial. You could replace "With XML" by "With
plain text" or by "Without XML".
My origional post was about asking what the inherent value of HTML
(althrough I realize now I should have specified structural) was
and all anyone can say about it is backward compatability. I.E.
that it will DISPLAY in older browsers.

If you wish to put it that way. But you should include newer and future
browsers, of course.
But the question in my mind origionally was why use HTML as an
output format?

Maybe because your output should be acceptable as input to interested
parties?

Once again, the blind test is revealing. If you use your collection of
XML tags (with or without a DTD, doesn't matter now) and your
stylesheet, which is virtually surely designed for one particular
visual presentation only, what are the chances of a blind man making
any sense of it?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top