What's wrong with this HTML (fails validation) ?

  • Thread starter robert maas, see http://tinyurl.com/uh3t
  • Start date
D

dorayme

"Jukka K. Korpela said:
I have much sympathy for idiots, but not
that much for people who just act like idiots.

Does this mean our partnership is headed for a rocky road?
 
R

robert maas, see http://tinyurl.com/uh3t

From: "Jukka K. Korpela said:
Content-Type: text/plain;
format=flowed;
charset="Windows-1252";
reply-type=original

That's not a valid charset to use for posting to newsgroups.
Could you please use US-ASCII or ISO 8859-1 (Latin 1) instead?
In particular, I don't have access to any machine capable of
displaying text in Windows-1252 character set, and the Web browser
doesn't do a particularily good job of rendering non-USASCII
characters in USASCII/VT100, so anything you said might be mangled
by the time I see it here.
Do you by any chance have a validator that spots each time you use
a character of Windows-1252 that is not compatible with US-ASCII,
so that you could eliminate all such characters from your article,
and then post what's left as US-ASCII instead?
Not very useful, but useful. Knowing whether there is an error or
not is more information than not knowing whether there are errors
or not.

Well that depends on whether you consider the purpose of a
validator to be:
-1- Show the author where the error is, and explain what's wrong,
so that the author can immediately fix it.
-2- Flag the entire WebPage as INVALID, with no idea where the
error actually is, so the author must post to a newsgroup
asking for help, and spend several days before a single error
can be fixed.
I'll agree the W3C validator accomplished -2- in this case.
In my wild youth, I wrote major parts of a Pascal compiler,
including error processing, and I still remember it was rather
difficult to be correct and helpful at the same time.

Hey, as long as you're here, I have an OT question just for you:
Do you have access to any Pascal system (compiler and runtime
library) which is suitable for writing CGI server-side
applications? I.e. it must have some way to inspect system
variables such as REQUEST_METHOD and CONTENT_LENGTH and
QUERY_STRING, and it must have some way to read n characters from
standard input, in order to obtain the urlencoded form contents as
a string for both GET and POST methods.
On the contrary. A utility that reports mistakes (even on a
"There is an error" basis, though naturally I prefer more exact
reports) is often essential, but it is not a _substitute_ for
learning and understanding. Rather, an incentive and tool for them.

Well so-far all I've learned is that:
While <tag /> is a perfectly acceptable non-container XML tag, it's
totally invalid as such in SGML, generating completely different
longuange semantics, therefore must be totally avoided in any
WebPage that is supposed to be transitional between HTML and XHTML.
The br element is defined in a way that's totally incompatible
between HTML and XHTML, so must be completely avoided in
transitional WebPages.
There's a second validator that is very much different from the W3C
validator and can help diagnose errors where the W3C validator gave
an error apparently unrelated to anything wrong in the syntax.
Compare this with spelling checkers.

My experience with spelling checkers was totally frustrating. For
trying to spot typographic mistakes in my transcription of my
personal diary to computer file, I spent many hours adding new
words to the dictionary, such as girls' names, places/streets, and
even psychological jargon that didn't happen to be in the base
dictionary. In the end, after many many hours of such work, not a
single real typo was spotted. Meanwhile several typos crept through
because the mis-typed word happened to match a real word, usually
an archaic word, that happened to be in the base dictionary.

My current computer (Macintosh Performa 600 running System 7.5.5)
doesn't even have a spelling checkers as far as I know, and I'm not
going to waste my time downloading one. Whenever I'm about to use a
new word I haven't used before, I type it as I best think it should
be spelled, and feed it to either dictionary.com or google.com, and
see what turns up. Sometimes it matches but gives a meaning I
didn't intend, indicating I either picked the wrong word or
accidently misspelled one word to yield another word. Often it says
no match but offers alternative spellings, and I try any that look
likely and eventually find the one with the meaning I intended.
About once every few months my spelling is wrong and none of the
offered corrections are what I intended and I have no idea how to
spell the word correctly. So then I have to post to a newsgroup
asking what is the word that sounds like "mumleblortch" and means
"suchandsuch", and hope somebody knows and will tell me.
I have the same problem with movie titles, where indb.com maps from
title to movie, and from actor in movie to all movies that starred
that actor, but has no way to map from concept to movie title.
For example for years I remembered a movie that had somebody go
back in time and almost collide with themselves making the return
trip, and then at the end the time machine is damaged causing the
whole movie to repeat at hyper-speed. I couldn't find anyone who
knew what that movie was. Finally a few years ago the movie
appeared on TV again, and I was able to find the title: "Journey to
the Center of Time".

In cryptography there's a concept of "trapdoor function", where
it's very easy to compute one function but not its inverse. The
same idea applies to information retrieval such as spelling and
movie titles with current network technology. A few WebSites try to
provide the reverse mapping in some very special cases, such as the
site that recognizes famous integer sequences. I don't know any
dictionary/spellcheck reverse-mapping service, do you?
if Word flags an entire sentence in a manner that effectively
says "hey, this went over my head, the sentence is too
complicated", I'm grateful for the information and don't require
it to tell _where_ its analysis broke. Rather, I read the
sentence carefully and then usually reformulate it, typically
breaking it to two sentences.

That's not at all comparable to what I experienced with with the
W3C validator. Imagine if you had a sentence that said:
I rather like using the open parens, i.e. "(", to start a footnote.
And then ten pages later you had another sentence that started:
I have 6 (six) reasons to love Heather Thompson:
And the spelling checker showed you just that last line and told
you that that was not a valid sentence. It didn't bother to tell
you that it considered the entire ten pages from the first open
paren, ignoring the second open paren, to the finally matching
close paren, as a single "sentence". It didn't tell you that it was
ignoring that second open parens because you are already in scope
of open parens and it's bad style to nest parens. How can you
carefully read "the sentence" when you don't even know how many
pages were gulped into a single sentence by the parsing of the
spell checker?
All human people should learn more every day to the extent they
can.

I agree. But the ancient practice of throwing somebody in jail, or
flogging somebody, or just saying "that's it" and walking away,
without even telling them what crime they committed that warranted
such punishment, is not the way to help somebody learn to avoid
such mistakes in the future. Unfortunately that ancient practice is
still practiced today, and I've been unable to learn several topics
as a result (of the combination of lack of meaningful error
diagnostics and my lack of knowledge sufficient to guess what I did
wrong).

And let me say something about the word "diagnostic"!! The word
comes from "diagnose", which means to figure out what's wrong and
state that clearly. If you go to a doctor, and he says "Yes, you're
sick, but I won't tell you what is causing you to be sick, you need
to learn enough to figure that out yourself, and I won't even tell
you what topic you need to learn", that's not a diagnosis, IMO.
These days, you can easily use CSS to set the relevant
properties (margin-top and margin-bottom for applicable elements)
to zero, with the usual CSS caveats of course.

The usual caveat is that it doesn't work, plain and simple.
In this context, "net" (better written as "NET") means "null end
tag". When you have "<p /", you have a NET-enabling start tag,
i.e. a start tag that makes the next "/" act as the end tag for
the element that was opened. It's a nice idea in SGML, but it was
never implemented in HTML browsers, even though it was formally
part of HTML up to and including HTML 4.01.

Ah, thanks for telling me that NET is an abbreviation, nothing to
do with Visual Basic .NET, or InterNet, or net profit, etc., and
for telling me what the abbreviation stands for. That's a start.
Now I still don't know why it's called "null" and why it's called
"end tag" and why it really acts like a start tag instead of a end
tag, etc. But clearly it's not anything I ever intended, so my De
Anza class instructor was wrong that I should use empty tags in
transitional HTML documents, so I've stopped using them as of
today. Now I just need to find alternative syntax in all my
existing Web pages that I've created or edited since 2.5 years ago.
By using <hr>.

That violates what I learned in the "Web Design" class.
The instructor required two things:
- Set up header (doctype etc.) as transitional.
- Never use an opening tag without the matching close tag.
As you point out, neither br nor hr is compatible with those two
rules. So I've been forced to go back to what I used to do before I
ever learned about the hr element:
----------------------------------------------------------------
Alternatively, by setting a top or bottom border for an element.

I just want a few blank lines followed by a bar across the screen
to mark the end of each section and start of next section, because
I want a whole lot of sections to be all in a single physical
WebPage, but I want readers to notice when they've run off the end
of one topic and are starting to bump into the next (unrelated)
topic.
There are other techniques like images and background images too,

That doesn't work with text-only access such as the only access I
have here.

So on another pending topic: Do you know any way to force a line
break without causing a blank line? Do you know any way to avoid a
blank line at the end of a pre block?
 
C

Chris F.A. Johnson

Let me see if I can find the documentation for that ...
<http://www.w3.org/TR/CSS21/box.html>
shows only how to do it in CSS, not directly in the element, so
that's useless for me because not all browsers support CSS, in
particular the only browser available to me for testing over VT100
dialup (lynx) into unix shell doesn't support CSS.
<http://www.thescripts.com/forum/thread154512.html>
also refers to CSS, but at least it shows doing something directly
in the individual element, so let me give it a try ... No, it
doesn't work. Here's the HTML source:

Anything I haven't yet included might be found
<a href="http://merd.sourceforge.net/pixel/language-study/syntax-across-languages.html">here</a>
in much terser form but still perhaps useful if you then use Google to find
the documentation for the keyword cited there.
<p style="margin-top: 0; padding-top: 0"></p>
<em>**(I don't want a blank line here, just a forced newline.)**</em>
Also, the original perl cookbook sourcecode and partial translations to
several other languages can be found
<a href="http://pleac.sourceforge.net/">here</a>,
..

And here's how it appears on-screen.

Anything I haven't yet included might be found here in much terser
form but still perhaps useful if you then use Google to find the
documentation for the keyword cited there.

**(I don't want a blank line here, just a forced newline.)** Also, the
original perl cookbook sourcecode and partial translations to several
other languages can be found here, ...

So what do I need to do there to get rid of that blank line?

Use said:
Also is there any way to prevent the blank line after the pre element here?

.. So suppose you
enter two numbers, with a space between them, such as</p>
<pre>
42 69
</pre>
<pre>

**(I do *not* want a blank line here!!)**
on a single line of input?
Lisp will read the 42, and print it out on a new line.


That's disgusting. I've tried to purge my documents of all
accidental use of that crock, using &gt; whenever I want that
character to *appear* in the display of the WebPage.

Why? There is no good reason to use &gt; instead of >.
 
D

dorayme

(e-mail address removed) (robert maas, see http://tinyurl.com/uh3t)
So on another pending topic: Do you know any way to force a line
break without causing a blank line?

<br>

As for all the other questions you raise, you will need a team of
surgeons to operate on you. I enjoyed watching your tortured
soul, it reminded me of something in my past. <g>

I am the <br> specialist, so hopefully this bit is now fixed.
 
R

robert maas, see http://tinyurl.com/uh3t

From: "Andy Dingley said:
So close your elements with an end tag.

If I do that, validation fails. Should I just ignore validation failure??
Quote your attributes.

Both the key and the value, which I believe is invalid:
<a "href"="http://...">
or just the value, as I already do?
Your code examples are one of the few cases when <pre> might well
be appropriate, ...

Is there any way to have a code example like that with both the
preceding and following text directly adjacent to it, no blank line
either before or after the code example?
It does no such thing, nor does <p>. It causes the content to be
rendered as a block within a box, and CSS might say that there's some
margin space after this. That's a lot different from there being "a
blank line afterwards". There is no line, there's only space after the
line before.

Then please tell me how to get rid of the space after the line before!!!
I want it to look like this:

**StartGoodExample**
Again, there's no automatic newline after each output, so the 42 and
69 run together with each other as well as with the following shell
prompt. To force a line break (in the output) at any point, include
this statement: print "\n";. For example, to print 42 on a line by
itself, not run together with the next shell prompt, do this:
perl -e 'print 42; print "\n";'
and to print 42 on one line and 69 on another line, and move to yet
another line for the next shell prompt, do this:
perl -e 'print 42; print "\n"; print 69; print "\n";'
**EndGoodExample**

not like this:

**StartBadExample**
Again, there's no automatic newline after each output, so the 42 and
69 run together with each other as well as with the following shell
prompt. To force a line break (in the output) at any point, include
this statement: print "\n";. For example, to print 42 on a line by
itself, not run together with the next shell prompt, do this:
perl -e 'print 42; print "\n";'

and to print 42 on one line and 69 on another line, and move to yet
another line for the next shell prompt, do this:
perl -e 'print 42; print "\n"; print 69; print "\n";'
**EndBadExample**

You see the space (blank line of text) after the end of the first
<pre> block in the bad example? How do I get rid of that so it
looks like the good example which I concocted by editing out the
blank line because I don't know any way for HTML to generate it
directly??
Learn some trivial CSS, ...

I already did, when I took that class that taught me crap.
CSS isn't available here over VT100 dialup into Unix shell, the
only net access I have here, so it's impossible for me to develop
any more CSS stuff, and even if I magically blindly guessed the
correct CSS it wouldn't work here.

That starts a NET, which is **not** the semantics I want!!

Note that all Web pages I develop must satisfy these requirements:
- Must render the way I want in lynx, the only Web browser available here.
- Must pass validation, so I don't get harassing crap from others.
- Should render the way I want in most other browsers too, but I
have no way to check that from here. So if something looks wrong
in your browser, after I already say it's fine from here, please
tell me what you think needs fixing that won't break lynx or
validator.
 
R

robert maas, see http://tinyurl.com/uh3t

From: "Jonathan N. Little said:
Anything in recent history, Yeah 10-year old browser will have trouble
but who is using one?

I'm using the only Web browser available here. If you don't like
it, why don't you come over here and show me something better that
also works on FreeBSD Unix over VT100 dialup.
Now unfortunately IE does not support "sibling selectors"

That's irrelevant to me because IE doesn't work over VT100 dialup.
 
R

Rik

If I do that, validation fails. Should I just ignore validation failure??

Don't forget that fixing some errors may draw attention to other errors
previously obscured.
Is there any way to have a code example like that with both the
preceding and following text directly adjacent to it, no blank line
either before or after the code example?

pre{
margin: 0;
}

Or any other value you deem appropriate.
 
J

Jonathan N. Little

robert said:
I'm using the only Web browser available here. If you don't like
it, why don't you come over here and show me something better that
also works on FreeBSD Unix over VT100 dialup.


That's irrelevant to me because IE doesn't work over VT100 dialup.

So your saying the page is *not* for publication? Only you are going to
see it in a terminal program? If so drop the HTML and make it a text file!

If not, and you do intend to publish this on the web, then what *you*
use is unimportant, but what *they* use (by that the rest of the world)
is paramount, and I can guarantee you that the lion-share will not be
viewing it with Lynx!
 
J

John Hosking

robert said:
[Omitted attributions for Jukka replaced] He spake, robert maas answered, Jukka replied:
Not very useful, but useful. Knowing whether there is an error or
not is more information than not knowing whether there are errors
or not.

"Several thousand lines after where the actual error happened" would
indeed be a pain. But from what I saw in your OP and subsequent posts,
that wasn't the actual case. In fact, all the action occurred within
Line 1353.
Well that depends on whether you consider the purpose of a
validator to be:
-1- Show the author where the error is, and explain what's wrong,
so that the author can immediately fix it.
-2- Flag the entire WebPage as INVALID, with no idea where the
error actually is, so the author must post to a newsgroup
asking for help, and spend several days before a single error
can be fixed.
I'll agree the W3C validator accomplished -2- in this case.

I think what'd be more accurate to say is that the W3C validator usually
hits in the range of your -1- above and about -1.7-, because some of the
error descriptions are unmistakeable but others take some studying and
experience.
Well so-far all I've learned is that:
While <tag /> is a perfectly acceptable non-container XML tag, it's
totally invalid as such in SGML, generating completely different
longuange semantics, therefore must be totally avoided in any
WebPage that is supposed to be transitional between HTML and XHTML.
The br element is defined in a way that's totally incompatible
between HTML and XHTML, so must be completely avoided in
transitional WebPages.

You referred to W3C documents in another post (or was it in this one?),
so I guess this might not be too helpful. Still, I give you some
references in a little table which I hope you will find accurate and useful:

HTML: <br> and <hr>
http://www.w3.org/TR/html401/struct/text.html#h-9.3.2
http://www.w3.org/TR/html401/present/graphics.html#edef-HR

XHTML: <br/> and <hr/> or <hr></hr>
http://www.w3.org/TR/xhtml1/#h-4.6

XHTML Appendix C: <br /> and <hr />
http://www.w3.org/TR/xhtml1/#C_2
Include a space before the trailing / and > of empty elements, e.g. <br
/>, <hr /> and <img src="karen.jpg" alt="Karen" />. Also, use the
minimized tag syntax for empty elements, e.g. <br />, as the alternative
syntax <br></br> allowed by XML gives uncertain results in many existing
user agents.

SGML: Don't know. Haven't needed to care yet.
XML: Similarly not directly relevant to me here (but see above).
There's a second validator that is very much different from the W3C
validator and can help diagnose errors where the W3C validator gave
an error apparently unrelated to anything wrong in the syntax.

Yes, it sometimes helps to get a second opinion.
I have the same problem with movie titles, where indb.com maps from
title to movie, and from actor in movie to all movies that starred
that actor, but has no way to map from concept to movie title.
For example for years I remembered a movie that had somebody go
back in time and almost collide with themselves making the return
trip, and then at the end the time machine is damaged causing the
whole movie to repeat at hyper-speed. I couldn't find anyone who
knew what that movie was. Finally a few years ago the movie
appeared on TV again, and I was able to find the title: "Journey to
the Center of Time".

If you register at IMDb (I assume you didn't really mean "indb") you
have access to the message boards. Sign on and go to the "I Need To
Know" board, where some people ask questions like yours and other people
pop up with the answers. IMDb also has search capabilities using
keywords, but the results are not always satisfying.


Ah, thanks for telling me that NET is an abbreviation, nothing to
do with Visual Basic .NET, or InterNet, or net profit, etc., and
for telling me what the abbreviation stands for. That's a start.
Now I still don't know why it's called "null" and why it's called
"end tag" and why it really acts like a start tag instead of a end
tag, etc.

Well, look again. What you got was "Warning: net-enabling start-tag;"

I interpret that to mean that what was found by the validator is a
NET-enabling thing, meaning it enables a null end tag (and as Jukka
explained - for which I am grateful because I did not know this - that
means the next slash acts as an ending tag, apparently without the need
for a bracket >. And oh yeah, it *is* a start-tag *itself*, but it
enables a null end tag. Maybe you need to re-read Jukka's post more
carefully. (And I don't think he used any characters which the VT100
would mangle, so you shouldn't have to worry.)

That violates what I learned in the "Web Design" class.
The instructor required two things:
- Set up header (doctype etc.) as transitional.

If this is exactly what she said (i.e., that's *all* she said about it,
dogmatically), then she left out some important info. Transitional is
for legacy documents not yet cleaned up to validate as strict. I don't
know when you had your course, but there *was* a time when specifying a
transitional doctype was good advice (or at least, widely recommended;
Jukka or BTS or somebody will have the details).

For *new* pages, or pages you are actively maintaining, go with HTML
4.01 strict, unless you have a bona fide reason to use XHTML.
- Never use an opening tag without the matching close tag.
As you point out, neither br nor hr is compatible with those two
rules.

Sure they are. You're letting yourself get confused. I think you're
hyperventilating. Sit down. ;-)
 
S

Steve Pugh

If I do that, validation fails. Should I just ignore validation failure??

Not if you do it correctly.

First you need to ask yourself if you are writing XHTML or HTML. Where
and how you close elements varies between the two. So you must be 100%
consistent to the rules of whichever one you are using. Starting with
inclduing either an HTML or XHTML doctype.

I think that most of the original problems have come from mixing the
two formats in one document and then getting more and more confused
when peopel tried to help you out.
Is there any way to have a code example like that with both the
preceding and following text directly adjacent to it, no blank line
either before or after the code example?

Without using CSS?

In XHTML:
<p>text<br /><code>example</code><br />text</p>
In HTML:
<p>text<br><code>example</code><br>text</p>

However, I would use pre + css and accept the blank lines for the tiny
minority of people who view the page in Lynx. Unless you are the only
visitor to your site, you have to accept that your view is very far
from what most people will see.
That starts a NET, which is **not** the semantics I want!!

Not in XHTML it doesn't. There's no such thing as a NET in XHTML.

And a NET is not semantics - it doesn't alter the meaning of the
element at all - <p/text/ means exactly the same as <p>text</p>.

In XHTML use <br /> and in HTML use <br>. That's it. That's how to
create a valid and working line break in the two languages.

Steve
 
S

Steve Pugh

Well so-far all I've learned is that:
While <tag /> is a perfectly acceptable non-container XML tag,
Correct.

it's totally invalid as such in SGML, generating completely different
longuange semantics, therefore must be totally avoided in any
WebPage that is supposed to be transitional between HTML and XHTML.

It has a different (but valid) use in HTML. However that use has never
been supported by browsers. It should be avoided in HTML for a mix of
practical and technical reasons.
The br element is defined in a way that's totally incompatible
between HTML and XHTML, so must be completely avoided in
transitional WebPages.

I think you've misunderstood what Transitional means in the context of
HTML and XHTML.

HTML 4.0 Transitional was for making pages that were in transition
between HTML 3.2 style presentational markup and HTML 4+CSS style
semantic markup plus separate presentation.

XHTML 1.0 Transitional is just an update of this to use XML syntax. It
is nothing to do with a transition between HTML and XHTML.

The only thing that comes close to a transition between HTML and XHTML
is Appendix C of the XHTML spec which recommends the use of <br /> for
empty elements and <p></p> for non-empty elements (where empty is as
defined in the spec, not as in whether they have actual content or
not).

Steve
 
J

Jukka K. Korpela

Scripsit robert maas, see http://tinyurl.com/uh3t:
That's not a valid charset to use for posting to newsgroups.

Windows-1252 is a registered character encoding, and any software used in
Internet matters needs to deal with it in order to be successful.
Could you please use US-ASCII or ISO 8859-1 (Latin 1) instead?

I could, but I have good reasons not to. I use Outlook Express both for
domestic and other matters and both for email and Usenet, for the time
being, for reasons partly related to my working with IT books for the
general audience. Setting the encoding the way would have too many negative
side effects in other uses, see e.g. "Issues in Unicode email",
http://www.kolumbus.fi/jukka.k.korpela/unicode-email.html
Well that depends on whether you consider the purpose of a
validator to be:
-1- Show the author where the error is, and explain what's wrong,
so that the author can immediately fix it.
-2- Flag the entire WebPage as INVALID, with no idea where the
error actually is, so the author must post to a newsgroup
asking for help, and spend several days before a single error
can be fixed.

It's something between the two: a validator reports mismatches with the
document type definition (and errors in general syntax) in detail, but you
need to understand something about the formal definitions to understand the
messages. See
http://www.cs.tut.fi/~jkorpela/html/validation.html

Validators aren't really suitable for most web page authors, but in the
absence of better checking tools, they are useful.
Well so-far all I've learned is that:
While <tag /> is a perfectly acceptable non-container XML tag, it's
totally invalid as such in SGML, generating completely different
longuange semantics, therefore must be totally avoided in any
WebPage that is supposed to be transitional between HTML and XHTML.

No, the point is simply that <tag /> causes errors in SGML validation. It's
actually "browser-safe", since browsers are so stupid. Just don't use <tag
which are the correct tags in said:
There's a second validator that is very much different from the W3C
validator and can help diagnose errors where the W3C validator gave
an error apparently unrelated to anything wrong in the syntax.

It's a phoney validator, because a program cannot be very much different
from a validator and still be a validator. There are snake-oil merchants
around; they sometimes even pop up in this group. You have been warned.
The usual caveat is that it doesn't work, plain and simple.

It works in all browsing situations where the visual appearance really
matters. If you use a speech browser, or switch off CSS support, or use a
text-only browser, you must be interested in the content of pages only and
not their graphic excellence.
The instructor required two things:
- Set up header (doctype etc.) as transitional.
- Never use an opening tag without the matching close tag.

That's simply wrong advice.
So on another pending topic: Do you know any way to force a line
break without causing a blank line? Do you know any way to avoid a
blank line at the end of a pre block?

It seems that you have a problem with Lynx. Apparently Lynx prints a blank
line after a <pre> element, no matter what. If that's intolerable, just
don't use <pre>. You can mostly achieve the same result by using
<div>...</div> for each line or <div>...<br>...<br>...</div> if you find
that more convenient. You may wish to set font-family: monospace, for
browsers that use a different font by default. And you may wish to use
&nbsp; instead of spaces to make spaces non-collapsible, so this might get a
bit awkward, and you might reconsider whether the blank line is tolerable.
 
R

robert maas, see http://tinyurl.com/uh3t

From: John Hosking said:
"Several thousand lines after where the actual error happened" would
indeed be a pain. But from what I saw in your OP and subsequent posts,
that wasn't the actual case. In fact, all the action occurred within
Line 1353.

OK, the "several thousand" was a slight exaggeration. I should have
said "more than one thousand".

I don't know what you mean by "the action". The error message cited
only line 1353, but in fact the error was caused by a mistake
somewhere previously in the file which entered a mode not supported
by any Web browser (and flagged at *that* location by the *other*
validator), and that place where the actual error happened *could*
have been anywhere in the first 1352 lines of the file and would
still have resulted in that exact same validation error noted on
1353. So the anti-diagnostic merely told me that somewhere in the
first 1353 lines of the file I had made some mistake that finally
caused an actual syntax error on line 1353.

The GNU C compiler has the decency to tell me, when it sees gross
syntax garbage immediately after the close of a multi-line string,
that although the error was noted at this point the start of the
string was way up on such-and-such line so the actual error might
really be here. That indeed happened a couple times in producing
the test programs I've been discussing in another thread. The 'n'
key on my keyboard has been very flaky lately, and sometimes when I
type a string whose last logical character is the two-character
notation \n the n is missed, and I don't happen to see it while
typing, so the \ doesn't make a newline, it quotes the closing
quote mark *within* the string, allowing the string to continue
several more lines of the program until finally there's another
string literal, and the start of that literal *ends* the multi-line
string, and the innerds of that string literal are parsed as if c
syntax, immediately causing a gross syntax diagnostic. But the
helpful advice from GNU c compiler that the string started way back
there gives me an immediate clue (because in c I *never* write
multi-line string literals, I always use \n within strings instead,
at least I try to when my keyboard cooperates). I wish the
validator would likewise note something like this:

You started a NET 20 lines earlier:
<p />In the descriptions of the built-in functions which take keyword arguments,
^
which finally terminated here on line 1353:
<br /><em>(reduce #'/ nums :end 9)</em>
^
which might be responsible for the validation error which occurred
almost immediately next in your file:
<br /><em>(reduce #'/ nums :end 9)</em>
^
whatever...
Start tag: required, End tag: forbidden

It doesn't say anything about empty tag. Allowed or forbidden??

9.3.4 Preformatted text: The PRE element

it doesn't say how to suppress the blank line at the end of each PRE element.
Also this entire section doesn't say which doctype it's applicable to.
For example, is it applicable to "transitional" or not?

Actually it looks like you have provided no links to HTML/XHTML
transitional at all.
XHTML: <br/> and <hr/> or <hr></hr>
http://www.w3.org/TR/xhtml1/#h-4.6

CORRECT: terminated empty elements
<br/><hr/>
INCORRECT: unterminated empty elements
<br><hr>
I originally tried <br/> (no space between br and /), but the
browser barfed on it, so I tried inserting the blank, and then it
worked fine. By the way, here are the homework assignments for that
"Web Design" class where I learned how to do "transitional" Web
pages: <http://www.rawbw.com/~rem/CIS89a/>
Those pages are littered with <br /> all over the place, and the
instructor who looked at all the pages (both source and as
displayed) never made one complaint about any of those empty-br
elements. By the way, she never mentionned the validation service,
and in fact those class assignments fail validation, so obviously
she didn't run the validator on our class assignments or she would
have noticed our grossly invalid HTML/XHTML transitional code, so
more and more I suspect the instructor was totally incompetant, was
just faking her way through teaching her first class in America
after miagrating from India.
XHTML Appendix C: <br /> and <hr />
http://www.w3.org/TR/xhtml1/#C_2

Include a space before the trailing / and > of empty elements, e.g.
<br />, <hr /> and <img src="karen.jpg" alt="Karen" />. Also, use the
minimized tag syntax for empty elements, e.g. <br />, as the
alternative syntax <br></br> allowed by XML gives uncertain results in
many existing user agents.

Well, per that, I was doing exactly the right thing!!
Why are you all complaining???

C.3. Element Minimization and Empty Element Content
Given an empty instance of an element whose content model is not EMPTY
(for example, an empty title or paragraph) do not use the minimized
form (e.g. use <p> </p> and not <p />).

it *is* a start-tag *itself*, but it enables a null end tag.

Ah, thanks for clearing up my confusion on the jargon usage.
I don't think he used any characters which the VT100 would mangle, ...

So why does he throw in the red herring of specifying his MIME
charaset as that special Windows thing I never heard of before his
post?? (I wouldn't have noticed it except that particular header
field was spread over about five physical lines, causing the most
important part of the header way down at the bottom to overflow to
the next screen so I had to go to extra work to copy&paste it in
with the rest of the header that was near the top of the first
screen. When I went back to see what was taking up so fucking much
space, I found that charset red herring.)
If this is exactly what she said (i.e., that's *all* she said
about it, dogmatically), then she left out some important info.

Well her heavy Indian accent made it hard to understand her, and
it's *possible* she in fact does not know American English sentence
structure well enough to even formulate the correct sequence of
words to express what she had in mind. Or, given that she wasn't
even aware of the validator, or the extra header needed so it
wouldn't have to *guess* the charset we're using, she might just
have been incompetant at the topic she was "teaching".
Transitional is for legacy documents not yet cleaned up to
validate as strict.

She wanted all our *new* documents to be transitional, so they'd
still "work" in existing Web browsers, yet they'd already work in
future XML-based Web browsers. It had nothing to do with any legacy
documents. If you look at the class assignments (URL earlier
above), one very early assignment was to make a template which
would then be copied as the starting point for all future
assignments which would all be brand-new from-scratch (well
from-template) pages. Please take a look at that template and tell
me if there's anything wrong with using it for brand-new documents.
I don't know when you had your course,

Like I said earlier in this thread (but it's easy to overlook): 2.5
years ago. The exact dates are given in the headers of each
homework assignment.
but there *was* a time when specifying a transitional doctype was
good advice (or at least, widely recommended; Jukka or BTS or
somebody will have the details).

When was that?

By the way, slightly related topic: The next semester after that,
one of my instructors gave me his old laptop, and sometime later I
decided to try my hand at JavaScript on it, got my first demo
working, uploaded it to my Web site, and tried it from the public
library, and it did't work. Turns out JavaScript has suffered an
incompatible change, such that it's impossible to write a script
that performs certain stuff and works in both the old version (on
my laptop) and the new version (elsewhere). If you're curious:
<http://groups.google.com/group/comp.lang.javascript/msg/ecf27d895294dd1a>
= Message-ID: said:
For *new* pages, or pages you are actively maintaining, go with HTML
4.01 strict, unless you have a bona fide reason to use XHTML.

http://groups.google.com/group/alt.support.diet/msg/7570305b77c13cc7
= Message-ID: <260820030806550860%[email protected]>

I try to please everyone and I end up carrying a donkey on my back.
I give up. There's no way to write Web pages that are acceptable.
Sure they are. You're letting yourself get confused.

Well if you think you can find a way to use a br or hr without
violating one of these rules:
Start tag: required, End tag: forbidden
INCORRECT: unterminated empty elements <br><hr>
Feel free to tell how to use BR or HR in HTML/XHTML transitional documents.
 
D

Dan

I had to be careful *never* to click on any link to anything that
might have images and take another 20 minutes to download. Of
course I cancelled the service before the free month was finished,
but AT&T insisted on billing me for the second month, and refused
to totally retract the charges, and I refused to pay even the
reduced amount, so they cut off my long distance service, which has
remained cut off to this day. Anyway, I've been text-only at home
ever since.

So you had a bad experience with an ISP nine years ago, and because of
that you refuse to use anything but a plain-text shell account ever
since? That seems rather ridiculous. Why not try a different ISP?
And maybe the slowness you report was because your computer was
antiquated even nine years ago; if you got a more modern machine, and
a high-bandwidth Internet connection, you'd have a much richer
experience.
 
D

Dan

Feel free to tell how to use BR or HR in HTML/XHTML transitional documents.

Well, first thing, you've got to decide whether you're using HTML or
XHTML, and give your document the appropriate doctype. Then make
consistent use of the appropriate syntax for your chosen style; <br>
and <hr> for HTML, or <br /> or <hr /> for XHTML. It validates fine
that way, and works in browsers from Lynx to Firefox.
 
J

John Hosking

robert said:
Start tag: required, End tag: forbidden

It doesn't say anything about empty tag. Allowed or forbidden??

Required. You have to read the whole section. Here's some possible help
with making out the DTD bits:
http://www.w3.org/TR/REC-html40/intro/sgmltut.html#h-3.3

The "EMPTY" keyword means that this type must not have content.
So it's <br> in HTML 4.01. You may add certain attributes, but the
element starts with a tag beginning with said:
9.3.4 Preformatted text: The PRE element

it doesn't say how to suppress the blank line at the end of each PRE element.

That's not the standard's job. Anyway, the choice of adding space after
an element or not is the browser's, not the W3C's.
Also this entire section doesn't say which doctype it's applicable to.
For example, is it applicable to "transitional" or not?

Yes. It applies to both strict and loose.
Actually it looks like you have provided no links to HTML/XHTML
transitional at all.

Sorry. I guess I'm done transitioning. But then again, *I* never
attended De Anza College. ;-)


Include a space before the trailing / and > of empty elements, e.g.
<br />, <hr /> and <img src="karen.jpg" alt="Karen" />. Also, use the
minimized tag syntax for empty elements, e.g. <br />, as the
alternative syntax <br></br> allowed by XML gives uncertain results in
many existing user agents.

Well, per that, I was doing exactly the right thing!!
Why are you all complaining???

Well, the thing is, nobody much likes Appendix C. I think that's because
it relies on browser weaknesses (and then *formalizes* it by adding it
to the W3C docs). For strict conformance, it's best to stay away from
Appendix C, which means not trying to serve XHTML to IE, which means
staying with HTML 4.01. And strictly speaking, <br /> is not valid HTML.

I mentioned Appendix C because it exists, and provides a weasel-way out
in certain circumstances. I don't like it, want it, trust it it or fully
understand the implications of it, but it *is*.

Anyway, I thought you were trying to do _HTML_ transitional everywhere.

Hmm, I've started doing <p></p> instead, i.e. no space between the
opening and closing paragraph delimiters. Is that wrong??

Don't know. Anyone else?
I *do* know an empty paragraph is a pretty weird thing. I've never had a
need for one on any of my pages, at least not since I learned to acheive
vertical spacing in other ways than extraneous said:
She wanted all our *new* documents to be transitional, so they'd
still "work" in existing Web browsers, yet they'd already work in
future XML-based Web browsers.

In 2004, that's silly.
It had nothing to do with any legacy
documents. If you look at the class assignments (URL earlier
above), one very early assignment was to make a template which
would then be copied as the starting point for all future
assignments which would all be brand-new from-scratch (well
from-template) pages. Please take a look at that template and tell
me if there's anything wrong with using it for brand-new documents.

Here's one from Beauregard T. Shagnasty you might prefer:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
<head>
<title>My Template</title>
<meta http-equiv="content-style-type" content="text/css">
<meta http-equiv="language" content="english">
<meta http-equiv="dialect" content="us">
<meta http-equiv="window-target" content="_top">
<meta name="author" content="<Your name>">
<meta name="description" content="<Your description">
<style type="text/css" media="screen">@import "yourcssfile.css";
</style>
<link rel="shortcut icon" href="favicon.ico" type="image/x-icon">
<link rel="icon" href="favicon.ico" type="image/x-icon">
</head>

<body>
<div id="logo">
[Insert your logo code here]
</div>

<div id="content"><!-- Content section -->
[Each individual page's content goes here]
</div> <!-- End of Content -->

<div id="nav"> <!-- Begin menu -->
[Your menu code goes here]
</div> <!-- End menu -->

<div id="footer">
<p>Copyright &copy;2007 Your Name Here. All rights reserved.</p>
</div>
</body>
</html>

Now save it as "template.html" (remember to change the <your name>
stuff). I'm not sure about the "window-target" meta tag so find out
before you use it; maybe you can leave it out. Otherwise I didn't change
his template much.

When was that?

Oh, late 1990s?

Well if you think you can find a way to use a br or hr without
violating one of these rules:
Start tag: required, End tag: forbidden
INCORRECT: unterminated empty elements <br><hr>
Feel free to tell how to use BR or HR in HTML/XHTML transitional documents.

No, I'll tell you to use <br> and <hr> in HTML 4.01 Strict.

If you _must_ use XHTML, use <br /> and <hr /> if you're targeting IE6
(you said you weren't, really), because IE6 doesn't handle real XHTML
served as XHTML (this is the Appendix C fudge).

If you need to use XHTML and can serve it as XHTML to XHTML-capable
browsers, then use <br/> and <hr/> or <hr></hr>.

Enough for me for now; must sleep. At least I've found somebody even
more verbose than myself. ;-) Can't wait to see what Toby comes up with
for your totals next Sunday.
 
R

robert maas, see http://tinyurl.com/uh3t

From: "Steve Pugh said:
First you need to ask yourself if you are writing XHTML or HTML.

Both. The instructor in the "Web Design" class required us to write
HTML/XHTML transitional, so it'd work in existing HTML-based Web
browsers, but also would continue to work with future XML-based Web
browsers. More recently somebody said that's impossible to achieve,
the instructor lied. I'm starting to believe the latter.
Where and how you close elements varies between the two.

The instructor insisted we write all our Web pages to be consistent
with both. For example, we can't just say <p> between paragraphs.
So you must be 100% consistent to the rules of whichever one you
are using. Starting with inclduing either an HTML or XHTML doctype.

The instructor insisted we start every document like this:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3c.org/TR/html4/loose.dtd">

<html>
<!-- Author: Robert Maas -->
<!-- Date started: 2004.Jun.28 -->
<!-- Just a first test of writing XML-complient HTML file, but with HR -->

<head>
<title>Assignment 1, a simple template
</title>
</head>

<body>

(where the title may vary between various Web pages, but the
doctype and URL for dtd are the same for all documents)

Every document must end like this:
</body>

</html>

However a year or so later somebody complained that my Web pages
done in that way weren't valid HTML because they failed the W3C
validator, so I tried the validator on one of the newer Web pages,
followed the recommedatation given by it, and so for more recent
documents I start them this way instead:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3c.org/TR/html4/loose.dtd">

<html>
<!-- Author: Robert Maas -->
<!-- Date started: 2004.Aug.24 -->
<!-- Linking to the various Hello World! etc. I've written/collected -->

<head>
<meta http-equiv="Content-Type" content="text/html;charset=us-ascii">
<title>Beyond "Hello World"</title>
</head>

<body>

(Again, the actual title varies, but the doctype and URL for dtd
are the same as before, and now the meta http-equiv has been added
and that is now a constant for all new Web pages I write, with
very rare exceptions where I'm trying to absolutely minimize
number of bytes in the file, such as some pages designed for
cellphones where the user is billed in units of 1k bytes so I want
to cram as much as possible into a 1k download.)
In XHTML use <br /> and in HTML use <br>. That's it. That's how to
create a valid and working line break in the two languages.

And in transitional HTML/XHTML I can't do either?
How can I write Web pages that work with old browsers, text-only
browsers, brand-new XML-based browsers, etc., rather than work with
this browser but not with that other browser?
 
R

robert maas, see http://tinyurl.com/uh3t

From: "Steve Pugh said:
Correct.
It has a different (but valid) use in HTML. However that use has never
been supported by browsers. It should be avoided in HTML for a mix of
practical and technical reasons.

OK, so you're saying I should avoid that syntax.
The only thing that comes close to a transition between HTML and
XHTML is Appendix C of the XHTML spec which recommends the use of
<br /> for empty elements ...

So that appendix says to go ahead and use <br />.
Since you disagree on that point, I assume you believe that
appendix is full of shit and should be ignored.

At this point I believe the class instructor lied, there's no such
thing as Web pages that are HTML/XHTML transitional, and I should
stop trying to write any such pages.

What doctype/dtd/meta combination do you recommend for Web pages
that work with old browsers such as NetScape Navigator, text-only
browsers such as lynx, modern fullfledged browsers such as Mozilla
Fire???, and new XML-based browsers? (Never mind IE. Nobody who
cares about security ever uses IE except on somebody else's
computer where they don't care if the OS is destroyed.)
 
R

robert maas, see http://tinyurl.com/uh3t

From: "Jukka K. Korpela said:
If you use a speech browser, or switch off CSS support, or use a
text-only browser, you must be interested in the content of pages
only and not their graphic excellence.

That's a snide way of implying that you're better than me because
you have enough money to afford a newer computer that is capable of
fullfledged direct InterNet access, so because you're better than
me you can make derogatory remarks about me without it being
"wrong".

I use a text-only browser because it's the **only** browser
available to me. So **** off with your stupid remarks about my
motivation for using a text-only browser.
That's simply wrong advice.

OK, we're in agreement that the instructor was incompetant to teach
the class she taught. Now what alterative advice do you have for me
now? What doctype/dtd/meta combo do you recommend I use, and where
can I find a list of warnings about what *not* to do and *how* to
do common things when using your recommended combo?
 
S

Steve Pugh


That's impossible.
The instructor in the "Web Design" class required us to write
HTML/XHTML transitional, so it'd work in existing HTML-based Web
browsers, but also would continue to work with future XML-based Web
browsers. More recently somebody said that's impossible to achieve,
the instructor lied. I'm starting to believe the latter.

Either you misunderstood or the instructor was unclear.
The instructor insisted we write all our Web pages to be consistent
with both. For example, we can't just say <p> between paragraphs.
Instead we must say <p> at start of paragraph and </p> at end of
paragraph. Doesn't that work in both HTML and XHTML??

That is fine in both.

In HTML </p> is optional. In XHTML it is mandatory. So including it is
okay in both.

But there are other case where the two differ in less compatible ways.
The instructor insisted we start every document like this:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3c.org/TR/html4/loose.dtd">

So you are writing HTML. So use HTML syntax.
And in transitional HTML/XHTML I can't do either?

If you are using the above doctype then you are writing HTML and so
you should use <br>. End of story. You can make your HTML slightly
more XHTML-ish by including all closing tags for non-empty elements
and by quoting all attribute values, but you are still writing HTML so
you must use HTML syntax for empty elements.
How can I write Web pages that work with old browsers, text-only
browsers, brand-new XML-based browsers, etc., rather than work with
this browser but not with that other browser?

Write valid HTML 4.01 or XHTML 1.0. 99% of all browsers will cope just
fine with both, now and in the forseeable future.

Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,778
Messages
2,569,605
Members
45,238
Latest member
Top CryptoPodcasts

Latest Threads

Top