<q> and language-specific quotation marks

Stan Brown · Oct 14, 2003

comp.infosystems.www.authoring.html, Bertilo Wennergren

Stan Brown:

[stripped attribution restored]

Let's say you want to do this:
quote {
font-style: italic;
}
You can of course add a meaningless "span" to your unstylable piece of
naked text, but wouldn't a meaningful element be better?

No, I don't think so. More precisely, I don't think such a styling
in <quote> or is ever appropriate. Maybe in other languages
things are different, but AFAIK in English inline quotes are not
styled differently from regular text.

--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
2.1 changes: http://www.w3.org/TR/CSS21/changes.html
validator: http://jigsaw.w3.org/css-validator/

Bertilo Wennergren · Oct 14, 2003

Stan Brown:

No, I don't think so. More precisely, I don't think such a styling
in <quote> or is ever appropriate. Maybe in other languages
things are different, but AFAIK in English inline quotes are not
styled differently from regular text.

That seems to rule out lots of use of styling. E.g.:

whatever

strong {
color: red;
background-color: white;
}

Never style a "span"? Never put a special background color on
a piece of text (logically marked up), because you dont do
that kind of thing in ordinary printed text?

And what about using a special voice to read quotes in a
voice browsers?

And how do we do this without using some mark-up?

Confucius said:
<quote cite="http://www.wisewords.org/confucius.html#bla">
"Bla, bla."
</quote>

Stan Brown · Oct 14, 2003

comp.infosystems.www.authoring.html, Bertilo Wennergren

(Once again, I have restored the attribution that you stripped out.
Please do not put other people's words in my mouth. How would you
like it if I used your name on a quote that you disagree with?)

Stan Brown:

That seems to rule out lots of use of styling. E.g.:

whatever

strong {
color: red;
background-color: white;
}

Never style a "span"?

That is not what I said. I said I don't think that styling a quote
in italics is appropriate, whether you use <quote> or to do
it.

Of course there are all sorts of uses for styling . But I
don't believe -- and again, somebody might well post a
counterexample to educate me -- I don't believe that it is ever
appropriate to style an inline quote in English differently from its
surrounding text.

--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
2.1 changes: http://www.w3.org/TR/CSS21/changes.html
validator: http://jigsaw.w3.org/css-validator/

Toby A Inkster · Oct 14, 2003

Stan said:
(Once again, I have restored the attribution that you stripped out.
Please do not put other people's words in my mouth. How would you
like it if I used your name on a quote that you disagree with?)

Can anyone see the irony?

Stan Brown · Oct 15, 2003

comp.infosystems.www.authoring.html, Toby A Inkster

Can anyone see the irony?

You mean, because you use an obviously fake address and do it in
the wrong way?

The alternatives are to use what information, poor as it is, you
give, or to falsely attribute(*) your words, which I do not agree
with, to me.

(*) Yes, I know. See Fowler at "split infinitive".

--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
2.1 changes: http://www.w3.org/TR/CSS21/changes.html
validator: http://jigsaw.w3.org/css-validator/

Micah Cowan · Oct 15, 2003

Stan Brown said:
comp.infosystems.www.authoring.html, Bertilo Wennergren

Stan Brown:

Click to expand...

[stripped attribution restored]

Let's say you want to do this:
quote {
font-style: italic;
}
You can of course add a meaningless "span" to your unstylable piece of
naked text, but wouldn't a meaningful element be better?

Click to expand...

No, I don't think so. More precisely, I don't think such a styling
in <quote> or is ever appropriate. Maybe in other languages
things are different, but AFAIK in English inline quotes are not
styled differently from regular text.

Not true. I have read many books where inline quotes were
indicated by italics. Sometimes this was 100% consistent, and
other times it is frequently used (even in modern books) to
indicate "thought"-quotes. In ths case a "class" attribute might
be appropriate.

However, I agree that <quote xml:lang="en">"Hello"</quote> has no
advantage over "Hello" ... but <quote xml:lang="en">Hello</quote>
does. In particular, it places the burden of having to remember
what level of quotations to use upon the user agent, instead of
the author; and it allows you to easily quote a section of text
that contains a quote: you merely wrap another <quote/> around it
without having to change double-quotes to singles, etc.

-Micah

Micah Cowan · Oct 15, 2003

Stan Brown said:
comp.infosystems.www.authoring.html, Bertilo Wennergren

(Once again, I have restored the attribution that you stripped out.
Please do not put other people's words in my mouth. How would you
like it if I used your name on a quote that you disagree with?)

That is not what I said. I said I don't think that styling a quote
in italics is appropriate, whether you use <quote> or to do
it.

Yes it is, though it may not have been what you meant. Read your
quote above again.

-Micah

Micah Cowan · Oct 15, 2003

Stan Brown said:
comp.infosystems.www.authoring.html, Toby A Inkster

You mean, because you use an obviously fake address and do it in
the wrong way?

No, because you attribute to Toby A Inkster a quote which in fact
originated from Bertilo Wennergren, whilst you complain about
incorrect attributions to you (which I can't seem to find -- the
quoting levels make it entirely obvious which quotes are yours
and which aren't--though I'll agree that *all* levels should have
been properly attributed for absolute clarity). It's pretty
ironic to me, too

This seems very much like the stereotypical spelling or grammar
corrections, which are of course obliged to contain at least one
such error in the complaint.

-Micah

Micah Cowan · Oct 15, 2003

Andreas Prilop said:
Young boy!

(Sorry I didn't obey the Followup-To header; mainly because I
don't understand why it broke the cross-post, and because I do
not read the c.i.w.a.h [yet]).

Yeah, you're right: I'm mistaken (for some reason, I'd thought
they were included in the entities for 3.2; obviously
not). However, every *current* browser I've seen supports them,
and the character reference equivalents (to which I frequently
convert these through postprocesing) are supported by the
previous generation of browsers. I don't encounter too many
people still surfing with browsers much older than that, so am
not too concerned; especially since those browsers would have
bigger issues with other standard-conformant but not
backwards-compatible facilities I frequently use.

-Micah

Micah Cowan · Oct 15, 2003

Jukka K. Korpela said:
Thanks for a good laugh. Seriously, you haven't actually studied the
HTML specification much if you think that it really sticks to RFC
language.

I'd be much happier if you'd actually produce some quotes from
the spec to back up that statement, rather than just haughtily
assert that my notion is laughable. In particular, I can't think
of many instances in which you can prove that a specification
didn't mean "must" where it says "must", and "should" where it
says "should". And, in *particular*, I see no reason why you
should not interpret the "should" in 9.2.2, which we were
discussing, in accordance with the RFC language--especially since
the spec itself tells you to. After all, if you can't treat a
spec as law, then what good is a spec? Better to buy a book that
teaches you all sorts of non-conformant but "de facto standard"
extensions and base your code on that :-(

I'll concede the point about requiring CSS; so I will modify my
previous assertion to "you are *supposed* to use style sheets to
indicate your preferences for the handling of <q>"; there, does
that make you happier?

For my part, the mere fact that the W3C recommends their use
instead of typing quotation marks directly (see, e.g., checkpoint
3.7 of the WCAG) gives me pause to dismiss them, and a comparison
of the relative advantages/disadvantages to using typed-in
quotation marks causes me to conclude that would not a poor
practice to prefer to use <q> (were it not for the fact that it
is not well-supported by a certain browser with very large
market-share). This doesn't stop me from writing my DocBook XML
stuff using the <quote> element, though; and the major advantage
to this is that I can choose to convert these to XHTML <q> in my
XSLT stylesheets once they are well-supported in the mainstream;
but convert them to suitable quote-mark characters in the
meantime.

The only actual disadvantage to <q> that I can see is in the case
of long quotations containing actual paragraph breaks (for
example, in conversations), since each new paragraph should being
with opening quote-marks; but in this case, the quote doesn't
really fit the qualification of "inline quote".

-Micah

Alan J. Flavell · Oct 15, 2003

Andreas Prilop said:
Andreas Prilop said:

Young boy!

Click to expand...

[...]
Yeah, you're right: I'm mistaken (for some reason, I'd thought
they were included in the entities for 3.2; obviously
not). However, every *current* browser I've seen supports them,

fair comment, but:

and the character reference equivalents (to which I frequently
convert these through postprocesing) are supported by the
previous generation of browsers.

So it seems you have no need to advocate use of the entity names!
While the difference in coverage can now be considered quite small,
and some would deem it no longer of any significance, the fact is that
one does get somewhat wider coverage with notation for
these characters, than with the &entityname; notations which HTML4
defined.

The only other consideration I could think of is that some of the
entity names, such as ™ , Ω etc are immediately intuitive
(if somewhat messy) if the browser doesn't understand them - and
therefore displays them as coded. Whereas browsers that are too old
(or incomplete - see WebTV) to understand notation are
liable to display something incomprehensible and/or silly.

But by now I wouldn't consider that to be a substantive argument. If
folks choose to use old or incomplete software, I'm willing to go some
way - as far as it doesn't disadvantage other users - to maintaining
compatibility, but I see no call for heroic measures.

Toby A Inkster · Oct 15, 2003

Stan said:
You mean, because you use an obviously fake address and do it in
the wrong way?

No, because you complain about misattributing quotes and in the same post
misattribute a quote which Bertilo said to me.

And the address is real.

Stan Brown · Oct 16, 2003

comp.infosystems.www.authoring.html said:
Yes it is, though it may not have been what you meant. Read your
quote above again.

I have done so, and I stand by what I said.

--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
2.1 changes: http://www.w3.org/TR/CSS21/changes.html
validator: http://jigsaw.w3.org/css-validator/

Stan Brown · Oct 16, 2003

comp.infosystems.www.authoring.html, Toby A Inkster

No, because you complain about misattributing quotes and in the same post
misattribute a quote which Bertilo said to me.

I have tried and failed to figure out who said what -- because
Bertilo persisted in snipping attributions and I was working from
his article, which I replied to.

And the address is real.

Yeah, right.

--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
2.1 changes: http://www.w3.org/TR/CSS21/changes.html
validator: http://jigsaw.w3.org/css-validator/

Toby A Inkster · Oct 16, 2003

Stan said:
Yeah, right.

If you don't believe me, send an e-mail.

Micah Cowan · Oct 16, 2003

Alan J. Flavell said:
Andreas Prilop said:

Every browser I've seen supports “, ”,

Young boy!

Click to expand...

[...]
Yeah, you're right: I'm mistaken (for some reason, I'd thought
they were included in the entities for 3.2; obviously
not). However, every *current* browser I've seen supports them,

Click to expand...

fair comment, but:

and the character reference equivalents (to which I frequently
convert these through postprocesing) are supported by the
previous generation of browsers.

Click to expand...

So it seems you have no need to advocate use of the entity names!
While the difference in coverage can now be considered quite small,
and some would deem it no longer of any significance, the fact is that
one does get somewhat wider coverage with notation for
these characters, than with the &entityname; notations which HTML4
defined.

Yes, and I do try to prefer the character references. However,
the entity names are much easier to remember (for me), and I tend
to hand-write code using those. If I'm concerned about coverage,
I will post-process.

Of course, there are many much older entity names which I have no
problem with keeping in my production HTML, because they've been
in the HTML standards from the beginning (though IIRC, they
weren't *required* in 2.0).

The only other consideration I could think of is that some of the
entity names, such as ™ , Ω etc are immediately intuitive
(if somewhat messy) if the browser doesn't understand them - and
therefore displays them as coded. Whereas browsers that are too old
(or incomplete - see WebTV) to understand notation are
liable to display something incomprehensible and/or silly.

Weren't character references *always* a part of SGML/HTML? I
realize that doesn't do anything for broken browsers, but "too
old"? (I'm afraid that I am a bit young to remember much about
Mosaic and that generation...)

But by now I wouldn't consider that to be a substantive argument. If
folks choose to use old or incomplete software, I'm willing to go some
way - as far as it doesn't disadvantage other users - to maintaining
compatibility, but I see no call for heroic measures.

Agreed here.

-Micah

Alan J. Flavell · Oct 16, 2003

Of course, there are many much older entity names which I have no
problem with keeping in my production HTML, because they've been
in the HTML standards from the beginning (though IIRC, they
weren't *required* in 2.0).

May I congratulate you on the accuracy of your historical detail?

In fact, Netscape were so late[1] in finally implementing those
character entity names that had been proposed in HTML/2.0/RFC1866,
that for a while the entities were on the verge of being taken out of
HTML3.2; but they finally made it.

[1] long after other browsers of the time had implemented them, I
mean.

(And history repeated itself with HTML4 and NN4.*, as you're clearly
aware).

Weren't character references *always* a part of SGML/HTML?

Oh yes, but in SGML the number which appears on the notation
relates to the "Document Character Set", and, up until RFC2070 set out
the plan for HTML to use iso-10646/Unicode as the "Document Character
Set" regardless of what external coding was in use (that
misleadingly-named "charset" parameter from the MIME specification),
there were many browsers which behaved as if the document character
set contained only 256 code points (8-bit character set) and was
aligned to whatever external character coding was in use.

WebTV still seems to behave on that basis (except that, judging from
its developer viewer, at least, there's only one character set and
coding which it implements: a somewhat stripped-down version of
Windows-1252).

Indeed, for an 8-bit character architecture it was the obvious thing
to do, in the short term, because basically the browser developer just
fed the 8-bit characters to the system's existing display routines:
it needed a lot more work (and understanding and expertise) to support
Unicode, back when operating systems themselves had no support and
browser authors has to spin their own as part of the browser design (I
think for example of Alis Tango browser, which implemented Unicode -
but ran on Windows/3.x OS, that had no Unicode support itself).

But now it's all coming together - in various ways, but all supporting
the same underlying concept (the black-box character model described
in RFC2070).

cheers

Jukka K. Korpela · Oct 17, 2003

Micah Cowan said:
I'd be much happier if you'd actually produce some quotes from
the spec to back up that statement, rather than just haughtily
assert that my notion is laughable.

I'm sorry, but it really is a matter of experience with reading the
HTML 4 specifications. If you read them for a while, with your mind
oriented towards considering exactness and conformance to rules and
specifications, you will soon realize that they are loose prose, rather
than an exact specification. I'll skip the question whether the
paragraph mentioned even tries to be _formal_, so let's just discuss
exactness (which can be formalized or non-formalized).

In section 4, http://www.w3.org/TR/html4/conform.html , the spec says:
"The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. However, for
readability, these words do not appear in all uppercase letters in this
specification."
The spec uses the odd, non-RFC-style phrase "we recommend that" too.
Should they be read as being identical with "it is recommended that"?
Even if we, rather unnaturally, assume that the answer is negative, you
will find lots of occasions where those key words are used so that they
hardly carry their RFC meanings. For example, B.1 Notes on invalid
documents http://www.w3.org/TR/html4/appendix/notes.html#h-B.1
plays with those words, yet begins with a nullifying disclaimer:
"The following notes are informative, not normative. Despite the
appearance of words such as "must" and "should", all requirements in
this section appear elsewhere in the specification."
What's the idea of using those words at all in an _informative_
appendix? And if all the requirements in the appendix (which is OTOH
said to be informative, i.e. contain no requirements) appear elsewhere,
then we apparently need to read between the lines a lot.

OK, enough said. The HTML 4 specification is nowhere near an exact
specification, and doesn't even really try. (And it contains some
formalized parts, mainly the DTDs and DTD fragments, but these in turn
are to be taken with quite some salt - as we know, HTML wasn't _really_
meant to be an SGML application.)

I'll concede the point about requiring CSS; so I will modify my
previous assertion to "you are *supposed* to use style sheets to
indicate your preferences for the handling of <q>"; there, does
that make you happier?

No, not at all. The theory behind <q> says that browsers should use
language-specific quotation marks, and elsewhere the specification says
quite a lot (though sloppily formulated) about language markup, so if
the spec *supposes* something in this respect, it's this: you're
supposed to use lang attributes, and browsers are supposed to know and
apply the rules of different languages. (CSS might conceivably be used
to express the author's preference on presentational details when a
language has several alternative styles of using quotation marks, as
some languages have.)

For my part, the mere fact that the W3C recommends their use
instead of typing quotation marks directly (see, e.g., checkpoint
3.7 of the WCAG) gives me pause to dismiss them,

Thanks for pointing that out. I'm currently writing some practical
instructions on applying WCAG, and I need to remember to mention that
"checkpoint" 3.7 is mostly nonsense and conflicts even with the way W3C
is going. - Using blockquote for block quotations is OK but hardly
sufficient - speech browsers don't do indents, and neither do most of
them express blockquote markup in any other way, so for accessibility,
the author needs to word his document so that by merely reading it
aloud, it is sufficiently obvious which parts are quoted and which are
not. Well, someone might say that if you do this for inline quotations
too, _then_ you can use <q> markup. You can use it when you don't need
it. But it's better to use quotation marks even then, since _they_ give
an additional hint to all people who see the document.

This doesn't stop me from writing my DocBook XML
stuff using the <quote> element, though;

Naturally you can use any suitable approach there.

and the major advantage
to this is that I can choose to convert these to XHTML <q> in my
XSLT stylesheets once they are well-supported in the mainstream;

That can hardly be a major advantage, since that time will never come.
The <q> is element is to be removed sooner than any major changes take
place on the browser front. But maybe you can use the <quote> element.
In fact you could in practice do it right away, since browsers are
expected to ignore tags that they don't recognize.

Vincent · Oct 19, 2003

Jukka said:
from recognizing strings delimited by quotation marks. (One might
consider treating <blockquote> as indicating quotation, but abuse is so
widespread that this would not be pragmatically wise.)

Speaking about <blockquote>...

I've been reading this thread with interest, but I still can't make up
my mind about how to use <q> and <blockquote> (or <quote> and
<blockquote> in XHTML 2.0).

Indeed, short and long quotations should be surrounded by quotation
marks in some way. These quotation marks can either be included directly
in the (X)HTML document, or be rendered via stylesheets.

In my opinion, it would be more consistent to use one single method for
both <q> (or <quote>) and <blockquote>, i.e. directly in the content, or
via style.

We know that the style choice is not really usable as today: IE will
ignore generated content, Mozilla is unable to handle nested quotes, etc.

Ok, so we're left with including the quotation marks directly in the
content, which makes sense because, after all, quotation marks are
nothing more than punctuation signs.

Now I read somewhere that in this case, quotation marks should ideally
appear outside of <quote> element, like this: "<quote>quotation</quote>"
instead of <quote>"quotation"</quote>

Fine. But then, what about <blockquote> ? Considering that a typical
<blockquote> looks like:
<blockquote>first paragraphsecond paragraph></blockquote>

where should we put the quotation marks ? Outside of the <blockquote> ?
But then they will appear on a single line because <blockquote> and 
are block elements. Inside of the first and the last (or any
suitable combination), but then we have quotation marks inside a
quotation wich we wanted to avoid in the case of <quote>...

So I have this feeling that we will only have a consistent, unique
method to delimit quotations when rendering via stylesheet is widely
implemented (which can take a while, I agree...)

Or do you have any wise advice ?

Toby A Inkster · Oct 19, 2003

Vincent said:
Fine. But then, what about <blockquote> ? Considering that a typical
<blockquote> looks like:
<blockquote>first paragraphsecond paragraph></blockquote>

where should we put the quotation marks ? Outside of the <blockquote> ?

Well, the best answer is that you don't! Large quoted sections are best
identified by indenting and then maybe italicising.

If you did want to add quote marks:

<blockquote>'first paragraph'second paragraph'</blockquote>

(Note: no closing quote in first paragraph. This is consitant with proper
English punctuation. Indeed, this post assumes that the page language and
thus quoting style is English.)

The devolution of English language and slothful c.l.p behaviors exposed!	50	Jan 24, 2012
Javascript and IE? Javascript and C#?	6	Oct 5, 2007
Musatov claims "Mode/Code"	2	Oct 31, 2009
Musatov's 'Mode/Code' Primary method call	4	Oct 31, 2009
comp.lang.c Changes to Answers to Frequently Asked Questions (FAQ)	1	Jul 4, 2004
FAQ update and proposed updates	2	Nov 7, 2005
In the Matter of Herb Schildt: a Detailed Analysis of "C: TheComplete Nonsense"	109	Apr 3, 2010
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006

<q> and language-specific quotation marks

Stan Brown

Bertilo Wennergren

Stan Brown

Toby A Inkster

Stan Brown

Micah Cowan

Micah Cowan

Micah Cowan

Micah Cowan

Micah Cowan

Alan J. Flavell

Toby A Inkster

Stan Brown

Stan Brown

Toby A Inkster

Micah Cowan

Alan J. Flavell

Jukka K. Korpela

Vincent

Toby A Inkster

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads