<q> and language-specific quotation marks

T

Tristan Miller

Greetings.

Do any popular browsers correctly support <q>, at least for Western
languages? I've noticed that Mozilla uses the standard English
double-quote character, ", regardless of the lang attribute of the HTML
document. Will any browsers render German-style quotes or French-style
guillemots for lang="de" and lang="fr", respectively?

Regards,
Tristan
 
K

kayodeok

Greetings.

Do any popular browsers correctly support <q>, at least for
Western languages? I've noticed that Mozilla uses the standard
English double-quote character, ", regardless of the lang
attribute of the HTML document. Will any browsers render
German-style quotes or French-style guillemots for lang="de" and
lang="fr", respectively?

IE doesn't support <q>

Getting quote marks around <q> tags in IE
http://groups.google.com/groups?threadm=Pine.LNX.4.53.0310111352150.10264@ppepc56.ph.gla.ac.uk
 
B

Brian

Tristan said:
Do any popular browsers correctly support <q>, at least for Western
languages? I've noticed that Mozilla uses the standard English
double-quote character, ", regardless of the lang attribute of the HTML
document. Will any browsers render German-style quotes or French-style
guillemots for lang="de" and lang="fr", respectively?

Mozilla displays French language quote delimiters with the following
in css:

[lang="fr"] {
quotes: '« ' ' »'
}

German could be handled in a similar fashion.

[lang="de"] {
quotes: '„' '"'
}

I don't know German nearly well enough to write in it, so I've never
actually used the second example.
 
J

Jukka K. Korpela

Tristan Miller said:
Do any popular browsers correctly support <q>
No.

I've noticed that Mozilla uses the standard English
double-quote character, ", regardless of the lang attribute of the
HTML document.

If you mean what you wrote, the Ascii quotation mark, then it's
definitely not _standard_ for English, or any language (except computer
"languages"). It's just the worldwide common surrogate.
Will any browsers render German-style quotes or
French-style guillemots for lang="de" and lang="fr", respectively?

Only if you write them as actual characters (and then the lang
attribute is immaterial in this issue). Why wouldn't you do that? We
can use language-specific punctuation characters for other things (such
as inverted question mark at the start of a question in languages that
require it), and seldom do we see requests to dispense with that by
using markup (like <question>) instead. What's so special about
quotations, then?

Beware that attempts to make browsers implement <q> by using CSS are
generally not successful and that _correct_ use of quotation marks is
trickier than people think.

Anyway, <q> was good idea as described (as an example) in the SGML
standard, but HTML did not adopt the idea early enough (and well
enough), and now it's too late. Just forget <q>.
 
M

Micah Cowan

Tristan Miller said:
Greetings.

Do any popular browsers correctly support <q>, at least for Western
languages? I've noticed that Mozilla uses the standard English
double-quote character, ", regardless of the lang attribute of the HTML
document. Will any browsers render German-style quotes or French-style
guillemots for lang="de" and lang="fr", respectively?

AIUI, a browser is not required to make allowances for the
declared language; if you want these changes, you are supposed to
use CSS to specify them (shameless snippet from CSS2 spec:)

Q:lang(en) { quotes: '"' '"' "'" "'" }
Q:lang(no) { quotes: "«" "»" "<" ">" }

....however, to my knowledge, neither Mozilla nor MSIE support
this. Mozilla uses " " ' ' regardless of what you specify using
CSS; and MSIE (last I checked) doesn't support the <q> element
properly at all. I think Opera might, but since that's not very
mainstream, it probably won't help you much.

-Micah
 
J

Jukka K. Korpela

Micah Cowan said:
AIUI, a browser is not required to make allowances for the
declared language;

The HTML specification says: "User agents should render quotation marks
in a language-sensitive manner (see the lang attribute)." In that
sense, it's not a requirement for conformance to recommendation, just a
recommendation in the recommendation. :) On the other hand, it is a
bit unrealistic to say that user agents should behave that way, since
it is rather hard to support all the thousands of languages, even in a
detail like this, since official information on punctuation rules is
not easy to find.
if you want these changes, you are supposed to
use CSS to specify them

No, you're not. The HTML specification says that browsers should do
(shameless snippet from CSS2 spec:)

Q:lang(en) { quotes: '"' '"' "'" "'" }
Q:lang(no) { quotes: "«" "»" "<" ">" }

How typical. Both rules are completely wrong, by the rules of those
languages. Correct English orthography uses none of the characters
listed, and Norwegian surely does not use less than sign and greater
than sign as inner quotes.

To repeat myself: Forget <q>. Use plain Ascii quotation marks, unless
you _know_ the correct use of punctuation characters in the language of
the context where the quotation appears and you can be reasonably sure
that browsers support those characters well enough. And when estimating
whether you _know_ such issues, it is useful to remember that the
authors of the CSS specification didn't have a clue.
 
M

Micah Cowan

Jukka K. Korpela said:
The HTML specification says: "User agents should render quotation marks
in a language-sensitive manner (see the lang attribute)." In that
sense, it's not a requirement for conformance to recommendation, just a
recommendation in the recommendation. :) On the other hand, it is a
bit unrealistic to say that user agents should behave that way, since
it is rather hard to support all the thousands of languages, even in a
detail like this, since official information on punctuation rules is
not easy to find.


No, you're not. The HTML specification says that browsers should do
such things automatically.

SHOULD and MUST are very different--formally. You *are* supposed
to use CSS if you want to force a conforming user-agent to Do The
Right Thing(TM). However, since there don't seem to be any
conforming user-agents... said:
And in practical terms, <q> markup is useless.

Yeah, which sucks.
How typical. Both rules are completely wrong, by the rules of those
languages. Correct English orthography uses none of the characters
listed, and Norwegian surely does not use less than sign and greater
than sign as inner quotes.

Agreed about (en); although even if it had been correct, I didn't
post using an encoding that would have allowed more appropriate
ones.

As to (no); you're right, that's stupid. That's how they were in
the CSS2 standard, though (should've been ‹ and › I
believe)
To repeat myself: Forget <q>.

But only until the stupid mainstream browsers (IOW, MSIE) get it
right. However, someone pointed out elsethread that apparently newer
versions Mozilla *can* get it right. Yay!
Use plain Ascii quotation marks

Why? Every browser I've seen supports &ldquo;, &rdquo;,
etc. Currently, the articles I've written in DocBook which use
, unless
you _know_ the correct use of punctuation characters in the language of
the context where the quotation appears and you can be reasonably sure
that browsers support those characters well enough.

But when you *don't* know this, are you sure that the Ascii
quotation marks are appropriate?

-Micah
 
T

Tina Holmboe

Jukka K. Korpela said:
such things automatically. And in practical terms, <q> markup is
useless.

So. In practical terms, marking up an inline quotation as an inline
quotation is useless.

This is good to know.
 
J

Jukka K. Korpela

Micah Cowan said:
SHOULD and MUST are very different--formally.

Theoretically HTML 4 specifications use RFC language here, but in
practice their wording is not that formal. Anyway, by the RFC language,
the statement that browsers SHOULD "render quotation marks in a
language-sensitive manner" means that "there may exist valid reasons in
particular circumstances to ignore [that statement] particular item,
but the full implications must be understood and carefully weighed
before choosing a different course". So if an implementator has
understood the full implications etc. and decided not to make a user
agent behave that way, what makes us think that an author knows better?
You *are* supposed
to use CSS if you want to force a conforming user-agent to Do The
Right Thing(TM).

No, of course not. First, HTML specifications do not postulate any use
of CSS. They are meant to be used without a style sheet, with CSS style
sheets, or with other style sheet. Second, author style sheets (by
design and by implementation) certainly cannot force anything. Third, a
duplicate implementation of quotation mark rendering would be a shot in
the dark. A browser programmer can be in a position to _know_ that e.g.
curly quotes are not available in a rendering situation and use Ascii
quotation marks instead, and if an author style sheet tries to force
curly quotes, it could end up with having no quotes rendered.
Agreed about (en); although even if it had been correct, I didn't
post using an encoding that would have allowed more appropriate
ones.

Surely you could write a style sheet in Ascii only and yet use any
Unicode character in generated content.
As to (no); you're right, that's stupid. That's how they were in
the CSS2 standard, though (should've been ‹ and › I
believe)

No, notations like ‹ have no meaning in CSS.
But only until the stupid mainstream browsers (IOW, MSIE) get it
right.

They'll never get it right. It'll take several years before the next
version of MSIE exists and has over 50 % share of MSIE installations.
Why? Every browser I've seen supports &ldquo;, &rdquo;,
etc.

Then you haven't seen enough. Ascii quotation marks are _safe_, as I
wrote. If you consider using real quotation marks, then you should at
least refrain from using those quasi-mnemonic entity references and use
character references instead.
But when you *don't* know this, are you sure that the Ascii
quotation marks are appropriate?

Ascii quotation marks are still the safest way. It's true that these
days, the number of browsers that fail to render the character
references for curly quotes properly is rather small - but yet not
zero, and users are accustomed to seeing Ascii quotation marks, so this
is not a big issue. I'm personally moving towards using "smart"
quotation marks on new pages, especially since it's awkward to change
such things later - I cannot just do a simple editing operation to
change Ascii quotation marks to any smart characters, since Ascii
quotation marks are used for HTML markup (attribute value delimiters).

Besides, there are other problems with correct quotation marks, even
the guillemets. The guillemets are technically rather safe, being
ISO 8859-1 characters, but the clueless line breaking rules in browsers
cause quite some trouble (see
http://www.cs.tut.fi/~jkorpela/html/nobr.html ).
 
J

Jukka K. Korpela

So. In practical terms, marking up an inline quotation as an inline
quotation is useless.

Yes, because no software actually uses such markup for useful purposes,
_and_ the theoretically available markup is poorly designed.
This is good to know.

It is, is it not? Similarly, marking up a question as a question would
be useless, if <question> markup existed but had been defined so that
browsers should insert language-specific quotation mark(s) and they
actually did not do that and no search engine or other useful software
used that markup either.

We can still survive, can't we? The question mark is available, and so
are quotation marks. Actually, both a question mark and the quotation
marks are effectively markup - at the text level. If anyone wishes to
write an indexing robots that recognizes quotations, he could start
from recognizing strings delimited by quotation marks. (One might
consider treating <blockquote> as indicating quotation, but abuse is so
widespread that this would not be pragmatically wise.)
 
T

Toby A Inkster

Jukka said:
Especially since by that time <q> will
have been officially deprecated or obsolete for years.

It is already gone int eh XHTML 2 drafts. Replaced by <quote>, that will
have more realistic demands on quote marks -- the author inserts them
directly into the XHTML:

<quote xml:lang="en">"Hello"</quote>

IIRC
 
T

Tina Holmboe

Jukka K. Korpela said:
Yes, because no software actually uses such markup for useful purposes,
_and_ the theoretically available markup is poorly designed.

Oddly enough, such tools exist. I can only guess that you find the Mozilla
solution "useless", but even Mark Pilgrim has a script for extracting
quotations.


are quotation marks. Actually, both a question mark and the quotation
marks are effectively markup - at the text level. If anyone wishes to
write an indexing robots that recognizes quotations, he could start
from recognizing strings delimited by quotation marks. (One might

In Finnish - of which I know nothing - it might be that quotation marks
are always used to signify actual quotations. Such is not the case in
other languages.

How you intend to attach citation information to that text level markup
I cannot even begin to guess at.

Are we going to start writing browsers that use heuristic algorithms to
determine whether a random piece of text is one thing or the other ? That
might be amusing, but I fail to see it being helpful to anyone.

Be all of this as it may. So far I have not seen a sensible explanation of
why the *name* of the element had to change.
 
S

Stan Brown

comp.infosystems.www.authoring.html, Toby A Inkster
<quote xml:lang="en">"Hello"</quote>

I'm trying hard to understand what advantage that has over

"Hello"

but I'm failing.

--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
2.1 changes: http://www.w3.org/TR/CSS21/changes.html
validator: http://jigsaw.w3.org/css-validator/
 
B

Bertilo Wennergren

Stan Brown:
I'm trying hard to understand what advantage that has over

but I'm failing.

Let's say you want to do this:

quote {
font-style: italic;
}

You can of course add a meaningless "span" to your unstylable piece of
naked text, but wouldn't a meaningful element be better?
 
M

Micah Cowan

Jukka K. Korpela said:
Theoretically HTML 4 specifications use RFC language here, but in
practice their wording is not that formal.

The second paragraph of section 4 makes it 100% formal.
Anyway, by the RFC language,
the statement that browsers SHOULD "render quotation marks in a
language-sensitive manner" means that "there may exist valid reasons in
particular circumstances to ignore [that statement] particular item,
but the full implications must be understood and carefully weighed
before choosing a different course". So if an implementator has
understood the full implications etc. and decided not to make a user
agent behave that way, what makes us think that an author knows
better?

That's completely non-sequitur. The author is the *most*
qualified to make that decision, as it's *his* friggin' document,
and *his* choice of language. Even if it's not "correct", an
author has the right to exert such control over his own document,
and indeed the duty to do so if he wishes to achieve these results.
No, of course not. First, HTML specifications do not postulate any use
of CSS. They are meant to be used without a style sheet, with CSS style
sheets, or with other style sheet.

The <style> element allows you to use any arbitrary style sheet
language, but CSS is specifically required for support of, e.g.,
style attributes.
Second, author style sheets (by
design and by implementation) certainly cannot force anything.

If the user agent claims to be conforming to HTML 4 and CSS Level
2, and the style sheet is active (by default or by user choice)
the rules specified must be obeyed above any defaults specified
by the "internal stylesheet".
Third, a
duplicate implementation of quotation mark rendering would be a shot in
the dark. A browser programmer can be in a position to _know_ that e.g.
curly quotes are not available in a rendering situation and use Ascii
quotation marks instead, and if an author style sheet tries to force
curly quotes, it could end up with having no quotes rendered.

If the programmer is in a position to know that they are not
available, he/she is in a position to substitute appropriate
characters, as is the case in some existing implementations.
Surely you could write a style sheet in Ascii only and yet use any
Unicode character in generated content.

I'm talking about the pasted snippet from my post, not a literal stylesheet.
No, notations like ‹ have no meaning in CSS.

I realize that. I was just using the SGML convention for
specifying the characters I would have wanted (since I wasn't
posting in Unicode).
Then you haven't seen enough. Ascii quotation marks are _safe_, as I
wrote.

&ldquo; and &rdquo; are _safe_ too. And more typographically
correct--as you yourself have pointed out. They work on a huge
variety of user-agents, including non-graphical ones, etc. Where
they are not available, they are often substituted with your
ASCII favorites.
If you consider using real quotation marks, then you should at
least refrain from using those quasi-mnemonic entity references and use
character references instead.

They are required to be supported by HTML 4.0-conformant user
agents, and are much more readale when editing source. However, I
typically post-process my HTML files to replace those entity
references not required by HTML 3.2 with the equivalent character
references.

-Micah
 
C

Chris Hoess

The <style> element allows you to use any arbitrary style sheet
language, but CSS is specifically required for support of, e.g.,
style attributes.

Wrong. Read closely Section 14.2.2.
 
J

Jukka K. Korpela

Micah Cowan said:
The second paragraph of section 4 makes it 100% formal.

Thanks for a good laugh. Seriously, you haven't actually studied the
HTML specification much if you think that it really sticks to RFC
language.

I think I will refrain from commenting further - there's too much
confusion in your ideas of forcing things in CSS, etc. Hang around and
you'll gradually see that.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,021
Latest member
AkilahJaim

Latest Threads

Top