Characters

F

Firas D.

Hi,

On http://firasd.org/ the word "resumé" appears fine in IE, Opera and
Lynx, but the last letter is replaced by a "?" in Firebird. What could I
do to rectify that?

I've warily followed some of the current relevant
discussions/mini-flamewars in related NG's (http://tinyurl.com/29hza,
http://tinyurl.com/242xg), but from what I can tell, "<?xml
version="1.0" encoding="UTF-8"?>" should be enough information?

I looked at http://www.cs.tut.fi/~jkorpela/html/chars.html#theway, but
there's no way I'm going to pander to stupid anglocentrism by
"[representing] other than ASCII characters using character references
of the form &#number", and I'm sure modern browsers don't require that
anyway.


http://tinyurl.com/242xg:
(http://groups.google.com/groups?hl=...4a1c964458fc998bc62%40news.odyssey.net&rnum=8)

http://tinyurl.com/29hza:
(http://groups.google.com/groups?hl=...181542290.10332%40ppepc56.ph.gla.ac.uk&rnum=4)
 
T

Toby A Inkster

Firas said:
On http://firasd.org/ the word "resumé" appears fine in IE, Opera and
Lynx, but the last letter is replaced by a "?" in Firebird. What could I
do to rectify that?

Most unusual.

1. Firebird *does* seem to be respecting the character set (see Tools >
Page Info), so that can't be the problem.

2. You have used the correct character: confirmed by viewing in Opera and
opening the source in gedit.

3. (Not that it should matter, but) the font I was using to view your page
does contain the required glyph.

So it does seem to be a Firebird bug/oddity. I would recommend taking this
query to one of the Mozilla groups, such as netscape.public.mozilla.i18n.
 
L

Lauri Raittila

In said:
Hi,

On http://firasd.org/ the word "resumé" appears fine in IE, Opera and
Lynx, but the last letter is replaced by a "?" in Firebird. What could I
do to rectify that?

Get your server send proper headers?

Now you are sending text/html without charset, I don't suppose you
changed that? Or did you? If you did, you just prevented people answering
your question.
http://www.delorie.com/web/headers.cgi?url=http://firasd.org/

UTF-8 is not default charset for html, that is US-ASCII, but most
browsers use windows-1252 instead.

Se, either set it to send XHTML or utf-8 charset.
 
F

Firas D.

Lauri said:
Se, either set it to send XHTML or utf-8 charset.

Didn't help :(

I think it's just a bug, but it's disconcerting because FB is my default
browser.

While we're looking at the page, any idea how I would remove the
whitespace between my content and the inner edges of the browser window?
The margin-top is set to 0..
 
E

Eric B. Bednarz

Firas D. said:
On http://firasd.org/ the word "resumé" appears fine in IE, Opera and
Lynx, but the last letter is replaced by a "?" in Firebird. What could
I do to rectify that?

The last letter isn't displayed at all in Lynx over here; I didn't
bother to check what other browsers do.

Either advertise the character encoding you are actually using (Latin 1,
I suppose) or encode your documents as UTF-8.
 
F

Firas D.

Eric said:
The last letter isn't displayed at all in Lynx over here; I didn't
bother to check what other browsers do.

Now it works in IE and exhibits the same quirk in Firebird.. but putting
in an .htaccess with "AddType "text/html; charset=UTF-8" html" breaks
Opera and Lynx as well..

Removing the line from .htaccess (as I did now) reverts to looking ok in
everything but Firebird.

God, WTF.
 
T

Toby A Inkster

Lauri said:
UTF-8 is not default charset for html, that is US-ASCII

Actually, the HTML spec doesn't specify a default character set for HTML.
However, the HTTP spec specifies a default character set for all text/*
MIME types (including HTML) and that default character set is.. (trumpet:
parah pah pah pum) iso-8859-1, not us-ascii.
 
L

Lauri Raittila

In said:
Actually, the HTML spec doesn't specify a default character set for HTML.
However, the HTTP spec specifies a default character set for all text/*
MIME types (including HTML) and that default character set is.. (trumpet:
parah pah pah pum) iso-8859-1, not us-ascii.

Thanks for correcting.
 
T

Toby A Inkster

Firas said:
Figured out how to send my own
(http://webmaster.info.aol.com/apache.html), but what's up with the "X-"
prefix on custom headers? Is it just good manners?

It's a convention borrowed from RFC822 (the format for mail).

Say in 2007 we get HTTP 1.2 and it includes special graphics morphing
capabilities. There may be a *real* "Bender" header and then if Mark had
used "Bender" his site would be in violation of HTTP 1.2.

So all custom headers are prefixed with a "X-" so that they don't get
trampled over by future standards.
 
J

Jukka K. Korpela

Toby A Inkster said:
Actually, the HTML spec doesn't specify a default character set for
HTML.

It does not specify the character encoding. It specifies the document
character set. These are two different concepts, and a wording like
"character set" is used to refer to both of them.

Specifically, it does not specify any fixed or default encoding, but it
specifies the mechanism by which the encoding shall be specified, and
it prescribes that browsers must _not_ imply any default if that
mechanism is used. (This is somewhat theoretical, since we know that
browsers will try to do _something anyway. But the spirit seems to be
that they should make an intelligent guess based on the data content,
instead of implying any predefined encoding.)
However, the HTTP spec specifies a default character set for
all text/* MIME types (including HTML) and that default character
set is.. (trumpet: parah pah pah pum) iso-8859-1, not us-ascii.

But (boink! boink! boink!) the HTML specification says:

"The HTTP protocol ([RFC2616], section 3.7.1) mentions ISO-8859-1 as a
default character encoding when the "charset" parameter is absent from
the "Content-Type" header field. In practice, this recommendation has
proved useless because some servers don't allow a "charset" parameter
to be sent, and others may not be configured to send the parameter.
Therefore, user agents must not assume any default value for the
"charset" parameter."

http://www.w3.org/TR/html4/charset.html#h-5.2.2
 
T

Toby A Inkster

Jukka said:
But (boink! boink! boink!) the HTML specification says

Yes, I realise there is a conflict there, but IMHO a Standards Track RFC
trumps a W3C recommendation.
 
J

Jukka K. Korpela

Toby A Inkster said:
IMHO a Standards Track RFC trumps a W3C recommendation.

It depends on the game you play. Personally, I prefer a game at
notrumps. But this is what the IETF, the body that issues RFCs, says
about "standards track RFCs", which normally (and here) means
a "proposed standard" (the entry level on that track):

Implementors should treat Proposed Standards as immature
specifications.

BCP 9, aka RFC 2026, clause 4.1.1

Besides, the IETF has effectively given W3C free hands to play with
HTML (in some RFC whose number I've forgotten now, but it's there).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,073
Latest member
DarinCeden

Latest Threads

Top