Chinese and Russian versions of the website

  • Thread starter Luigi Donatello Asero
  • Start date
L

Luigi Donatello Asero

Hello,
I am building the website
https://www.scaiecat-spa-gigi.com
I want to prepare the Russian and Chinese version of the site.
On the homepage I want to add the links to the Russian and Chinese versions.
Should I use Unicode on the homepage?
If so, which charset should I choose?
What about the doctype?
 
D

David Dorward

Luigi said:
On the homepage I want to add the links to the Russian and Chinese
versions. Should I use Unicode on the homepage?
Yes

If so, which charset should I choose?

UTF-8 makes sense most of the time. I believe there are some efficiency
savings to be made with UTF-16 if a document consists predominantly of
Asian characters.

http://www.tbray.org/ongoing/When/200x/2003/04/26/UTF
What about the doctype?

HTML 4.01 Strict as per normal.
 
S

Smike

Luigi said:
Hello,
I am building the website
https://www.scaiecat-spa-gigi.com
I want to prepare the Russian and Chinese version of the site.
On the homepage I want to add the links to the Russian and Chinese versions.
Should I use Unicode on the homepage?
If so, which charset should I choose?
What about the doctype?

I do not know about Chinese, but Cyrillic (Russian) version is
suggested to be done in 16-bit code version. No charset specification
is required, Cyrillic text will be visible in most modern web browsers
immediately without Encoding selection.
View source of this example:
http://bratok.prison.se/heros.htm
http://bratok.prison.se/nv.htm
http://bratok.prison.se/cp.htm

To convert Cyrillic text to HTML to 16-bit code presentation, use the
following tool
http://russiantext.ircdb.org/ruseditE.htm
Format menu->Code conversion.
Converted text should be selected.
 
J

Jukka K. Korpela

Smike said:
I do not know about Chinese, but Cyrillic (Russian) version is
suggested to be done in 16-bit code version.

Are you kidding? Suggested by whom? Surely not by the Internet Architecture
Board, which clearly favors UTF-8.
No charset specification is required,

You _are_ kidding, are you not?
Cyrillic text will be visible in most modern web browsers
immediately without Encoding selection.

Web browsers recognize encoding from HTTP headers and don't need any manual
encoding selection, for any registered encoding they support.
View source of this example:

Huh? Why should we view source, in an issue like this? There's no sensible
way of viewing source without knowing or guessing the encoding, so what
could it possibly demonstrate?

These all appear to be documents that
a) lack any declaration about character encoding, which is a protocol error
and leaves it to browsers to make their guesses
b) contain just octets < 128, so any reasonable guess, such as US-ASCII or
ISO-8859-1 or ISO-8859-5 or ISO-8869-6 or UTF-8 or UTF-16, will do
c) represent non-ASCII characters as character references, which is of
course possible but rather inefficient and hopelessly obscure unless you
have an editing tool that interprets the references, and if you have, you
could use it cleverly, saving the data as UTF-8 encoded
d) demonstrate nothing relevant to the topic.

Long ago, a "conservative approach" as described in c) made sense, but it's
hardly fruitful these days for authoring in a language that uses a non-Latin
script. Besides, it has absolutely nothing to do with using some "16 bit
version".
 
L

Luigi Donatello Asero

David Dorward said:
UTF-8 makes sense most of the time. I believe there are some efficiency
savings to be made with UTF-16 if a document consists predominantly of
Asian characters.

How do I use UTF-8 if I save the html/php documents using Word Pad
(Windows 98)?
Do I do it the same way as in Outook Express?
 
D

dorayme

Luigi Donatello Asero said:
Hello,
I am building the website
https://www.scaiecat-spa-gigi.com
I want to prepare the Russian and Chinese version of the site.
On the homepage I want to add the links to the Russian and Chinese versions.
Should I use Unicode on the homepage?
If so, which charset should I choose?
What about the doctype?

My minders have put in a special request: Can we have a version in Martian?
 
L

Luigi Donatello Asero

dorayme said:
My minders have put in a special request: Can we have a version in Martian?

I am open to other planets...but unfortunately I do not know Martian, yet...
Any site to recommend to learn it?
 
L

Luigi Donatello Asero

Luigi Donatello Asero said:
I am open to other planets...but unfortunately I do not know Martian, yet...
Any site to recommend to learn it?

To begin with I would be satisfied with learning to write it on the web...
I might be proficient in the listening comprehension later...
 
A

Alan J. Flavell

I do not know about Chinese, but Cyrillic (Russian) version is
suggested to be done in 16-bit code version.
Pardon?

No charset specification is required,

Do you have the remotest clue what you are talking about?
Cyrillic text will be visible in most modern web browsers
immediately without Encoding selection.

If you understand what you're doing, it will, yes. Sometimes, by
chance, even if you don't.

As far as I can see, the encoding is us-ascii. It doesn't contain
any actual Cyrillic characters - only numerical character references
(). This is a permissible option - but rarely the option of
choice. Although it can be useful for authors who haven't a clue what
they're doing. If you catch my drift.
To convert Cyrillic text to HTML to 16-bit code presentation, use the
following tool
http://russiantext.ircdb.org/ruseditE.htm
Format menu->Code conversion.

This seems to be completely pointless. Any decent i18n software can
do this without fuss, and not only for Cyrillic.
 
D

dorayme

Luigi Donatello Asero said:
To begin with I would be satisfied with learning to write it on the web...
I might be proficient in the listening comprehension later...

Ask Blinky about the relationship between writing and comprehension, he is
the expert on this... WHACK!
 
L

Luigi Donatello Asero

dorayme said:
Ask Blinky about the relationship between writing and comprehension, he is
the expert on this... WHACK!

Can Blinky speak Martian, then?
 
L

Luigi Donatello Asero

Frank Olieu said:
_Luigi Donatello Asero_ skrev | wrote | écrivit (24-05-2006 23:32):


You could start with Klingon, then! Here is some ressource:
http://www.evertype.com/standards/csur/klingon.html
I've been told Martian is quite close to Klingon (think Swedish and
Danish)...

Joking aside,
If you tell me how Klingon could help me save
Chinese and Russian on my website as UTF-8
or do something else which I need
then it might be interesting to learn it.
 
L

Luigi Donatello Asero

Sally Thompson said:
Here is a sample: http://www.geocities.com/ctesibos/hampton/image/mars32.gif

I'm sure dorayme will translate if you ask nicely.


Are you sure that the language on the image on the link which you posted
above has not been written on the earth...?
Back to Chinese and Unicode, it seems as I can find an Unicode for many
Chinese signs on the site
http://www.mandarintools.com
the question is however how do I type it?
Should I press alt + the number or what else?
For example here are the Chinese sign, the number in Unicode and the
transliteration in Pinyin.
How do I do to use the number in Unicode?
我 = 6211 = wo (Pinyin)
 
S

Sally Thompson

(Hi Sally!)

It is a dialect from a Martian tribe that lies deeper under the
surface than most... I can make out they are cautioning against
too much freedom... but beyond that I would have to consult my
minders...

I thought the reference to seagulls was interesting, if my Martian/English
dictionary is not at fault. Martian seagulls must be a bit like hen's teeth.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top