Chinese only in HTML format?

  • Thread starter Luigi Donatello Asero
  • Start date
E

Els

Toby said:
Els said:
That didn't do much here. Do I need to change a charset something in
Dialog for that?

Here is a checklist with everything you need to display those characters:

[ ] Operating system with decent Unicode support.
[ ] A font with a good range of characters.
[ ] A half-decent newsreader.

I'm guessing in your case, it's the middle one that you need.

I guess so. No idea yet how to get it right though.
Of course, the chances are that most Chinese people will have their
software configured to correctly display Chinese characters.

Sure - but I wanna see them too :)
I seem to remember that when I installed XP, I did click the option
for Hebrew language support - could you send some Hebrew characters
just like you did Chinese, see what happens on my end?
 
L

Luigi Donatello Asero

Toby Inkster said:

Thank you.
Which code should I use to write in Chinese on my website?
--
Luigi Donatello (un italiano che vive in Svezia)
(minä olen Italian kansalainen, mutta minä asun Ruotsissa)
( 我是 æ„大利人 , 但是 我 主 在 瑞典)
(Ñ ÑтальÑнец а Ñ Ð¶Ð¸Ð²Ñƒ в Швеции )
https://www.scaiecat-spa-gigi.com/sv/boende-i-italien.php
 
D

Dylan Parry

Using a pointed stick and pebbles, Toby Inkster scraped:
Nor here, though it looked fine in my newsreader. It's probably because
Google isn't sending a charset in their HTTP headers. Manually selecting
UTF-8 seems to have fixed it.

As pointed out by someone else in this thread, it works if you view the
"parsed" version of the Google archive. Is there any chance, Toby, that
you could make message-id.net point to the parsed version of articles
rather than the source?
 
T

Toby Inkster

Els said:
I seem to remember that when I installed XP, I did click the option
for Hebrew language support - could you send some Hebrew characters
just like you did Chinese, see what happens on my end?

Try this page:
http://www.mechon-mamre.org/c/ct/c0.htm

Not only Hebrew, but also right-to-left text -- try selecting a multi-line
chunk of text and see if the selection works oddly.
 
D

Dylan Parry

Using a pointed stick and pebbles, Els scraped:
[ ] A font with a good range of characters.

I guess so. No idea yet how to get it right though.

There used to be a languages section on Windows Update, but that seems
to have disappeared now (?). I can't seem to find anywhere through WU
that I can download language packs any more.
 
T

Toby Inkster

Luigi said:
Which code should I use to write in Chinese on my website?

<html lang="zh">
<title>我是 æ„大利人, 但是 我在主 瑞典</title>
<h1>我是 æ„大利人, 但是 我在主 瑞典</h1>
<p>我是 æ„大利人, 但是 我在主 瑞典</p>
</html>

and then save the file as UTF-16 and make sure you send a "Content-Type:
text/html; charset=utf-16" header. Easy.

How to set a header? As you're using PHP, it's as simple as including this
line at the top of any Chinese pages:

<? header("Content-Type: text/html; charset=utf-16"); ?>

If you're mixing Chinese and western languages on the same page, you might
find that using UTF-8 gives you smaller files.
 
E

Els

Toby said:
Try this page:
http://www.mechon-mamre.org/c/ct/c0.htm

Not only Hebrew, but also right-to-left text -- try selecting a multi-line
chunk of text and see if the selection works oddly.

No, works perfectly.

סדר הקרי×ות בבתי הכנסת, הורדת כל התנ"ך ×”×–×”, תיקון קור××™×
תורה נבי××™× ×•×›×ª×•×‘×™×
לפי הכתר וכתבי היד ×”×§×¨×•×‘×™× ×œ×•

מהדורת סיוון התשס"ה
© 2005 כל הזכויות שמורות למכון ממר×
רחוב ×—×™×™× ×•×™×˜×œ 12, ×™×¨×•×©×œ×™× 95470
(02-652-1906)

יש לך ש×לה ×ו הערה? × × ×œ×›×ª×•×‘ לנו!

Gives me questionmarks in TextPad though.
 
E

Els

Dylan said:
Using a pointed stick and pebbles, Els scraped:
[ ] A font with a good range of characters.

I guess so. No idea yet how to get it right though.

There used to be a languages section on Windows Update, but that seems
to have disappeared now (?). I can't seem to find anywhere through WU
that I can download language packs any more.

Exactly. I thought I could just find an option on my PC somewhere to
upload more fonts from the XP CD, but either it's gone, or I've
misplaced it somewhere (in my brain).
 
H

Hilarion

There used to be a languages section on Windows Update, but that seems
Exactly. I thought I could just find an option on my PC somewhere to
upload more fonts from the XP CD, but either it's gone, or I've
misplaced it somewhere (in my brain).


On XP you have to go to regional settings (in control panel)
and switch to languages tab. You'll fin there checkboxes which make
Windows install support for many languages (including additional
fonts or expanding existing fonts). If you also want support for
different codepages (not only unicode), then you'll have to check
the advanced tab and mark the conversion tables you need.


Hilarion
 
E

Els

Hilarion said:
On XP you have to go to regional settings (in control panel)
and switch to languages tab. You'll fin there checkboxes which make
Windows install support for many languages (including additional
fonts or expanding existing fonts).

Yup, found it. The box for all kinds of weird languages was already
ticked, the East Asian languages wasn't, and apparently I need to
install 230MB of stuff when I tick it, so I decided against it :)

Still would like to find out if I can write Hebrew in a text editor
though - I'd guess TextPad supports it, but I haven't found out yet
how.
If you also want support for
different codepages (not only unicode), then you'll have to check
the advanced tab and mark the conversion tables you need.

That bit I have to investigate further, as I have no idea what all
those different codes actually do.

Thanks for the info :)
 
H

Hilarion

Still would like to find out if I can write Hebrew in a text editor
though - I'd guess TextPad supports it, but I haven't found out yet
how.

I do not know TextPad, but I checked with notepad and it worked.
To enter Hebrew chars into notepad you'll have to install Hebrew
keyboard (not physical keyborad - see regional settings) or (and
I think it's better way) use "charmap.exe" (available in accessories
/ system tools - not sure of the name because I use Polish Windows).
If you launch charmap, choose font you want to use (eg. Courier
New), tick "advanced view", select grouping by Unicode subranges
and in the group selection window select Hebrew. This way you'll get
hebrew chars in main window of charmap. You can select and copy
those chars into your application (tested in notepad). Should work
if you use same font in charmap as in your application (not all
fonts have Hebrew chars).

That bit I have to investigate further, as I have no idea what all
those different codes actually do.

I'll give egzample. To write posts or create webpages in Polish
I use ISO-8859-2 ANSI codepage. Windows uses UTF-16 (also called UCS-2).
So if I display some ISO-8859-2 webpage in Internet Explorer then it
uses translation table for ISO-8859-2 to translate it to UTF-16 and
shows the results. If I use notepad to create ANSI txt file (and
I have Windows-1250 codepage selected for default ANSI codepage in my
Windows), then when I save the file the notepad uses translation table
for Windows-1250 to translate the text entered into the notepad (which
works with UTF-16) to Windows-1250 and saves the translation results
to selected file.
So if someone uses some different that UTF-16 (or UTF-8, which is
an other way to code UTF-16) codepage to write Hebrew news post
or webpage or text file, then you'll have to install translation
table for this codepage to get this post or webpage or text file
shown properly (assuming that the application you are using uses
Windows API to do the translation).


If I screwed something in my explanation then please post corrections
to it. I'm no Windows XP expert and I may be wrong on this subject.

Hilarion
 
L

Luigi Donatello Asero

Toby Inkster said:
<html lang="zh">
<title>我是 æ„大利人, 但是 我在主 瑞典</title>
<h1>我是 æ„大利人, 但是 我在主 瑞典</h1>
<p>我是 æ„大利人, 但是 我在主 瑞典</p>
</html>

and then save the file as UTF-16 and make sure you send a "Content-Type:
text/html; charset=utf-16" header. Easy.


Can I configure Notepad or Wordpad to write Chinese?
Do they support it?
Which mode should I use to send the file to the server?
What about Russian?


Luigi Donatello (un italiano che vive in Svezia)
(minä olen Italian kansalainen, mutta minä asun Ruotsissa)
( 我是 æ„大利人 , 但是 我 主 在 瑞典)
(Ñ ÑтальÑнец а Ñ Ð¶Ð¸Ð²Ñƒ в Швеции )
https://www.scaiecat-spa-gigi.com/it/traduzioni.php
 
H

Hilarion

Which code should I use to write in Chinese on my website?
Can I configure Notepad or Wordpad to write Chinese?
Do they support it?

On Windows XP they both do, because they support Unicode.
If by "write Chinese" you mean "save Chinese text in text file",
then you'll have to choose Unicode format using "Save as..."
option. If by "write Chinese" you mean a way to enter Chinese
character in Notepad or Wordpad, then the solution is instalation
of Chinese keyboard or using "charmap.exe" application to
select, copy and paste desired Chinese characters.

Which mode should I use to send the file to the server?

Binary (you can use text transfer mode if you save files as UTF-8,
but not if it's UTF-16 or if you are not sure which Unicode
format was used). The more important thing is not the way
of sending files to the webserver, but the way of using them.
You have to remember to edit them using Unicode-enabled
editor and to make web server send content-type header
including file encoding information (UTF-16 or UTF-8 or UCS-2).

What about Russian?

Same here.


You can also use codepages specific for the language (Chinese
or Russian), but Unicode is probably the best way (UTF-8 being
best if you mix languages and / or mix code - like PHP / HTML
- with text).


Hilarion
 
E

Els

Hilarion said:
I do not know TextPad, but I checked with notepad and it worked.
To enter Hebrew chars into notepad you'll have to install Hebrew
keyboard (not physical keyborad - see regional settings) or (and
I think it's better way) use "charmap.exe" (available in accessories
/ system tools - not sure of the name because I use Polish Windows).
If you launch charmap, choose font you want to use (eg. Courier
New), tick "advanced view", select grouping by Unicode subranges
and in the group selection window select Hebrew. This way you'll get
hebrew chars in main window of charmap. You can select and copy
those chars into your application (tested in notepad).

That works: ×—×™ (even applies the right to left order of the letters),
but if I'd ever have to write an entire page in Hebrew, I think I'd go
for the Hebrew keyboard though, as it's obviously very slow typing to
have to click and select every single letter in the charmap.
Should work
if you use same font in charmap as in your application (not all
fonts have Hebrew chars).

Took a while, but found the option where I can change the font in
TextPad, and it's working now, thanks :)
I'll give egzample. To write posts or create webpages in Polish
I use ISO-8859-2 ANSI codepage. Windows uses UTF-16 (also called UCS-2).
So if I display some ISO-8859-2 webpage in Internet Explorer then it
uses translation table for ISO-8859-2 to translate it to UTF-16 and
shows the results. If I use notepad to create ANSI txt file (and
I have Windows-1250 codepage selected for default ANSI codepage in my
Windows), then when I save the file the notepad uses translation table
for Windows-1250 to translate the text entered into the notepad (which
works with UTF-16) to Windows-1250 and saves the translation results
to selected file.

I guess it's too early in the morning for me to get that into my head
:S
So if someone uses some different that UTF-16 (or UTF-8, which is
an other way to code UTF-16) codepage to write Hebrew news post
or webpage or text file, then you'll have to install translation
table for this codepage to get this post or webpage or text file
shown properly (assuming that the application you are using uses
Windows API to do the translation).

I think I already have that, as I can see Hebrew pages in my browser
just fine.
If I screwed something in my explanation then please post corrections
to it. I'm no Windows XP expert and I may be wrong on this subject.

Right now, Hebrew is working for me in both my editor and my browser,
so I guess all is good now. It's Chinese I can't read here in Dialog,
but that probably has to do with me not wanting to instal the 230MB
needed for those characters (Win XP) :)
 
H

Hilarion

So if someone uses some different that UTF-16 (or UTF-8, which is
I think I already have that, as I can see Hebrew pages in my browser
just fine.

Maybe. But it's also possible that those pages use Unicode. In this
case translation tables are not needed.


Hilarion
 
R

Rincewind

I'd guess TextPad supports it, but I haven't found out yet
how.

From Text Pad help file "English is always available, but other languages
are stored in DLL's with names of the form "TXPAD???.DLL". For example
"???" is "DEU" for German and "FRA" for French. "
 
E

Els

Rincewind said:
From Text Pad help file "English is always available, but other languages
are stored in DLL's with names of the form "TXPAD???.DLL". For example
"???" is "DEU" for German and "FRA" for French. "

No, that's the interface language, not the charset. I've found the
Hebrew support already though. I just had to change the font from
Courier to Courier New to see it :)
 
J

Jukka K. Korpela

Toby Inkster said:
and then save the file as UTF-16 and make sure you send a "Content-Type:
text/html; charset=utf-16" header. Easy.

Some browsers that understand UTF-8 don't grok UTF-16. Moreover, it seems
that Google does not understand UTF-16. Besides, UTF-8 is favored on the
Internet by the IETF policy on character encodings.

Using UTF-16 probably gives efficiency benefits over UTF-8 for dominantly
Chinese text, but I think the above points matter more. Besides, an HTML
document contains HTML markup, which is all ASCII and there more compact
(one byte per character) in UTF-8 than in UTF-16 (two bytes per character).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top