problem about convert utf8 to gbk

Ö

Ö£öκÀ

hi all,

i need to covert the utf8 character to gbk,is that possible?Any idea
will be appreciate.
 
B

Bart Van der Donck

i need to covert the utf8 character to gbk,is that possible?Any idea
will be appreciate.

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
<html>
<head>
<title>UTF-8 to GBK Convertor</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<script type="text/javascript">
function conv(utf8) {
var r = '';
for (var i=0; i < utf8.length; i++) {
if (utf8.charCodeAt(i) > 127) {
r += unescape('%u'
+ unescape(utf8.charCodeAt(i).toString(16))
+ unescape(utf8.charCodeAt(i+1).toString(16))
);
i++;
}
else r += utf8.charAt(i);
}
document.forms[0].t2.value = r;
}
</script>
</head>
<body>
<form method="get" action="#">
<textarea cols="20" rows="10" name="t1"></textarea>
<input type="button" onClick="conv(document.forms[0].t1.value);"
value="Convert from UTF-8 to GBK -&gt;">
<textarea cols="20" rows="10" name="t2"></textarea>
</form>
<pre>
Examples (cut&paste):
Òþ˽ÉùÃ÷ (GBK)
ºãÐÅ·Ö²¿ (GBK)
CD¼ÜÊé¼Ü 15ÂòR ÊÖÌáÀ¶ (mix Latin/GBK)
</pre>
</body>
</html>

Hope this helps,
 
B

Bart Van der Donck

Bart Van der Donck wrote:

0„2 0„2 Examples (cut&paste):
0„2 0„2 0„2„1¤7„1¤70¼3„1¤7„1¤7„1¤7„1¤7 (GBK)
0„2 0„2 0„2„1¤7„1¤7„1¤70–90ö2„1¤7 (GBK)
0„2 0„2 0„2CD„1¤7„1¤7„1¤7„1¤7„1¤7 15„1¤7„1¤7R „1¤7„1¤7„1¤7„1¤7„1¤7 (mix Latin/GBK)

Sorry for the wrong encoding. Here is the right one:
http://www.dotinternet.be/temp/gbk.htm

--
Bart
 
Ö

Ö£öκÀ

That should be

charset=ISO-8859-1

hi Bart Van der Donck,

Thanks for your reply.But if we set charset=ISO-8859-1,then the
character user input will be coded in this enoding.Can you give me
some more idea.Thanks in advanced.
 
B

Bart Van der Donck

  Thanks for your reply.But if we set charset=ISO-8859-1,then the
character user input will be coded in this enoding.Can you give me
some more idea.Thanks in advanced.

The character encoding for my webpage was only relevant for the
display of the cut-and-paste examples, since all the rest is ASCII-
safe.

See the following example (ISO 8859-1):
http://www.dotinternet.be/temp/utf8toGBK.htm
versus (ASCII):
http://www.dotinternet.be/temp/utf8toGBK-ASCII.htm

But maybe you're just looking for a way to display utf-8 characters in
GBK ? Then see:
http://www.dotinternet.be/temp/displayasGBK.htm

Please bear in mind that W3 considers GBK (CP936) as a "rare or
unregistered character encoding"; it's perhaps a good idea to use
another Chinese character set for maximum client compatibility.
 
Ö

Ö£öκÀ

hi Bart,

I have some questions:

Do you mean the charset of our webpage has nothing to do with the
encoding of the user's input?

can you give me some links resource for my reference/

i have tried your code,when i input:
ÖйúÓëÊÀ½çͬ²½

output in the next textarea:ÖÐ56fdÓë4e16½ç540c²½NaN

Thanks for your great work!
 
B

Bart Van der Donck

Do you mean the charset of our webpage has nothing to do with the
encoding of the user's input?

Please see below.
can you give me some links resource for my reference/
http://www.khngai.com/chinese/charmap/tblgbk.php?page=0
http://en.wikipedia.org/wiki/UTF-8
http://en.wikipedia.org/wiki/GBK

i have tried your code,when i input:
....
output in the next textarea: .....
Thanks for your great work!

The left textarea needs to hold an UTF-8 string representing GBK
characters. It is not the intention to paste Chinese characters into
the left textarea, since those are no valid UTF-8 encodings.

If a code point up to 127 is pasted, it is treated as ASCII. Any code
point from the 128-255 range must always be manifested in pairs, since
UTF-8 uses a two-byte encoding to represent characters of the GBK
table. Code points above 256 (as your Chinese input) may never be used
in the left textarea, as they cannot be a valid UTF-8 encoding. Please
refer to the specifications of UTF-8 to see how multibyte-sequences
are used to represent one character.

If these conditions are not met for the left textarea, then the input
is not valid GBK as represented under UTF-8, and the outcome of the
conversion on the right side will be unreliable.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top