The nature of string and char[] in .NET

L

Lau Lei Cheong

Hello,

I'm trying to write a converter for converting between Big5 and UTF-8,
but I want to make sure a few facts before writing.

1) I know that by default .NET store string in unicode. Would there be
any problem if I store Big5 characters in the string? Or could I set the
codepage setting for individual string?

2) There are basically three types of Unicode scheme - UTF-7, UTF-8 and
UCS-2. Which one does the default Unicode setting refer to?

3) Same as 1) but this time is for char[].

I'm writing this because the webpage I'm writing is in Unicode, it
stores data to MySQL database which store data in Big5, and we also have a
backend written in VB6 which would be nearly rewritting if need to change to
Unicode. Here, I plan to translate the data immediately when read from the
database and vice versa so no other existing part need to be changed. I'm
using LibEx with MyODBC for accessing MySQL.

This post will be crossposted to
microsoft.public.dotnet.internationalization.(The i18n group seems more
appropiate, but as I'm also asking how strings are stored in .NET
applications, I think it's also good to post in here.)

Any advice would be greatly appreciated. Whether for the questions or
for a better way to fatch the data so no manual translation is needed. :)

Thanks in advance.

Regards,
Lau Lei Cheong
 
N

Natty Gur

Hi,

1) you can set application encoding in web.config :

<globalization
requestEncoding="utf-8"
responseEncoding="utf-8"
/>

2) you can convert from one encoding to others by using
Encoding.Convert Method :
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/h
tml/frlrfSystemTextEncodingClassConvertTopic1.asp

HTH

Natty Gur[MVP]

blog : http://weblogs.asp.net/ngur
Mobile: +972-(0)58-888377


*** Sent via Devdex http://www.devdex.com ***
Don't just participate in USENET...get rewarded for it!
 
L

Lau Lei Cheong

Natty Gur said:
Hi,

1) you can set application encoding in web.config :

<globalization
requestEncoding="utf-8"
responseEncoding="utf-8"
/>
Thanks for the information. :)
2) you can convert from one encoding to others by using
Encoding.Convert Method :
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/h
tml/frlrfSystemTextEncodingClassConvertTopic1.asp
It's pity that my attempt to follow this has failed. The modified function
only return question marks to the database.
Seems more modifications is needed for converting multibyte characters.

I've worked so far success to convert unicode string to byte array(big5 code
equivalent). Now I need to convert it back to string. Any idea on that?
Thanks a lot.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top