A simple question - how to convert from UTF8 to wide char (wchar_t) on linux

Discussion in 'C Programming' started by uday.sen@gmail.com, Jun 6, 2006.

  1. Guest

    Hi,

    I need to convert a string from UTF8 to wide character (wchar_t *). I
    perform the same in windows using:

    MultiByteToWideChar(CP_UTF8, 0, pInput, -1, pOutput, nLen);

    However, in linux this API is not available. However, there exists
    mbstowcs() API, which converts multibyte string to wide character. But
    will this API convert UTF8 encoded string to wide character? Or this
    API will convert *only ASCII* characters to wide characters?

    There exists also iconv() API which converts characterset using a
    characterset conversion descriptor returned by iconv_open(). But for
    iconv_open(char *toCode, char* fromCode), what would be "toCode" and
    "fromCode" value? I think "toCode" will be UTF8, but what would be
    "fromCode"?


    Thanks and regards,
    - Uday
    , Jun 6, 2006
    #1
    1. Advertising

  2. Re: A simple question - how to convert from UTF8 to wide char (wchar_t)on linux

    wrote:
    > Hi,
    >
    > I need to convert a string from UTF8 to wide character (wchar_t *). I
    > perform the same in windows using:
    >
    > MultiByteToWideChar(CP_UTF8, 0, pInput, -1, pOutput, nLen);
    >
    > However, in linux this API is not available. However, there exists
    > mbstowcs() API, which converts multibyte string to wide character. But
    > will this API convert UTF8 encoded string to wide character? Or this
    > API will convert *only ASCII* characters to wide characters?
    >
    > There exists also iconv() API which converts characterset using a
    > characterset conversion descriptor returned by iconv_open(). But for
    > iconv_open(char *toCode, char* fromCode), what would be "toCode" and
    > "fromCode" value? I think "toCode" will be UTF8, but what would be
    > "fromCode"?
    >
    >
    > Thanks and regards,
    > - Uday
    >

    Your fromCode is UTF8 and your toCode is WCHAR_T.

    Robert
    Robert Harris, Jun 6, 2006
    #2
    1. Advertising

  3. ithink Guest

    You need to set tocode to "UTF-8" and fromcode to "WCHAR_T"
    To get a complete list for to and from do 'iconv --list'

    wrote:
    > Hi,
    >
    > I need to convert a string from UTF8 to wide character (wchar_t *). I
    > perform the same in windows using:
    >
    > MultiByteToWideChar(CP_UTF8, 0, pInput, -1, pOutput, nLen);
    >
    > However, in linux this API is not available. However, there exists
    > mbstowcs() API, which converts multibyte string to wide character. But
    > will this API convert UTF8 encoded string to wide character? Or this
    > API will convert *only ASCII* characters to wide characters?
    >
    > There exists also iconv() API which converts characterset using a
    > characterset conversion descriptor returned by iconv_open(). But for
    > iconv_open(char *toCode, char* fromCode), what would be "toCode" and
    > "fromCode" value? I think "toCode" will be UTF8, but what would be
    > "fromCode"?
    >
    >
    > Thanks and regards,
    > - Uday
    ithink, Jun 6, 2006
    #3
  4. ithink Guest

    Oops .. sorry for my foolish swap in my previous post ..

    Set fromcode to "UTF-8" and tocode to "WCHAR_T"

    ithink wrote:
    > You need to set tocode to "UTF-8" and fromcode to "WCHAR_T"
    > To get a complete list for to and from do 'iconv --list'
    >
    > wrote:
    > > Hi,
    > >
    > > I need to convert a string from UTF8 to wide character (wchar_t *). I
    > > perform the same in windows using:
    > >
    > > MultiByteToWideChar(CP_UTF8, 0, pInput, -1, pOutput, nLen);
    > >
    > > However, in linux this API is not available. However, there exists
    > > mbstowcs() API, which converts multibyte string to wide character. But
    > > will this API convert UTF8 encoded string to wide character? Or this
    > > API will convert *only ASCII* characters to wide characters?
    > >
    > > There exists also iconv() API which converts characterset using a
    > > characterset conversion descriptor returned by iconv_open(). But for
    > > iconv_open(char *toCode, char* fromCode), what would be "toCode" and
    > > "fromCode" value? I think "toCode" will be UTF8, but what would be
    > > "fromCode"?
    > >
    > >
    > > Thanks and regards,
    > > - Uday
    ithink, Jun 6, 2006
    #4
  5. Guest

    Thanks a lot for your prompt help. I have following 2 questions:

    1. Will "WCHAR_T" be platform independent? Going forward I plan to
    deploy the piece of code on Solaris 10.
    2. When converting using iconv(), how can I determine the size required
    for output string? "iconv(3)" man page does not tell about anything
    about it.
    size_t retval = iconv(cd, pInput, wcslen(pInput), pOutput,
    ???);
    Do I have to calculate it by myself? Or there is any platform API that
    I can use for my purpose.

    Thanks again,
    - Uday
    , Jun 6, 2006
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Web Developer

    char 8bit wide or 7bit wide in c++?

    Web Developer, Jul 31, 2003, in forum: C++
    Replies:
    2
    Views:
    569
    John Harrison
    Jul 31, 2003
  2. Bren
    Replies:
    4
    Views:
    4,113
    Peter van Merkerk
    Oct 7, 2003
  3. wchar_t and wide characters

    , Mar 13, 2006, in forum: C Programming
    Replies:
    1
    Views:
    289
    P.J. Plauger
    Mar 13, 2006
  4. Replies:
    3
    Views:
    1,080
    James Kanze
    Aug 15, 2008
  5. gry
    Replies:
    2
    Views:
    704
    Alf P. Steinbach
    Mar 13, 2012
Loading...

Share This Page