get wide character and multibyte character value

Discussion in 'C++' started by George2, Jan 24, 2008.

  1. George2

    George2 Guest

    Hello everyone,


    I need to know the wide character (unicode) and multibyte (UTF-8)
    values of a character string of czech. I personally know nothing about
    czech. Is the following approach correct?

    1. I use L on the character string and watch memory to get the wide
    character representation of the character string in little endian
    form;

    2. I change the computer region/language to czech, and use function
    WideCharToMultiByte, and use CP_ACP as input code page and use the L
    character string as input to get the output multibyte character string
    output from parameter lpMultiByteStr.

    Is (1) and (2) correct? Any more efficient and smart ways?


    thanks in advance,
    George
     
    George2, Jan 24, 2008
    #1
    1. Advertising

  2. George2

    Daniel T. Guest

    George2 <> wrote:

    > I need to know the wide character (unicode) and multibyte (UTF-8)
    > values of a character string of czech. I personally know nothing about
    > czech. Is the following approach correct?
    >
    > 1. I use L on the character string and watch memory to get the wide
    > character representation of the character string in little endian
    > form;


    I don't think that would work. The C++ compilers that I have used don't
    handle unicode files well.

    > 2. I change the computer region/language to czech, and use function
    > WideCharToMultiByte, and use CP_ACP as input code page and use the L
    > character string as input to get the output multibyte character string
    > output from parameter lpMultiByteStr.
    >
    > Is (1) and (2) correct? Any more efficient and smart ways?


    If you are using Windows there is the "character map" program, on a Mac
    go to the Edit menu and select "Special Characters". Or you could simply
    go to http://unicode.org/charts/. Czech. uses the Cyrillic alphabet
    doesn't it?
     
    Daniel T., Jan 24, 2008
    #2
    1. Advertising

  3. George2

    James Kanze Guest

    On Jan 24, 8:12 am, George2 <> wrote:

    > I need to know the wide character (unicode) and multibyte (UTF-8)
    > values of a character string of czech. I personally know nothing about
    > czech. Is the following approach correct?


    > 1. I use L on the character string and watch memory to get the wide
    > character representation of the character string in little endian
    > form;


    (Just a nit, using wchar_t avoids any question of endianness.)

    > 2. I change the computer region/language to czech, and use function
    > WideCharToMultiByte, and use CP_ACP as input code page and use the L
    > character string as input to get the output multibyte character string
    > output from parameter lpMultiByteStr.


    (Just a nit, but there isn't any function WideCharToMultiByte in
    C++. Most of the rest of this paragraph doesn't make much sense
    to me either.)

    > Is (1) and (2) correct? Any more efficient and smart ways?


    Neither is correct if you want to know the Unicode encodings.
    Both depend a lot on the aleas of the implementation.

    What's wrong with just looking the information up in the code
    tables at the Unicode site? (Note that there won't be just one
    encoding---the actual values will depend on the canonical form
    being used.)

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Jan 25, 2008
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Web Developer

    char 8bit wide or 7bit wide in c++?

    Web Developer, Jul 31, 2003, in forum: C++
    Replies:
    2
    Views:
    615
    John Harrison
    Jul 31, 2003
  2. yazan jab

    Multibyte VS. Wide

    yazan jab, Nov 6, 2003, in forum: C Programming
    Replies:
    3
    Views:
    578
    Michael B Allen
    Nov 8, 2003
  3. Disc Magnet
    Replies:
    2
    Views:
    739
    Jukka K. Korpela
    May 15, 2010
  4. Disc Magnet
    Replies:
    2
    Views:
    818
    Neredbojias
    May 14, 2010
  5. Martin Rinehart

    80 columns wide? 132 columns wide?

    Martin Rinehart, Oct 31, 2008, in forum: Javascript
    Replies:
    16
    Views:
    193
    John W Kennedy
    Nov 13, 2008
Loading...

Share This Page