Transcode Japanese??

Discussion in 'XML' started by Robert M. Gary, Apr 19, 2005.

  1. I'm on a Solaris 9 Japanese machine w/ an Ultra 5 Sparc CPU. I'm using
    Xerces 2.6 DOM

    I've got a document in UTF-8 format..
    <?xml version="1.0" encoding="UTF-8"?>
    <Name>ja_alert-\343\201\250\343\201\241\343\201\244\343\201\252\343\201\256\343\
    201\253</Name>
    (I'm not sure if the Japanese came out right here, but everything after
    ja_alert- is UTF-8 for Japanese).

    When I extract the text element I get an XMLCh* that claims to be 15 char's
    long. However, when I get a char* from it, all the Japanese is truncated and
    it comes out only 9 chars long.

    char * value = XMLString::transcode( pNode->getNodeValue() );
    cout<<"original length is "<<strlen( value )<<endl;
    cout<<"Its a text named
    "<<XMLString::transcode(pNode->getNodeName())
    <<" value "
    <<XMLString::transcode(pNode->getNodeValue())
    <<" size is "<<XMLString::stringLen( pNode->getNodeValue())
    <<endl;

    I get back...

    original length is 9
    Its a text named #text value ja_alert- size is 15

    (notice the Japanese is gone).
    My locale looks like...
    => locale
    LANG=ja
    LC_CTYPE="ja"
    LC_NUMERIC="ja"
    LC_TIME="ja"
    LC_COLLATE="ja"
    LC_MONETARY="ja"
    LC_MESSAGES="ja"
    LC_ALL=


    Do I need to something to tell the transcoder what encoding to transcode
    to??

    -Robert
     
    Robert M. Gary, Apr 19, 2005
    #1
    1. Advertising

  2. Robert M. Gary

    commercial Guest

    Opinion 1 :

    If you "overwrite" load/save nomen klatur re: characters

    you will render the contentType useless.

    It will not render.

    ======

    Opinion 2 :

    If this tripple digit backward slash multiple Garbage
    was supposed to be ANY UNICODE or JAPANESE,

    then I live on the moon.






    "Robert M. Gary" <> wrote in message
    news:...
    > I'm on a Solaris 9 Japanese machine w/ an Ultra 5 Sparc CPU. I'm using
    > Xerces 2.6 DOM
    >
    > I've got a document in UTF-8 format..
    > <?xml version="1.0" encoding="UTF-8"?>
    >

    <Name>ja_alert-\343\201\250\343\201\241\343\201\244\343\201\252\343\201\256\
    343\
    > 201\253</Name>
    > (I'm not sure if the Japanese came out right here, but everything after
    > ja_alert- is UTF-8 for Japanese).
    >
    > When I extract the text element I get an XMLCh* that claims to be 15

    char's
    > long. However, when I get a char* from it, all the Japanese is truncated

    and
    > it comes out only 9 chars long.
    >
    > char * value = XMLString::transcode( pNode->getNodeValue() );
    > cout<<"original length is "<<strlen( value )<<endl;
    > cout<<"Its a text named
    > "<<XMLString::transcode(pNode->getNodeName())
    > <<" value "
    > <<XMLString::transcode(pNode->getNodeValue())
    > <<" size is "<<XMLString::stringLen( pNode->getNodeValue())
    > <<endl;
    >
    > I get back...
    >
    > original length is 9
    > Its a text named #text value ja_alert- size is 15
    >
    > (notice the Japanese is gone).
    > My locale looks like...
    > => locale
    > LANG=ja
    > LC_CTYPE="ja"
    > LC_NUMERIC="ja"
    > LC_TIME="ja"
    > LC_COLLATE="ja"
    > LC_MONETARY="ja"
    > LC_MESSAGES="ja"
    > LC_ALL=
    >
    >
    > Do I need to something to tell the transcoder what encoding to transcode
    > to??
    >
    > -Robert
    >
    >
     
    commercial, Apr 19, 2005
    #2
    1. Advertising

  3. On Mon, 18 Apr 2005, Robert M. Gary wrote:

    > X-Newsreader: Microsoft Outlook Express 6.00.2900.2180
    >
    > <Name>ja_alert-\343\201\250\343\201\241\343\201\244\343\201\252\343\201\256\343\
    > 201\253</Name>
    > (I'm not sure if the Japanese came out right here, but everything after
    > ja_alert- is UTF-8 for Japanese).


    If you continue to use this newsreader surrogate instead of an
    actual newsreader, you need to select

    Tools > Options > Send
    Mail Sending Format > Plain Text Settings > Message format MIME
    News Sending Format > Plain Text Settings > Message format MIME
    Encode text using: None

    in order to transmit special, non-ASCII characters.

    --
    Top-posting.
    What's the most irritating thing on Usenet?
     
    Andreas Prilop, Apr 19, 2005
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Joseph
    Replies:
    2
    Views:
    5,629
  2. Prajakta

    Leaks in XMLString::transcode

    Prajakta, Apr 22, 2004, in forum: XML
    Replies:
    0
    Views:
    2,401
    Prajakta
    Apr 22, 2004
  3. Replies:
    0
    Views:
    4,607
  4. Replies:
    0
    Views:
    2,404
  5. harijay
    Replies:
    0
    Views:
    318
    harijay
    Jan 25, 2011
Loading...

Share This Page