how to display/input/write Chinese Text in java

Discussion in 'Java' started by Buddha, Feb 20, 2008.

  1. Buddha

    Buddha Guest

    Dear All,

    I am trying to make a very simple java program, where I am trying to
    display Chinese characters. I am trying to save them into a file (for
    now, later into db2). However, I seem to make no progress at all.
    I am completely lost with this one. I have googled and gone through a
    lot of sites like :

    http://www.mandarintools.com/javaconverter.html
    http://www.chinesecomputing.com/programming/java.html
    http://forum.java.sun.com/thread.jspa?threadID=442220&messageID=1995079
    http://www.linuxforum.net/chinese/develop/java.html
    http://java.sun.com/docs/books/tutorial/i18n/locale/services.html

    I have downloaded the cyberbit.ttf into /jre/lib/fonts and updated the
    fonts.properties also. (I dont reallyknow if I need to do it though).
    I have tried to compile with two different encoding options big5 and
    gb2132 as well.
    I have also tried to pick up a few unicodes from a site and tried ..
    for ex.

    String s = "\U+0061";


    but this results in Illegale Escape Char. exception.

    All I am trying to do is use a String reference, feed it with Chinese
    text, which again I copied from a site, and then display it on the
    console. Or even write it to a file.
    I am lost in the maze of Unicode, UTF -8 etc.


    Later on :
    I tried to actually start working on the web app where this change is
    intended.
    I copied a chinese String from a website (pasted it in Word, and it
    pasted fine).
    Pasted it in the text box and saved this field. The value appears
    gibberish in db2 ( I am sure the tables arent specified for UTF-8).
    My app doesnt use any contentType or characterEncoding either.
    Then when I retrieve this value, it comes as exact same gibberish as
    in DB.
    But, behold, when I change the encoding (view -> encoding -> chinese
    (GB2312)) It displays the exact String that was inserted :)
    Then I add these lines in my Jsp.

    <%@ page contentType="text/html; charset=GB2312" %>
    request.setCharacterEncoding("GB2312");

    ..
    so that I can make that chinese GB2312 as the default option. It does
    that. (The default was Western European (ISO).
    However now, it displays question marks (????) in that field.

    What options can I have ?
    I would be really glad if someone could lead me out of this.

    Thanks in advance.

    Rgds,

    This is a crosspost, obviously, because I did not recieve help
    elsewhere.
     
    Buddha, Feb 20, 2008
    #1
    1. Advertising

  2. Buddha

    Mark Space Guest

    Buddha wrote:

    > String s = "\U+0061";


    I think "\U0061" is the correct syntax. However this is a latin "="
    yes? Is that what you want? (or is it an "a"?)


    > <%@ page contentType="text/html; charset=GB2312" %>
    > request.setCharacterEncoding("GB2312");


    I am by no means an expert, but is your page really GB2312? Aren't you
    using a regular text editor? Does your own system use GB2312? Because I
    think that's what you are saying here.

    request.setCharacterEncoding("GB2312") will set the request but... is
    the request in GB2312 format? Can you show us the settings before
    setting this? Did you set your browser to request that character set?

    You might want to check out the <fmt> stuff in the JSTL. It provides
    localization in JSPs. You still have to provide resource bundles.

    Don't forget that the locale comes from the user and the browser, you
    don't set it. You follow what the user asks for.
     
    Mark Space, Feb 20, 2008
    #2
    1. Advertising

  3. Buddha

    Buddha Guest

    Hi,
    First, thanks for the quick reply.

    > yes?  Is that what you want? (or is it an "a"?)

    I dont know, honestly. I just got this off a chinese site.


    > I am by no means an expert, but is your page really GB2312?  Aren't you
    > using a regular text editor?  Does your own system use GB2312? Because I
    > think that's what you are saying here.

    Again, I am not sure. On a site is where I read about this.
    and this is what is expected to be done in case we are dealing with
    words in other language ( chinese ).

    > You might want to check out the <fmt> stuff in the JSTL. It provides
    > localization in JSPs. You still have to provide resource bundles.
    >
    > Don't forget that the locale comes from the user and the browser, you
    > don't set it. You follow what the user asks for.


    My application is only in english. It doesnt really use
    internationalization.
    Now, in short. All I need to do is allow my user irrespective of which
    country this site is opened in,
    to be able to enter chinese text.
    I am guessing : They will have the chinese keyboards and a way to
    enter chinese text.
    So, all I would need to concentrate on is, entering some chinese words/
    alphabets(?) (which I copied from a site, because I dont know Chinese)
    and inserted them in the text box, and hit the save button.
    Rest is as mentioned above :
    **********************
    The value appears
    gibberish in db2 ( I am sure the tables arent specified for UTF-8).
    My app doesnt use any contentType or characterEncoding either.
    Then when I retrieve this value, it comes as exact same gibberish as
    in DB.
    But, behold, when I change the encoding (view -> encoding -> chinese
    (GB2312)) It displays the exact String that was inserted :)
    **********************
    This is a very old application and there are no
    <%@ page contentType="text/html; charset=GB2312" %> tags anywhere. I
    just inserted them to check this functionality.

    So basically, what I am looking to do is : Enter Chinese text and upon
    retrival of that record; display it.
    and, am clueless on how I would go about it.

    Any help appreciated.

    TIA
     
    Buddha, Feb 20, 2008
    #3
  4. Buddha

    Mark Space Guest

    Buddha wrote:

    > Any help appreciated.


    Well like I say I'm not an expert, but try going here:

    http://www.javapassion.com/j2ee/#JSTL

    Read the PDF files first, paying attention to the i18n stuff mostly.
    Note there are two ways described that fit into J2EE architecture. One
    is for the browser to send a request, the second is for the user to log
    in and configure their preferences.

    I think just putting tags everywhere won't work, because obviously you
    have people browsing with other systems, like English.

    The lab document takes you through some steps to see examples of i18n
    and how they work in J2EE. It's pretty valuable I think.

    Then back up on the lessons there, and check out the NetBeans IDE. It
    has an HTTP monitor tool that allows you to see what is actually being
    requested. (The lab document talks about this.) I think you are going
    to need this to figure out what is really going on.

    That's about all I can say, because I'd really need to see how the
    requests are being sent. You'll likely need to learn how to set your
    browser to request GB2312 so you can test it. And even better would be
    an automated test suite that makes both Latin and GB2312 requests
    automatically so you don't have to wear your fingers out testing. Just
    something to think about.
     
    Mark Space, Feb 20, 2008
    #4
  5. Buddha

    Roedy Green Guest

    On Tue, 19 Feb 2008 21:22:25 -0800 (PST), Buddha
    <> wrote, quoted or indirectly quoted someone
    who said :

    >I am trying to make a very simple java program, where I am trying to
    >display Chinese characters. I am trying to save them into a file (for
    >now, later into db2). However, I seem to make no progress at all.
    >I am completely lost with this one. I have googled and gone through a
    >lot of sites like :


    Have at a look at the source code for fontshower and fontshowerawt
    They each display a few Chinese characters. Perhaps you could tell me
    which ones I chose. I picked something that had visual appeal and
    that looked "Chinese". I hope they don't have some peculiar meaning.

    see http://mindprod.com/applet/fontshower.html
    http://mindprod.com/jgloss/fontshowerawt.html

    That will show you how to write Chinese in program in a number of
    ways. Your big problem was you did not know how to write unicode
    literals.

    See http://mindprod.com/jgloss/literal.htm

    you want "\u3302\u4e02" no plus signs.

    If you have a file in Chinese, you next need to figure out what
    encoding it is. See http://mindprod.com/applet/officialencoding.html
    to help you guess.

    Once you know, you can write a program to read the file, ,decoding it.
    See http://mindprod.com/jgloss/encoding.html
    http://mindprod.com/applet/fileio.html
    for details.

    --

    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
     
    Roedy Green, Feb 21, 2008
    #5
  6. Buddha

    Buddha Guest

    Thank you so much Roedy and Mark.
    Mark, I dont really have the luxury of using JSTL.Moreover what I
    ONLY need to do is allow the user to ENTER Chinese in the text boxes,
    and display it when retrieved. The application that I am maintaining
    went live in 97 !!

    Roedy, thanks for all the help. Those links majorly deal with awts and
    applets. I am working only in Jsps, and as said before, only be able
    to enter chinese and display chinese ( in the text boxes).

    I shall get back to you with what ever I plan to do. In the mean time
    if you have more info, kindly keep it coming :)

    thanks
    Buddha
     
    Buddha, Feb 21, 2008
    #6
  7. Buddha

    Lew Guest

    Buddha wrote:
    > Thank you so much Roedy and Mark.
    > Mark, I dont really have the luxury of using JSTL.Moreover what I
    > ONLY need to do is allow the user to ENTER Chinese in the text boxes,
    > and display it when retrieved. The application that I am maintaining
    > went live in 97 !!


    Side note: How is it that JSTL is a "luxury" for you? Would you please
    elucidate what makes it unavailable for you?

    JSPs weren't available in 1997. There must be some path to upgrade the
    application platform. What Java version does the application use?

    Using JSTL should be just a matter of dropping the correct JARs in the
    application lib path.

    --
    Lew
     
    Lew, Feb 21, 2008
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John Toop
    Replies:
    0
    Views:
    411
    John Toop
    Jul 21, 2003
  2. =?Utf-8?B?VG9t?=

    UTF-8 cannot display Chinese

    =?Utf-8?B?VG9t?=, Nov 8, 2004, in forum: ASP .Net
    Replies:
    3
    Views:
    812
    =?Utf-8?B?UGV0ZXI=?=
    Dec 29, 2004
  3. Replies:
    7
    Views:
    6,667
    Joerg Jooss
    Feb 25, 2005
  4. ad
    Replies:
    1
    Views:
    345
    Lau Lei Cheong
    Mar 12, 2005
  5. acord
    Replies:
    3
    Views:
    122
    Evertjan.
    Mar 24, 2006
Loading...

Share This Page