Data entry in foreign languages

Discussion in 'Java' started by Roedy Green, Jan 26, 2006.

  1. Roedy Green

    Roedy Green Guest

    To what extent can you ignore the foreign language problem for
    entering data into Java?

    do the OS keyboard drivers and Unicode handle everything?

    What about Hebrew, right to left. Do the Strings read right to left
    too?

    What about Arabic which has 2D placement and all kinds of special
    forms for the glyphs. Do the fonts contain enough information that you
    just string the unicode chars together and it renders plausibly?

    Has anyone any experience with languages that don't use the Roman
    alphabet?

    If something strange is required, how the heck do you write code
    without being able to tell if the results are correct? Are there some
    test strings?

    --
    Canadian Mind Products, Roedy Green.
    http://mindprod.com Java custom programming, consulting and coaching.
     
    Roedy Green, Jan 26, 2006
    #1
    1. Advertising

  2. I have experience with writing program which accept chinese variants
    and japanese characters.

    There are ample example strings and web tools for the varients of
    Unicode. One way to handle input is to copy and paste examples from
    webpages into web form inputs and have those feed into Java servlets.
    Another way is to save examples and read binary input into Java Strings
    followed by Graphics.drawString invocations.

    This has worked very smoothly for me.

    The only gottcha I ran into was when trying to pass paramters into
    java's main from environemtn -- this is specified as non-standard, but
    I like using it. The problem is that main(String a[]) does not specify
    an interpretation of bytes passed in from environment. I suppose the
    assumption is that ascii was read in from terminal.

    Strings can be concatinated without trouble as long as same Unicode
    encoding is used (95% sure of this). There are different Unicode
    encoding (100% sure of this). UTF-8 appears to be best. UTF-8 did not
    exist till two years after Java 1.0. Java 1.0 uses Unicode 16.

    I am not sure about the Hebrew thing being right to left. I suspect
    Java does take account of that stuff because I remember seeing
    LayoutManagers take account of local conventions. I have not tried
    displaying a right to left language.

    Opalinski

    http://www.geocities.com/opalpaweb/
     
    opalinski from opalpaweb, Jan 26, 2006
    #2
    1. Advertising

  3. Roedy Green

    Oliver Wong Guest

    "Roedy Green" <> wrote in
    message news:...
    > To what extent can you ignore the foreign language problem for
    > entering data into Java?
    >
    > do the OS keyboard drivers and Unicode handle everything?
    >
    > What about Hebrew, right to left. Do the Strings read right to left
    > too?
    >
    > What about Arabic which has 2D placement and all kinds of special
    > forms for the glyphs. Do the fonts contain enough information that you
    > just string the unicode chars together and it renders plausibly?
    >
    > Has anyone any experience with languages that don't use the Roman
    > alphabet?
    >
    > If something strange is required, how the heck do you write code
    > without being able to tell if the results are correct? Are there some
    > test strings?


    I'll warn you right now that the matter is complicated by the fact that
    there may be an additional layer of indirection between the keyboard, and
    Java. Microsoft uses something called IME (Input Method Editor, or something
    like that?) such that when I set my local to Japanese Hiragana, and I type
    in 'k', Java will notice that a key-down event has occured, and that the 'k'
    key was pressed, but the character 'k' has not yet been sent to the input.

    Then, I later press 'a', and Java will notice that a key-down event has
    occured, and that the 'a' key was pressed, and still no character has been
    sent to the input yet.

    I then press 'enter', and Java will notice yet another key-down event
    has occured, and that the 'enter' key was pressed, and now, finally, a
    single character 'hiragana ka' (\u304b) has been sent to the input.

    So when you ask "Does Java handle everything?" it depends on what you're
    doing. As long as you don't associate key pressed with character input, you
    should be okay. The IME (or its equivalent on Linux and MacOS) will take
    care of translating key events into unicode strings. But if you start mixing
    key-handling and string reading, you may run into problems.

    - Oliver
     
    Oliver Wong, Jan 26, 2006
    #3
  4. Roedy Green

    Roedy Green Guest

    On Thu, 26 Jan 2006 20:52:29 GMT, Roedy Green
    <> wrote, quoted or
    indirectly quoted someone who said :

    >To what extent can you ignore the foreign language problem for
    >entering data into Java?


    I am doing some experiments after coming up somewhat dry on google,
    getting mostly information IN Hebrew rather than about how to render
    it.

    I discovered first it is not strictly right to left. Numbers go left
    to right. AND you still enter the digits high order first.

    When you key into a Java JTextField the cursor flips back and forth
    between the two modes. It is quite insane.

    Further just setting the input locale and keyboard driver to Hebrew
    was sufficient to trigger this mixed right to left behaviour. However,
    it was not sufficient to trigger an automatic right justify on a
    JTextField.

    The caretListener getText sees something quite different from what is
    on the screen. I have yet to sort out what the internal forms are.

    The is all complicated by the fact the only thing I know about Hebrew
    is what as Aleph character looks like.

    Then on top of this, I have to figure out what drawString thinks.
    which probably has no notion of locale.

    I have started a web page on my findings. See
    http://mindprod.com/jgloss/hebrew.html
    --
    Canadian Mind Products, Roedy Green.
    http://mindprod.com Java custom programming, consulting and coaching.
     
    Roedy Green, Jan 27, 2006
    #4
  5. Roedy Green

    Roedy Green Guest

    On Fri, 27 Jan 2006 02:07:16 GMT, Roedy Green
    <> wrote, quoted or
    indirectly quoted someone who said :

    >I have started a web page on my findings. See
    >http://mindprod.com/jgloss/hebrew.html


    At least for output, Java, it turns out is quite automatic. It
    ignores the locale. It simply notices Hebrew characters in your
    Jlabel, JTextArea, JTextField or drawString and renders them right to
    left. The first char you pronounce is at position 0 of the string, and
    at the far right on screen.

    So all works fine except that everything is left instead of right
    aligned. I gather then Israelis are used to seeing inept computer
    printout all left aligned.

    I have posted a little test program to demonstrate this.

    Console IO is hopeless.
    --
    Canadian Mind Products, Roedy Green.
    http://mindprod.com Java custom programming, consulting and coaching.
     
    Roedy Green, Jan 27, 2006
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. yes=no

    !doctype & foreign languages

    yes=no, Nov 28, 2003, in forum: HTML
    Replies:
    17
    Views:
    4,484
    Jukka K. Korpela
    Nov 30, 2003
  2. AtomicBob
    Replies:
    14
    Views:
    966
    Toby Inkster
    May 2, 2006
  3. H5N1
    Replies:
    0
    Views:
    472
  4. UJ

    Foreign Languages?

    UJ, Jun 16, 2006, in forum: ASP .Net
    Replies:
    3
    Views:
    390
    Juan T. Llibre
    Jun 16, 2006
  5. Mike Owen

    Allowing entry of a Carriage Return during data entry

    Mike Owen, Jul 27, 2006, in forum: ASP .Net Web Controls
    Replies:
    3
    Views:
    796
    Alessandro Zifiglio
    Jul 27, 2006
Loading...

Share This Page