JNI Unicode String puzzle

Discussion in 'Java' started by Roedy Green, Dec 18, 2007.

  1. Roedy Green

    Roedy Green Guest

    If you do JNI GetStringChars in C++, just what do you get? an array
    of TCHARS? A null terminated TCHAR string?

    If there is no null, is there some idiomatic way to convert to
    null-terminated?

    You do insert it on the Java side?
    Do you have to allocate a buffer, copy and plop a null??

    It seems odd there would not be something built-in to handle this.
    --
    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
    Roedy Green, Dec 18, 2007
    #1
    1. Advertising

  2. Roedy Green

    Arne Vajhøj Guest

    Roedy Green wrote:
    > If you do JNI GetStringChars in C++, just what do you get? an array
    > of TCHARS? A null terminated TCHAR string?


    It returns jchar* and jchar is unsigned short, so you should
    get a TCHAR array (assuming _UNICODE).

    GetStringUTFChars returns NULL terminated, so I would
    expect GetStringChars to be the same.

    > You do insert it on the Java side?
    > Do you have to allocate a buffer, copy and plop a null??


    What ?

    Arne
    Arne Vajhøj, Dec 18, 2007
    #2
    1. Advertising

  3. Roedy Green

    Roedy Green Guest

    On Mon, 17 Dec 2007 21:50:09 -0500, Arne Vajhøj <>
    wrote, quoted or indirectly quoted someone who said :

    >
    >GetStringUTFChars returns NULL terminated, so I would
    >expect GetStringChars to be the same.


    the docs don't mention the null.
    --
    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
    Roedy Green, Dec 18, 2007
    #3
  4. Roedy Green

    Arne Vajhøj Guest

    Roedy Green wrote:
    > On Mon, 17 Dec 2007 21:50:09 -0500, Arne Vajhøj <>
    > wrote, quoted or indirectly quoted someone who said :
    >> GetStringUTFChars returns NULL terminated, so I would
    >> expect GetStringChars to be the same.

    >
    > the docs don't mention the null.


    Have you upgoogled:

    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4616318

    ?

    It seems as it is not NULL terminated.

    Arne
    Arne Vajhøj, Dec 18, 2007
    #4
  5. Roedy Green

    Roedy Green Guest

    On Mon, 17 Dec 2007 21:50:09 -0500, Arne Vajhøj <>
    wrote, quoted or indirectly quoted someone who said :

    >
    >It returns jchar* and jchar is unsigned short, so you should
    >get a TCHAR array (assuming _UNICODE).
    >
    >GetStringUTFChars returns NULL terminated, so I would
    >expect GetStringChars to be the same.


    I have been reading Sheng Liang's book. He says definitely
    GetStringChars can't be trusted to return null.

    Some program listings on the net suggests GetStringUTFChars
    does automatically append null.

    This makes sense. 16-bit chars probably point you to the original
    which has no trailing null. It has to construct an 8-bit string, so
    it might as well append the null for you.

    --
    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
    Roedy Green, Dec 18, 2007
    #5
  6. Roedy Green

    Roedy Green Guest

    On Tue, 18 Dec 2007 01:37:24 GMT, Roedy Green
    <> wrote, quoted or indirectly quoted
    someone who said :

    >If you do JNI GetStringChars in C++, just what do you get? an array
    >of TCHARS? A null terminated TCHAR string?


    Mystery solved.

    GetStringChars (16-bit) does not terminate with null. You must use
    wcsncpy_s to provide one.

    GetStringUTFChars (8-bit) does terminate with null.

    C++ Unicode 16-bit functions do not work (quietly degrade to 8-bit
    mode) unless you define BOTH:

    #define UNICODE
    #define _UNICODE

    I had forgotten what a nightmare C++ deeply nested typedefs with a
    dozen aliases for every actual type are. YUCCH!

    It came clear with sizeof dumps.

    --
    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
    Roedy Green, Dec 18, 2007
    #6
  7. Roedy Green

    Lew Guest

    Roedy Green wrote:
    > I had forgotten what a nightmare C++ deeply nested typedefs with a
    > dozen aliases for every actual type are. YUCCH!


    This resonates with what I've been saying about the dangers of adding
    something like 'typedef' to Java. For some reason people object to adding all
    the extra type-safe decorations, such as in complicated generics declarations,
    and they imagine that a 'typedef' will make life easier.

    As anyone who's had to delve into another's source can tell you, things that
    favor the original writer don't always favor a later reader of that source.
    Shortcut idioms that hide too much, as 'typedef' can, do not always facilitate
    maintenance of the program.

    --
    Lew
    Lew, Dec 18, 2007
    #7
  8. Roedy Green

    Zig Guest

    On Tue, 18 Dec 2007 00:38:46 -0500, Roedy Green
    <> wrote:

    > On Tue, 18 Dec 2007 01:37:24 GMT, Roedy Green
    > <> wrote, quoted or indirectly quoted
    > someone who said :
    >
    >> If you do JNI GetStringChars in C++, just what do you get? an array
    >> of TCHARS? A null terminated TCHAR string?


    I'll assume you're writing for Windows.

    >
    > Mystery solved.
    >
    > GetStringChars (16-bit) does not terminate with null. You must use
    > wcsncpy_s to provide one.
    >
    > GetStringUTFChars (8-bit) does terminate with null.
    >


    In one of my standard includes for my Windows JNI projects, I have a
    protype for the function:

    LPWSTR GetSzwStringCharsFromHeap(JNIEnv * env, HANDLE hHeap, jstring jstr)
    {
    LPWSTR lpwResult=NULL;
    jsize jStrLen;

    if (jstr==NULL)
    goto finished;
    jStrLen=(*env)->GetStringLength(env, jstr);

    lpwResult=HeapAlloc(hHeap, HEAP_ZERO_MEMORY, (jStrLen+1)*sizeof(WCHAR));
    if (lpwResult==NULL)
    {
    fireJavaExceptionForSystemErrorCode(env, GetLastError());
    goto finished;
    }
    (*env)->GetStringRegion(env, jstr, 0L, jStrLen, lpwResult);

    finished:
    return lpwResult;
    }

    (Callers should use (*env)->ExceptionCheck(env) to see if this function
    actually succeeded).

    If there is a more conventional approach, I'ld love to hear it. Using
    GetStringRegion to copy data to the native buffer once seems like it
    should be more efficient than allocating a non-terminated buffer and a
    terminated buffer.

    > C++ Unicode 16-bit functions do not work (quietly degrade to 8-bit
    > mode) unless you define BOTH:
    >
    > #define UNICODE
    > #define _UNICODE


    I try to avoid using LPTSTR and TCHAR wherever possible, and instead favor
    LPWSTR and WCHAR. Most Windows functions are declared as

    #ifdef UNICODE
    #define SomeFunction SomeFunctionW
    #else
    #define SomeFunction SomeFunctionA
    #endif

    (With the exception that functions new for Vista / Windows 2008 are
    generally UNICODE only)

    Thus, I explicitly call SomeFunctionW, thus avoiding the compiler's global
    UNICODE definitions.

    Isn't the UNICODE declaration supposed to be set by the C compiler's
    environment when it's in Unicode mode (which to me would suggest the
    compiler will compile "xyz" the same as L"xyz")? Since <jni.h> expects
    method & type signatures to be supplied as char* , it seems like switching
    the compiler to the full-blown Unicode mode would then break when you
    attempt to make JNI calls of the form:

    (*env)->FindClass(env, "java/lang/Object");

    Anyway, as some of this is speculation and my experimentation with such
    settings is minimal, I'ld be curious how your mileage goes.

    For what it's worth though, if you just use the "W" functions and avoid
    the TCHAR abstraction, the rest seems to fall into place.

    >
    > I had forgotten what a nightmare C++ deeply nested typedefs with a
    > dozen aliases for every actual type are. YUCCH!
    >
    > It came clear with sizeof dumps.
    >


    Hope that was interesting or useful,

    -Zig
    Zig, Dec 20, 2007
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Alex Hunsley

    IBM's JNI fails where Sun's JNI works

    Alex Hunsley, Nov 3, 2003, in forum: Java
    Replies:
    4
    Views:
    847
    Alex Hunsley
    Nov 4, 2003
  2. Dominik
    Replies:
    0
    Views:
    890
    Dominik
    Feb 19, 2004
  3. Pasturel Jean-Louis

    Porting JNI Windows under JNI LINUX + Wine ?

    Pasturel Jean-Louis, Feb 29, 2004, in forum: Java
    Replies:
    5
    Views:
    902
    Pasturel Jean-Louis
    Mar 3, 2004
  4. vasanth
    Replies:
    0
    Views:
    2,685
    vasanth
    Jan 25, 2005
  5. vasanth
    Replies:
    0
    Views:
    620
    vasanth
    Jan 25, 2005
Loading...

Share This Page