Passing large C buffers to Java (via JNI) without copying?

J

jpknott

I'm trying to determine whether this is even possible... I have a
_very_ large buffer malloc'd on my C heap that I would like to hand
over to Java, preferably as a Byte[]. So far, every example of this
that I have come across involves _copying_ the data from my C buffer
into java storage. Is there any way to avoid this expensive (very
expensive in my case) operation?
 
R

Rationem

You could use sun's sun.misc.Unsafe class and pass the actually memory
address, although that is unsupported at best.

If you decide to go this route you should include the source for the
class, as it is not on all impls of the jvm.

Enjoy
 
R

Roedy Green

I'm trying to determine whether this is even possible... I have a
_very_ large buffer malloc'd on my C heap that I would like to hand
over to Java, preferably as a Byte[]. So far, every example of this
that I have come across involves _copying_ the data from my C buffer
into java storage. Is there any way to avoid this expensive (very
expensive in my case) operation?

Java code only deals with objects on its stacks or its heaps. Those
are the only things the JVM even knows exists.

C is much more flexible. It is prepared to address anything. So I
think the way to avoid the copy is to make Java (or a JNI system
method) allocate a Java-style byte[] buffer or nio wrapper then pass
that to C early on in the game to use as its buffer. The it can hand
back a Java-style handle to it later.
 
T

Thomas Hawtin

Roedy said:
Java code only deals with objects on its stacks or its heaps. Those
are the only things the JVM even knows exists.

And in particular byte[] has a header (length, type, monitor, hash code,
gc bits, whatever), so a plain C char[] will not do.
C is much more flexible. It is prepared to address anything. So I
think the way to avoid the copy is to make Java (or a JNI system
method) allocate a Java-style byte[] buffer or nio wrapper then pass
that to C early on in the game to use as its buffer. The it can hand
back a Java-style handle to it later.

I bet you can allocate the memory for a NIO direct buffer, and then
create the Buffer object after the fact. It probably isn't going to be
cross JVM though. (Note, most of the repetitive source for NIO is
generated, so the .java files may appear to be missing.)

Tom Hawtin
 
C

Chris Uppal

I'm trying to determine whether this is even possible... I have a
_very_ large buffer malloc'd on my C heap that I would like to hand
over to Java, preferably as a Byte[]. So far, every example of this
that I have come across involves _copying_ the data from my C buffer
into java storage. Is there any way to avoid this expensive (very
expensive in my case) operation?

How come it's so expensive ? You have presumably filled in the array in your C
code, which will take time (at least) proportionate to the size of the
buffer -- with constants or proportionality that are (at least) approaching the
cost of copying. And you are presumably going to use the data in the buffer in
your Java code, which again will take time (at least) proportionate to the size
of the buffer -- and again with constants that are (at least) approaching the
cost of copying. In fact either the processing on the C side must be
significantly more expensive than the copy, or the processing on the Java side
must be -- or else both are so trivial that there's no point in copying the
data across at all. So I don't see how the copy can take more a small part of
your total execution time.

Anyway, as Thomas has already mentioned, you can use NIO buffers to wrap a
C-side buffer in a Java object which will access it directly. But remember
that /every/ access to that data from Java will then cost more than a direct
lookup in a byte[] array would (several times more according to a link that
Thomas posted recently), so you should carefully consider whether that overhead
will overall be greater than that of a single copy.

Another approach would be to ask Java to allocate the byte[] array, and then
for the C code to use that memory as its working data (rather than malloc()-ing
a buffer). For that to work depends on GetByteArrayElements() returning a
pointer to the JVM's internal data directly, rather than making a copy (which
is implementation-dependent, and may vary depending on other factors too).
Another factor to consider is that if the JVM allocates the byte buffer then
it'll zero the memory for you, which will add a cost not so far off the cost of
the copy -- but that might be acceptable if the C code would otherwise have to
zero the memory itself.

-- chris
 
M

Mark Thornton

Chris said:
Anyway, as Thomas has already mentioned, you can use NIO buffers to wrap a
C-side buffer in a Java object which will access it directly. But remember
that /every/ access to that data from Java will then cost more than a direct
lookup in a byte[] array would (several times more according to a link that
Thomas posted recently), so you should carefully consider whether that overhead
will overall be greater than that of a single copy.

If you access the buffer via IntBuffer, CharBuffer, etc views of the
direct ByteBuffer then access from Java can be very efficient.

Mark Thornton
 
T

Thomas Hawtin

Chris said:
Anyway, as Thomas has already mentioned, you can use NIO buffers to wrap a
C-side buffer in a Java object which will access it directly. But remember
that /every/ access to that data from Java will then cost more than a direct
lookup in a byte[] array would (several times more according to a link that
Thomas posted recently), so you should carefully consider whether that overhead
will overall be greater than that of a single copy.

You should be careful in that that link contains two entries on NIO
performance (unfortunately JRoller doesn't allow me to link to them
individually). The oldest tests within a very large method, and gives
NIO bad marks. The later article uses smaller methods, and NIO comes out
much better (except for the odd outlier).
Another approach would be to ask Java to allocate the byte[] array, and then
for the C code to use that memory as its working data (rather than malloc()-ing
a buffer). For that to work depends on GetByteArrayElements() returning a
pointer to the JVM's internal data directly, rather than making a copy (which
is implementation-dependent, and may vary depending on other factors too).
Another factor to consider is that if the JVM allocates the byte buffer then
it'll zero the memory for you, which will add a cost not so far off the cost of
the copy -- but that might be acceptable if the C code would otherwise have to
zero the memory itself.

I am so not an expert in this field, but mightn't a Java allocated array
have issues with GC. If C pins the array for sometime, wont that cause
issues? If the C only pins for a short periods, it may be more tricky to
maintain good performance.

Tom Hawtin
 
R

Roedy Green

If you access the buffer via IntBuffer, CharBuffer, etc views of the
direct ByteBuffer then access from Java can be very efficient.

You can get a handle to the orgininal buffer back can't you if you
don't need the fancy access methods?
 
M

Mark Thornton

Roedy said:
You can get a handle to the orgininal buffer back can't you if you
don't need the fancy access methods?

No (at least not without dangerous hacks). You should keep a reference
to the original buffer if you will ever need it again.

Mark Thornton
 
C

Chris Uppal

Mark said:
Anyway, as Thomas has already mentioned, you can use NIO buffers to
wrap a C-side buffer in a Java object which will access it directly.
But remember that /every/ access to that data from Java will then cost
more than a direct lookup in a byte[] array would (several times more
according to a link that Thomas posted recently), so you should
carefully consider whether that overhead will overall be greater than
that of a single copy.

If you access the buffer via IntBuffer, CharBuffer, etc views of the
direct ByteBuffer then access from Java can be very efficient.

But there is still some overhead, and we are comparing that with the /very/
tiny (per byte) overhead of doing a copy. If the issue is important in the
first place (which I am still not convinced of) then the NIO overhead has to be
considered too (and measured, etc...)

-- chris
 
C

Chris Uppal

Thomas said:
I am so not an expert in this field, but mightn't a Java allocated array
have issues with GC. If C pins the array for sometime, wont that cause
issues? If the C only pins for a short periods, it may be more tricky to
maintain good performance.

Personally I wouldn't expect this to be an issue.

For one thing, we have no reason to assume that this is a heavily threaded
application, if not then the JVM will be largely quiescent while the JNI code
has the array pinned.

For another thing, if the buffer is as big as the OP implies, then I wouldn't
expect the JVM to allocate it in a movable memory area anyway (and if it did
then there'd be other problems to worry about...)

For the last thing, if the JVM's design were such that pinning large amounts of
memory during a JNI call were likely to cause problems, then I'd expect it to
use its option to return a copy of the memory from GetByteArrayElements().

-- chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top