But I thought the idea of the weak reference is that the java side can
clean it up if the only references to the object are weak.
The Java side can certainly GC the object itself. However, as I said before,
it's "the JNI mechanisms for tracking references from the C-side to the Java
side" that don't get cleaned up.
Forget weak refs for a second and consider a "global" JNI reference (which are
somewhat simpler than the "normal" thread-local references). The Java GC
cannot "see" references to Java objects that are held by C code -- therefore it
doesn't even try to do so. The JVM does not provide a pointer directly to the
Java object itself (a pointer to the memory it occupies), instead it provides a
"handle" (or you could even consider it to be a kind of Proxy) for the object,
and it is that that the C code sees. Somewhere inside the JNI implementation
there is a mapping from the handles to the actual Java objects, and the JVM's
GC is aware of those tables, and uses them (amongst other things) to decide
which Java objects are eligible for reaping. If the C code does not release
those handles properly, then there will be /two/ space leaks, one is that the
Java objects themselves won't be collected, the other is that the handles
themselves (and the corresponding table space) won't be released in the
internals of JNI.
(Of course, the above is only an approximation -- I don't know exactly what any
real implementation does, but the JNI bit can't be very different.)
Now consider weak references. Exactly the same issues arise, but in this case
the GC knows that the mapping tables don't constitute a "strong" reference to
the Java object itself -- so that can be reaped. But the tables themselves can
only be cleaned up with cooperation from the C code. That's why you have to
release the references yourself.
-- chris