synchronized using String.intern()

L

Lew

I wondered if it is really true that intern()ed Strings can be GCed.
How could Java keep the promises specified for intern()ed Strings if
they're GCed?

Also, the String#intern() Javadocs state:
"A pool of strings, initially empty, is maintained privately by the
class String."

This apparently implies that the class 'String' keeps all references
to intern()ed Strings, which would prevent them from ever being GCed
unless the class itself were collected.

However, the article at
<http://www.javaworld.com/javaworld/javaqa/2003-12/01-qa-1212-
intern.html>
explains, and even provides evidence, that intern()ed Strings can be
and are GCed, and still maintain their semantics.
 
T

Tom Anderson

But the very next sentence says:

File-lock objects are safe for use by multiple concurrent threads.

That looks like it contradicts the previous sentence.

No. That just means that if several threads within one process use the
same FileLock object, it won't blow up in anyone's face.

It's like saying that the lock on your front door is safe for use by
multiple concurrent family members. You don't have to worry about anything
weird happening if you get home at the same time as your wife, but it
doesn't let you have the house to yourself.

tom
 
D

Daniel Pitts

Lew said:
I wondered if it is really true that intern()ed Strings can be GCed.
How could Java keep the promises specified for intern()ed Strings if
they're GCed?

Also, the String#intern() Javadocs state:
"A pool of strings, initially empty, is maintained privately by the
class String."

This apparently implies that the class 'String' keeps all references
to intern()ed Strings, which would prevent them from ever being GCed
unless the class itself were collected.
Or it used some version of a weak map. (which it appears that it does:)

public class TestIntern {
public static void main(String...args) {
for (int i = 0; i < 1000000; ++i) {
System.out.println(("Testing" + i + ": 1.2.3").intern());
System.out.println(
Runtime.getRuntime().totalMemory() + " total, " +
Runtime.getRuntime().freeMemory() + " free, " +
Runtime.getRuntime().maxMemory() + " max. "
);
}
}
}

As you can see, free memory rises and falls, indicating that the
intern'd strings go away.

After thinking about it, however, I think the OP was correct when
thinking that synchronize (someString.intern()) is valid.

One thing I would suggest is having some sort of "namespace" prepended
to your string, so as not to interfere with any other code that uses the
same trick.

synchronize (("net.virtualinfinity.filelock:" + someString).intern()) {
}
 
D

Daniel Pitts

Tom said:
Okay, i completely missed that the lock was being removed at the end.
You're quite right, my approach is incorrect.

My solution, though, would be to not remove the lock from the map - i'd
leave it there for ever afterwards. I got the impression from the OP
that there was a small set of filenames which could be used, and thus
the map won't grow without limit. If that wasn't the case, using
WeakReferences (but not a WeakHashMap!) would be a solution that still
avoided the need for explicit removal.

tom
Even with WeakReferences, you would have to worry about replacing the
collected result, so it is the same as if you managed it yourself,
except possibly slowing down the GC process.

You *could* use a WeakHashMap, but you'd have to have a synchronized
block around accessing it. I only used putIfAbsent because it was atomic
as-is.

This whole algorithm can be made more robust/generic by allowing
arbitrary key types, timeouts, "tryLock", etc... Maybe even have it use
"Lock" objects, and provide conditional waits etc...
 
T

Tom Anderson

So does that mean if thread-1 creates a FileLock on "foo", then, because
the lock is "held" by all threads concurrently, if thread-2 tries to
create a FileLock also on "foo", it will *not* block because thread-2
"already" holds the lock also?

Correct.

And just to make sure i'm not confused, i ran a test - on OS X at least,
this is definitely the case, even when the locks are taken from completely
different RandomAccessFile and FileChannel objects.

tom
 
M

Mike Schilling

Tom said:
It's like saying that the lock on your front door is safe for use by
multiple concurrent family members. You don't have to worry about
anything weird happening if you get home at the same time as your
wife,

Depending on whom else you're coming home with, of course.
 
L

Lew

Daniel said:

Daniel said:
Or it used some version of a weak map. (which it appears that it does:)

public class TestIntern {
   public static void main(String...args) {
     for (int i = 0; i < 1000000; ++i) {
       System.out.println(("Testing" + i + ": 1.2.3").intern());
       System.out.println(
         Runtime.getRuntime().totalMemory() + " total, " +
         Runtime.getRuntime().freeMemory() + " free, " +
         Runtime.getRuntime().maxMemory() + " max. "
       );
     }
   }

}

As you can see, free memory rises and falls, indicating that the
intern'd strings go away.

After thinking about it, however, I think the OP was correct when
thinking that synchronize (someString.intern()) is valid.

This article
<http://www.cs.umd.edu/~pugh/java/memoryModel/archive/2225.html>
points up a danger to synchronization on intern()ed Strings.

It may not hurt the OP's scenario, but it's worth pondering.
 
D

Daniel Pitts

Lew said:
This article
<http://www.cs.umd.edu/~pugh/java/memoryModel/archive/2225.html>
points up a danger to synchronization on intern()ed Strings.

It may not hurt the OP's scenario, but it's worth pondering.
The thing to consider though, in Java 1.5 at least, is that the Happens
Before relationship *would* be guaranteed. intern() *must* have
synchronization at some level (to ensure its contract appropriately). so
intern() will return the String instance that is being synchronized
against, or if it doesn't, then the block that was synchronized against
previously has already happened-before the new value was returned.

So, I think that other thread is incorrect in the Java 1.5 world. It is
indeed safe to use this approach.
 
L

Lew

Daniel said:
The thing to consider though, in Java 1.5 at least, is that the Happens
Before relationship *would* be guaranteed. intern() *must* have
synchronization at some level (to ensure its contract appropriately). so

I don't see that.

Neither the Javadocs nor the JLS discuss happens-before in the context of
String#intern().
intern() will return the String instance that is being synchronized
against, or if it doesn't, then the block that was synchronized against
previously has already happened-before the new value was returned.

So, I think that other thread is incorrect in the Java 1.5 world. It is
indeed safe to use this approach.

Can you provide evidence for that assertion?

Looking at the source, all I can tell so far is that any synchronization, if
it exists, is not at the Java language level but deep inside the native code.
 
D

Daniel Pitts

Lew said:
I don't see that.

Neither the Javadocs nor the JLS discuss happens-before in the context
of String#intern().


Can you provide evidence for that assertion?

Looking at the source, all I can tell so far is that any
synchronization, if it exists, is not at the Java language level but
deep inside the native code.
Indeed, it must be in the native code, but that still guarantees the
order of operations.
 
L

Lew

Daniel said:
Indeed, it must be in the native code, but that still guarantees the
order of operations.

Again, do you have actual evidence that it actually does actually
synchronize? Simply stating again that "it must" is not evidence.
 
M

Mike Schilling

Lew said:
Again, do you have actual evidence that it actually does actually
synchronize? Simply stating again that "it must" is not evidence.

Are we (or are we not) presuming that synchronizing on an object is
enough to hold it in memory? If we are, we have the sequence of
events:

1. Call intern() in thread 1
2. Lock the result's monitor
3. Run the synchronized block in thread 1
4. Release the monitor
5. Collect and remove the value of intern()
6. Call intern() in thread 2
7. Lock the result's monitor
8. Run the second synchronized block in thread 2

1-4 are ordered for the usual reason, as are 6-8. If we have the
problematical situation, so are 4-6.

* By our assumption, 4 preceds 5
* I'm assuming that if one or more threads are waiting on an object's
monitor, the object is held in memory so long as any are still
waiting. That means that if 6 precedes 4, 5 won't occur. Thus 4
precedes 6.

4 preceding 6 (really, 4 preceding 7) is enough to provide correct
synchronization semantics, even if there's a race between 5 and 6.
 
D

Daniel Pitts

Lew said:
Again, do you have actual evidence that it actually does actually
synchronize? Simply stating again that "it must" is not evidence.
The evidence is in the API contract. If it didn't have *some* sort of
synchronization, then it is possible to have a.intern() != b.intern() &&
a.equals(b), which violates the contract.
 
M

Mark Space

Mike said:
Are we (or are we not) presuming that synchronizing on an object is
enough to hold it in memory?


I think they're discussion just the call to intern(). Read Daniel's
answer to Lew's same post. He claims that intern() must synchronize
internally because:

"The evidence is in the API contract. If it didn't have *some* sort of
synchronization, then it is possible to have a.intern() != b.intern() &&
a.equals(b), which violates the contract. "

Which I don't follow.

The intern() API doesn't address multithreading at all. I'd assume that
there's no synchronization unless specified that there is. In short, it
could happen that a String could somehow violate the contract of the
intern() if multiple threads are involved.

Almost all of the API Java docs are the same way: it only specific
semantics for a single thread. All bets are off if multiple threads are
involved. (Some objects do specify their semantics in a multithreaded
environment, though not many.)

In short, you'd better provide your own locking if you use intern() from
multiple threads.
 
T

Tom Anderson

I think they're discussion just the call to intern(). Read Daniel's answer
to Lew's same post. He claims that intern() must synchronize internally
because:

"The evidence is in the API contract. If it didn't have *some* sort of
synchronization, then it is possible to have a.intern() != b.intern() &&
a.equals(b), which violates the contract. "

Which I don't follow.

The intern() API doesn't address multithreading at all. I'd assume that
there's no synchronization unless specified that there is. In short, it
could happen that a String could somehow violate the contract of the
intern() if multiple threads are involved.

I differ from you slightly. I think Daniel's belief about the contract not
being violated in multithreaded programs is probably right - it would be a
huge loophole in the semantics of intern() if that wasn't so. However,
there's no java-level synchronization mentioned in the docs, which means
that intern() calls from different threads don't generate formal
happens-before relationships, just as Lew was saying. However^2, i can't
imagine how the invariant could be enforced without locking (okay, i guess
with a lock-free data structure), and therefore i suspect that in
practice, there is effectively a happens-before.

And as Mike observed, whether there is or isn't is irrelevant. The only
thing that matters is whether intern() is threadsafe.
In short, you'd better provide your own locking if you use intern() from
multiple threads.

It would be really good to know if this was true. How could we test it?
Have a multithreaded program which creates strings, interns them, and then
somehow looks for non-identical intern results - how could we do the
check?

Also, someone with a JDC account should file an RFE or bug report to get
the documentation changed to state whether intern() is or is not
threadsafe.

tom
 
M

Mark Space

Lew said:
As stated upthread, there are all sorts of API calls whose contract is
violated in the presence of multi-threaded access. We have also seen
times when the Javadocs were incomplete or imprecise. <SNIP>

I'm reminded of Java Concurrency in Practice where it talks about
thread-safey of the JDBC object. It also doesn't document thread
safety, but it must have some guarantees or it can't possibly do its job.

Perhaps Daniel is thinking the same thing. I disagree, I think a
designer could easily give intern() single-threaded semantics and
consider the implementation "correct," but I suppose there is some
precedence for thinking as Daniel does.

Where is the *evidence*? Actual *evidence*?

Most JDBC objects are open source and well understood. Those that
weren't thread-safe had to be corrected. I don't see any evidence that
intern() must work in a multithreaded environment, but I think Tom's
point is well-taken: when in doubt, check. Opening up the source for
that native method might be worthwhile.
 
M

Mark Space

Tom said:
I differ from you slightly. I think Daniel's belief about the contract
not being violated in multithreaded programs is probably right - it


Well it's good that we all seem to agree on what the central issue is.

It would be really good to know if this was true. How could we test it?

I don't have time to do this right now, this week is very busy for me,
but I'd start by opening up that native method and take a look at the
comments in the source code. Are the software engineers aware of
multi-threaded issues and designing for them? Or is it only thread-safe
by accident, or not at all? It would be interesting to know.

Personally, I bet the implementation is very simple, and just does the
minimum needed, but I'm now curious.
 
A

Arne Vajhøj

Mark said:
I'm reminded of Java Concurrency in Practice where it talks about
thread-safey of the JDBC object. It also doesn't document thread
safety, but it must have some guarantees or it can't possibly do its job.

What is "JDBC object" ?

Most classes in java.sql are most certainly not safe to use from
multiple threads.
Most JDBC objects are open source and well understood. Those that
weren't thread-safe had to be corrected.

????

We do not fix JDBC - we code so that each Connection, PreparedStatement
and ResultSet are only used by one thread.

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top