synchronized using String.intern()

P

Paul J. Lucas

I've seen differing opinions on whether it will do as one would expect to do
something like:

synchronized ( myString.intern() ) {
// ...
}

that is: for a given, unique string (say "foo"), does String.intern() guarantee
that:

s.intern() == t.intern() iff s.equals( t )

assuming s != null && t != null? I.e., does it guarantee the same object for a
string composed of the same characters? If yes, then synchronizing on it should
work as one would expect, right?

The use case is to prevent concurrent access to a particular file, e.g.:

File f = ...;
synchoronzed ( f.getCanonicalPath().intern() ) {
// ...
}

If this will *not* work, why not? And how can I achieve what I want?

- Paul
 
M

Mark Space

Paul said:
The use case is to prevent concurrent access to a particular file, e.g.:

File f = ...;
synchoronzed ( f.getCanonicalPath().intern() ) {
// ...
}

If this will *not* work, why not? And how can I achieve what I want?

Concurrent access for your program? Or any running program (outside the
JVM)?

If the latter you're at the mercy of your OS. Look up file locking.
For the former, well, recent Java versions have file locking, so... use
that.
 
L

Lew

I've seen differing opinions on whether it will do as one would expect to do
something like:

        synchronized ( myString.intern() ) {
           // ...
        }

that is: for a given, unique string (say "foo"), does String.intern() guarantee
that:

        s.intern() == t.intern() iff s.equals( t )

assuming s != null && t != null?  I.e., does it guarantee the same object for a
string composed of the same characters?  If yes, then synchronizing on it should
work as one would expect, right?

Javadocs are your friend:
for any two strings s and t,
s.intern() == t.intern() is true if and only if s.equals(t) is true.

It uses almost the exact same wording that you did.
The use case is to prevent concurrent access to a particular file, e.g.:

As Mark Space pointed out, synchronization between Java threads is not
related to file access.
File f = ...;
synchoronzed

Watch your spelling!
        ( f.getCanonicalPath().intern() ) {
            // ...
        }

If this will *not* work, why not?  And how can I achieve what I want?

You will synchronize on the object that represents the canonical path
to the file. That canonical path will change depending on whether the
file exists or not, so the synchronization would end up being on
different objects, i.e., no synchronization at all.

There are ways to access the file without using the canonical path,
and those will not be synchronized.

None of the synchronization you do get will have anything to do with
file access, only with multithreaded execution of critical sections of
code within a single JVM.

The fact that the words "concurrent" and "synchronize" occur in
different contexts (memory, execution, file access) does not by itself
guarantee that the words mean the same thing across all contexts.
Different things are being synchronized in different ways for
different kinds of concurrency. In your example, even the things in
the same context, JVM threads, will be locking different objects and
not synchronizing correctly.
 
T

Tom Anderson

I've seen differing opinions on whether it will do as one would expect to do
something like:

synchronized ( myString.intern() ) {
// ...
}

that is: for a given, unique string (say "foo"), does String.intern()
guarantee that:

s.intern() == t.intern() iff s.equals( t )

assuming s != null && t != null? I.e., does it guarantee the same object for
a string composed of the same characters? If yes, then synchronizing on it
should work as one would expect, right?

The use case is to prevent concurrent access to a particular file, e.g.:

File f = ...;
synchoronzed ( f.getCanonicalPath().intern() ) {
// ...
}

If this will *not* work, why not? And how can I achieve what I want?

I think that's a pretty cunning hack, and i think it'll work.

Up to, as Lew says, degeneracy in the canonical path. If the file exists
before you start doing all this, and you're not using a case-insensitive
filesystem on an OS which doesn't handle it well (eg VFAT on linux), and
you don't have any hard links or other weirdness like bits of the local
filesystem mounted as loopback network shares etc floating around, you
should be fine.

tom
 
T

Tom Anderson

Yes, I know (and I don't care about other VMs or processes).


In my case, the number of such strings is guaranteed to be on the order of
1-10, so, again, I don't care.

I think i'd be looking for a way to refactor the architecture so that all
access to one of those files went through a single application object. I
would then either synchronize on that, or site a higher-level locking
mechanism there.

Something like:

class ApplicationFile {
private File file;
private ReadWriteLock lock;
public void addEntryToFile(Entry entry) {
if (!lock.writeLock().tryLock(200, MILLISECONDS))
throw new ApplicationException("timed out waiting to lock file " + file.getName());
try {
ADD ENTRY TO FILE GIVING FILE
}
finally {
lock.unlock();
}
}
public void doSomeOtherLowLevelOperationOfYourApplication() {
// similar to the above
}
// and more methods for the other low-level ops
}

I'd then stick the 1-10 of these that get used in a HashMap, and pull them
out and use them as needed.

Even better, rather than passing around strings in the app to identify
which file needs to be worked on, i'd pass references to the
ApplicationFile instances, to save on the lookup.

tom
 
E

EJP

Paul said:
I've seen differing opinions on whether it will do as one would expect
to do something like:

synchronized ( myString.intern() ) {
// ...
}

that is: for a given, unique string (say "foo"), does String.intern()
guarantee that:

s.intern() == t.intern() iff s.equals( t )

That's what it says in the Javadoc. Any 'differing opinions' are uninformed.
 
P

Paul J. Lucas

Tom said:
I think i'd be looking for a way to refactor the architecture so that
all access to one of those files went through a single application
object.

It already does.
Even better, rather than passing around strings in the app to identify
which file needs to be worked on, i'd pass references to the
ApplicationFile instances, to save on the lookup.

The name of the file comes from an external source. I can't do the above.

- Paul
 
D

Daniel Pitts

Paul said:
I've seen differing opinions on whether it will do as one would expect
to do something like:

synchronized ( myString.intern() ) {
// ...
}

that is: for a given, unique string (say "foo"), does String.intern()
guarantee that:

s.intern() == t.intern() iff s.equals( t )

assuming s != null && t != null? I.e., does it guarantee the same
object for a string composed of the same characters? If yes, then
synchronizing on it should work as one would expect, right?

The use case is to prevent concurrent access to a particular file, e.g.:

File f = ...;
synchoronzed ( f.getCanonicalPath().intern() ) {
// ...
}

If this will *not* work, why not? And how can I achieve what I want?

- Paul
Nice try, but doesn't work...

Interned strings may be reclaimed by the garbage collector at any time,
so two successive calls to myString.intern() may not return the same
object (if the interned String instance wasn't stored somewhere).

You can do something like:

interface FileAccessor<T> {
T accessFile(File file) throws Exception;
}

public class FileAccessSynchronizer {
ConcurrentMap<String, Object> fileNameInUse =
new ConcurrentHashMap<String, Object>();

<T> T lockAndAccess(File file, FileAccessor<T> accessor)
throws Exception {
final Object myLock = new Object();
syncronize (myLock) {
try {
Object otherLock;
do {
otherLock = fileNameInUse.putIfAbsent(
file.getCanonicalPath(), myLock);
// If someone else has a registered lock:
if (otherLock != null) {
// Wait for them to release the lock.
syncronize (otherLock) {
// Grab it myself.
fileNameInUse.replace(file.getCanonicalPath(),
otherLock,
myLock);
}
}
} while (otherLock != null);
// do the work.
return accessor.accessFile(file);

} finally {
fileNameInUse.remove(file.getCanonicalPath(), myLock);
}
}
}

Then you can create any number of FileAccessor classes (even anonymous
ones), that access the file.

Note, this is untested, but it "looks right to me" (famous last words).
 
M

Mark Space

Paul said:
The name of the file comes from an external source. I can't do the above.

Just curious: what external source?

You can store file names in a Map, associating them to an object which
is synchronized. No need for intern, the Map will compare strings based
on equals() by default.

fileMap.put( filename, synchFile );
SyncrhonizedFile sf = fileMap.get( filename );

where filename is a String and SynchronizedFile is some class you create
to handle the IO synchronization. fileMap may have to be some sort of
global, which is bad. Try to inject it if possible. Be very certain
that this external source doesn't open any file handles of it's own, or
none of this will work.
 
P

Paul J. Lucas

Daniel said:
Interned strings may be reclaimed by the garbage collector at any time,

How do you know that? The Javadoc doesn't say so.
so two successive calls to myString.intern() may not return the same
object (if the interned String instance wasn't stored somewhere).

(See: this is one of those "differing opinions" I referred to in my original post.)

[snip]

I just discovered that Java has FileLock. Couldn't I just use that? Does it
actually work as advertised cross-platform (at least Win, Mac, Linux)? Wouldn't
that be simpler?

- Paul
 
P

Paul J. Lucas

Mark said:
Just curious: what external source?

Some other process says: I want either to fetch or replace the contents of a
file *named* "foo". The other process only specifies the *name* and *not* the
full path to the file. My process is the only one that knows the directory
these files live in. Then, if fetching, I slurp in the contents of said file
and burp it back out to the other process (via a socket); if replacing, I accept
the new contents of the file (via a socket) and overwrite the file. The other
process never access the file directly. In effect, the "name" acts as a key to
a file.
You can store file names in a Map, associating them to an object which
is synchronized. No need for intern, the Map will compare strings based
on equals() by default.

fileMap.put( filename, synchFile );
SyncrhonizedFile sf = fileMap.get( filename );

where filename is a String and SynchronizedFile is some class you create
to handle the IO synchronization.

Or simply have a Map<String,Object> and I synchronize on the Object.

- Paul
 
T

Tom Anderson

It already does.


The name of the file comes from an external source. I can't do the above.

Oh, i'm sure you could manage it. You have a registry which maps names to
file-wrapper objects somewhere (this could be as simple as a map living in
a static variable somewhere), and the instant you have a filename, in the
code which interfaces with the external source, you map it to the
corresponding file wrapper. Look at it this way: if your external source
had some numbers in it, would you pass around strings, or do an
Integer.parseInt as soon as you could, and then pass around the numbers?

tom
 
T

Tom Anderson

How do you know that? The Javadoc doesn't say so.


(See: this is one of those "differing opinions" I referred to in my original
post.)

True. But it's misguided: it's not true that interned strings may be
reclaimed at any time. What is true is that they may be reclaimed by the
garbage collector at any time *at which they are not referenced*. In the
idiom you described, the string is referenced throughout the synchronized
block, so we know it won't be collected during that time, and thus we know
that any other internments of the same string during that time will return
the same instance.

If, on the other hand, you used the reference to get a lock, then dropped
it, then went on to do the protected operation, this would indeed be a
problem.
I just discovered that Java has FileLock. Couldn't I just use that?
Does it actually work as advertised cross-platform (at least Win, Mac,
Linux)? Wouldn't that be simpler?

Javadoc says:

File locks are held on behalf of the entire Java virtual machine. They
are not suitable for controlling access to a file by multiple threads
within the same virtual machine.

So it doesn't address your original problem.

tom
 
T

Tom Anderson

You can do something like:

interface FileAccessor<T> {
T accessFile(File file) throws Exception;
}

public class FileAccessSynchronizer {
ConcurrentMap<String, Object> fileNameInUse =
new ConcurrentHashMap<String, Object>();

<T> T lockAndAccess(File file, FileAccessor<T> accessor)
throws Exception {
final Object myLock = new Object();
syncronize (myLock) {
try {
Object otherLock;
do {
otherLock = fileNameInUse.putIfAbsent(
file.getCanonicalPath(), myLock);
// If someone else has a registered lock:
if (otherLock != null) {
// Wait for them to release the lock.
syncronize (otherLock) {
// Grab it myself.
fileNameInUse.replace(file.getCanonicalPath(),
otherLock,
myLock);
}
}
} while (otherLock != null);
// do the work.
return accessor.accessFile(file);

} finally {
fileNameInUse.remove(file.getCanonicalPath(), myLock);
}
}
}

Then you can create any number of FileAccessor classes (even anonymous ones),
that access the file.

Note, this is untested, but it "looks right to me" (famous last words).

Despite my trash-talking in my other post, i think this is a pretty good
solution.

I'm not sure it needs to be quite this sophisticated - do you need to lock
your own lock first? Would:

Object myLock = new Object();
Object lock = fileNameInUse.putIfAbsent(file.getCanonicalPath(), myLock);
if (lock == null) lock = myLock;
synchronized (lock) {
// do stuff
}

Not work? I think this is guaranteed to leave everyone with the same value
in lock, and thus getting mutual exclusive access to the synchronized
block. Am i missing something?

It's a shame that ConcurrentMap doesn't have a method that's like
putIfAbsentAndReturnWhateverTheValueIsAfterThisCall; that would eliminate
the need for much of the fancy footwork. I think it'd be much more
generally useful than the current putIfAbsent (although of course you can
build the behaviour i want on top of putIfAbsent, as i do above, so i
suppose the current semantics are more general). Python has this method
(although not threadsafe) on its dictionaries under the name setdefault,
and it's rather useful.

tom
 
J

John B. Matthews

"Paul J. Lucas said:
How do you know that? The Javadoc doesn't say so.

The behavior is implementation dependent.
[...]

Here's an article that discusses the matter further and suggests a way
to check one's implementation:

http://www.javaworld.com/javaworld/javaqa/2003-12/01-qa-1212-intern.html

Here's some interesting history:

http://mindprod.com/jgloss/interned.html#GC

[...]
I just discovered that Java has FileLock. Couldn't I just use that?
Does it actually work as advertised cross-platform (at least Win,
Mac, Linux)? Wouldn't that be simpler?

The FileLock API discusses platform dependencies:

http://java.sun.com/javase/6/docs/api/java/nio/channels/FileLock.html
 
D

Daniel Pitts

Tom said:
Despite my trash-talking in my other post, i think this is a pretty good
solution.

I'm not sure it needs to be quite this sophisticated - do you need to
lock your own lock first? Would:

Object myLock = new Object();
Object lock = fileNameInUse.putIfAbsent(file.getCanonicalPath(), myLock);
if (lock == null) lock = myLock;
synchronized (lock) {
// do stuff
}

Not work? I think this is guaranteed to leave everyone with the same
value in lock, and thus getting mutual exclusive access to the
synchronized block. Am i missing something?
Think about this scenario:
Thread a: Successfully puts in a new lock.
Thread b: tries to get lock, but instead finds thread A's lock (thats
fine for now)
Thread a: finishes what needs to be finished, removes lock.
Thread c: Successfully puts in a new lock.
Thread b: enters synchronized block with original Thread a lock.
Thread c: access file concurrently: Errors, bugs, nuclear explosions!
It's a shame that ConcurrentMap doesn't have a method that's like
putIfAbsentAndReturnWhateverTheValueIsAfterThisCall; that would
eliminate the need for much of the fancy footwork. I think it'd be much
more generally useful than the current putIfAbsent (although of course
you can build the behaviour i want on top of putIfAbsent, as i do above,
so i suppose the current semantics are more general). Python has this
method (although not threadsafe) on its dictionaries under the name
setdefault, and it's rather useful.
It would help some, but you still need to synchronize on the Object that
is *going into* the map *BEFORE* it gets there, and you still need to
loop until that Object gets in.
 
T

Tom Anderson

Think about this scenario:
Thread a: Successfully puts in a new lock.
Thread b: tries to get lock, but instead finds thread A's lock (thats fine
for now)
Thread a: finishes what needs to be finished, removes lock.
Thread c: Successfully puts in a new lock.
Thread b: enters synchronized block with original Thread a lock.
Thread c: access file concurrently: Errors, bugs, nuclear explosions!

Okay, i completely missed that the lock was being removed at the end.
You're quite right, my approach is incorrect.

My solution, though, would be to not remove the lock from the map - i'd
leave it there for ever afterwards. I got the impression from the OP that
there was a small set of filenames which could be used, and thus the map
won't grow without limit. If that wasn't the case, using WeakReferences
(but not a WeakHashMap!) would be a solution that still avoided the need
for explicit removal.

tom
 
P

Paul J. Lucas

Tom said:
Javadoc says:

File locks are held on behalf of the entire Java virtual machine. They
are not suitable for controlling access to a file by multiple threads
within the same virtual machine.

So it doesn't address your original problem.

But the very next sentence says:

File-lock objects are safe for use by multiple concurrent threads.

That looks like it contradicts the previous sentence.

- Paul
 
M

Mike Schilling

Paul said:
But the very next sentence says:

File-lock objects are safe for use by multiple concurrent threads.

That looks like it contradicts the previous sentence.

Does it? I read it as saying they're thread-safe, that is, once you
take a lock, which is effectively held by all threads, you can access
and manipulate it from any thread you like.
 
M

Mike Schilling

Paul said:
So does that mean if thread-1 creates a FileLock on "foo", then,
because the lock is "held" by all threads concurrently, if thread-2
tries to create a FileLock also on "foo", it will *not* block
because
thread-2 "already" holds the lock also?

I don't know how file locks work, so I'm not sure if it creates the
lock or throws an exception becuase a conflicting lock is already held
by that JVM.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top