Bogus NullPointerExceptions

T

Twisted

while (!dir.equals(baseDir) && dir.list() != null && dir.list().length
== 0) {
File parent = dir.getParentFile();
dir.delete();
dir = parent;
}

is being used to nuke some empty directories in an app of mine, then
the parent if it's now empty, and so forth up the chain to a top
directory.

Sometimes, the while line is throwing an NPE, a problem that seems
impossible.

First, baseDir is not null. It's set only once and never changed,
nothing that uses it ever throws an NPE except this line, and this line
only uses it as the RHS of .equals(), where null is supposed to be
legal anyway.

Second, dir is not null (I added an explicit throw of NPE if dir was
null just before the "while" loop and the "while" line still threw the
NPEs, rather than the line with the explicit throw).

Finally, dir.list().length is accessed only after a short-circuit and
after a test that dir.list() isn't null.

The only logical explanation seems to be that dir.list() can return an
array one nanosecond and null the next...

I don't suppose this is some weird concurrency problem. I guess I'll
try synchronizing on something (baseDir?) before the loop...
 
P

Patricia Shanahan

Twisted said:
while (!dir.equals(baseDir) && dir.list() != null && dir.list().length
== 0) {
File parent = dir.getParentFile();
dir.delete();
dir = parent;
}

is being used to nuke some empty directories in an app of mine, then
the parent if it's now empty, and so forth up the chain to a top
directory.

Sometimes, the while line is throwing an NPE, a problem that seems
impossible.

First, baseDir is not null. It's set only once and never changed,
nothing that uses it ever throws an NPE except this line, and this line
only uses it as the RHS of .equals(), where null is supposed to be
legal anyway.

Second, dir is not null (I added an explicit throw of NPE if dir was
null just before the "while" loop and the "while" line still threw the
NPEs, rather than the line with the explicit throw).

Finally, dir.list().length is accessed only after a short-circuit and
after a test that dir.list() isn't null.

The only logical explanation seems to be that dir.list() can return an
array one nanosecond and null the next...

I don't suppose this is some weird concurrency problem. I guess I'll
try synchronizing on something (baseDir?) before the loop...

Additionally, your code seems to me to depend on baseDir being on the
parent chain from the initial value of dir.

If that were not the case, you would reach the end of the parent chain
without encountering the stop condition, so getParentFile would return
null. If that happened, dir would be null in the while test without
having been null when you went through the check before the while loop.

If you have not already done so, perhaps check the code that calculates
dir and baseDir?

Patricia
 
T

Twisted

Patricia said:
Additionally, your code seems to me to depend on baseDir being on the
parent chain from the initial value of dir.

It does, and earlier code guarantees that baseDir is on the parent
chain. (Certain dirs under baseDir are checked for being empty and
deleted if so.)
 
F

Fred Kleinschmidt

Twisted said:
while (!dir.equals(baseDir) && dir.list() != null && dir.list().length
== 0) {
File parent = dir.getParentFile();

Note that getParentFile() can return null.
dir.delete();
dir = parent;

if parent is null, dir is now null, and dir.equals() will generate a NPE
}

<snip>

Sometimes, the while line is throwing an NPE, a problem that seems
impossible.
<snip>
 
P

Patricia Shanahan

Twisted said:
It does, and earlier code guarantees that baseDir is on the parent
chain. (Certain dirs under baseDir are checked for being empty and
deleted if so.)

The fact that directory X is under directory Y does not guarantee that
all File objects for X have Y as a getParentFile ancestor.

Of course, the way you are creating the File object may ensure that dir
does have baseDir as a getParentFile ancestor, but if I were you I would
put in a check for null getParentFile result just in case.

Patricia
 
T

Twisted

Fred said:
Note that getParentFile() can return null.

I should have been clearer in the original post. This code is reached
only with dir a descendant of baseDir, so it can't reach the root and
try to keep going.
 
T

Twisted

Patricia said:
The fact that directory X is under directory Y does not guarantee that
all File objects for X have Y as a getParentFile ancestor.

It should, as long as X is genuinely under Y rather than some kind of
shortcut to X being under Y. In this particular application, is
certainly is under Y (it was reached by traversal from Y to begin
with).

Anyway it does seem to have been a concurrency issue -- synchronizing
on baseDir (which is app global and unchanging) at several key spots in
the code made the NPEs go away.
 
D

Daniel Pitts

Twisted said:
I should have been clearer in the original post. This code is reached
only with dir a descendant of baseDir, so it can't reach the root and
try to keep going.

It doesn't hurt to add an assert.

File parent = dir.getParentFile();
assert parent != null : "Whoops, missed my parent!";
 
A

Andreas Leitgeb

Twisted said:
Anyway it does seem to have been a concurrency issue -- synchronizing
on baseDir (which is app global and unchanging) at several key spots in
the code made the NPEs go away.

There are two global (principially) modifyable "objects" involved:
baseDir, which you're sure remains unmodified throughout the
running program, and the filesystem itself!

Perhaps the method gets to run twice in parallel, and
the parent-directory obtained in first thread has already
been deleted in the other thread just before you get
to deal with it in the first one. Or something like that.

In that case, synchronizing on baseDir obviously solved
the problem, even if just indirectly. Probably it would
have been enough to synchronize only those methods on baseDir
that actually modify the filesystem's subtree starting at
baseDir.
 
I

Ingo Menger

Twisted said:
I should have been clearer in the original post. This code is reached
only with dir a descendant of baseDir, so it can't reach the root and
try to keep going.

The NullPointerException tells you, however, that it can or that there
may be other reasons why getParentFile() returns null.
BTW, make sure you write code that will work with symbolic links.
 
P

Patricia Shanahan

Twisted said:
It should, as long as X is genuinely under Y rather than some kind of
shortcut to X being under Y. In this particular application, is
certainly is under Y (it was reached by traversal from Y to begin
with).

That should work.
Anyway it does seem to have been a concurrency issue -- synchronizing
on baseDir (which is app global and unchanging) at several key spots in
the code made the NPEs go away.

I would still try to nail down what is really happening, because adding
arbitrary synchronization can make a program work more often, without
fixing the underlying problem.

The implication is that baseDir is changing, although you thought it was
fixed, which is disturbing.

Alternatively, there could be an initialization problem, if dir is being
calculated in a different thread from the NPE.

Patricia
 
C

Chris Uppal

Daniel said:
[...]

It doesn't hurt to add an assert.

File parent = dir.getParentFile();
assert parent != null : "Whoops, missed my parent!";

Or better still -- since this code is running out of control while deleting
stuff (!) -- some heavy duty tracing/logging so that he can find out /exactly/
what erroneous assumption(s) the code is making.

And then leave the null-check in, but use a test plus fatal-internal-error
notification, not a switchable check like an assertion.

-- chris
 
T

Twisted

Ingo said:
The NullPointerException tells you, however, that it can or that there
may be other reasons why getParentFile() returns null.
BTW, make sure you write code that will work with symbolic links.

There are none in this application. It creates and destroys
subdirectories of baseDir to house temporary data of various kinds. The
directories are only ever reached by drilling down from baseDir to
begin with. Someone would have to manually and deliberately drop a
symlink to a different part of the directory hierarchy beneath baseDir
for there to be a problem.

It's unlikely even then -- it gets dir from an internal memory of what
it's created, rather than from browsing around under baseDir.
 
T

Twisted

Andreas said:
There are two global (principially) modifyable "objects" involved:
baseDir, which you're sure remains unmodified throughout the
running program, and the filesystem itself!

Perhaps the method gets to run twice in parallel, and
the parent-directory obtained in first thread has already
been deleted in the other thread just before you get
to deal with it in the first one. Or something like that.

In that case, synchronizing on baseDir obviously solved
the problem, even if just indirectly. Probably it would
have been enough to synchronize only those methods on baseDir
that actually modify the filesystem's subtree starting at
baseDir.

I now think this is what was happening. I do now synchronize on baseDir
everywhere that might create or destroy subdirectories of it.
 
T

Twisted

Patricia said:
The implication is that baseDir is changing, although you thought it was
fixed, which is disturbing.

It can't be -- it's final. It's the structure of the actual filesystem
that was changing asynchronously once I added multithreading.
 
T

Tom Forsmo

Twisted said:
while (!dir.equals(baseDir) && dir.list() != null && dir.list().length
== 0) {
File parent = dir.getParentFile();
dir.delete();
dir = parent;
}

There is one big problem with your code. dir can be changed during the
loop, so it does not matter whether you are testing dir for null before
the loop starts. According to the code you have shown us, baseDir only
needs to be checked before the loop. because its not used elsewhere in
the code so it does not change.

The reason for your problem is that you can *not guarantee* that any of
the references/methods you are using does not return null (you are
experiencing an NPE even when you claim there can not be any).
In addition, your code is very unstable. If, for a reason, you change
how baseDir is used or how getParentFile() works, you could get into
problems again in the future.

What you need to do is the following, for absolute certainty:

while (dir != null && baseDir != null &&
!dir.equals(baseDir) &&
dir.list() != null &&
dir.list().length > == 0) {
...
}

tom
 
T

Twisted

Tom said:
What you need to do is the following, for absolute certainty:

while (dir != null && baseDir != null &&
!dir.equals(baseDir) &&
dir.list() != null &&
dir.list().length > == 0) {
...
}

This has been debated to death already.

As was discussed earlier, if it DID happen that drilling down from
baseDir to an empty child directory and then working back up with
getParent() could produce a null before reaching baseDir again, it
would mean that the library had a bug (unless some sort of symlink was
placed into one of the directories, which my app doesn't do). In actual
FACT, the NPEs went away when some synchronization was sprinkled over
the problem, indicating a concurrency issue. Finally, contrary to the
alarmist remarks of one poster, it can't "run amok deleting stuff"
because it is explicitly only capable of deleting a directory that is
empty at the time. :p
 
I

Ingo Menger

Twisted said:
This has been debated to death already.

As was discussed earlier, if it DID happen that drilling down from
baseDir to an empty child directory and then working back up with
getParent() could produce a null before reaching baseDir again, it
would mean that the library had a bug (unless some sort of symlink was
placed into one of the directories, which my app doesn't do).

It could be enough if some component of the path to baseDir is a
symlink.
$ cd /foo/bar/base
$ /usr/bin/pwd
/has/nothing/to/do/with/pathname/entered/has/it?

Or is there a guarantee that getParent() will work "in text mode" only?
 
T

Twisted

Ingo said:
It could be enough if some component of the path to baseDir is a
symlink.
$ cd /foo/bar/base
$ /usr/bin/pwd
/has/nothing/to/do/with/pathname/entered/has/it?

Or is there a guarantee that getParent() will work "in text mode" only?

I suppose every newcomer to this thread will need this explained
separately and again.

The app creates directories under baseDir. It also creates some files.
It does not create any symlinks, nor is anyone expected to manually
change anything in there.

It also sometimes deletes one of the files, and if this leaves a
directory empty, deletes the directory and works its way up the chain
in case this had made the parent empty and so forth.

The worst case "run amok" scenario would require that someone create a
chain of empty directories somewhere and then place a symlink to the
last one under baseDir. The app would then have to decide on its own to
put a file in a same-named subdirectory, and finding one already
existed not create it; the file then ends up in the symlink. And then
the app has to later delete the file. And the result of this worst-case
scenario is for it to delete the empty directories in the other chain
recursively until it hit one that wasn't empty. And it would have to
delete an empty logical filesystem root before it tried to delete null.

I don't think I've ever even *seen* an empty logical filesystem root.
An empty individual drive root from time to time and *that* is rare.

And of course it can't delete an actual file, except for the extremely
unlikely occurrence of someone (other than the app, now that it's
synchronizing on baseDir) putting a file manually into one of the
subdirectories just as the app is between testing it for being empty
and deleting it. (The most likely case would actually be running two
concurrent instances of the app and pointing them both at the same
baseDir, or at least one at a parent or child of the other's. And that
will, at worst, recreate the concurrency problems that used to exist.)

This thread is over.
 
T

Tom Forsmo

Twisted said:
I suppose every newcomer to this thread will need this explained
separately and again.

Don't be condescending. You have a bad design and you insist on fixing
it by adding more complexity, synchronisation.
> This thread is over

That's why its not over, you may ignore any further comments if you
wish, but you should not tell others what to do with the thread in a
public forum.

tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,046
Latest member
Gavizuho

Latest Threads

Top