Performance of isDirecory()

B

Bill Tschumy

I have a customer that is reporting severe performance problems with Jurtle,
my beginners Java IDE. He is in a high school setting where they netboot all
their Macs. The students home directories are kept on the server.

Jurtle displays a listing of the current directory, showing class files,
directories, and other "assets". It appears that the act of building my
directory listing is putting a severe load on their server. This is
especially true when the student navigates to the directory containing the
2000 student home directories. It locks up the server (~100% utilization)
for about 60 seconds while building the listing!!!

Basically all I do is get a directory listing with the file.list() method. I
then iterate through the list and determine if the item is a class file,
directory, or something else. I use File's isDirectory() method on each file
in the directory listing to see it is a directory. The only thing I can
think of is that file.isDirectory() goes back to the server across the
network to get this information and this is killing performance.

I almost think there must be some misconfiguration of the server to cause
such poor performance. Can anyone think of something I might try (coding or
otherwise) to solve this?
 
R

Robert

You should post the code. Too many things could be wrong. I doubt
it's isDirectory(). Also if you're afraid to post the code get JProbe
and you'll find the problem quick. It's free for 30 days.
 
J

JScoobyCed

Bill said:
I have a customer that is reporting severe performance problems with Jurtle,
my beginners Java IDE. He is in a high school setting where they netboot all
their Macs. The students home directories are kept on the server.

Jurtle displays a listing of the current directory, showing class files,
directories, and other "assets".

Then why don't you get the list of the current directory only, cache it
in a local object. When you change to another directory, check you don't
already have it in the cached object, if not use the list() method.
Caching could be a bit tricky if there is concurrency on the FS, so
maybe you can just list() the directory you want to display on-the-fly.
This is just a thought.
 
B

Betty

Bill Tschumy said:
I have a customer that is reporting severe performance problems with Jurtle,
my beginners Java IDE. He is in a high school setting where they netboot all
their Macs. The students home directories are kept on the server.

Jurtle displays a listing of the current directory, showing class files,
directories, and other "assets". It appears that the act of building my
directory listing is putting a severe load on their server. This is
especially true when the student navigates to the directory containing the
2000 student home directories. It locks up the server (~100% utilization)
for about 60 seconds while building the listing!!!

Basically all I do is get a directory listing with the file.list() method. I
then iterate through the list and determine if the item is a class file,
directory, or something else. I use File's isDirectory() method on each file
in the directory listing to see it is a directory. The only thing I can
think of is that file.isDirectory() goes back to the server across the
network to get this information and this is killing performance.

I almost think there must be some misconfiguration of the server to cause
such poor performance. Can anyone think of something I might try (coding or
otherwise) to solve this?

There are two methods, I use "String[] list = f.list();"
which returns an array of Strings. The other method returns
an array of "File". It is pretty clear that if all you
have about a file is a string, you will have to go back to
the server to find out if it is a directory. Try the other one.
It might have what you need and not have to play tag with the server.
 
T

Thomas Weidenfeller

Bill said:
Jurtle displays a listing of the current directory, showing class files,
directories, and other "assets". It appears that the act of building my
directory listing is putting a severe load on their server.

Consider using a network sniffer like ethereal to figure out which
operations are actually going over the network, and which create the
hight load.

/Thomas
 
B

Bill Tschumy

Consider using a network sniffer like ethereal to figure out which
operations are actually going over the network, and which create the
hight load.

/Thomas

I sent the person a test program to try and it looks like isDirectory() is
*not* causing the problem. I still don't know what it could be. I think you
may be right that a network sniffer may be the way to go.
 
B

Bill Tschumy

I have a customer that is reporting severe performance problems with Jurtle,
my beginners Java IDE. He is in a high school setting where they netboot all
their Macs. The students home directories are kept on the server.

Jurtle displays a listing of the current directory, showing class files,
directories, and other "assets". It appears that the act of building my
directory listing is putting a severe load on their server. This is
especially true when the student navigates to the directory containing the
2000 student home directories. It locks up the server (~100% utilization)
for about 60 seconds while building the listing!!!

Basically all I do is get a directory listing with the file.list() method. I
then iterate through the list and determine if the item is a class file,
directory, or something else. I use File's isDirectory() method on each file
in the directory listing to see it is a directory. The only thing I can
think of is that file.isDirectory() goes back to the server across the
network to get this information and this is killing performance.

I almost think there must be some misconfiguration of the server to cause
such poor performance. Can anyone think of something I might try (coding or
otherwise) to solve this?

I have been investigating the performance of isDirectory() using ktrace/kdump
to see what system calls are generated. The results are quite surprising. I
have run the same test code as a standalone program and as a method invoked
from a menu inside my Jurtle application.

There is the test code:

System.out.println("Start");
File dir = new File( "/Users/bill/testdir/" );

System.out.println("getting list");
String[] files = dir.list();
System.out.println("got it!");

for ( int i = 0; i < files.length; i++ )
{
String filename = files[ i ];
File f = new File( dir, filename );
System.out.println("calling isDirectory()");
boolean isDir = f.isDirectory();
System.out.println("called");
}

In the directory I'm listing (testdir) there are only two files. There is a
dramatic difference in what isDirectory() does in the two cases. Here is the
output of calling isDirectory() for the standalone program:

"calling isDirectory()"
6678 java RET write 21/0x15
6678 java CALL write(0x1,0xf07fbb40,0x1)
6678 java GIO fd 1 wrote 1 byte
"
"
6678 java RET write 1
6678 java CALL stat(0x50ace0,0xf07fff00)
6678 java NAMI "/Users/bill/testdir/file2"
6678 java RET stat 0
6678 java CALL write(0x1,0xf07fbad0,0x6)
6678 java GIO fd 1 wrote 6 bytes
"called"

It does a single stat() call on one of the files in the directory.

Here is the same range of output when the **same** code is called via a menu
in my app.

"calling isDirectory()"
6875 java RET write 21/0x15
6875 java CALL write(0x1,0xf0eb8eb0,0x1)
6875 java GIO fd 1 wrote 1 byte
"
"
6875 java RET write 1
6875 java CALL lstat(0xf0ebc690,0xf0ebc150)
6875 java NAMI "/Users"
6875 java RET lstat 0
6875 java CALL lstat(0xf0ebc690,0xf0ebc150)
6875 java NAMI "/Users/bill"
6875 java RET lstat 0
6875 java CALL lstat(0xf0ebc690,0xf0ebc150)
6875 java NAMI "/Users/bill/testdir"
6875 java RET lstat 0
6875 java CALL lstat(0xf0ebc690,0xf0ebc150)
6875 java NAMI "/Users/bill/testdir/file2"
6875 java RET lstat 0
6875 java CALL stat(0xf0ebc690,0xf0ebc550)
6875 java NAMI "/Users"
6875 java RET stat 0
6875 java CALL open(0x94c34658,0,0)
6875 java NAMI "/"
6875 java RET open 12/0xc
6875 java CALL getdirentries(0xc,0xf0eba4d0,0x2000,0xf0ebc4d0)
6875 java RET getdirentries 676/0x2a4
6875 java CALL close(0xc)
6875 java RET close 0
6875 java CALL stat(0xf0ebc690,0xf0ebc550)
6875 java NAMI "/Users/bill"
6875 java RET stat 0
6875 java CALL open(0xf0ebc690,0,0)
6875 java NAMI "/Users"
6875 java RET open 12/0xc
6875 java CALL getdirentries(0xc,0xf0eba4d0,0x2000,0xf0ebc4d0)
6875 java RET getdirentries 144/0x90
6875 java CALL close(0xc)
6875 java RET close 0
6875 java CALL stat(0xf0ebc690,0xf0ebc550)
6875 java NAMI "/Users/bill/testdir"
6875 java RET stat 0
6875 java CALL open(0xf0ebc690,0,0)
6875 java NAMI "/Users/bill"
6875 java RET open 12/0xc
6875 java CALL getdirentries(0xc,0xf0eba4d0,0x2000,0xf0ebc4d0)
6875 java RET getdirentries 4996/0x1384
6875 java CALL close(0xc)
6875 java RET close 0
6875 java CALL stat(0xf0ebc690,0xf0ebc550)
6875 java NAMI "/Users/bill/testdir/file2"
6875 java RET stat 0
6875 java CALL open(0xf0ebc690,0,0)
6875 java NAMI "/Users/bill/testdir"
6875 java RET open 12/0xc
6875 java CALL getdirentries(0xc,0xf0eba4d0,0x2000,0xf0ebc4d0)
6875 java RET getdirentries 56/0x38
6875 java CALL close(0xc)
6875 java RET close 0
6875 java CALL stat(0x5312c0,0xf0ebd270)
6875 java NAMI "/Users/bill/testdir/file2"
6875 java RET stat 0
6875 java CALL write(0x1,0xf0eb8e40,0x6)
6875 java GIO fd 1 wrote 6 bytes
"called"

God only knows what it is doing here. It seems to be (twice) getting stats
and directory listings on every directory in the path. I'm not sure, but I
strongly suspect this is what is killing network performance on my customer's
remote server.

Anyone have any thoughts on this? Quite frankly I'm pretty stumped by this.
 
E

Esmond Pitt

Bill said:
for ( int i = 0; i < files.length; i++ )
{
String filename = files[ i ];
File f = new File( dir, filename );
System.out.println("calling isDirectory()");
boolean isDir = f.isDirectory();
System.out.println("called");
}

Bill

As has already been suggested, don't use list(), use listFiles().
File.list() does enough operations on the directory to return a File[]
but only returns you the filename so you then have to construct another
File and do another operation to find out whether it is a directory.
listFiles() does all this in one operation.
 
B

Bill Tschumy

Bill said:
for ( int i = 0; i < files.length; i++ )
{
String filename = files[ i ];
File f = new File( dir, filename );
System.out.println("calling isDirectory()");
boolean isDir = f.isDirectory();
System.out.println("called");
}

Bill

As has already been suggested, don't use list(), use listFiles().
File.list() does enough operations on the directory to return a File[]
but only returns you the filename so you then have to construct another
File and do another operation to find out whether it is a directory.
listFiles() does all this in one operation.

No, I tried listFiles() and it has the same problem when calling
isDirectory() and list() does. It apparently does not pre-fetch that
information. I used ktrace to verify this.
 
B

Betty

Bill Tschumy said:
Bill said:
for ( int i = 0; i < files.length; i++ )
{
String filename = files[ i ];
File f = new File( dir, filename );
System.out.println("calling isDirectory()");
boolean isDir = f.isDirectory();
System.out.println("called");
}

Bill

As has already been suggested, don't use list(), use listFiles().
File.list() does enough operations on the directory to return a File[]
but only returns you the filename so you then have to construct another
File and do another operation to find out whether it is a directory.
listFiles() does all this in one operation.

No, I tried listFiles() and it has the same problem when calling
isDirectory() and list() does. It apparently does not pre-fetch that
information. I used ktrace to verify this.
As a temporary method you might consider taking advantage
of naming conventions. For example if a filename ends with ".class"
it is more likely to be a java class file than a directory (but this
is not absolutely true).
 
B

Bill Tschumy

I have been investigating the performance of isDirectory() using ktrace/kdump
to see what system calls are generated. The results are quite surprising. I
have run the same test code as a standalone program and as a method invoked
from a menu inside my Jurtle application.

There is the test code:

System.out.println("Start");
File dir = new File( "/Users/bill/testdir/" );

System.out.println("getting list");
String[] files = dir.list();
System.out.println("got it!");

for ( int i = 0; i < files.length; i++ )
{
String filename = files[ i ];
File f = new File( dir, filename );
System.out.println("calling isDirectory()");
boolean isDir = f.isDirectory();
System.out.println("called");
}

In the directory I'm listing (testdir) there are only two files. There is a
dramatic difference in what isDirectory() does in the two cases. Here is the
output of calling isDirectory() for the standalone program:

"calling isDirectory()"
6678 java RET write 21/0x15
6678 java CALL write(0x1,0xf07fbb40,0x1)
6678 java GIO fd 1 wrote 1 byte
"
"
6678 java RET write 1
6678 java CALL stat(0x50ace0,0xf07fff00)
6678 java NAMI "/Users/bill/testdir/file2"
6678 java RET stat 0
6678 java CALL write(0x1,0xf07fbad0,0x6)
6678 java GIO fd 1 wrote 6 bytes
"called"

It does a single stat() call on one of the files in the directory.

Here is the same range of output when the **same** code is called via a menu
in my app.

"calling isDirectory()"
6875 java RET write 21/0x15
6875 java CALL write(0x1,0xf0eb8eb0,0x1)
6875 java GIO fd 1 wrote 1 byte
"
"
6875 java RET write 1
6875 java CALL lstat(0xf0ebc690,0xf0ebc150)
6875 java NAMI "/Users"
6875 java RET lstat 0
6875 java CALL lstat(0xf0ebc690,0xf0ebc150)
6875 java NAMI "/Users/bill"
6875 java RET lstat 0
6875 java CALL lstat(0xf0ebc690,0xf0ebc150)
6875 java NAMI "/Users/bill/testdir"
6875 java RET lstat 0
6875 java CALL lstat(0xf0ebc690,0xf0ebc150)
6875 java NAMI "/Users/bill/testdir/file2"
6875 java RET lstat 0
6875 java CALL stat(0xf0ebc690,0xf0ebc550)
6875 java NAMI "/Users"
6875 java RET stat 0
6875 java CALL open(0x94c34658,0,0)
6875 java NAMI "/"
6875 java RET open 12/0xc
6875 java CALL getdirentries(0xc,0xf0eba4d0,0x2000,0xf0ebc4d0)
6875 java RET getdirentries 676/0x2a4
6875 java CALL close(0xc)
6875 java RET close 0
6875 java CALL stat(0xf0ebc690,0xf0ebc550)
6875 java NAMI "/Users/bill"
6875 java RET stat 0
6875 java CALL open(0xf0ebc690,0,0)
6875 java NAMI "/Users"
6875 java RET open 12/0xc
6875 java CALL getdirentries(0xc,0xf0eba4d0,0x2000,0xf0ebc4d0)
6875 java RET getdirentries 144/0x90
6875 java CALL close(0xc)
6875 java RET close 0
6875 java CALL stat(0xf0ebc690,0xf0ebc550)
6875 java NAMI "/Users/bill/testdir"
6875 java RET stat 0
6875 java CALL open(0xf0ebc690,0,0)
6875 java NAMI "/Users/bill"
6875 java RET open 12/0xc
6875 java CALL getdirentries(0xc,0xf0eba4d0,0x2000,0xf0ebc4d0)
6875 java RET getdirentries 4996/0x1384
6875 java CALL close(0xc)
6875 java RET close 0
6875 java CALL stat(0xf0ebc690,0xf0ebc550)
6875 java NAMI "/Users/bill/testdir/file2"
6875 java RET stat 0
6875 java CALL open(0xf0ebc690,0,0)
6875 java NAMI "/Users/bill/testdir"
6875 java RET open 12/0xc
6875 java CALL getdirentries(0xc,0xf0eba4d0,0x2000,0xf0ebc4d0)
6875 java RET getdirentries 56/0x38
6875 java CALL close(0xc)
6875 java RET close 0
6875 java CALL stat(0x5312c0,0xf0ebd270)
6875 java NAMI "/Users/bill/testdir/file2"
6875 java RET stat 0
6875 java CALL write(0x1,0xf0eb8e40,0x6)
6875 java GIO fd 1 wrote 6 bytes
"called"

God only knows what it is doing here. It seems to be (twice) getting stats
and directory listings on every directory in the path. I'm not sure, but I
strongly suspect this is what is killing network performance on my customer's
remote server.

Anyone have any thoughts on this? Quite frankly I'm pretty stumped by this.

I think the mystery has been solved. The reason for the inefficient
isDirectory() in Jurtle, but not in standalone test code is that Jurtle
installs its own subclass of SecurityManager. For some reason that still
escapes me, installing a SecurityManger causes the isDirectory() to check
permissions of all the directories along a specific directory's path
(twice!).

Looking at Sun's code I can't see any reason for this to be the case. I
think the offending code must be in the platform specific implementation of
FileSystem (it was on Apple's 1.4 implementation where the problem was found;
don't know if the problem exists in Windows or not).

The only reason I install my own SecurityManager is to override checkExit()
to prevent student code running from within Jurtle from calling System.exit
and causing Jurtle to exit. I can't think of any other way of controlling
this situation. If there is no other workaround I will probably not install
the new security manager and let the students shoot themselves in the foot.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top