Stuck on a caching issue.

Discussion in 'Java' started by Inertia_sublimation, Oct 17, 2004.

  1. Hello everyone. Im developing a program the requires heavy usage of
    recursive folder size computation, and need to speed it up immensely.

    I thought the use of a cache to store the result of recursively
    getting the size of a folder would be ideal, but am having a problem
    with it: Its not working.

    The size is recomputed every time the sizeRecursive() method is called
    on a CachedDirectory.

    Also, please forgive the amount of code I'm posting, I think all of it
    is pertinant though.

    Note the comments in DirectoryCache.CachedDirectory.sizeRecursive()'s
    for loop, In order to work on the Directorys contained in the
    Directory[] as CachedDirectorys, I have to wrap them in a new instance
    of CachedDirectory. I don't know how to get around the lastComputed
    field of the new CachedDirectory being set to -1 (forcing the new
    CachedDirectory to recompute its value instead of looking in the cache
    for it).
    Please help!

    PS - Is there any where I can put a line break in the declaration of a
    generic Collection? The lines where I declare them at the top of the
    class get kinda long, and they look ugly.

    -------------------Class DirectoryCache-------------------
    import java.util.*;
    import java.io.*;

    /**
    * Provides one place where CachedDirectories can be
    * instantiated, that all use the same cache.
    * To instantiate a CachedDirectory from this cache
    * use this code:
    * DirectoryCache cache = new DirectoryCache();
    * CachedDirectory dir = cache.new CachedDirectory(...);
    */

    public class DirectoryCache {
    // The storage facility:
    private static HashMap<Directory, Long> dirSizes = new
    HashMap<Directory, Long>();

    /**
    * This class subclasses Directory to add caching of the result of
    * calling sizeRecursive().
    */
    public class CachedDirectory extends Directory {
    /** The last time the recusrive size was computed. */
    private long lastComputed = -1;

    /** This is provided as an alternative to
    * typing DirectoryCache.this */
    private final DirectoryCache cache = DirectoryCache.this;

    public CachedDirectory (final String pathName)
    throws NotADirectoryException {
    super(pathName);
    }

    public CachedDirectory(final Directory f) throws IOException {
    super(f);
    }

    public long sizeRecursive() throws IOException {
    /* If this CachedDirectory is NOT in the cache,
    * or if it was modified: */
    if (!cache.dirSizes.containsKey(this)
    || lastModified() > lastComputed) {
    // Debug output:
    if (lastModified() > lastComputed)
    System.out.println(this.toString()
    + " was modified.");
    if (!cache.dirSizes.containsKey(this))
    System.out.println(this.toString()
    + " is not cached.");

    // Recompute the recursive size of this directory:

    lastComputed = lastModified();

    // Get the size of all files in this directory:
    long newSize = size();

    // Get all directories in this CachedDirectory:
    Directory[] dirs = listDirectories();

    for (Directory dir : dirs) {
    /* TODO: Check if there's a better way to do this
    * that doesn't involve creating a new instance.
    * since this "fools" the equals() method...
    * or does it?
    *
    * After some thought, becasuse its a new instance
    * of CachedDirectory, its lastComputed variable
    * is -1, meaning it will always recompute its
    * size "manually". This is a bad thing, and
    * I dont know how to fix it.
    */
    dir = new CachedDirectory(dir);
    newSize += dir.sizeRecursive();
    }

    // Cache and return new size:
    cache.dirSizes.put(this, newSize);
    return newSize;
    } else { // This is in the cache and hasn't been modified.
    System.out.println(this.toString() + " is cached.");
    return cache.dirSizes.get(this);
    }
    }
    }
    }
    -------------------Class Directory-------------------
    import java.io.*;
    import java.util.*;

    /**
    * This class represents a directory, othewise known as a folder.
    * It provides a method of differentiating between Files and
    Directories
    * contained within the folder. Breaking from the java.io.File
    "standard",
    * this class provides a method of getting files and folders contained
    within
    * a folder seperately.
    */
    public class Directory {
    /** Internal representation of this directory. */
    private final File representation;

    /** The last time that the files and dirs arrays were refreshed.
    */
    private long lastRefreshed = -1;

    /** Contains the files stored in this directory. */
    private File[] files;

    /** Contains the directories stored in this directory. */
    private Directory[] dirs;

    public Directory(String pathname) throws NotADirectoryException {
    this(new File(pathname));
    }

    public Directory(File f) throws NotADirectoryException {
    if (f.exists() && f.isFile())
    throw new NotADirectoryException(f);
    representation = f;
    }

    /**
    * Constructs a new Directory object pointing to the same
    * data on disk.
    * @todo Make constructors call each other (via this()).
    */
    public Directory(Directory dir) {
    /*
    * TODO: Try to make all constructors call each other.
    * right now, I cant call this(dir.representation)
    * since I need to catch the NotADirectoryException.
    */
    representation = dir.representation;
    }

    /**
    * If recursive is false, gets the sum of all the sizes
    * of the files contained in this Directory.
    */
    public long size() throws IOException {
    //System.out.println("Computing size for " + toString());
    long size = 0;
    File[] files = listFiles();
    //System.out.println("File[] files == " +
    Arrays.toString(files));
    for (File file : files) {
    size += file.length();
    }
    return size;
    }

    /**
    * This method gets the sum of all the sizes of
    * the files contained within this Directory,
    * including those contained within the directories
    * within this directory.
    */
    public long sizeRecursive() throws IOException {
    long size = size();

    Directory[] directories = listDirectories();
    //System.out.println("Directory[] directories == " +
    Arrays.toString(directories));
    for (Directory dir : directories) {
    //System.out.println("Directory currently scanning == " +
    dir);
    size += dir.sizeRecursive();
    }
    return size;
    }

    /**
    * Repopulates the arrays containing the files and directories
    * contained within this directory.
    */
    private void refreshContentArrays() throws IOException {
    lastRefreshed = representation.lastModified();
    File[] contents = representation.listFiles();

    // contents shouldnt be null unless this directory doesn't
    exist.
    if (contents == null) {
    this.dirs = null;
    this.files = null;
    return;
    }

    // A directory, at most, has contents.length files/directories
    in it.
    List<File> newFiles = new
    ArrayList<File>(contents.length);
    List<Directory> newDirs = new
    ArrayList<Directory>(contents.length);
    for (File item : contents) {
    // TODO: Why does this need to be here?
    if (item.exists() && !isSymbolicLink(item)) {
    if (item.isFile()) {
    newFiles.add(item);
    } else { // Item is directory
    try {
    newDirs.add(new Directory(item));
    } catch (NotADirectoryException nade) {
    nade.printStackTrace();
    }
    // NotADirectoryException shouldn't happen.
    }
    }
    }
    // Store new data in the fields:
    this.dirs = new Directory[newDirs.size()];
    this.files = new File[newFiles.size()];
    newDirs.toArray(dirs);
    newFiles.toArray(files);
    }

    /**
    * Lists all the directories contained within this directory.
    * Note: This doesn't recurse into the directories.
    * The special .. (parent) and . (this) directories are excluded.
    * Returns null if this directory does not exist.
    * @todo Move content array validity check to setLastModified().
    */
    public Directory[] listDirectories() throws IOException {
    // Make sure contents arrays are up-to-date.
    // TODO: Check if I can just override setLastModified() to do
    this.
    long lastModified = representation.lastModified();
    if (lastModified > lastRefreshed) {
    refreshContentArrays();
    }

    Directory[] ret = null;
    if (dirs != null) {
    // Cant pass a refrence to the field, since
    // its not immutable (can I use an UnmodifiableSet?)
    ret = new Directory[dirs.length];
    System.arraycopy(dirs, 0, ret, 0, dirs.length);
    }
    return ret;
    }

    /**
    * Returns all the files contained within this directory.
    * Contrary to java.io.File's listFiles() method, this returns
    * only the *files* in this directory, not the directories.
    */
    public File[] listFiles() throws IOException {
    // Make sure contents arrays are up-to-date.
    // TODO: Check if I can just override setLastModified() to do
    this.
    long lastModified = representation.lastModified();
    if (lastModified > lastRefreshed) {
    refreshContentArrays();
    }

    File[] ret = null;
    if (files != null) {
    // Cant pass a refrence to the field, since
    // its not immutable (can I use an UnmodifiableList?)
    ret = new File[files.length];
    System.arraycopy(files, 0, ret, 0, files.length);
    }
    return ret;
    }

    private static boolean isSymbolicLink(File f) throws IOException {
    return !f.getAbsolutePath().equals(f.getCanonicalPath());
    }

    /**
    * @see java.io.File#lastModified()
    */
    public long lastModified() {
    return representation.lastModified();
    }

    /**
    * @see java.io.File#exists()
    */
    public boolean exists() {
    return representation.exists();
    }

    /**
    * @see java.io.File#mkdir()
    */
    public boolean mkdir() {
    return representation.mkdir();
    }

    /**
    * @see java.io.File#delete()
    */
    public boolean delete() {
    return representation.delete();
    }

    /**
    * @see java.io.File#toString()
    */
    public String toString() {
    return representation.toString();
    }

    /**
    * @see java.io.File#hashCode()
    */
    public int hashCode() {
    return representation.hashCode();
    }

    /**
    * @see java.io.File#equals(Object)
    */
    public boolean equals(Object o) {
    return representation.equals(o);
    }

    public int compareTo(Directory pathname) {
    return representation.compareTo(pathname.representation);
    }

    /**
    * @see java.io.File#getCanonicalPath()
    */
    public String getCanonicalPath() throws IOException {
    return representation.getCanonicalPath();
    }

    /**
    * @see java.io.File#getPath()
    */
    public String getPath() {
    return representation.getPath();
    }
    }
    -------------------Class NotADirectoryException-------------------
    import java.io.File;

    /**
    * This exception is thrown when a Directory is constructed
    * via a File that already exists as a file on the storage media.
    * @todo See if this should instead subclass RuntimeException.
    */
    public class NotADirectoryException extends Exception {
    private final File file;
    public NotADirectoryException(File file) {
    this.file = file;
    }

    public NotADirectoryException(String filePath) {
    this(new File(filePath));
    }

    public String getMessage() {
    return file.toString() + " is not a directory.";
    }
    }
    Inertia_sublimation, Oct 17, 2004
    #1
    1. Advertising

  2. Inertia_sublimation

    Alex Hunsley Guest

    Inertia_sublimation wrote:

    > Also, please forgive the amount of code I'm posting, I think all of it
    > is pertinant though.


    Just one thing for now: post compilable code. What you posted isn't
    compilable, as it has been line-broken and I can see comments spill over
    onto the next line etc.
    Also, when posting a pretty large bit of code, consider placing it on a
    web server or ftp server if you have one!
    Alex Hunsley, Oct 17, 2004
    #2
    1. Advertising

  3. Inertia_sublimation

    Paul Lutus Guest

    Inertia_sublimation wrote:

    > Hello everyone. Im developing a program the requires heavy usage of
    > recursive folder size computation, and need to speed it up immensely.
    >
    > I thought the use of a cache to store the result of recursively
    > getting the size of a folder would be ideal, but am having a problem
    > with it: Its not working.


    First, your code is way too large for the task at hand. The problem can be
    solved in 80 lines. I know, I posted an example for you earlier, but to no
    apparent effect.

    Second, you cannot cache your result from scan to scan, because there may be
    changes in the filesystem between scans. So throw out the idea of a cache.
    Each time you scan the directory tree, you have to scan each and every
    item, period.

    Please consider doing this with the simplest possible embodiment, by
    abandoning any illusions that you can avoid a complete rescan each time a
    scan is required.

    Again, as a reminder:

    // DirSum.java

    import java.io.*;
    import java.util.*;

    public class DirSum {

    DirSum(String path) {
    scanSimple(new File(path));
    }

    private boolean isSymLink(File f)
    {
    try {
    return !f.getCanonicalPath().equals(f.getAbsolutePath());
    }
    catch (java.io.IOException e) {
    return false;
    }
    }

    public DirData scanSimple(File path)
    {
    DirData data = new DirData();
    File[] list = path.listFiles();
    if(list != null) {
    Arrays.sort(list);
    for(int i = 0; i < list.length;i++) {
    File f = list;
    // don't follow symlinks
    if(f.isDirectory()) {
    if(isSymLink(f)) {
    data.symlinks++;
    }
    else {
    data.add(scanSimple(f));
    }
    }
    else {
    data.files++;
    data.length += f.length();
    }
    }
    }
    System.out.println(path + ": " + data);
    return data;
    }


    public static void main(String args[])
    {
    if(args.length == 0) {
    System.out.println("usage: directory to be scanned.");
    }
    else {
    new DirSum(args[0]);
    }
    }
    }

    class DirData {
    public int symlinks = 0;
    public long files = 0;
    public long length = 0;
    public long directories = 0;

    public void add(DirData d)
    {
    symlinks += d.symlinks;
    files += d.files;
    length += d.length;
    directories += d.directories;
    }
    public String toString()
    {
    return files + " files, " + directories + " directories, "
    + symlinks + " symbolic links, " + length + " bytes.";
    }
    }

    --
    Paul Lutus
    http://www.arachnoid.com
    Paul Lutus, Oct 18, 2004
    #3
  4. Alex Hunsley <> wrote in message news:<7uCcd.80227$>...
    > Inertia_sublimation wrote:
    >
    > > Also, please forgive the amount of code I'm posting, I think all of it
    > > is pertinant though.

    >
    > Just one thing for now: post compilable code. What you posted isn't
    > compilable, as it has been line-broken and I can see comments spill over
    > onto the next line etc.
    > Also, when posting a pretty large bit of code, consider placing it on a
    > web server or ftp server if you have one!


    Thanks for the tip. Im still new to usenet, Ill keep that in mind and
    post links to a downloadable copy of the source as soon as I can.
    Inertia_sublimation, Oct 18, 2004
    #4
  5. Paul Lutus <> wrote in message news:<>...
    > Inertia_sublimation wrote:
    >
    > > Hello everyone. Im developing a program the requires heavy usage of
    > > recursive folder size computation, and need to speed it up immensely.
    > >
    > > I thought the use of a cache to store the result of recursively
    > > getting the size of a folder would be ideal, but am having a problem
    > > with it: Its not working.

    >
    > First, your code is way too large for the task at hand. The problem can be
    > solved in 80 lines. I know, I posted an example for you earlier, but to no
    > apparent effect.
    >
    > Second, you cannot cache your result from scan to scan, because there may be
    > changes in the filesystem between scans. So throw out the idea of a cache.
    > Each time you scan the directory tree, you have to scan each and every
    > item, period.


    For my purposes, I assume the filesystem doesn't change. What Im
    trying to make is a utility that allows me to know a good estimate of
    the sizes of each directory. I dont care about a few transient bytes
    that change every so often; the user is responsible for not making
    considerable changes to the filesystem (ex. deleting a huge amount of
    data) while the program is running. Even if he/she does, I can provide
    an option to recache those directories that changed.

    Can your solution be adapted to allow for caching of directory data?
    If so, it would be ideal, and I would gladly throw out my original
    code :)

    Thanks for your replies!
    Inertia_sublimation, Oct 18, 2004
    #5
  6. Inertia_sublimation

    Paul Lutus Guest

    Inertia_sublimation wrote:

    > Paul Lutus <> wrote in message
    > news:<>...
    >> Inertia_sublimation wrote:
    >>
    >> > Hello everyone. Im developing a program the requires heavy usage of
    >> > recursive folder size computation, and need to speed it up immensely.
    >> >
    >> > I thought the use of a cache to store the result of recursively
    >> > getting the size of a folder would be ideal, but am having a problem
    >> > with it: Its not working.

    >>
    >> First, your code is way too large for the task at hand. The problem can
    >> be solved in 80 lines. I know, I posted an example for you earlier, but
    >> to no apparent effect.
    >>
    >> Second, you cannot cache your result from scan to scan, because there may
    >> be changes in the filesystem between scans. So throw out the idea of a
    >> cache. Each time you scan the directory tree, you have to scan each and
    >> every item, period.

    >
    > For my purposes, I assume the filesystem doesn't change.


    Okay, in that case, you only have to scan the filesystem once, and
    subsequently you can read the stored data. So put your tree data in a tree
    data structure, one structurally synchronized with the tree search code,
    and after the scan, you only have to read the tree, not the filesystem.

    This is not difficult at all. But if your purpose is to have a program to
    discover the size of the filesystem once, you might as well read it and
    exit, without storing all the file-by-file data. If on the other hand you
    need to read the filesystem repeatedly and accurately, then you do not want
    to store the prior read data, so there is no point storing the tree state.

    In neither of these scenarios is there a need to store the entire tree.

    > What Im
    > trying to make is a utility that allows me to know a good estimate of
    > the sizes of each directory.


    I provided that, directory by directory, in my short program.

    > I dont care about a few transient bytes
    > that change every so often; the user is responsible for not making
    > considerable changes to the filesystem (ex. deleting a huge amount of
    > data) while the program is running. Even if he/she does, I can provide
    > an option to recache those directories that changed.
    >
    > Can your solution be adapted to allow for caching of directory data?


    Please ask yourself how you plan to use the cached data. There are any
    number of ways to store the data (for example you could store the data that
    is now printed by my program after each directory-level scan), but I
    heaven't heard how you plan to use the data.

    > If so, it would be ideal, and I would gladly throw out my original
    > code :)


    Get out the shovels. :)

    In all seriousness, tell me how you plan to use the data. What do you need
    to know, when, at what detail level?

    --
    Paul Lutus
    http://www.arachnoid.com
    Paul Lutus, Oct 18, 2004
    #6
  7. Inertia_sublimation

    Alex Hunsley Guest

    Inertia_sublimation wrote:
    > Alex Hunsley <> wrote in message news:<7uCcd.80227$>...
    >
    >>Inertia_sublimation wrote:
    >>
    >>
    >>>Also, please forgive the amount of code I'm posting, I think all of it
    >>>is pertinant though.

    >>
    >>Just one thing for now: post compilable code. What you posted isn't
    >>compilable, as it has been line-broken and I can see comments spill over
    >>onto the next line etc.
    >>Also, when posting a pretty large bit of code, consider placing it on a
    >>web server or ftp server if you have one!

    >
    >
    > Thanks for the tip. Im still new to usenet, Ill keep that in mind and
    > post links to a downloadable copy of the source as soon as I can.


    Thanks for listening!
    Generally, putting small-ish bits of code in usenet posts is fine. I
    would say your post was sort of mid-way between the "put code in the
    post" and "upload somewhere and give a link" camps.

    alex
    Alex Hunsley, Oct 18, 2004
    #7
  8. Paul Lutus <> wrote in message news:<>...
    > Inertia_sublimation wrote:
    >
    > > Paul Lutus <> wrote in message
    > > news:<>...
    > >> Inertia_sublimation wrote:
    > >>
    > >> > Hello everyone. Im developing a program the requires heavy usage of
    > >> > recursive folder size computation, and need to speed it up immensely.
    > >> >
    > >> > I thought the use of a cache to store the result of recursively
    > >> > getting the size of a folder would be ideal, but am having a problem
    > >> > with it: Its not working.
    > >>
    > >> First, your code is way too large for the task at hand. The problem can
    > >> be solved in 80 lines. I know, I posted an example for you earlier, but
    > >> to no apparent effect.
    > >>
    > >> Second, you cannot cache your result from scan to scan, because there may
    > >> be changes in the filesystem between scans. So throw out the idea of a
    > >> cache. Each time you scan the directory tree, you have to scan each and
    > >> every item, period.

    > >
    > > For my purposes, I assume the filesystem doesn't change.

    >
    > Okay, in that case, you only have to scan the filesystem once, and
    > subsequently you can read the stored data. So put your tree data in a tree
    > data structure, one structurally synchronized with the tree search code,
    > and after the scan, you only have to read the tree, not the filesystem.
    >
    > This is not difficult at all. But if your purpose is to have a program to
    > discover the size of the filesystem once, you might as well read it and
    > exit, without storing all the file-by-file data. If on the other hand you
    > need to read the filesystem repeatedly and accurately, then you do not want
    > to store the prior read data, so there is no point storing the tree state.
    >
    > In neither of these scenarios is there a need to store the entire tree.
    >
    > > What Im
    > > trying to make is a utility that allows me to know a good estimate of
    > > the sizes of each directory.

    >
    > I provided that, directory by directory, in my short program.
    >
    > > I dont care about a few transient bytes
    > > that change every so often; the user is responsible for not making
    > > considerable changes to the filesystem (ex. deleting a huge amount of
    > > data) while the program is running. Even if he/she does, I can provide
    > > an option to recache those directories that changed.
    > >
    > > Can your solution be adapted to allow for caching of directory data?

    >
    > Please ask yourself how you plan to use the cached data. There are any
    > number of ways to store the data (for example you could store the data that
    > is now printed by my program after each directory-level scan), but I
    > heaven't heard how you plan to use the data.
    >
    > > If so, it would be ideal, and I would gladly throw out my original
    > > code :)

    >
    > Get out the shovels. :)
    >
    > In all seriousness, tell me how you plan to use the data. What do you need
    > to know, when, at what detail level?


    I think the simplest way to describe what Im trying to do is to run
    through what would happen if a user started my theoretical program.

    The program starts out asking the user for a root directory to browse
    in. After selecting one (C:) the program gets all directories 1 level
    within C: (for example: C:\Windows, C:\Documents and Settings,
    C:\RECYCLER, C:\Program Files) I call these directories Level 1. The
    size of the contents in each of these Level 1 directories is then
    calculated recursively. The user is presented with a pie chart
    displaying the sizes of each Level 1 directory as a fraction of the
    total space of C:\.

    The user then clicks on one of the slices of the pie chart, say, the
    one representing C:\Program Files. The program gets all directories 1
    level within C:\Program Files (C:\Program Files\Foo, C:\Program
    Files\Bar, C:\Program Files\Mozilla, etc), call these level 2. The
    program then calculates the size of the contents of each of these
    Level 2 directories recursively.

    This is where caching will help the situation. Because the program
    needed to visit every directory within every directory in C:\, it
    already visited all the directories in C:\Program Files.

    The user can continue browsing deeper into the directories in
    C:\Program Files, or he/she can go back up to any level at any time.

    Ah, and btw, I was able to get my code uploaded, here are links to it:
    <http://ameritech.net/users/linuxgateway/Directory.java>
    <http://ameritech.net/users/linuxgateway/DirectoryCache.java>
    <http://ameritech.net/users/linuxgateway/NotADirectoryException.java>
    Inertia_sublimation, Oct 20, 2004
    #8
  9. Inertia_sublimation

    Paul Lutus Guest

    Inertia_sublimation wrote:

    / ...

    > This is where caching will help the situation. Because the program
    > needed to visit every directory within every directory in C:\, it
    > already visited all the directories in C:\Program Files.
    >
    > The user can continue browsing deeper into the directories in
    > C:\Program Files, or he/she can go back up to any level at any time.


    My posted program already does all this, at every level. All you have to do
    is store the results that my program prints.

    Please think about this. How do you know the actual total for the root
    directory (in your case, C:\)? You do this by visiting all the nested
    directories and totaling their contents. This means you have already
    visited all those directories, and you already have a total number for them
    -- all you have to do is save the result.

    If the user clicks on a specified subdirectory, all you have to do is recall
    the stored total for that directory and its subdirectories. What you do not
    have to do is rescan anything.

    The required addition to my program would be a Vector (or ArrayList), to
    which you would add entries for each visited directory, recursively. You
    could create a utility class that contained a total number plus a vector
    containing all the visited directories for that level. Each visited
    directory would get an instance of this utility class added to the vector
    for the higher level.

    This would create a tree data structure, similar to the original filesystem
    tree, that would contain all the directories and the totals for each. Each
    node in the tree would be an instance of your utility class. Each utility
    class instance would contain a Vector or ArrayList, plus a number that
    represents the total size of all the files beneath that level.

    This is not at all difficult, but you need to think it through.

    --
    Paul Lutus
    http://www.arachnoid.com
    Paul Lutus, Oct 20, 2004
    #9
  10. Paul Lutus <> wrote in message news:<>...
    > The required addition to my program would be a Vector (or ArrayList), to
    > which you would add entries for each visited directory, recursively. You
    > could create a utility class that contained a total number plus a vector
    > containing all the visited directories for that level. Each visited
    > directory would get an instance of this utility class added to the vector
    > for the higher level.

    By "total number" do you mean size?
    Inertia_sublimation, Oct 21, 2004
    #10
  11. Inertia_sublimation

    Paul Lutus Guest

    Inertia_sublimation wrote:

    > Paul Lutus <> wrote in message
    > news:<>...
    >> The required addition to my program would be a Vector (or ArrayList), to
    >> which you would add entries for each visited directory, recursively. You
    >> could create a utility class that contained a total number plus a vector
    >> containing all the visited directories for that level. Each visited
    >> directory would get an instance of this utility class added to the vector
    >> for the higher level.


    > By "total number" do you mean size?


    I'll say it differently. I mean a number that represents the sum of the
    lengths of all the files you encountered during your scan. The total number
    for the root directory should be the grand total of the lengths of all the
    files on the entire scanned directory tree -- e.g. the sum of all the
    individual sums from the subdirectories.

    Just as my original program does it, but with a browsable tree structure at
    the end of the process. Like this:

    ********************************************************************

    // DirSum.java

    import java.io.*;
    import java.util.*;

    public class DirSum {

    private boolean isSymLink(File f) {
    try {
    return !f.getCanonicalPath().equals(f.getAbsolutePath());
    }
    catch (java.io.IOException e) {
    return false;
    }
    }

    public DirData scanSimple(File path,boolean showResult) {
    DirData data = new DirData();
    File[] list = path.listFiles();
    if(list != null) {
    Arrays.sort(list);
    for(int i = 0; i < list.length;i++) {
    File f = list;
    // don't follow symlinks
    if(f.isDirectory()) {
    if(isSymLink(f)) {
    data.addSymlink();
    }
    else {
    data.addDirectory(scanSimple(f,showResult));
    }
    }
    else {
    data.addFile(f);
    }
    }
    }
    if(showResult) {
    System.out.println(path + ": " + data);
    }
    return data;
    }

    public static void main(String args[]) {
    if(args.length == 0) {
    System.out.println("usage: directory to be scanned.");
    }
    else {
    DirSum s = new DirSum();
    DirData d = s.scanSimple(new File(args[0]),false);
    System.out.println("Total for " + args[0] + ": " + d);
    }
    }
    }

    class DirData {
    private long symlinks = 0;
    private long files = 0;
    private long length = 0;
    private long directories = 1;
    private ArrayList subDirs = new ArrayList();

    public final ArrayList getSubs() {
    return subDirs;
    }

    public void addFile(File f) {
    files++;
    length += f.length();
    }

    public void addSymlink() {
    symlinks++;
    }

    public long getFiles() {
    return files;
    }

    public long getLength() {
    return length;
    }

    public long getDirs() {
    return directories;
    }

    public long getSymlinks() {
    return symlinks;
    }

    public void addDirectory(DirData d) {
    subDirs.add(d);
    directories += d.getDirs();
    files += d.getFiles();
    length += d.getLength();
    }

    public String toString() {
    return files + " files, " + directories + " dirs, " + symlinks
    + " symlinks, " + length + " bytes.";
    }
    }

    ********************************************************************

    In this line:

    DirData d = s.scanSimple(new File(args[0]),false);

    What is returned is an instance of the DirData class that contains within it
    a nested ArrayList of all the visited directories, arranged as a tree
    structure. You can traverse the tree to get the information you need.

    --
    Paul Lutus
    http://www.arachnoid.com
    Paul Lutus, Oct 21, 2004
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Hypo
    Replies:
    6
    Views:
    405
  2. Troy Simpson

    Fragment Caching inside page caching?

    Troy Simpson, Jan 19, 2004, in forum: ASP .Net
    Replies:
    0
    Views:
    783
    Troy Simpson
    Jan 19, 2004
  3. JimLad
    Replies:
    3
    Views:
    911
    JimLad
    Jan 21, 2010
  4. Nick Wild

    Stuck on silly nil issue

    Nick Wild, Aug 25, 2009, in forum: Ruby
    Replies:
    4
    Views:
    88
    Nick Wild
    Aug 25, 2009
  5. Alina

    Stuck with resizing issue

    Alina, Feb 23, 2004, in forum: Javascript
    Replies:
    3
    Views:
    82
    Alina N
    Feb 23, 2004
Loading...

Share This Page