FileList and FileFilter and regular expressions

Discussion in 'Java' started by P.Hill, Nov 17, 2003.

  1. P.Hill

    P.Hill Guest

    Greetings Java Folks,

    I as writting a class that was processing a list of files.
    Given an arbitrary file spec that looked like:
    /incoming/projectData*.dat
    The last splat in this case is actually a time stamp.

    but now I need to complicate things by searching for
    /incoming/*/projectData*.dat

    Where the first splat is a userDir.

    Anyone know of code that can help me with this?

    Using generic stuff out of the JDK, I was playing with
    File.fileList and a FileFilter, creating my special
    FileFilter which has an accept method that looks like:

    public boolean accept(File file) {
    String filepath = file.getPath();
    if ( filepath.startsWith( this.lefthand ) &&
    filepath.endsWith( this.righthand ) ) {
    return true;
    }
    return false;
    }

    Simple enough, but am I missing something somewhere else
    for doing this type of generic filtering?

    Why do I ask? Well we've realized we want to change
    the code to deal with /incoming/*/projectData*.dat

    Now I COULD use the existing code by telling
    working this second spec. to put both username
    and the date into the last splat, but for external
    reason (other code) I think I'll stick to
    the form /incoming/*/projectData*.dat

    Is there anything in the JDK or elsewhere that I've overlooked that helps
    me more than FileFilter to do a fancier job than the above FileFilter of
    doing such wildcard matches? How about combining FileFilter with
    Reg. Expressions? Never used that, but it looks like something like:

    Pattern pattern = Pattern.compile( "/incoming/*/projectData*.dat" );
    this.matcher = pattern.matcher( "" ); // start with no candidate string.

    ....
    and then in accept( File file) I do:
    String filepath = file.getPath();
    this.matcher.reset( filepath );
    boolean goodFile = this.matcher.matches();
    if ( goodFile ) return true;

    That look good to everyone?

    It seems pretty good, but then I also have to find all
    dirs in the /incoming/ directory looping through those
    to to get a fileList from each.

    None of this is rocket science, but it seems to be a well traveled
    path. Anyone know of code that does more of the work for me instead
    of working up a set of two calls to filePath?

    Let me know, if you do, or if you have other ideas, meanwhile I'll
    be coding the fileSet of dir's followed by the fileset from each
    dir solution.

    TIA,
    -Paul
     
    P.Hill, Nov 17, 2003
    #1
    1. Advertising

  2. P.Hill

    nos Guest

    if it was me, and it isn't, i would use split()
    then x[0] is a directory
    x[1] is a directory
    x[2] is a filename

    "P.Hill" <> wrote in message
    news:bpb4iv$tok$...
    > Greetings Java Folks,
    >
    > I as writting a class that was processing a list of files.
    > Given an arbitrary file spec that looked like:
    > /incoming/projectData*.dat
    > The last splat in this case is actually a time stamp.
    >
    > but now I need to complicate things by searching for
    > /incoming/*/projectData*.dat
    >
    > Where the first splat is a userDir.
    >
    > Anyone know of code that can help me with this?
    >
    > Using generic stuff out of the JDK, I was playing with
    > File.fileList and a FileFilter, creating my special
    > FileFilter which has an accept method that looks like:
    >
    > public boolean accept(File file) {
    > String filepath = file.getPath();
    > if ( filepath.startsWith( this.lefthand ) &&
    > filepath.endsWith( this.righthand ) ) {
    > return true;
    > }
    > return false;
    > }
    >
    > Simple enough, but am I missing something somewhere else
    > for doing this type of generic filtering?
    >
    > Why do I ask? Well we've realized we want to change
    > the code to deal with /incoming/*/projectData*.dat
    >
    > Now I COULD use the existing code by telling
    > working this second spec. to put both username
    > and the date into the last splat, but for external
    > reason (other code) I think I'll stick to
    > the form /incoming/*/projectData*.dat
    >
    > Is there anything in the JDK or elsewhere that I've overlooked that helps
    > me more than FileFilter to do a fancier job than the above FileFilter of
    > doing such wildcard matches? How about combining FileFilter with
    > Reg. Expressions? Never used that, but it looks like something like:
    >
    > Pattern pattern = Pattern.compile( "/incoming/*/projectData*.dat" );
    > this.matcher = pattern.matcher( "" ); // start with no candidate string.
    >
    > ...
    > and then in accept( File file) I do:
    > String filepath = file.getPath();
    > this.matcher.reset( filepath );
    > boolean goodFile = this.matcher.matches();
    > if ( goodFile ) return true;
    >
    > That look good to everyone?
    >
    > It seems pretty good, but then I also have to find all
    > dirs in the /incoming/ directory looping through those
    > to to get a fileList from each.
    >
    > None of this is rocket science, but it seems to be a well traveled
    > path. Anyone know of code that does more of the work for me instead
    > of working up a set of two calls to filePath?
    >
    > Let me know, if you do, or if you have other ideas, meanwhile I'll
    > be coding the fileSet of dir's followed by the fileset from each
    > dir solution.
    >
    > TIA,
    > -Paul
    >
     
    nos, Nov 17, 2003
    #2
    1. Advertising

  3. P.Hill wrote:

    > Greetings Java Folks,


    Greetings, P.Hill.

    > I as writting a class that was processing a list of files.
    > Given an arbitrary file spec that looked like:
    > /incoming/projectData*.dat
    > The last splat in this case is actually a time stamp.
    >
    > but now I need to complicate things by searching for
    > /incoming/*/projectData*.dat
    >
    > Where the first splat is a userDir.
    >
    > Anyone know of code that can help me with this?
    >
    > Using generic stuff out of the JDK, I was playing with
    > File.fileList and a FileFilter, creating my special
    > FileFilter which has an accept method that looks like:
    >
    > public boolean accept(File file) {
    > String filepath = file.getPath();
    > if ( filepath.startsWith( this.lefthand ) &&
    > filepath.endsWith( this.righthand ) ) {
    > return true;
    > }
    > return false;
    > }
    >
    > Simple enough, but am I missing something somewhere else
    > for doing this type of generic filtering?


    You are looking in the right places, but I don't think you're going
    about it in the right way.

    For a solution to the specific case you raise
    (/incoming/*/projectData*.dat), I would do something like this:

    File incomingDir = new File("/incoming");
    File[] contents = incomingDir.listFiles();
    List files = new ArrayList();

    for (int i = 0 ; i < contents.length; i++) {
    if (contents.isDirectory()) {
    File[] goodFiles = contents.listFiles(myFileFilter);
    files.addAll(Arrays.asList(goodFiles));
    }
    }

    where myFileFilter is an instance of a FileFilter that checks the name
    part of the file. You don't need to worry there about the path, because
    you needed (or at least wanted) to have already checked that (and in
    this case the spec is "*", anyway).

    To do this in a more general way, you should incorporate the filtration
    into your tree walk. For instance, with something like:

    List findFiles(File path, FileFilter smartFilter) {
    List files = new ArrayList();
    File[] contents = path.listFiles(smartFilter);

    for (int i = 0; i < contents.length; i++) {
    if (contents.isDirectory()) {
    files.addAll(findFiles(contents, smartFilter));
    } else {
    files.add(contents);
    }
    }

    return files;
    }

    void doSomething() {
    ...
    List filesToProcess = findFiles(new File("/incoming"), new
    MySmartFileFilter());
    ...
    }

    Here all the intelligence is built into one FileFilter, which must both
    select which directories to traverse and select which regular files are
    accepted. A more flexible approach might use a prebuilt chain of
    FileFilters, one per tree level, or even FileFilters at each level that
    can tell you which filters to use for the next level.

    [...]

    > Is there anything in the JDK or elsewhere that I've overlooked that helps
    > me more than FileFilter to do a fancier job than the above FileFilter of
    > doing such wildcard matches? How about combining FileFilter with
    > Reg. Expressions? Never used that, but it looks like something like:
    >
    > Pattern pattern = Pattern.compile( "/incoming/*/projectData*.dat" );


    I think you'd want:

    Pattern pattern = Pattern.compile("/incoming/[^/]+/projectData.*\\.dat");

    > this.matcher = pattern.matcher( "" ); // start with no candidate string.
    >
    > ...
    > and then in accept( File file) I do:
    > String filepath = file.getPath();
    > this.matcher.reset( filepath );
    > boolean goodFile = this.matcher.matches();
    > if ( goodFile ) return true;
    >
    > That look good to everyone?


    The pattern (as amended) could work, but it seems like more effort than
    you want. Especially if the next change is to support different or more
    complex filename patterns. If there is any chance of that then you
    would be well advised to build a more general solution now.

    > It seems pretty good, but then I also have to find all
    > dirs in the /incoming/ directory looping through those
    > to to get a fileList from each.


    You have to do this no matter what if you are processing files from
    multiple directories. It's not so hard, though. (Vide supra)

    Good luck,

    John Bollinger
     
    John C. Bollinger, Nov 17, 2003
    #3
  4. P.Hill

    P.Hill Guest

    nos wrote:
    > if it was me, and it isn't, i would use split()
    > then x[0] is a directory
    > x[1] is a directory
    > x[2] is a filename


    Hi Nos,

    Yes, that is how I originally filled my FileFilter.
    I usd split an built my simple filter with the LHS and RHS
    (left and right).

    -Paul
     
    P.Hill, Nov 18, 2003
    #4
  5. P.Hill

    P.Hill Guest

    John C. Bollinger wrote:
    > For a solution to the specific case you raise
    > (/incoming/*/projectData*.dat), I would do something like this:
    >
    > File incomingDir = new File("/incoming");
    > File[] contents = incomingDir.listFiles();
    > List files = new ArrayList();
    >
    > for (int i = 0 ; i < contents.length; i++) {
    > if (contents.isDirectory()) {
    > File[] goodFiles = contents.listFiles(myFileFilter);
    > files.addAll(Arrays.asList(goodFiles));
    > }
    > }


    Actually I ended up implementing
    a DirectoryTreeSearch class
    which has the equivalent of some of the code you mention.
    Which uses two FileFilters.
    I did a DirFileFilter and a RegExpFileFilter.
    Now the DirFileFilter does NOT result in the GENERAL case you mention
    that goes to any depth like the ** operator in Ant.
    /java/src/**/*.java
    But I did implement the DirectoryTreeSearch
    so that it handles /incoming/prePart*PostPart/projectDataPre*PostPart.dat
    Which means I can force a certain pattern on the dir and allow various
    extras in the filename.

    The hard part was making sure I escaped dots and backslashes (in
    windows file specs) to correctly convert my dir/ls like wildcard
    spec to a proper java.regex (Perl-like) regex. A * had to
    be convert to a [a-zA-Z_0-9] which is a \w in java pattern
    (which is fine for me not to include the $)

    That was fun! I even wrote a bunch of test cases for it.

    Combining both doesn't sound very using for a search to
    find
    /incoming/prePartABC/data123.dat
    OR
    /incoming/prePartABC/prePartXYZ/data123.dat
    given
    /incoming/prePart*/data*.dat

    On the other hand, I happen to stumble on the thought that
    a partial dir spec might be useful in my case.

    Maybe an alternate entry point (or the ** syntax) would be
    a great way to provide both!

    BUT I'm too agile oriented (i.e. don't implement what I don't need),
    and I have a delivery this week, so I think I'll pass on your
    interesting suggestion.

    thanks for the input,
    -Paul
     
    P.Hill, Nov 18, 2003
    #5
  6. P.Hill

    Alan Moore Guest

    On Mon, 17 Nov 2003 11:38:29 -0700, "P.Hill" <>
    wrote:

    >Greetings Java Folks,
    >
    >I as writting a class that was processing a list of files.
    >Given an arbitrary file spec that looked like:
    >/incoming/projectData*.dat
    >The last splat in this case is actually a time stamp.
    >
    >but now I need to complicate things by searching for
    >/incoming/*/projectData*.dat
    >
    >Where the first splat is a userDir.
    >
    >Anyone know of code that can help me with this?


    Have you ever checked out JRegex? Its filesystem utilites seem to be
    a very close match for what you're trying to do:

    http://jregex.sourceforge.net/gstarted.html#filesystem
     
    Alan Moore, Nov 18, 2003
    #6
  7. P.Hill

    P.Hill Guest

    Alan Moore wrote:
    > On Mon, 17 Nov 2003 11:38:29 -0700, "P.Hill" <>
    > wrote:
    >>Anyone know of code that can help me with this?

    >
    > Have you ever checked out JRegex? Its filesystem utilites seem to be
    > a very close match for what you're trying to do:
    >
    > http://jregex.sourceforge.net/gstarted.html#filesystem

    Alan,

    You are right. In fact that is pretty much an exact match for
    everything both John and I suggested.

    The jregex.util.io.WildcardFilter class.

    [...]
    File dir=...;
    String[] htmlFiles=dir.list(new WildcardFilter("*.html"));


    2. The jregex.util.io.PathPattern class.
    [...]
    The path pattern can be both relative and absolute and may the following
    wildcards:
    ? - any-character
    * - any-string
    ** - any-path

    I should have waited a day while others where responding.
    I will definitely note this classes.

    thanks,
    -Paul
     
    P.Hill, Nov 18, 2003
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Sebastian Hoehn

    Re: FileFilter

    Sebastian Hoehn, Jun 29, 2003, in forum: Java
    Replies:
    1
    Views:
    2,317
  2. Frank
    Replies:
    5
    Views:
    514
    Andrew Thompson
    Jan 14, 2005
  3. Replies:
    0
    Views:
    412
  4. WoodHacker

    Gtk::FileFilter

    WoodHacker, Nov 1, 2006, in forum: Ruby
    Replies:
    2
    Views:
    169
    WoodHacker
    Nov 2, 2006
  5. Noman Shapiro
    Replies:
    0
    Views:
    235
    Noman Shapiro
    Jul 17, 2013
Loading...

Share This Page