Better way of checking file/directory spec in Win32?

Discussion in 'Perl Misc' started by Henry Law, Sep 15, 2005.

  1. Henry Law

    Henry Law Guest

    Running under Activestate on Windows XP I have need to feed a series of
    directory and file specifications to a program, which then interprets
    them. Specifications can be directories, single file names, or wildcard
    combinations like c:\foo\bar\*.pdf which result in lists of matching files.

    The specifications need to be validated to weed out those that don't
    refer to an existing directory, or that result in no matching files.
    I've Googled and searched CPAN for a module that does this, since it
    must be a common requirement, but without success. Can someone suggest one?

    Here's the subroutine I wrote to do the job, so you'll see what I'm
    trying to do. If it needs to exist then comments on how to smarten it
    up would be welcome.

    #! /usr/bin.perl

    use strict;
    use warnings;

    use constant VALID_DIRECTORY => -1;
    use constant VALID_FILE => -2;

    print "Enter spec: ";
    my $spec = <STDIN>;
    chomp $spec;
    while ($spec) {
    my $ret = interpret_spec($spec);
    if ($ret == VALID_DIRECTORY) {
    print "'$spec' is a valid directory\n";
    } elsif ($ret == VALID_FILE) {
    print "'$spec' is an existing file\n";
    } elsif ($ret) {
    print "'$spec' results in $ret files\n";
    } else {
    print "'$spec' doesn't match any file or directory\n";
    }
    print "=> ";
    $spec=<STDIN>;
    chomp $spec;
    }

    sub interpret_spec{
    # Checks a file/directory specification. Returns
    # 0 Spec matches nothing
    # -1 It's a valid directory specification
    # -2 It's a valid single file specification
    # number It's a glob, resulting in "number" files

    my $spec = shift;
    return VALID_DIRECTORY if (-d $spec);
    return VALID_FILE if (-f $spec);
    my @globs = glob $spec;
    if (scalar @globs == 1){
    return ((-f $globs[0]) ? 1 : 0);
    } else {
    return scalar @globs;
    }
    }
    Henry Law, Sep 15, 2005
    #1
    1. Advertising

  2. Henry Law

    Anno Siegel Guest

    Henry Law <> wrote in comp.lang.perl.misc:
    > Running under Activestate on Windows XP I have need to feed a series of
    > directory and file specifications to a program, which then interprets
    > them. Specifications can be directories, single file names, or wildcard
    > combinations like c:\foo\bar\*.pdf which result in lists of matching files.
    >
    > The specifications need to be validated to weed out those that don't
    > refer to an existing directory, or that result in no matching files.


    My first idea when I read this was (untested):

    my @valid = grep -e, map glob, @specs;

    but you seem to want a more detailed analysis of each entry.

    > I've Googled and searched CPAN for a module that does this, since it
    > must be a common requirement, but without success. Can someone suggest one?
    >
    > Here's the subroutine I wrote to do the job, so you'll see what I'm
    > trying to do. If it needs to exist then comments on how to smarten it
    > up would be welcome.
    >
    > #! /usr/bin.perl
    >
    > use strict;
    > use warnings;
    >
    > use constant VALID_DIRECTORY => -1;
    > use constant VALID_FILE => -2;


    [code that uses interpret_spec() snipped]

    > sub interpret_spec{
    > # Checks a file/directory specification. Returns
    > # 0 Spec matches nothing
    > # -1 It's a valid directory specification
    > # -2 It's a valid single file specification
    > # number It's a glob, resulting in "number" files
    >
    > my $spec = shift;
    > return VALID_DIRECTORY if (-d $spec);
    > return VALID_FILE if (-f $spec);
    > my @globs = glob $spec;
    > if (scalar @globs == 1){

    ^^^^^^
    You don't need "scalar", it's already in scalar context.

    > return ((-f $globs[0]) ? 1 : 0);
    > } else {
    > return scalar @globs;
    > }
    > }


    There are issues with some forms of glob() that need looking at.

    Under Unix, it is quite possible to have a file named like a glob
    pattern: "touch '*.foo'" creates a file with an actual asterisk in
    its name. The logic of interpret_spec() would detect the file first
    and never try to expand the glob (which may hit several more files).
    This problem may be void under windows, I don't know.

    A different problem occurs with globs of the form "foo.{a,b,c}".
    This expands to "foo.a", "foo.b" and "foo.c" with no check against
    the file system, that is, it doesn't matter whether "foo.a" etc.
    exist as files. That may result in unwanted answers.

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
    Anno Siegel, Sep 15, 2005
    #2
    1. Advertising

  3. Henry Law

    Henry Law Guest

    Anno Siegel wrote:
    > My first idea when I read this was (untested):
    >
    > my @valid = grep -e, map glob, @specs;
    >
    > but you seem to want a more detailed analysis of each entry.


    Hmm; I've a lot still to learn. That wouldn't have been my first idea;
    nor my twenty-first either.

    >> if (scalar @globs == 1){

    Yes, of course. Laziness, unfortunately.

    > There are issues with some forms of glob() that need looking at.
    >
    > Under Unix, it is quite possible to have a file named like a glob
    > pattern: "touch '*.foo'" creates a file with an actual asterisk in
    > its name.

    How horrible. Fortunately nobody other than me is going to run this
    thing and I'll just stay away from things like that.

    > This problem may be void under windows, I don't know.

    It is; "*" is not a valid character in Redmond's file names.

    > A different problem occurs with globs of the form "foo.{a,b,c}".
    > This expands to "foo.a", "foo.b" and "foo.c" with no check against
    > the file system

    Once again I can avoid those and get round the problem. But thank you
    for pointing out the breadth and depth of the problem I am trying to solve!
    Henry Law, Sep 15, 2005
    #3
  4. Henry Law

    Anno Siegel Guest

    Henry Law <> wrote in comp.lang.perl.misc:
    > Anno Siegel wrote:
    > > My first idea when I read this was (untested):
    > >
    > > my @valid = grep -e, map glob, @specs;
    > >
    > > but you seem to want a more detailed analysis of each entry.

    >
    > Hmm; I've a lot still to learn. That wouldn't have been my first idea;
    > nor my twenty-first either.


    Ah, but it isn't hard at all once you get into the habit of considering
    data structures (like arrays and lists, in this case) as units that are
    the objects of operations. In Perl, there are only three "operators",
    map, grep, and sort, that transform one list into another, but they are
    powerful because they all take a user-supplied function that works on
    list elements.

    OO takes that way of thinking a step further and lets you define your
    own operations on complex data structures. Working with objects for
    a while is habit-forming that way. You start to see would-be objects
    all over your normal code as well, which is a productive way of looking
    at things.

    > >> if (scalar @globs == 1){

    > Yes, of course. Laziness, unfortunately.


    Ah, the kind of laziness that results in more text rather than less.
    Who was it who didn't have time for a short letter, so wrote a long
    one instead?

    > > There are issues with some forms of glob() that need looking at.
    > >
    > > Under Unix, it is quite possible to have a file named like a glob
    > > pattern: "touch '*.foo'" creates a file with an actual asterisk in
    > > its name.

    > How horrible. Fortunately nobody other than me is going to run this
    > thing and I'll just stay away from things like that.
    >
    > > This problem may be void under windows, I don't know.

    > It is; "*" is not a valid character in Redmond's file names.


    Yes, I learned that in the meantime too.

    > > A different problem occurs with globs of the form "foo.{a,b,c}".
    > > This expands to "foo.a", "foo.b" and "foo.c" with no check against
    > > the file system

    > Once again I can avoid those and get round the problem. But thank you
    > for pointing out the breadth and depth of the problem I am trying to solve!


    I'm afraid it is not so much depth, but a certain murkiness in the
    problem that I'm pointing out. The question whether a certain string
    "is" a glob or a plain filename can be answered variously. As a
    programmer, you must decide. It seems to me that the decision is
    rather implicit in the code. If it mattered (as it probably doesn't)
    it would be advisable to make the decision explicit.

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
    Anno Siegel, Sep 15, 2005
    #4
  5. Henry Law

    Anno Siegel Guest

    Jim Gibson <> wrote in comp.lang.perl.misc:
    > In article <dgcgjv$m5b$-Berlin.DE>, Anno Siegel
    > <-berlin.de> wrote:
    > >
    > > Ah, the kind of laziness that results in more text rather than less.
    > > Who was it who didn't have time for a short letter, so wrote a long
    > > one instead?
    > >

    >
    > Mark Twain


    Thanks!

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
    Anno Siegel, Sep 15, 2005
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?UmFlZCBTYXdhbGhh?=

    Better Way to Get Directory Size

    =?Utf-8?B?UmFlZCBTYXdhbGhh?=, Apr 19, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    3,339
    Karl Seguin
    Apr 19, 2005
  2. Andrew Chen
    Replies:
    1
    Views:
    190
    David Chelimsky
    Mar 25, 2008
  3. Peter Havens
    Replies:
    0
    Views:
    106
    Peter Havens
    Mar 17, 2011
  4. Replies:
    2
    Views:
    52
    Mark H Harris
    May 13, 2014
  5. Replies:
    2
    Views:
    68
    Rainer Weikusat
    Jun 1, 2014
Loading...

Share This Page