Cleanup file path

Discussion in 'Perl Misc' started by Yves Martin, Jul 5, 2004.

  1. Yves Martin

    Yves Martin Guest

    Hello,

    I'm looking for a way to clean file paths like the following:
    ~/dir1/../dir2
    ~/dir1/../dir2/f1
    ~/dir1/./dir2/f1
    /dir1/../dir2
    /dir1/../dir2/f1
    /dir1/./dir2/f1

    I tested Cwd->realpath that reduce ./ and ../ but has many drawbacks:
    - it does not understand ~/
    - it follows symbolic links
    - it checks path over the filesystem, so an unexisting file becomes
    undef.

    Results I expect are obviously:
    ~/dir1/../dir2 -> ~/dir2
    ~/dir1/../dir2/f1 -> ~/dir2/f1
    ~/dir1/./dir2/f1 -> ~/dir1/dir2/f1
    /dir1/../dir2 -> /dir2
    /dir1/../dir2/f1 -> /dir2/f1
    /dir1/./dir2/f1 -> /dir1/dir2/f1

    what ever the filesystem looks like (non existing file or symlink)
    and which is multiplatform of course ;)

    I'm really demanding but I just ask question before writing the code
    myself from File::Spec splitpath, catpath, splitdir, catdir

    If there is a simple existing way to do that, please just tell
    me. Thank you in advance.

    If you just want to check your ideas, here is a short test case:

    my @testCase = (
    "~/dir1/../dir2",
    "~/dir1/../dir2/f1",
    "~/dir1/./dir2/f1",
    "/dir1/../dir2",
    "/dir1/../dir2/f1",
    "/dir1/./dir2/f1",
    );
    print "Result: " . join( " ", map { File::Spec->canonpath($_) } @testCase ) . "\n";

    Result: ~/dir1/../dir2 ~/dir1/../dir2/f1 ~/dir1/dir2/f1 /dir1/../dir2 /dir1/../dir2/f1 /dir1/dir2/f1

    Regards,
    --
    Yves Martin
     
    Yves Martin, Jul 5, 2004
    #1
    1. Advertising

  2. Yves Martin <> wrote:

    > I'm looking for a way to clean file paths like the following:
    > ~/dir1/../dir2
    > ~/dir1/../dir2/f1
    > ~/dir1/./dir2/f1
    > /dir1/../dir2
    > /dir1/../dir2/f1
    > /dir1/./dir2/f1
    >
    > I tested Cwd->realpath that reduce ./ and ../ but has many
    > drawbacks:


    None of them are drawbacks. I'm not sure you understand what the
    purpose of it is. :)

    > - it does not understand ~/


    It works fine for me:

    $ cd /var/tmp
    $ mkdir -p '~/dir1'
    $ perl -MCwd -wle 'print Cwd::realpath "~/dir1"'
    /var/tmp/~/dir1

    You can't expect it to expand "~" into your home directory since
    "~" might very well be an existing directory.

    If you want tilde expansion, do it yourself. It is mentioned in
    the FAQ how to do this.

    > - it follows symbolic links


    Of course it does. How else would it find the absolute path?

    > - it checks path over the filesystem, so an unexisting file
    > becomes undef.


    That makes sense, since a non-existing file has no real pathname.

    > Results I expect are obviously:
    > ~/dir1/../dir2 -> ~/dir2
    > ~/dir1/../dir2/f1 -> ~/dir2/f1
    > ~/dir1/./dir2/f1 -> ~/dir1/dir2/f1
    > /dir1/../dir2 -> /dir2
    > /dir1/../dir2/f1 -> /dir2/f1
    > /dir1/./dir2/f1 -> /dir1/dir2/f1
    >
    > what ever the filesystem looks like (non existing file or
    > symlink) and which is multiplatform of course ;)


    You can't simplify '~/dir1/../dir2' into '~/dir2' unless you make
    sure that 'dir1' is a directory. If 'dir1' is a symbolic link,
    then '~/dir1/../dir2' and '~/dir2' might be two totally different
    directories.

    Peter

    --
    #!/local/bin/perl5 -wp -*- mode: cperl; coding: iso-8859-1; -*-
    # matlab comment stripper (strips comments from Matlab m-files)
    s/^((?:(?:[])}\w.]'+|[^'%])+|'[^'\n]*(?:''[^'\n]*)*')*).*/$1/x;
     
    Peter J. Acklam, Jul 5, 2004
    #2
    1. Advertising

  3. Yves Martin

    Yves Martin Guest

    (Peter J. Acklam) writes:

    > Yves Martin <> wrote:
    >
    >> I'm looking for a way to clean file paths like the following:
    >> ~/dir1/../dir2
    >> ~/dir1/../dir2/f1
    >> ~/dir1/./dir2/f1
    >> /dir1/../dir2
    >> /dir1/../dir2/f1
    >> /dir1/./dir2/f1


    Sorry for the misunderstanding, just forget what I said about
    'realpath'.

    My aim is not to get the real path of a file BUT to clean up file
    paths according to the following examples:

    ~/dir1/../dir2 -> ~/dir2
    ~/dir1/../dir2/f1 -> ~/dir2/f1
    ~/dir1/./dir2/f1 -> ~/dir1/dir2/f1
    /dir1/../dir2 -> /dir2
    /dir1/../dir2/f1 -> /dir2/f1
    /dir1/./dir2/f1 -> /dir1/dir2/f1

    I have tried many things: File::Spec->canonpath, Cwd->realpath
    ... and realpath was the closest I found with the drawbacks I
    enumerated.

    So I'm just looking for a function/package than may achieve the
    clean up I described in a multi-platform way without checking the
    filesystem.

    Sorry again.
    --
    Yves Martin
     
    Yves Martin, Jul 5, 2004
    #3
  4. Yves Martin <> wrote:

    > My aim is not to get the real path of a file BUT to clean up
    > file paths according to the following examples:
    >
    > ~/dir1/../dir2 -> ~/dir2
    > ~/dir1/../dir2/f1 -> ~/dir2/f1
    > ~/dir1/./dir2/f1 -> ~/dir1/dir2/f1
    > /dir1/../dir2 -> /dir2
    > /dir1/../dir2/f1 -> /dir2/f1
    > /dir1/./dir2/f1 -> /dir1/dir2/f1
    >
    > I have tried many things: File::Spec->canonpath, Cwd->realpath
    > ... and realpath was the closest I found with the drawbacks I
    > enumerated.


    I see, so it's ok that '~/dir1/../dir2' becomes '~/dir2' even if
    they are actually two different directories?

    Oh, well, try this:

    sub trimpath {
    local $_ = shift;

    require File::Spec::Functions;
    $_ = File::Spec::Functions::canonpath($_);

    # /foo/bar/../baz -> /foo/baz
    1 while s{/(?!\.\.?/)[^/]+/\.\.(/|$)}{$1}gx;

    # /../../foo -> /foo
    s|^/(\.\./)+|/|;

    return $_;
    }

    Peter

    --
    #!/local/bin/perl5 -wp -*- mode: cperl; coding: iso-8859-1; -*-
    # matlab comment stripper (strips comments from Matlab m-files)
    s/^((?:(?:[])}\w.]'+|[^'%])+|'[^'\n]*(?:''[^'\n]*)*')*).*/$1/x;
     
    Peter J. Acklam, Jul 5, 2004
    #4
  5. Yves Martin

    Joe Smith Guest

    Yves Martin wrote:

    > (Peter J. Acklam) writes:
    >
    > My aim is not to get the real path of a file BUT to clean up file
    > paths according to the following examples:
    >
    > ~/dir1/../dir2 -> ~/dir2


    On my system, ~/src = /home/jms/src but
    ~/src/../dir2 = /home/common/sources/dir2, not /home/jms/dir2.

    Why would you want to use an algorithm that is guarenteed to
    return wrong data when symlinks are use? The only way to get
    completely accurate results is check the filesystem on the
    server (something that cannot be done on a proxy).

    -Joe
     
    Joe Smith, Jul 5, 2004
    #5
  6. Joe Smith <> wrote:

    > On my system, ~/src = /home/jms/src but
    > ~/src/../dir2 = /home/common/sources/dir2, not /home/jms/dir2.
    >
    > Why would you want to use an algorithm that is guarenteed to
    > return wrong data when symlinks are use?


    It's not guaranteed to be wrong. For instance, it won't be wrong
    if the symlink is to a directory within the same directory as the
    symlink.

    So the problem isn't that it is guaranteed to fail, but that it
    isn't guaranteed to succeed. Of course, that is hardly any
    better.

    Peter

    --
    #!/local/bin/perl5 -wp -*- mode: cperl; coding: iso-8859-1; -*-
    # matlab comment stripper (strips comments from Matlab m-files)
    s/^((?:(?:[])}\w.]'+|[^'%])+|'[^'\n]*(?:''[^'\n]*)*')*).*/$1/x;
     
    Peter J. Acklam, Jul 5, 2004
    #6
  7. Yves Martin

    Yves Martin Guest

    (Peter J. Acklam) writes:

    > It's not guaranteed to be wrong. For instance, it won't be wrong
    > if the symlink is to a directory within the same directory as the
    > symlink.
    >
    > So the problem isn't that it is guaranteed to fail, but that it
    > isn't guaranteed to succeed. Of course, that is hardly any
    > better.


    Thank you for your help.

    In fact, my perl script is supposed to read build system
    configuration files (generated and read in Java) and to provide
    Emacs JDEE project files (prj.el)

    The source configuration files have known constraints (relative path
    and no symbolic links at that level - but perhaps out of the build
    system) and the prj.el supports "~/" but not ../ or ./ in path. Here
    is the complete story.
    --
    Yves Martin
     
    Yves Martin, Jul 6, 2004
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Dan Walls
    Replies:
    3
    Views:
    1,569
    Alvin Bruney [MVP]
    Apr 16, 2004
  2. =?Utf-8?B?VGltOjouLg==?=

    ADSI Cleanup Help!

    =?Utf-8?B?VGltOjouLg==?=, Mar 30, 2005, in forum: ASP .Net
    Replies:
    0
    Views:
    380
    =?Utf-8?B?VGltOjouLg==?=
    Mar 30, 2005
  3. VB Programmer
    Replies:
    3
    Views:
    521
    VB Programmer
    Feb 10, 2006
  4. C Programmer

    Tools for large C file cleanup?

    C Programmer, Aug 17, 2003, in forum: C Programming
    Replies:
    0
    Views:
    290
    C Programmer
    Aug 17, 2003
  5. Replies:
    0
    Views:
    315
Loading...

Share This Page