How to test if two strings point to the same file or directory?

S

Sandra-24

Comparing file system paths as strings is very brittle. Is there a
better way to test if two paths point to the same file or directory
(and that will work across platforms?)

Thanks,
-Sandra
 
T

Tim Chase

Comparing file system paths as strings is very brittle. Is there a
better way to test if two paths point to the same file or directory
(and that will work across platforms?)

os.path.samefile(filename1, filename2)
os.path.sameopenfile(fileobject1, fileobject2)

-tkc
 
S

Steven D'Aprano

Comparing file system paths as strings is very brittle.

Why do you say that? Are you thinking of something like this?

/home//user/somedirectory/../file
/home/user/file

Both point to the same file.
Is there a
better way to test if two paths point to the same file or directory
(and that will work across platforms?)

How complicated do you want to get? If you are thinking about aliases,
hard links, shortcuts, SMB shares and other complications, I'd be
surprised if there is a simple way.

But for the simple case above:
'/home/user/file'
 
T

Tim Chase

Comparing file system paths as strings is very brittle.
Why do you say that? Are you thinking of something like this?

/home//user/somedirectory/../file
/home/user/file

Or even

~/file
How complicated do you want to get? If you are thinking about aliases,
hard links, shortcuts, SMB shares and other complications, I'd be
surprised if there is a simple way.

But for the simple case above:

'/home/user/file'

I'd suggest os.path.samefile which should handle case-sensitive
(non-win32) vs case-insensitive (win32) filenames, soft-links,
and hard-links. Not sure it's prescient enough to know if you
have two remote shares, it will unwind them to their full
server-path name. Works here on my various boxes (Linux, MacOS-X
and OpenBSD) here. I'd assume it's the same functionality on Win32.

-tkc
 
S

Steven D'Aprano

How complicated do you want to get? If you are thinking about aliases,
hard links, shortcuts, SMB shares and other complications, I'd be
surprised if there is a simple way.

Almost, but not quite, platform independent.

os.path.samefile(path1, path2)

From the docs:

samefile(path1, path2)
Return True if both pathname arguments refer to the same file or
directory (as indicated by device number and i-node number). Raise an
exception if a os.stat() call on either pathname fails. Availability:
Macintosh, Unix

http://docs.python.org/lib/module-os.path.html
 
T

Tim Chase

Comparing file system paths as strings is very brittle. Is there a
Nice try, but they don't "work across platforms".

Okay...double-checking the docs.python.org writeup, it apparently
does "work across platforms" (Mac & Unix), just not "across *all*
platforms", with Win32 being the obvious outlier. It seems a
strange omission from Win32 python, even if it were filled in
with only a stub...something like

def samefile(f1, f2):
return abspath(f1.lower()) == abspath(f2.lower())

it might not so gracefully handle UNC-named files, or SUBST'ed
file-paths, but at least it provides an attempt at providing the
functionality on win32. As it currently stands, it would be
entirely too easy for a [Mac|*nix] programmer to see that there's
a samefile() function available, use it successfully based on its
docstring, only to have it throw an exception or silently fail
when run on win32.

-tkc
 
J

John Machin

Tim Chase wrote:
[snip]
I'd suggest os.path.samefile which should handle case-sensitive
(non-win32) vs case-insensitive (win32) filenames, soft-links,
and hard-links. Not sure it's prescient enough to know if you
have two remote shares, it will unwind them to their full
server-path name. Works here on my various boxes (Linux, MacOS-X
and OpenBSD) here. I'd assume it's the same functionality on Win32.

Assume nothing. Read the manual.
 
S

Sandra-24

/home//user/somedirectory/../file
/home/user/file

Both point to the same file.

hard links, shortcuts, SMB shares and other complications, I'd be
surprised if there is a simple way.

So would I. Maybe it would make a good addition to the os.path library?

os.path.isalias(path1, path2)
But for the simple case above:

The simplest I can think of that works for me is:

def isalias(path1, path2):
.... return os.path.normcase(os.path.normpath(path1)) ==
os.path.normcase(os.path.normpath(path2))

But that won't work with more complicated examples. A common one that
bites me on windows is shortening of path segments to 6 characters and
a ~1.

-Dan
 
J

John Machin

Tim said:
Okay...double-checking the docs.python.org writeup, it apparently
does "work across platforms" (Mac & Unix), just not "across *all*
platforms", with Win32 being the obvious outlier.

My bet would be that it really works on Unix only (OS/X qualifying as
Unix) and not on older Mac setups.
It seems a
strange omission from Win32 python, even if it were filled in
with only a stub...something like

def samefile(f1, f2):
return abspath(f1.lower()) == abspath(f2.lower())

it might not so gracefully handle UNC-named files, or SUBST'ed
file-paths, but at least it provides an attempt at providing the
functionality on win32. As it currently stands, it would be
entirely too easy for a [Mac|*nix] programmer to see that there's
a samefile() function available, use it successfully based on its
docstring, only to have it throw an exception or silently fail
when run on win32.

The current setup will not "silently fail when run on win32". How could
it? It doesn't exist; it can't be run.

Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit
(Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
| >>> import os
| >>> os.path.samefile
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'samefile'

The only way it could "silently fail when run on win32" would be to
have a "stub" which gave the wrong answer.
 
L

Leif K-Brooks

Tim said:
Or even

~/file

~ is interpreted as "my home directory" by the shell, but when it's used
in a path, it has no special meaning. open('~/foo.txt') tries to open a
file called foo.txt in a subdirectory of the current directory called '~'.
 
E

Erik Max Francis

Leif said:
~ is interpreted as "my home directory" by the shell, but when it's used
in a path, it has no special meaning. open('~/foo.txt') tries to open a
file called foo.txt in a subdirectory of the current directory called '~'.

That's what os.path.expanduser is for.
 
J

John Nagle

Sandra-24 said:
Comparing file system paths as strings is very brittle. Is there a
better way to test if two paths point to the same file or directory
(and that will work across platforms?)

No.

There are ways to do it for many operating systems, but there is no
system-independent way. It's often not possible for files accessed
across a network; the information just isn't being sent.

John Nagle
 
T

Tim Golden

Sandra-24 said:
Comparing file system paths as strings is very brittle. Is there a
better way to test if two paths point to the same file or directory
(and that will work across platforms?)

I suspect that the "and that will work across platforms"
parenthesis is in effect a killer. However, if you're
prepared to waive that particular requirement, I can
suggest reading this page for a Win32 possibility:

http://timgolden.me.uk/python/win32_how_do_i/see_if_two_files_are_the_same_file.html

TJG
 
T

Tim Chase

The current setup will not "silently fail when run on win32". How could
it? It doesn't exist; it can't be run.

Ah...didn't know which it did (or didn't do) as I don't have a
win32 box at hand on which to test it.

In chasing the matter further, the OP mentioned that their
particular problem was related to tilde-expansion/compression
which python doesn't seem to distinguish.

To take this into consideration, there's some advice at

http://groups.google.com/group/comp...read/thread/ee559c99d54d970b/b71cb7ac1b7be105

where Chris Tismer has an example function/module that uses Win32
API calls to normalize a path/filename to the short-name equiv.
It looks like this could be integrated into the previous code I
posted, so you'd have something like

os.path.samefile = lambda f1, f2: (
LongToShort(abspath(f1)).lower() ==
LongToShort(abspath(f2)).lower()
)

As stated, it's a bit fly-by-the-seat-of-the-pants as I don't
have any boxes running Win32 here at home, but that would be the
general gist of the idea.

It would be handy to add it, as the semantic meaning is the same
across platforms, even if the implementation details are vastly
different. One of the reasons I use python is because it usually
crosses platform boundaries with nary a blink.

Just a few more ideas,

-tkc
 
S

Sandra-24

It looks like you can get a fairly good apporximation for samefile on
win32. Currently I'm using the algorithm suggested by Tim Chase as it
is "good enough" for my needs.

But if one wanted to add samefile to the ntpath module, here's the
algorithm I would suggest:

If the passed files do not exist, apply abspath and normcase to both
and return the result of comparing them for equality as strings.

If both paths pass isfile(), try the mecahnism linked to in this thread
which opens both files at the same time and compares volume and index
information. Return this result. If that raises an error (maybe they
will not allow us to open them) Try comparing them using the approach
suggested by Tim Chase, but if that works there should be some
indication given that the comparison may not be accurate (raise a
warning?). If that also fails, raise an error.

This should allow samfile to be used on win32 in well over 99.9% of
cases with no surprises. For the rest it will either return a result
that is likely correct with a warning of some kind, or it will fail
entirely. It's not perfect, but you can't acheive perfection here. It
would, however, have far less surprises than newbies using == to
compare paths for equality. And it would also make os.path.samefile
available on another platform.

os.path.sameopenfile could be implemented perfectly using the
comparison of volume and index information alone (assuming you can get
a win32 handle out of the open file object, which I think you can)

If someone would be willing to write a patch for the ntpath tests I
would be willing to implement samefile as described above or as agreed
upon in further discussion. Then we can submit it for inclusion in the
stdlib.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,164
Latest member
quinoxflush
Top