how to detect a hard link in Java?

R

Raymond DeCampo

John said:
I'll make that stronger: a canonical form *must* be unique. The
canonicalization algorithm used by a Java implementation meets this
criterion or it is broken. The issue, though, is the definition of
"unique" in the problem domain. On UNIXen, and perhaps other systems,
whether or not two pathnames refer to the same inode is not a relevant
criterion for this decision, as File objects have nothing whatsoever to
do with inodes, and have only an arm's-length relationship with any file
referred to by the *path* they represent -- if, indeed, there even is
such a file. Canonicalization is relative to the node in a hierarchical
filesystem tree represented by a File, and how or whether that
corresponds to any actual data is not relevant.

Do read the class-level docs of java.io.File if you haven't before. The
class is somewhat misnamed, IMO, and it's important to this discussion
to have a firm grasp of what it actually represents. (As I wrote in a
previous post, that's a path, not the actual file referred to by a path.)

Upon further reflection, I see your point. :)

Ray
 
M

Monique Y. Mudama

I'll make that stronger: a canonical form *must* be unique. The
canonicalization algorithm used by a Java implementation meets this
criterion or it is broken.

So on unix-like systems, this method should throw an exception?
There's no way for it to return a canonical filename.
 
H

Hemal Pandya

John said:
I'll make that stronger: a canonical form *must* be unique.

Unique among which set? Of the many (actually infinite number of) path
strings by which a single directory entry can be referred,
getCanonicalPath does return the same unique String for all those
paths.

I suspect getCanonicalPath has its roots in emacs-lisp function
file-truename, which returns according to documentation returns the
canonical name of FILENAME (the parameter) and java mimics its
behavior, on Windows and Linux.
 
J

John C. Bollinger

Monique said:
So on unix-like systems, this method should throw an exception?
There's no way for it to return a canonical filename.

Read my post again. Following my advice and make sure you understand
the abstraction that java.io.File represents. Once you have, it should
be clear why two unequal, canonical java.io.Files may refer to the same
underlying file data, on Unix or in principle on any other system.
Hint: a java.io.File doesn't actually represent a file.
 
M

Monique Y. Mudama

Read my post again. Following my advice and make sure you
understand the abstraction that java.io.File represents. Once you
have, it should be clear why two unequal, canonical java.io.Files
may refer to the same underlying file data, on Unix or in principle
on any other system. Hint: a java.io.File doesn't actually
represent a file.

No need to be snippy. I understand all of that, which is why it
seems to me that the term canonical doesn't make sense for filenames
on Unix.
 
L

Larry Barowski

Monique Y. Mudama said:
No need to be snippy. I understand all of that, which is why it
seems to me that the term canonical doesn't make sense for filenames
on Unix.

"Canonical" makes perfect sense for filenames in UNIX.
Multiple filenames may be used to refer to the same
filesystem node. Multiple filesystem nodes may refer
to the same file, directory, named pipe, or device, but
that is irrelevant.
 
J

John C. Bollinger

Hemal said:
John C. Bollinger wrote:




Unique among which set? Of the many (actually infinite number of) path
strings by which a single directory entry can be referred,
getCanonicalPath does return the same unique String for all those
paths.

This is indeed what I mean. The File.equals(Object) method establishes
the equivalence classes from which canonical members are chosen. This
seems to me the only viable context for the File class' internal
canonicalization, but perhaps you see another?
 
B

Bent C Dalager

No need to be snippy. I understand all of that, which is why it
seems to me that the term canonical doesn't make sense for filenames
on Unix.

It works for filenames, just not for files.

A filename is a path through the directory structure that ends up with
an entry that points to some file. You stop there, you don't consider
which file it is that is being pointed to. In this case, there can be
a canonical filename. All sorts of clever symlinked directories,
NFS-mounts of local drives and whatever will resolve to one and only
one "real" canonical path to the filename.

Cheers
Bent D
 
C

Chris Uppal

Bent said:
It works for filenames, just not for files.

A filename is a path through the directory structure that ends up with
an entry that points to some file. You stop there, you don't consider
which file it is that is being pointed to. In this case, there can be
a canonical filename. All sorts of clever symlinked directories,
NFS-mounts of local drives and whatever will resolve to one and only
one "real" canonical path to the filename.

But that's exactly the point -- all those clever filesystem mounts, etc, have
nothing at all to do with /names/. It is true that two names may happen
resolve to the same data (in various ways) but that does /not/ mean that they
are equivalent, nor does it mean that one or the other (or both at once) cannot
be canonical. Being canonical is a propery of a name -- in itself -- not of
the thing to which is happens to refer.

I don't dispute that an operation that will resolve as much of the clutter as
possible in a name is a good thing to have, not at all, but "canonical" is the
wrong word to use in the name of such a facility. As an example, the JavaDoc
points out that the so-called canonicial name of a file that exists may be
different from the canonical form a filename that doesn't correspond to an
existing file. Its a semantic muddle.

I think that Monique is essentially correct, though I would say that under UNIX
the true canonical form of any pathname is that name itself -- there is no
proper other form. (So I wouldn't throw exceptions from getCanonicalForm(), or
whatever it's called, but it would be a trivial identity function.)

Compare this with the case in Windows where (as I understand it) the
interpretation of '.' and '..' is part of the defined semantics of the
namespace, rather than being a convention followed in the placement of hard
links (as it is in UNIX). On Windows, it is legitimate (even under the
hard-line pedantic interpretation I'm taking here) to remove '.' and '..' from
pathnames. Similarly it is correct to replace '/' with '\', and to map drive
letters to some standard case. As far as I know, there is no equivalent in
Unix.

-- chris
 
L

Larry Barowski

Chris Uppal said:
But that's exactly the point -- all those clever filesystem mounts, etc,
have
nothing at all to do with /names/. It is true that two names may happen
resolve to the same data (in various ways) but that does /not/ mean that
they
are equivalent, nor does it mean that one or the other (or both at once)
cannot
be canonical. Being canonical is a propery of a name -- in itself -- not
of
the thing to which is happens to refer.

True, "canonical" is not the perfect word here. I suspect
the functionality of the method may have evolved
somewhat in the early days of Java. The last paragraph
of the documentation is fairly clear though: "Every
pathname that denotes an existing file or directory has
a unique canonical form." So in UNIX terms, the
purpose is to get a unique, valid pathname for a
particular file system node. If the node exists, then
this is what the method does. It would probably be
a good idea for the method to throw an exception if
the node does not exist.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,787
Messages
2,569,630
Members
45,335
Latest member
Tommiesal

Latest Threads

Top