Ways to detect if a file has moved?

W

withtape

Hello all,
I am writing an application that will track data about files. Once the
application becomes aware of a particular file, I would like to add a
feature whereby, even if the nominal pathname has changed, it is still
able to locate the file. I have thought of a few ways one might do
this:

1) Have the program run a thread that constantly watches each known
file. Not only is this inefficient, it also feels dirty; but it would
get the job done in a cross-platform way.
2) Make use of operating sytem APIs that implement this kind of
behavior. For example, the Windows Shell API provides for "copy hook
handlers" (http://msdn2.microsoft.com/en-us/library/aa969285.aspx),
which was the original inspiration for this idea. Of course, this
negates "write once, run anywhere" but I could accept that if I could
implement it for even a few other systems.

The real question is whether any non-Windows systems provide anything
similar. I had thought of using the file inodes, rather than the
paths, to get such behavior on Linux (and maybe other such systems),
but I don't think that's the right direction. As I understand it,
inodes are guaranteed unique only within a particular filesystem; so
if the file moves to a different partition, its inode would change.

My question for the Java-programming audience is whether there are
other ways to achieve this feature, preferably pure-Java ones.

Thanks for your replies,
-Jason Chang
 
J

John

Hello all,
I am writing an application that will track data about files. Once the
application becomes aware of a particular file, I would like to add a
feature whereby, even if the nominal pathname has changed, it is still
able to locate the file. I have thought of a few ways one might do
this:

1) Have the program run a thread that constantly watches each known
file. Not only is this inefficient, it also feels dirty; but it would
get the job done in a cross-platform way.
2) Make use of operating sytem APIs that implement this kind of
behavior. For example, the Windows Shell API provides for "copy hook
handlers" (http://msdn2.microsoft.com/en-us/library/aa969285.aspx),
which was the original inspiration for this idea. Of course, this
negates "write once, run anywhere" but I could accept that if I could
implement it for even a few other systems.

The real question is whether any non-Windows systems provide anything
similar. I had thought of using the file inodes, rather than the
paths, to get such behavior on Linux (and maybe other such systems),
but I don't think that's the right direction. As I understand it,
inodes are guaranteed unique only within a particular filesystem; so
if the file moves to a different partition, its inode would change.

My question for the Java-programming audience is whether there are
other ways to achieve this feature, preferably pure-Java ones.

Thanks for your replies,
-Jason Chang
In a *nx environment, you should start with a baseline of the
directories you want to monitor and use something to compare the
"current" contents against your baseline. If you find something has
changed, take a new baseline and use it until it changes again.
I'm not sure how to do this in the Win* world.
 
P

Patricia Shanahan

Hello all,
I am writing an application that will track data about files. Once the
application becomes aware of a particular file, I would like to add a
feature whereby, even if the nominal pathname has changed, it is still
able to locate the file. I have thought of a few ways one might do
this:

1) Have the program run a thread that constantly watches each known
file. Not only is this inefficient, it also feels dirty; but it would
get the job done in a cross-platform way.
2) Make use of operating sytem APIs that implement this kind of
behavior. For example, the Windows Shell API provides for "copy hook
handlers" (http://msdn2.microsoft.com/en-us/library/aa969285.aspx),
which was the original inspiration for this idea. Of course, this
negates "write once, run anywhere" but I could accept that if I could
implement it for even a few other systems.

The real question is whether any non-Windows systems provide anything
similar. I had thought of using the file inodes, rather than the
paths, to get such behavior on Linux (and maybe other such systems),
but I don't think that's the right direction. As I understand it,
inodes are guaranteed unique only within a particular filesystem; so
if the file moves to a different partition, its inode would change.

My question for the Java-programming audience is whether there are
other ways to achieve this feature, preferably pure-Java ones.

Thanks for your replies,
-Jason Chang

What is your concept of file identity. Is it identical content? Or
defined in terms of a series of operations? Or is the name involved?

For example, some utilities do their work in a temporary file, and do
rename operations to replace the old file with the new one.

Patricia
 
W

withtape

What is your concept of file identity. Is it identical content? Or
defined in terms of a series of operations? Or is the name involved?

For example, some utilities do their work in a temporary file, and do
rename operations to replace the old file with the new one.

Patricia

Patricia,
Two files would be considered identical if their contents are
identical. Here's one example of what I would like to accomplish:

Suppose we have a file AFILE.dat on the partition mounted at /mnt/
adrive. The program first becomes aware of the file by the user
informing it of /mnt/adrive/AFILE.dat.

Sometime after the program terminates, the user decides to copy
AFILE.dat to some other partition, for example, /other. The Unix path
to the file is now /other/AFILE.dat (although it could just easily by /
other/RENAMED.dat, as far as the program should care).

The next time the program runs, is there any way to determine that the
file that used to be at /mnt/adrive/AFILE.dat is now at /other/
AFILE.dat?

The Windows API call that I linked seems to offer a way to accomplish
this by hooking custom code into the system's copy command. The hooked-
in code could update the program's data structure with the new path.
Do other operating systems offer methods like this, let alone ones
that would be easily accessible to Java?

Thanks for your input,
-Jason
 
E

Eric Sosman

[...]
Two files would be considered identical if their contents are
identical.

So, for example, if there are two hundred forty-seven
zero-length files at various places in your file system,
they are all the same file?
Here's one example of what I would like to accomplish:

Suppose we have a file AFILE.dat on the partition mounted at /mnt/
adrive. The program first becomes aware of the file by the user
informing it of /mnt/adrive/AFILE.dat.

Sometime after the program terminates, the user decides to copy
AFILE.dat to some other partition, for example, /other. The Unix path
to the file is now /other/AFILE.dat (although it could just easily by /
other/RENAMED.dat, as far as the program should care).

You said "copy" (which is what motivated my semi-rhetorical
question above). After the copy, the file system holds two
files with different complete paths but identical content. Why
should your program care? What are you trying to do with these
identical-content copies that requires you to find both -- or
"all" -- of them?
The next time the program runs, is there any way to determine that the
file that used to be at /mnt/adrive/AFILE.dat is now at /other/
AFILE.dat?

... and at /mnt/adrive/AFILE.dat as well, perhaps. I know
of no reasonable[*] way to accomplish this.

[*] You might, at startup, scan every accessible file in the
file system and check for duplications. Computing a checksum for
every file and entering them all in a Map keyed on the checksum
might make this feasible, but it still wouldn't be reasonable.
The Windows API call that I linked seems to offer a way to accomplish
this by hooking custom code into the system's copy command. The hooked-
in code could update the program's data structure with the new path.

Windows can invoke these hooks even when your program isn't
running?
Do other operating systems offer methods like this, let alone ones
that would be easily accessible to Java?

I've never heard of a file system with this property. It would
sort of vitiate the entire notion of a "backup copy," wouldn't it?
 
P

Patricia Shanahan

Patricia,
Two files would be considered identical if their contents are
identical. Here's one example of what I would like to accomplish:

Under that definition, I don't think you can depend on tracking copies
and moves. There will be situations in which two files have identical
contents without any local copying relationship.

For example, the same program may be checked out more than once from a
version repository on a different system.

For the identical contents problem I would focus on calculating hashes,
and comparing them. If two files have identical hashes, there is at
least a possibility that they have identical content, and you can
consider either more detailed hashing or direct comparison.

Patricia
 
A

a249

Sometime after the program terminates, the user decides to copy
AFILE.dat to some other partition, for example, /other.

That's the user's god given right and really not your business. Don't
give the user an own file if he isn't supposed to do with it whatever
him pleases. Put the data in a database, or put the data in some file
which requires elevated rights to work with, outside of the user's
home. Then only give the application, not the user the necessary
rights to work with the DB or file.
The Windows API call that I linked seems to offer a way to accomplish
this by hooking custom code into the system's copy command.

That schema will break in hundreds of ways. It will, for example,
break the moment the user uses a tool which doesn't use the particular
copy command. All it takes is that one tool copies the data by
individual read/write operations, and deletes the original after the
copy is complete. It will also break if the user decides to put in in
a ZIP archive, delete the original, than unpack the archive in some
other directory. It will break if the user creates some backup,
deletes the original and than restores the backup in some other
directory.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,772
Messages
2,569,593
Members
45,111
Latest member
KetoBurn
Top