Is it possible to detect if files on a drive were changed without scanning the drive?

T

Tom Anderson

It is maybe not a pure Python question, but I think it is the right
newsgroup to ask for help, anyway.

You might try comp.arch.storage or comp.sys.ibm.pc.hardware.storage, or a
newsgroup specific to the operating system you're working on.
After connecting a drive to the system (via USB or IDE) I would like to
be able to see within seconds if there were changes in the file system
of that drive since last check (250 GB drive with about four million
files on it).

How to accomplish this?

I don't think there's a portable way to do this. I also don't think
there's a way to do this at all on most disks. I think there might be a
way to do this using journalled filesystems, but i'm not certain, it would
definitely involve fiddling with low-level filesystem APIs (or even
on-disk structures), and, UIVMM, these are a minority of disks anyway.

Sorry i couldn't be more helpful!

tom
 
A

Alessandro Bottoni

Claudio said:
After connecting a drive to the system (via USB
or IDE) I would like to be able to see within seconds
if there were changes in the file system of that drive
since last check (250 GB drive with about four million
files on it).

How to accomplish this? (best if providing
directly a Python receipe for it :)
Do available file systems have something like
archive attribute assigned to the root directory
of the drive?
I suppose not. Am I right?

On Linux there is the FAM (File Alteration Module) for this, as long as I
know. Maybe Python has a wrapper/binding for it.
I ask this question having Microsoft Windows 2000
and Windows proprietary NTFS file system in mind,
but I am also interested to know it about Linux or
Unix file systems.

As long as I know, on Windows there are a few specific "hooks" to perform
such a task. They are provided by the MS API for the NTFS/HPFS file
systems. I do not think Python implements anything so "low level", anyway.
Check the docu to be sure.
I know, that looking for the archive attribute of the
top directories doesn't help when the change
happened to files somewhere deeper in the
hierarchy of directories.

Right. It does not help.

Consider this: if are accessing a network file system, you can intercepts
the calls to the virtualization layer (NFS or NetBIOS). Most likely, Python
can support you in performing this task.

HTH
 
C

Claudio Grondi

It is maybe not a pure Python question, but I think
it is the right newsgroup to ask for help, anyway.

After connecting a drive to the system (via USB
or IDE) I would like to be able to see within seconds
if there were changes in the file system of that drive
since last check (250 GB drive with about four million
files on it).

How to accomplish this? (best if providing
directly a Python receipe for it :)
Do available file systems have something like
archive attribute assigned to the root directory
of the drive?
I suppose not. Am I right?

I ask this question having Microsoft Windows 2000
and Windows proprietary NTFS file system in mind,
but I am also interested to know it about Linux or
Unix file systems.

I know, that looking for the archive attribute of the
top directories doesn't help when the change
happened to files somewhere deeper in the
hierarchy of directories.

Any hints are welcome.

Claudio
 
O

Oren Tirosh

After connecting a drive to the system (via USB
or IDE) I would like to be able to see within seconds
if there were changes in the file system of that drive
since last check (250 GB drive with about four million
files on it).

Whenever a file is modified the last modification time of the directory
containing it is also set. I'm not sure if the root directory itself
has a last modification time field but you can just store and compared
the last mod time of all subdirectories directly under the root
directory.

If you trust the clock of all machines mounting this drives is set
correctly (including time zone) you can store just a single timestamp
and compare for files or directories modified after that time.
Otherwise you will need to store and compare for any changes, not just
going forward.

Oren
 
G

Grant Edwards

Whenever a file is modified the last modification time of the directory
containing it is also set.

Nope.

$ ls -ld --time-style=full-iso .
drwxr-xr-x 2 grante grante 4096 2005-09-12 12:38:04.749815352 -0500 ./

$ touch asdf

$ ls -ld --time-style=full-iso .
drwxr-xr-x 2 grante grante 4096 2005-09-12 12:39:35.657995208 -0500 ./

$ echo "hi" >asdf

$ ls -ld --time-style=full-iso .
drwxr-xr-x 2 grante grante 4096 2005-09-12 12:39:35.657995208 -0500 ./

$ echo "foo" >asdf

$ ls -ld --time-style=full-iso .
drwxr-xr-x 2 grante grante 4096 2005-09-12 12:39:35.657995208 -0500 ./

Notice that writing to the file did not change the modification
date of the directory contining it.
 
S

sven

hi list,
i'd like to define __repr__ in a class to return the standardrepr
a la "<__main__.A instance at 0x015B3DA0>"
plus additional information.
how would i have to do that?
how to get the standardrepr after i've defined __repr__?

sven.
 
C

Claudio Grondi

Alessandro Bottoni said:
On Linux there is the FAM (File Alteration Module) for this, as long as I
know. Maybe Python has a wrapper/binding for it.


As long as I know, on Windows there are a few specific "hooks" to perform
such a task. They are provided by the MS API for the NTFS/HPFS file
systems. I do not think Python implements anything so "low level", anyway.
Check the docu to be sure.


Right. It does not help.

Consider this: if are accessing a network file system, you can intercepts
the calls to the virtualization layer (NFS or NetBIOS). Most likely, Python
can support you in performing this task.

HTH

Thank you for your response.

To be more clear I should maybe add, that I have not to do with
drives beeing altered while the system is running. The drives content
can be altered e.g. by the computer of a friend who I gave the drive
out to.
I tell it here, because it seems, that the answer is biased
towards detecting changes done to files on a drive while
running on a system able to monitor the drives activity.

Claudio
 
S

Steve Holden

sven said:
hi list,
i'd like to define __repr__ in a class to return the standardrepr
a la "<__main__.A instance at 0x015B3DA0>"
plus additional information.
how would i have to do that?
how to get the standardrepr after i've defined __repr__?

sven.
It's relatively easy for new-style (type-based) classes:
... def __repr__(self):
... return "Extra stuff then\n%s" % super(C, self).__repr__()
...
Not immediately clear how to extend that to old-style objects since they
aren't as cooperative in making their superclass as readily available.

Unless your needs are specifically for old-style classes I'd suggest
using new-style classes.

regards
Steve
 
G

Gerrit Holl

sven said:
i'd like to define __repr__ in a class to return the standardrepr
a la "<__main__.A instance at 0x015B3DA0>"
plus additional information.
how would i have to do that?
how to get the standardrepr after i've defined __repr__?
'<int object at 0x94f7d8c>'

Gerrit.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top