Search for AVI file?

J

John Doe

I realise that this is not really a python question, but python's the
only language I'd be comfortable trying to deal with this.

What I need is to search a drive and find all the AVI format files
that are NOT listed with the AVI extension. I'm looking over an old
drive of mine from an old computer. I know the files were renamed
with the wrong extension, but I know that they were originally AVI
files. Can python do this for me? Any hints? Anybody have a link to
something that would already do this? I appreciate any help.
 
P

Peter Kleiweg

John Doe schreef:
I realise that this is not really a python question, but python's the
only language I'd be comfortable trying to deal with this.

What I need is to search a drive and find all the AVI format files
that are NOT listed with the AVI extension. I'm looking over an old
drive of mine from an old computer. I know the files were renamed
with the wrong extension, but I know that they were originally AVI
files. Can python do this for me? Any hints? Anybody have a link to
something that would already do this? I appreciate any help.

Use walk() to find all files. Open each file and read in the
first 12 bytes. The last four of those 12 bytes should be
'AVI ', if I'm not mistaken.
 
P

Peter Hansen

John said:
I realise that this is not really a python question, but python's the
only language I'd be comfortable trying to deal with this.

What I need is to search a drive and find all the AVI format files
that are NOT listed with the AVI extension. I'm looking over an old
drive of mine from an old computer. I know the files were renamed
with the wrong extension, but I know that they were originally AVI
files. Can python do this for me? Any hints? Anybody have a link to
something that would already do this? I appreciate any help.

Have you looked for and found information about the AVI file
format? Google can help you with that.

You should easily be able to use Python to read the first
X bytes of a given file and check the signature to see if
it's likely an AVI file. I'm sure there are exceptions
and new versions and such things, but if you have only
a bunch of "old" AVI files, it's quite possible they are
all detectable by doing something like checking that
bytes 0 through 3 are 'RIFF' and bytes 8 through 10 are
'AVI' (that info from a few handy sites on the AVI format).

Basically you could just open the file and do a .read(10)
and compare the result using slices, e.g. data[0:4] == 'RIFF'
and data[8:11] == 'AVI'.

-Peter
 
P

Peter Kleiweg

Peter Kleiweg schreef:
John Doe schreef:


Use walk() to find all files. Open each file and read in the
first 12 bytes. The last four of those 12 bytes should be
'AVI ', if I'm not mistaken.

Or from the command line:

find / -type f -exec file '{}' ';' | grep AVI
 
J

John Lenton

John Doe schreef:


Use walk() to find all files. Open each file and read in the
first 12 bytes. The last four of those 12 bytes should be
'AVI ', if I'm not mistaken.

Correct. According to file(1)'s database, there are two types of AVI:

0 string RIFF RIFF (little-endian) data
[...]
>8 string AVI\040 \b, AVI
[...]
0 string RIFX RIFF (big-endian) data
[...]
>8 string AVI\040 \b, AVI

(meaning that little-endian AVIs start with 'RIFFAVI ', whereas
big-endian start with 'RIFXAVI ').

HTH

--
John Lenton ([email protected]) -- Random fortune:
BOFH excuse #10:

hardware stress fractures

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFBM469gPqu395ykGsRAhEKAKCvF9j4UG2rJSUvqu8y/wEQLQn2AwCdEMzm
gUnjoyyIduzUEXAFNOM96Zs=
=O5mF
-----END PGP SIGNATURE-----
 
J

John Lenton

Peter Kleiweg schreef:


Or from the command line:

find / -type f -exec file '{}' ';' | grep AVI

if that's GNU find, you'll find

find / -type f -print0 | xargs -0 file | grep AVI

a lot faster, usually.

--
John Lenton ([email protected]) -- Random fortune:
Them as has, gets.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFBM48GgPqu395ykGsRAqIJAKChh1KE/1yeSurxeF+rzXdOHc+RKgCgmo6t
SH73unNSrnInGqHyCFZaG6Q=
=uvdU
-----END PGP SIGNATURE-----
 
J

John Doe

John Doe schreef:


Use walk() to find all files. Open each file and read in the
first 12 bytes. The last four of those 12 bytes should be
'AVI ', if I'm not mistaken.

Ah, os.walk() was exactly what I needed. By the time I read your
response, I had found the 'AVI' signature, but did not know an easy
way to get to all those files.

Thank You.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top