Scanning directories for new files?

M

Matty Sarro

Hey everyone.
I'm in the midst of writing a parser to clean up incoming files,
remove extra data that isn't needed, normalize some values, etc. The
base files will be uploaded via FTP.
How does one go about scanning a directory for new files? For now
we're looking to run it as a cron job but eventually would like to
move away from that into making it a service running in the
background.
 
J

Jon Clements

Hey everyone.
I'm in the midst of writing a parser to clean up incoming files,
remove extra data that isn't needed, normalize some values, etc. The
base files will be uploaded via FTP.
How does one go about scanning a directory for new files? For now
we're looking to run it as a cron job but eventually would like to
move away from that into making it a service running in the
background.

Not a direct answer, but I would choose the approach of letting the
FTP server know when a new file has been added. For instance:
http://www.pureftpd.org/project/pure-ftpd -

"Any external shell script can be called after a successful upload.
Virus scanners and database archiveal can easily be set up."

Of course, there's loads more servers, that I'm sure will have
callback events or similar.

Although, yes, the monitoring the file system is completely possible.

hth

Jon.
 
M

Martin Gregorie

Hey everyone.
I'm in the midst of writing a parser to clean up incoming files, remove
extra data that isn't needed, normalize some values, etc. The base files
will be uploaded via FTP.
How does one go about scanning a directory for new files? For now we're
looking to run it as a cron job but eventually would like to move away
from that into making it a service running in the background.
Make sure the files are initially uploaded using a name that the parser
isn't looking for and rename it when the upload is finished. This way the
parser won't try to process a partially loaded file.

If you are uploading to a *nix machine You the rename can move the file
between directories provided both directories are in the same filing
system. Under those conditions rename is always an atomic operation with
no copying involved. This would you to, say, upload the file to "temp/
myfile" and renamed it to "uploaded/myfile" with your parser only
scanning the uploaded directory and, presumably, renaming processed files
to move them to a third directory ready for further processing.

I've used this technique reliably with files arriving via FTP at quite
high rates.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top