Testing whether given file is open?

C

Charles Packer

A large data file is being pushed to us via FTP. We can assume that as
long as the file is open, the FTP transfer hasn't completed. How do we
test whether the file is still open? We know about fuser and lsof, but
we (my colleague, at least, who has to code the Perl) would prefer
using a purely Perlish approach. Is there a Perl module out there
somewhere that performs such a test?
 
B

Bill

Charles said:
A large data file is being pushed to us via FTP. We can assume that as
long as the file is open, the FTP transfer hasn't completed. How do we
test whether the file is still open? We know about fuser and lsof, but
we (my colleague, at least, who has to code the Perl) would prefer
using a purely Perlish approach. Is there a Perl module out there
somewhere that performs such a test?

Depends a _lot_ on the FTP program, but try to see if the file can be
opened for appending, which _should_ fail til it is closed, I believe-- eg

while ( ! open(FTPF, '>>', $name) ) {
print "still downloading\n"; sleep 10;
}
close FTPF;
 
T

Tassilo v. Parseval

Also sprach Charles Packer:
A large data file is being pushed to us via FTP. We can assume that as
long as the file is open, the FTP transfer hasn't completed. How do we
test whether the file is still open? We know about fuser and lsof, but
we (my colleague, at least, who has to code the Perl) would prefer
using a purely Perlish approach. Is there a Perl module out there
somewhere that performs such a test?

There is Linux::Fuser. You can't have done a very thorough search if you
didn't find it.

Other than that, there might be other ways. If the FTP server is locking
the file (and it really should do so), then you could try acquiring a
lock from Perl and see whether you get one. If you do, the file is not
open any longer:

use Fcntl qw/:flock/;
...
open F, "/incoming/file" or die $!;
if (flock F, LOCK_SH|LOCK_NB) {
print "/incoming/file no longer opem";
flock F, LOCK_UN;
...
}

If the FTP server does not lock the incoming file, you're out of luck
with this approach. Also see the particular pitfalls as described in
'perldoc -f flock'.

Tassilo
 
T

Tassilo v. Parseval

Also sprach Bill:
Depends a _lot_ on the FTP program, but try to see if the file can be
opened for appending, which _should_ fail til it is closed, I believe-- eg

while ( ! open(FTPF, '>>', $name) ) {
print "still downloading\n"; sleep 10;
}
close FTPF;

On many (most?) systems you can happily open a file for appending while
it is still written to by another process. So I doubt that your
suggestion would work.

Tassilo
 
C

chris-usenet

Charles Packer said:
A large data file is being pushed to us via FTP. We can assume that as
long as the file is open, the FTP transfer hasn't completed.

No you can't. A network outage halfway through the transfer will
eventually cause the FTP daemon to abort the connection and implicitly
close the file. Most will not delete an aborted and only partially
transferred file.

There are only two meaningful ways of asynchronously determining whether
a file has been transferred in its entirely.

(a) You transfer the file with a temporary file name (e.g. suffix
".tmp" to its proper name) and when the transfer has completed,
the sender renames it. Receipients must be coded to ignore all
files ending with ".tmp".

(b) The sender transfers a zero length marker file when the real
data file has been successfully transferred. Recipients must be
coded to ignore the data file until the marker appears.

In both cases you need a cleanup process to delete stale ".tmp" files.
Chris
 
R

Richard Morse

No you can't. A network outage halfway through the transfer will
eventually cause the FTP daemon to abort the connection and implicitly
close the file. Most will not delete an aborted and only partially
transferred file.

There are only two meaningful ways of asynchronously determining whether
a file has been transferred in its entirely.

(a) You transfer the file with a temporary file name (e.g. suffix
".tmp" to its proper name) and when the transfer has completed,
the sender renames it. Receipients must be coded to ignore all
files ending with ".tmp".

(b) The sender transfers a zero length marker file when the real
data file has been successfully transferred. Recipients must be
coded to ignore the data file until the marker appears.

In both cases you need a cleanup process to delete stale ".tmp" files.
Chris

One other option is to have the upload client send the size of the file
first, and then send the file. When the uploaded file is the same as
the sent size, it's been entirely transferred.

Ricky
 
J

James Willmore

A large data file is being pushed to us via FTP. We can assume that as
long as the file is open, the FTP transfer hasn't completed. How do we
test whether the file is still open? We know about fuser and lsof, but
we (my colleague, at least, who has to code the Perl) would prefer
using a purely Perlish approach. Is there a Perl module out there
somewhere that performs such a test?

I get the sense that, on the server side, you need to do something with
the file when the transfer is complete, right? If so, read on.

When the transfer is completed, you could send an empty file to the server
(like filename.extention.200). When the server goes about it's business,
and sees the file with the extention 200, it knows the file transfer is
completed and can remove the files with the 200 extention. I suggest
this because secure ftp sites won't allow you to remove files, just send
them (with good reason - you don't want someone to remove a file that was
just uploaded whenever they want :) ). This also aids in the issue of
the network connection being droped during transfer - because no file with
the 200 extention will be sent unless the transfer is completed. Worse
case is the file needs to be resent - which is something you will most
likely do anyway in the case of a network outage :)

I'm sure there's issues with this method I haven't thought of, but it
might fit the bill for what you're doing.

HTH

--
Jim

Copyright notice: all code written by the author in this post is
released under the GPL. http://www.gnu.org/licenses/gpl.txt
for more information.

a fortune quote ...
A lack of leadership is no substitute for inaction.
 
L

lostriver

One other option is to have the upload client send the size of the file
first, and then send the file. When the uploaded file is the same as
the sent size, it's been entirely transferred.


You can also check the size of the file, go sleep()
for a while and check the size again. If sleep()
was long enough and size hasn't increased, you can assume
that the transfer is complete....
 
J

John W. Krahn

Richard said:
One other option is to have the upload client send the size of the file
first, and then send the file. When the uploaded file is the same as
the sent size, it's been entirely transferred.

There is no guarantee that the size of the file on the server will be
the same as the size of the file on the client.


John
 
U

Uri Guttman

l> You can also check the size of the file, go sleep()
l> for a while and check the size again. If sleep()
l> was long enough and size hasn't increased, you can assume
l> that the transfer is complete....

that is a very poor solution. what if the transfer dies before it
completes? what if there is a long delay for some reason? how long do
you sleep for? what if you need to know quickly and can't afford the
time to sleep?

polling for file transfer completions is not a good idea in
general. there are other ways that work.

uri
 
C

chris-usenet

Richard Morse said:
One other option is to have the upload client send the size of the file
first, and then send the file. When the uploaded file is the same as
the sent size, it's been entirely transferred.

How would you know that the file containing the file size had not been
truncated? :)

Chris
 
C

chris-usenet

lostriver said:
You can also check the size of the file, go sleep()
for a while and check the size again. If sleep()
was long enough and size hasn't increased, you can assume
that the transfer is complete....

No you can't. The transfer could have been interrupted (temporarily
or permanently).

Chris
 
L

lostriver

that is a very poor solution. what if the transfer dies before it
completes?

If transfer dies (by dies I assume you mean termination of TCP session) it is
complete. Esp. if sending process does not automaticaly resend the file and
does not check for success of the transfer.

what if there is a long delay for some reason? how long do
you sleep for?

It depends how often transfer takes place.
what if you need to know quickly and can't afford the
time to sleep?

Than you cannot use this.
polling for file transfer completions is not a good idea in
general. there are other ways that work.

It is not great solution but it will work just fine in some situations.
The OP never defined the frequency and quantity of those transfers.


How would you go about it, Uri?
 
U

Uri Guttman

l> If transfer dies (by dies I assume you mean termination of TCP
l> session) it is complete. Esp. if sending process does not
l> automaticaly resend the file and does not check for success of the
l> transfer.

huh? i mean dying in the middle of the transfer. and the whole point is
to detect a proper transfer of the file so your point about not checking
is moot.

l> It depends how often transfer takes place.

l> Than you cannot use this.

right.

l> It is not great solution but it will work just fine in some situations.
l> The OP never defined the frequency and quantity of those transfers.

it is a poor design in general. polling is poor in general. it can have
race conditions unless you do atomic stuff like a rename after
transfer. it means the remote system has to be in a polling loop which
is wasteful when no files are transfered.

l> How would you go about it, Uri?

that is for to know and you to find out. :)

actually i would use a framework where file transfers were integrated in
and remote notification is possible. then the remote side is just told
when files have been transfered and it can process them at will. no
races, no wasted polling. of course stem (on cpan) is such a beast.

uri
 
L

lostriver

actually i would use a framework where file transfers were integrated in
and remote notification is possible. then the remote side is just told
when files have been transfered and it can process them at will. no
races, no wasted polling. of course stem (on cpan) is such a beast.

You are trying to give me a solution for a perfect world :)
I am dealing with reality - external entity pushes some
files over FTP or (better) SSH onto my server. This external entity is run
by a bunch of clueless morons who can't/does't want to make any changes.
Since i am on the receiving end, I can use fuser, sleep and check sizes
or (if IP address is known) netstat. There is not much else I can do....
 
B

Bill

No you can't. The transfer could have been interrupted (temporarily
or permanently).

Back before the internet was big, telephone based networks (Fidonet
and others) were used for file transfer, and modem/telephone based
file mailers had an elaborate system of flag files for this. The flag
file code in these could be adapted for an FTP system. For a remaining
example, see eg.

http://btxe.sourceforge.net/

Such a setup may be overkill for OP's needs, of course.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,012
Latest member
RoxanneDzm

Latest Threads

Top