FTP Offset larger than file.

B

Bakes

I am writing a python script that performs an identical function to
the 'tail' unix utility, except that it connects to its files over
FTP, rather than the local hard disk.

I am currently using this python script to generate an increasing
'logfile' of garbage.
import time
for i in range(1, 20000):
time.sleep(0.2)
print i
f = open("data1.log","a")
f.write('%s: This logfile is being automatically generated to help Bakes test his python ftptail. \n' % i)
f.close()

and use this script to actually download it.
import time
import os.path
from ftplib import FTP

#Empty the file
filename = 'data1.log'
file = open(filename, 'w')
file.write('')
file.close()

def handleDownload(block):
file.write(block)
print ".",

# Create an instance of the FTP object
# Optionally, you could specify username and password:
ftp=FTP(host, user, pass)

directory = '/temp'
ftp.cwd(directory)

file = open(filename, 'a')

for i in range(1,20000):
size=os.path.getsize('data1.log')
ftp.retrbinary('RETR ' + filename, handleDownload, rest=size)

file.close()

print ftp.close()

Now, my problem is that I get a very strange error. What should be
happening is the script gets the size of the local file before
downloading all of the external file after that offset.

The error I get is:
ftplib.error_temp: 451-Restart offset 24576 is too large for file size
22852.
451 Restart offset reset to 0
which tells me that the local file is larger than the external file,
by about a kilobyte. Certainly, the local file is indeed that size, so
my local script is doing the right things. I do wonder what is going
wrong, can anyone enlighten me?
 
H

Hrvoje Niksic

Bakes said:
The error I get is:
ftplib.error_temp: 451-Restart offset 24576 is too large for file size
22852.
451 Restart offset reset to 0
which tells me that the local file is larger than the external file,
by about a kilobyte. Certainly, the local file is indeed that size, so
my local script is doing the right things. I do wonder what is going
wrong, can anyone enlighten me?

I'd say you failed to take buffering into account. You write into a
buffered file, yet you use os.path.getsize() to find out the current
file size. If the data is not yet flushed, you keep re-reading the same
stuff from the remote file, and writing it out. Once the buffer is
flushed, your file will contain more data than was retrieved from the
remote side, and eventually this will result in the error you see.

As a quick fix, you can add a file.flush() line after the
file.write(...) line, and the problem should go away.
 
B

Bakes

I'd say you failed to take buffering into account.  You write into a
buffered file, yet you use os.path.getsize() to find out the current
file size.  If the data is not yet flushed, you keep re-reading the same
stuff from the remote file, and writing it out.  Once the buffer is
flushed, your file will contain more data than was retrieved from the
remote side, and eventually this will result in the error you see.

As a quick fix, you can add a file.flush() line after the
file.write(...) line, and the problem should go away.

Thank you very much, that worked perfectly.
 
B

Bakes

Thank you very much, that worked perfectly.

Actually, no it didn't. That fix works seamlessly in Linux, but gave
the same error in a Windows environment. Is that expected?
 
H

Hrvoje Niksic

Bakes said:
Actually, no it didn't. That fix works seamlessly in Linux, but gave
the same error in a Windows environment. Is that expected?

Consider opening the file in binary mode, by passing the 'wb' and 'ab'
modes to open instead of 'w' and 'a' respectively. On Windows, python
(and other languages) will convert '\n' to '\r\n' on write.
 
D

Dave Angel

Bakes said:
Actually, no it didn't. That fix works seamlessly in Linux, but gave
the same error in a Windows environment. Is that expected?
This is a text file you're transferring. And you didn't specify "wb".
So the Windows size will be larger than the Unix size, since you're
expanding the newline characters.

getsize() is looking at the size after newlines are expanded to 0d0a,
while The remote file, presumably a Unix system likely has just has 0a.

I think you'd do best just keeping track of the bytes you've written.


DaveA
 
B

Bakes

This is a text file you're transferring.  And you didn't specify "wb".  
So the Windows size will be larger than the Unix size, since you're
expanding the newline characters.

getsize() is looking at the size after newlines are expanded to 0d0a,
while The remote file, presumably a Unix system likely has just has 0a.

I think you'd do best just keeping track of the bytes you've written.

DaveA

Thank you very much, that worked perfectly.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,040
Latest member
papereejit

Latest Threads

Top