convert ftp.retrbinary to file object? - Python language lacks expression?

R

Robert

I just tried to convert a (hugh size) ftp.retrbinary run into a
pseudo-file object with .read(bytes) method in order to not consume
500MB on a copy operation.

First I thought, its easy as usual with python using something like
'yield' or so.

Yet I didn't manage to do (without using threads or rewriting
'retrbinary')? Any ideas?

#### I tried a pattern like:
....
def open(self,ftppath,mode='rb'):
class FTPFile: #TODO
...
def iter_retr()
...
def callback(blk):
how-to-yield-from-here-to-iter_retr blk???
ftp.retrbinary("RETR %s" % relpath,callback)
def read(self, bytes=-1):
...
self.buf+=self.iter.next()
...
....
 
M

Martin Franklin

Robert said:
I just tried to convert a (hugh size) ftp.retrbinary run into a
pseudo-file object with .read(bytes) method in order to not consume
500MB on a copy operation.

First I thought, its easy as usual with python using something like
'yield' or so.

Yet I didn't manage to do (without using threads or rewriting
'retrbinary')? Any ideas?

#### I tried a pattern like:
....
def open(self,ftppath,mode='rb'):
class FTPFile: #TODO
...
def iter_retr()
...
def callback(blk):
how-to-yield-from-here-to-iter_retr blk???
ftp.retrbinary("RETR %s" % relpath,callback)
def read(self, bytes=-1):
...
self.buf+=self.iter.next()
...
....


Hmmmm this is nearly there I think...:

import ftplib

class TransferAbort(Exception): pass

class FTPFile:
def __init__(self, server, filename):
self.server = server
self.filename = filename
self.offset = 0

def callback(self, data):
self.offset = self.offset + len(data)
self.data = data
## now quit the RETR command?
raise TransferAbort("stop right now")

def read(self, amount):
self.ftp = ftplib.FTP(self.server)
self.ftp.login()
try:
self.ftp.retrbinary("RETR %s" %self.filename, self.callback,
blocksize=amount,
rest=self.offset)
except TransferAbort:
return self.data


f = FTPFile("HOSTNAME", "FILENAME")

print f.read(24)
print f.read(24)


I open the ftp connection inside the read method as it caused an error
(on the second call to read) when I opened it in __init__ ???

HTH
Martin
 
M

Martin Franklin

Martin said:
Robert said:
I just tried to convert a (hugh size) ftp.retrbinary run into a
pseudo-file object with .read(bytes) method in order to not consume
500MB on a copy operation.
[snip]



Hmmmm this is nearly there I think...:

whoops... spoke too soon..
import ftplib

class TransferAbort(Exception): pass

class FTPFile:
def __init__(self, server, filename):
self.server = server
self.filename = filename
self.offset = 0

def callback(self, data):
self.offset = self.offset + len(data)
self.data = data
## now quit the RETR command?
raise TransferAbort("stop right now")

def read(self, amount):
self.ftp = ftplib.FTP(self.server)
self.ftp.login()


I needed to insert a time.sleep(0.1) here as the connections were
falling over themselves - I guess testing with a blocksize of 24
is a little silly.

try:
self.ftp.retrbinary("RETR %s" %self.filename, self.callback,
blocksize=amount,
rest=self.offset)
except TransferAbort:
return self.data


f = FTPFile("HOSTNAME", "FILENAME")

print f.read(24)
print f.read(24)

## new test...

f = FTPFile("HOSTNAME", "FILENAME")

while 1:
data = f.read(24)
if not data:
break
print data,
 
M

Martin Franklin

Martin said:
Martin said:
Robert said:
I just tried to convert a (hugh size) ftp.retrbinary run into a
pseudo-file object with .read(bytes) method in order to not consume
500MB on a copy operation.

[snip]



Hmmmm this is nearly there I think...:


whoops... spoke too soon..


Trigger happy this morning...
I needed to insert a time.sleep(0.1) here as the connections were
falling over themselves - I guess testing with a blocksize of 24
is a little silly.

also need to close the ftp connection here!

self.ftp.close()
 
R

Robert

That turns into periodic new RETR commands with offset. Think its more
an "odd" trick. I'd even prefer a threaded approach (thread puts the
blocks into a stack; a while ... yield generator loop in the main thread
serves the .read() function of the pseudo file object, which is my
wish). Yet such tricks are all kind of OS-level tricks with a lot of
overhead.

I wonder really, if the Python language itself can express an elegant
flat solution to turn the block delivering callback function into a
generator/.read(bytes) solution? I found no way.

(Looking over some Ruby stuff, Ruby seems to be able to do so from the
language. I am not really familiar to Ruby. I always felt Python to be
as complete - but much more clean. I became somewhat jealous ... :) )

As the solution in my case has to stand many different file systems
compatibly ( file.read(bytes) function !) and also other FTPS & SFTP
classes with different retrbinary functions have to be compatible, I
cannot even make a simple FTP subclassed retrbinary without getting
really weired. Thus the existing .retrbinary with callback is the
"official interface in this game".


 
S

Steve Holden

Robert said:
That turns into periodic new RETR commands with offset. Think its more
an "odd" trick. I'd even prefer a threaded approach (thread puts the
blocks into a stack; a while ... yield generator loop in the main thread
serves the .read() function of the pseudo file object, which is my
wish). Yet such tricks are all kind of OS-level tricks with a lot of
overhead.

I wonder really, if the Python language itself can express an elegant
flat solution to turn the block delivering callback function into a
generator/.read(bytes) solution? I found no way.
Don't know whether this would be helpful as a starting point, but a
while (hmm, some years ...) ago I wrote an example of how FTP could be
used as a file-like object. Look for ftpStream.py on

http://www.holdenweb.com/Python/

Of course, in those days files could do a bit less than they can now, so
there's no attempt to provide an iterator interface.
(Looking over some Ruby stuff, Ruby seems to be able to do so from the
language. I am not really familiar to Ruby. I always felt Python to be
as complete - but much more clean. I became somewhat jealous ... :) )

As the solution in my case has to stand many different file systems
compatibly ( file.read(bytes) function !) and also other FTPS & SFTP
classes with different retrbinary functions have to be compatible, I
cannot even make a simple FTP subclassed retrbinary without getting
really weired. Thus the existing .retrbinary with callback is the
"official interface in this game".
You will note that my code uses delegation to an FTP object rather than
inheritance. Maybe you would find that approach more fruitful for your
application.

regards
Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top