ftp recursively

J

Jeff Schwab

I need to move a directory tree (~9GB) from one machine to another on
the same LAN. What's the best (briefest and most portable) way to do
this in Python?

I see that urllib has some support for getting files by FTP, but that it
has some trouble distinguishing files from directories.

http://docs.python.org/lib/module-urllib.html

"The code handling the FTP protocol cannot differentiate
between a file and a directory."

I tried it anyway, but got an authentication problem. I don't see a
"how to log into the FTP server" section on docs.python.org. Is there a
tutorial I should read?

I am particularly looking for a quick, "good enough" solution that I can
use this afternoon. Thanks in advance to any kind-hearted soul who
chooses to help me out.
 
P

Paul Rubin

Jeff Schwab said:
I need to move a directory tree (~9GB) from one machine to another on
the same LAN. What's the best (briefest and most portable) way to do
this in Python?

os.popen("rsync ...")
 
A

Arnaud Delobelle

Jeff said:
I need to move a directory tree (~9GB) from one machine to another on
the same LAN. What's the best (briefest and most portable) way to do
this in Python?

I see that urllib has some support for getting files by FTP, but that it
has some trouble distinguishing files from directories.

http://docs.python.org/lib/module-urllib.html

"The code handling the FTP protocol cannot differentiate
between a file and a directory."

I tried it anyway, but got an authentication problem. I don't see a
"how to log into the FTP server" section on docs.python.org. Is there a
tutorial I should read?

I am particularly looking for a quick, "good enough" solution that I can
use this afternoon. Thanks in advance to any kind-hearted soul who
chooses to help me out.

Have you tried ftplib? (http://docs.python.org/lib/module-ftplib.html)
Here is an example of how to use it to differentiate between files and
directories (untested).

conn = ftplib.FTP('host', 'user', 'password')

def callback(line):
name = line.rpartition(' ')[-1] # assumes no whitespace in
filenames
if line[0] == 'd':
#filename is a directory
print 'directory', name
else:
print 'file', name

#This will print all files in cwd prefixed by 'directory' or 'file'
#You can change callback to retrieve files and explore
#directories recursively
conn.dir(callback)

HTH
 
G

Gabriel Genellina

I need to move a directory tree (~9GB) from one machine to another on
the same LAN. What's the best (briefest and most portable) way to do
this in Python?

See Tools/scripts/ftpmirror.py in your Python installation.
 
J

Jeff Schwab

Gabriel said:
See Tools/scripts/ftpmirror.py in your Python installation.

Thank you, that's perfect. Thanks to Arnaud as well, for the pointer to
ftplib, which might useful for other purposes as well.

Per the earlier advice of other posters (including one whose message
seems mysteriously to have disappeared from c.l.python), I just stuck
with the Unix tools I already knew: I ended up tarring the whole 9GB,
ftping it as a flat file, and untarring it on the other side. Of
course, the motivation wasn't just to get the files from point A to
point B using Unix (which I already know how to do), but to take
advantage of an opportunity to learn some Python; next time, I'll try
the ftpmirror.py script if it's generic enough, or ftplib if there are
more specific requirements.
 
P

Paul Rubin

Jeff Schwab said:
ftping it as a flat file, and untarring it on the other side. Of
course, the motivation wasn't just to get the files from point A to
point B using Unix (which I already know how to do), but to take
advantage of an opportunity to learn some Python; next time, I'll try
the ftpmirror.py script if it's generic enough, or ftplib if there are
more specific requirements.

I see, that wasn't clear in your original post. You should look at
the os.walk function if you want to know how to traverse a directory
tree (maybe you are already doing this). Also, for security reasons,
it's getting somewhat uncommon, and is generally not a good idea to
run an ftpd these days, even on a LAN. It's more usual these days
to transfer all files by rcp or rsync tunnelled through ssh.
 
J

Jeff Schwab

Paul said:
I see, that wasn't clear in your original post. You should look at
the os.walk function if you want to know how to traverse a directory
tree (maybe you are already doing this).

I thought os.walk was for locally mounted directories... How is it
relevant on remote filesystems?
Also, for security reasons,
it's getting somewhat uncommon, and is generally not a good idea to
run an ftpd these days, even on a LAN. It's more usual these days
to transfer all files by rcp or rsync tunnelled through ssh.

Don't shoot the messenger, but you're severely confused here. Whether
you're using ftp, rcp, or rsync is a completely separate issue to
whether you're running over ssl (which I assume you meant by ssh).

FTP is a work-horse protocol for transferring files. It's going to be
with us for a long, long time. There are various clients and servers
built on it, including the traditional ftp command-line tools on Unix
and Windows.

rcp is a very simple tool for copying files from one (potentially
remote) place to another. The point of rcp is that its interface is
similar to cp, so the flags are easy to remember. Modern Unix and Linux
systems usually include secure versions of both ftp and rcp, called sftp
and scp, respectively.

The point of rsync is to keep a local directory tree in sync with a
remote one, by transferring only change-sets that are conceptually
similar to patches. If you're only transferring files once, there's no
particular benefit (AFAIK) to using rsync rather than some kind of
recursive ftp.
 
P

Paul Rubin

Jeff Schwab said:
I thought os.walk was for locally mounted directories... How is it
relevant on remote filesystems?

Yes, os.walk is for local directories. I thought you wanted to push a
local tree to a remote host. For pulling, yes, you need to do
something different; for example, rsync can handle it for you.
Don't shoot the messenger, but you're severely confused here. Whether
you're using ftp, rcp, or rsync is a completely separate issue to
whether you're running over ssl (which I assume you meant by ssh).

SSL would make more sense but ssh is more commonly used.
FTP is a work-horse protocol for transferring files. It's going to be
with us for a long, long time.

Here's what happened when I just tried to use it:

$ ftp localhost
ftp: connect: Connection refused
ftp>

That is quite common these days. I think ftp doesn't play nicely with
encryption tunnels because of the way it uses a separate port for out
of band signalling, but I never paid close attention, so maybe there's
some other issue with it. I can't think of when the last time was
that I actually used ftp.
The point of rsync is to keep a local directory tree in sync with a
remote one, by transferring only change-sets that are conceptually
similar to patches. If you're only transferring files once, there's
no particular benefit (AFAIK) to using rsync rather than some kind of
recursive ftp.

One benefit is that it handles the recursion for you by itself.
Another is that in modern unix environments it's more likely to work
at all (i.e. if there is no ftp server at the target machine).
 
S

Steven D'Aprano

The point of rsync is to keep a local directory tree in sync with a
remote one, by transferring only change-sets that are conceptually
similar to patches. If you're only transferring files once, there's no
particular benefit (AFAIK) to using rsync rather than some kind of
recursive ftp.

Avoiding re-inventing the wheel is a big benefit.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,151
Latest member
JaclynMarl
Top