Having trouble extracting useful directory details from ftplib.FTP

S

Stephen Horne

Just recently I decided I want to make use of my ISP freebie webspace.
In order to make that easier, I'd like to be able to automatically
synchronise an FTP file/folder heirarchy with one on my local hard
drive. I figured this should be easily handled in Python, and broadly
speaking it is, but I am having a little difficulty.

You see, in order to handle the synchronise correctly, I need to be
able to determine what is on the FTP server to start with. I need to
be able to recursively search the folders on the FTP server. But so
far as I can tell, I'm having to rely on some very flaky techniques
and some observations about my ISPs FTP server that could potentially
change.

I tried using FTP.nlst(pathname) but this wasn't very helpful...

1. It created a bogus first entry in the list - a line saying
"Found 1" (irrespective of the actual number of items found)
IIRC.

2. It provided no indication of whether each item was a file or
a folder.

Therefore, I switched to using the following piece of code...

filelist=[]
ftp.retrlines ("LIST "+pathname, filelist.append)

I still get a bogus line at the top, but that's easily worked around.
The important thing is that by testing out the results, I found that
for the ftp server I am using I can look at the first character for
the directory flag, and use the slice [53:] to extract the filename
from each line.

The problem is, however, that this seems like skating on very thin
ice. So far as I can tell, there is no standard for the format
reported by the LIST command.

This so far is enough - I can use this to find all files and folders
on the FTP server, and I can download backup copies to a hard drive
folder and then delete them all before uploading the replacement
files, all in one automatic process that (at the moment) just works.
But - apart from the fact that I'm very nervous about it - ideally,
I'd like to do a more sane synchronise - only delete files that need
to be deleted, and only upload files that have changed. To do that, I
need to get more details about the files on the FTP server - time and
date stamps in particular. Getting hold of this information looks like
being at least as much of a hack as what I've done already.

So the question is... Is there some better way of handling this that
I'm missing, or is this just the way it is? After all, there are a lot
of FTP client utilities out there that seem to get this information
perfectly reliably, which seems surprising if they have to rely on
parsing an inconsistent directory listing format.

Any hints?
 
P

Paul McGuire

Stephen Horne said:
Just recently I decided I want to make use of my ISP freebie webspace.
In order to make that easier, I'd like to be able to automatically
synchronise an FTP file/folder heirarchy with one on my local hard
drive. I figured this should be easily handled in Python, and broadly
speaking it is, but I am having a little difficulty.
<snip>
Is this anything like ftpmirror.py? You'll find it in the tools\scripts
directory of the Python distribution.

-- Paul
 
S

Stephen Horne

<snip>
Is this anything like ftpmirror.py? You'll find it in the tools\scripts
directory of the Python distribution.

Thanks.

It's the exact reverse of what I want (it mirrors from the server to
the local disk, I want to mirror from the disk to the server) but it
is extremely clueful - the way it assumes a standard (unix oriented)
format for the listing unless a Mac server option is specified tells
me precisely what I need to know.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top