Thanks comp.lang.python!!!



While migrating a lot of Mac OS9 machines to PCs, we encountered several
problems with Mac file and directory names. Below is a list of the problems:

1. Macs allow characters in file and dir names that are not acceptable
on PCs. Specifically this set of characters [*<>?\|/]

2. Mac files and dirs that contained a "/" in their names would ftp to
the server OK, but the "/" would be translated to "%2f". So, a Mac file
named 7/19/03 would ftp as 7%2f19%2f03... not a very desirable filename,
especially when there are hundreds of them.

3. The last problem was spaces at the beginning and ending of file and
dir names. We encountered hundreds of files and dirs like this on the
Macs. They would ftp up to the Linux server OK but when the Windows PC
attempted to d/l them, the ftp transaction would stop and complain about
not finding files whenever it tried to transfer a file. Dirs with spaces
at the beginning or ending would literally crash the ftp client.

These were problems that we did not expect. So, we wrote a script to
clean up these names. Since all of the files were bring uploaded to a
Linux ftp server, we decided to do the cleaning there. Python is a
simple, easily readable programming language, so we chose to use it.
Long story short, attached to this email is the script. If anyone can
use it to address Mac to PC migrations, feel free to.

The only caveat is that the script uses os.walk. I don't think Python
2.2.x comes with os.walk. To address this, we d/l 2.3b2 and have used it
extensively with this script w/o any problems. And, someone here on
comp.lang.python told me that os.walk could be incorporated into 2.2.x
too, but I never tried to do that as 2.3b2 worked just fine.

Thanks to everyone who contributed to this script. Much of it is
straight from advice that I received here. Also, if anyone sees how it
can be improved, let me know. For now, I'm satisfied with it as it works
"well enough" for what I need it to do, however, I'm trying to become a
better programmer so I appreciated feedback from those who are much more
experienced than I am.

Special Thanks to Andy Jewell, Bengt Richter and Ethan Mindlace Fremen
as they wrote much of the code initially and gave a lot of great tips!!!

print " "
import os, re, string
setpath = raw_input("Path to the Directory Where Mac Directory & Filenames Need to be Made Sane: ")
print " "
print "--- Replace Bad Characters in Directory Names & Filenames ---"
print " "
def clean_names(setpath):
bad = re.compile(r'%2f|%25|[*?<>/\|\\]') #search for these bad chars.
for root, dirs, files in os.walk(setpath):
for dir in dirs:
badchars = bad.findall(dir) # find all bad chars.
newdir = dir
for badchar in badchars: # loop through each character in badchars
print "replaced: ",badchar," in dir ",newdir," ",
newdir = newdir.replace(badchar,'-') #replace bad chars.
print newdir
if newdir: # If there are any bad characters in the name, do this:
newpath = os.path.join(root,newdir)
oldpath = os.path.join(root,dir)
for root, dirs, files in os.walk(setpath):
for file in files:
badchars = bad.findall(file) # find all bad chars.
newfile = file
for badchar in badchars: # loop through each character in badchars
print "replaced: ",badchar," in file ",newfile," ",
newfile = newfile.replace(badchar,'-') #replace bad chars.
print newfile
if newfile: # If there are any bad characters in the name, do this:
newpath = os.path.join(root,newfile)
oldpath = os.path.join(root,file)
clean_names(setpath) #1
clean_names(setpath) #2
clean_names(setpath) #3
clean_names(setpath) #4
clean_names(setpath) #5 Be recursive so program gets sub dirs too.
clean_names(setpath) #6 You may add or remove as many of these as you need.
clean_names(setpath) #7
clean_names(setpath) #8
clean_names(setpath) #9
clean_names(setpath) #10
print " "
print "--- Done ---"
print " "
print "--- Remove Spaces from Beginning and Ending of Directory Names & Filenames ---"
print " "
def clean_spaces(setpath):
for root, dirs, files in os.walk(setpath):
for dir in dirs:
old_dname = dir #original name of dir as it exists in the filesystem.
new_dname = old_dname.strip() #new name of dir with spaces striped from beginning and ending.
if new_dname != old_dname: #if space(s) are found...
print "removed spaces from dir:",old_dname #show user what's going on.
newpath = os.path.join(root,new_dname) #declare new path.
oldpath = os.path.join(root,old_dname) #declare old path.
os.rename(oldpath,newpath) #rename dir without spaces.
for root, dirs, files in os.walk(setpath):
for file in files:
old_fname = file #original name of file as it exists in the filesystem.
new_fname = old_fname.strip() #new name of file with spaces striped from beginning and ending.
if new_fname != old_fname: #if space(s) are found...
print "removed spaces from file:",old_fname #show user what's going on.
newpath = os.path.join(root,new_fname) #declare new path.
oldpath = os.path.join(root,old_fname) #declare old path.
os.rename(oldpath,newpath) #rename file without spaces
clean_spaces(setpath) #1
clean_spaces(setpath) #2
clean_spaces(setpath) #3
clean_spaces(setpath) #4
clean_spaces(setpath) #5 Be recursive as it doesn't hurt anything, although it's probably not needed for files.
clean_spaces(setpath) #6
clean_spaces(setpath) #7
clean_spaces(setpath) #8
clean_spaces(setpath) #9
clean_spaces(setpath) #10
print " "
print "--- Done --- "
print " "
print "--- This program was written by Brad Tilley (e-mail address removed) ---"
print " "

Andy Jewell

While migrating a lot of Mac OS9 machines to PCs, we encountered several
problems with Mac file and directory names. Below is a list of the

1. Macs allow characters in file and dir names that are not acceptable
on PCs. Specifically this set of characters [*<>?\|/]

2. Mac files and dirs that contained a "/" in their names would ftp to
the server OK, but the "/" would be translated to "%2f". So, a Mac file
named 7/19/03 would ftp as 7%2f19%2f03... not a very desirable filename,
especially when there are hundreds of them.

3. The last problem was spaces at the beginning and ending of file and
dir names. We encountered hundreds of files and dirs like this on the
Macs. They would ftp up to the Linux server OK but when the Windows PC
attempted to d/l them, the ftp transaction would stop and complain about
not finding files whenever it tried to transfer a file. Dirs with spaces
at the beginning or ending would literally crash the ftp client.

Shame on Apple for allowing subversive filenames! ;-)
These were problems that we did not expect. So, we wrote a script to
clean up these names. Since all of the files were bring uploaded to a
Linux ftp server, we decided to do the cleaning there. Python is a
simple, easily readable programming language, so we chose to use it.
Long story short, attached to this email is the script. If anyone can
use it to address Mac to PC migrations, feel free to.

You never do, until they bite you!
The only caveat is that the script uses os.walk. I don't think Python
2.2.x comes with os.walk. To address this, we d/l 2.3b2 and have used it
extensively with this script w/o any problems. And, someone here on
comp.lang.python told me that os.walk could be incorporated into 2.2.x
too, but I never tried to do that as 2.3b2 worked just fine.

There is os.path.walk, instead.
Thanks to everyone who contributed to this script. Much of it is
straight from advice that I received here. Also, if anyone sees how it
can be improved, let me know. For now, I'm satisfied with it as it works
"well enough" for what I need it to do, however, I'm trying to become a
better programmer so I appreciated feedback from those who are much more
experienced than I am.

You're welcome. We all come here to learn... :))
Special Thanks to Andy Jewell, Bengt Richter and Ethan Mindlace Fremen
as they wrote much of the code initially and gave a lot of great tips!!!


Some additional comments on your source-code, if I may. The following points
will help you make your program much more efficient:

1) You'd normally place your functions in a separate section, usually at the
top of your program, rather than 'in the middle'. It will work fine this
way, but it's a bit less readable.

2) There seem to be some indentation anomalies, probably because of using a
combination of tabs and spaces. This WILL bite you sometime in the future:
best to stick to one or t'other, preferably just spaces: the convention in
Python is to indent by 4 spaces for each 'suite', or logical 'block' of

3) I'm not sure you quite get the recursive bit yet! Simply calling your
function lots of times in succession doesn't cut it... all that happens is
that each time you call it, it does the same thing, effectively doing the job
10 times... What you'd have to do is call the function from *WITHIN* itself,
i.e. in the body, like:

def recurse(dir,depth=0):
""" walk dir's subdirectories recursively, printing their name """
# process list of files in dir...
for entry in os.listdirs(dir):
# if the current one is a directory...
if os.path.isdir(os.join(dir,entry)):
print " "*depth+"+"+entry
# recurse (call ourselves)

** NOTE: Looking at the docs, if you use os.walk, you don't need to do the
recursion yourself, as os.walk does it for you!

3) You're still repeating yourself several times, too. You can get away with
JUST ONE os.walk() loop:

for root, dirs, files in os.walk(setpath):
for thisfile in dirs+files:
newname=thisfile.strip() # strip off leading and trailing whitespace
# replace any bad characters...
for badchar in badchars:
# rename thisfile ONLY if newname is different...
if newname != thisfile: # check if it's changed:
print renaming thisfile,newname,"in",root

!! that replaces what your four for loops do... 8-0

hope you find this useful :)


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Latest member

Latest Threads
