Thanks comp.lang.python!!!

Discussion in 'Python' started by hokiegal99, Jul 20, 2003.

  1. hokiegal99

    hokiegal99 Guest

    While migrating a lot of Mac OS9 machines to PCs, we encountered several
    problems with Mac file and directory names. Below is a list of the problems:

    1. Macs allow characters in file and dir names that are not acceptable
    on PCs. Specifically this set of characters [*<>?\|/]

    2. Mac files and dirs that contained a "/" in their names would ftp to
    the server OK, but the "/" would be translated to "%2f". So, a Mac file
    named 7/19/03 would ftp as 7%2f19%2f03... not a very desirable filename,
    especially when there are hundreds of them.

    3. The last problem was spaces at the beginning and ending of file and
    dir names. We encountered hundreds of files and dirs like this on the
    Macs. They would ftp up to the Linux server OK but when the Windows PC
    attempted to d/l them, the ftp transaction would stop and complain about
    not finding files whenever it tried to transfer a file. Dirs with spaces
    at the beginning or ending would literally crash the ftp client.

    These were problems that we did not expect. So, we wrote a script to
    clean up these names. Since all of the files were bring uploaded to a
    Linux ftp server, we decided to do the cleaning there. Python is a
    simple, easily readable programming language, so we chose to use it.
    Long story short, attached to this email is the script. If anyone can
    use it to address Mac to PC migrations, feel free to.

    The only caveat is that the script uses os.walk. I don't think Python
    2.2.x comes with os.walk. To address this, we d/l 2.3b2 and have used it
    extensively with this script w/o any problems. And, someone here on
    comp.lang.python told me that os.walk could be incorporated into 2.2.x
    too, but I never tried to do that as 2.3b2 worked just fine.

    Thanks to everyone who contributed to this script. Much of it is
    straight from advice that I received here. Also, if anyone sees how it
    can be improved, let me know. For now, I'm satisfied with it as it works
    "well enough" for what I need it to do, however, I'm trying to become a
    better programmer so I appreciated feedback from those who are much more
    experienced than I am.

    Special Thanks to Andy Jewell, Bengt Richter and Ethan Mindlace Fremen
    as they wrote much of the code initially and gave a lot of great tips!!!

    print " "
    import os, re, string
    setpath = raw_input("Path to the Directory Where Mac Directory & Filenames Need to be Made Sane: ")
    print " "
    print "--- Replace Bad Characters in Directory Names & Filenames ---"
    print " "
    def clean_names(setpath):
    bad = re.compile(r'%2f|%25|[*?<>/\|\\]') #search for these bad chars.
    for root, dirs, files in os.walk(setpath):
    for dir in dirs:
    badchars = bad.findall(dir) # find all bad chars.
    newdir = dir
    for badchar in badchars: # loop through each character in badchars
    print "replaced: ",badchar," in dir ",newdir," ",
    newdir = newdir.replace(badchar,'-') #replace bad chars.
    print newdir
    if newdir: # If there are any bad characters in the name, do this:
    newpath = os.path.join(root,newdir)
    oldpath = os.path.join(root,dir)
    os.rename(oldpath,newpath)
    for root, dirs, files in os.walk(setpath):
    for file in files:
    badchars = bad.findall(file) # find all bad chars.
    newfile = file
    for badchar in badchars: # loop through each character in badchars
    print "replaced: ",badchar," in file ",newfile," ",
    newfile = newfile.replace(badchar,'-') #replace bad chars.
    print newfile
    if newfile: # If there are any bad characters in the name, do this:
    newpath = os.path.join(root,newfile)
    oldpath = os.path.join(root,file)
    os.rename(oldpath,newpath)
    clean_names(setpath) #1
    clean_names(setpath) #2
    clean_names(setpath) #3
    clean_names(setpath) #4
    clean_names(setpath) #5 Be recursive so program gets sub dirs too.
    clean_names(setpath) #6 You may add or remove as many of these as you need.
    clean_names(setpath) #7
    clean_names(setpath) #8
    clean_names(setpath) #9
    clean_names(setpath) #10
    print " "
    print "--- Done ---"
    print " "
    print "--- Remove Spaces from Beginning and Ending of Directory Names & Filenames ---"
    print " "
    def clean_spaces(setpath):
    for root, dirs, files in os.walk(setpath):
    for dir in dirs:
    old_dname = dir #original name of dir as it exists in the filesystem.
    new_dname = old_dname.strip() #new name of dir with spaces striped from beginning and ending.
    if new_dname != old_dname: #if space(s) are found...
    print "removed spaces from dir:",old_dname #show user what's going on.
    newpath = os.path.join(root,new_dname) #declare new path.
    oldpath = os.path.join(root,old_dname) #declare old path.
    os.rename(oldpath,newpath) #rename dir without spaces.
    for root, dirs, files in os.walk(setpath):
    for file in files:
    old_fname = file #original name of file as it exists in the filesystem.
    new_fname = old_fname.strip() #new name of file with spaces striped from beginning and ending.
    if new_fname != old_fname: #if space(s) are found...
    print "removed spaces from file:",old_fname #show user what's going on.
    newpath = os.path.join(root,new_fname) #declare new path.
    oldpath = os.path.join(root,old_fname) #declare old path.
    os.rename(oldpath,newpath) #rename file without spaces
    clean_spaces(setpath) #1
    clean_spaces(setpath) #2
    clean_spaces(setpath) #3
    clean_spaces(setpath) #4
    clean_spaces(setpath) #5 Be recursive as it doesn't hurt anything, although it's probably not needed for files.
    clean_spaces(setpath) #6
    clean_spaces(setpath) #7
    clean_spaces(setpath) #8
    clean_spaces(setpath) #9
    clean_spaces(setpath) #10
    print " "
    print "--- Done --- "
    print " "
    print "--- This program was written by Brad Tilley ---"
    print " "
    hokiegal99, Jul 20, 2003
    #1
    1. Advertising

  2. hokiegal99

    Andy Jewell Guest

    On Sunday 20 Jul 2003 4:40 am, hokiegal99 wrote:
    > While migrating a lot of Mac OS9 machines to PCs, we encountered several
    > problems with Mac file and directory names. Below is a list of the
    > problems:
    >
    > 1. Macs allow characters in file and dir names that are not acceptable
    > on PCs. Specifically this set of characters [*<>?\|/]
    >
    > 2. Mac files and dirs that contained a "/" in their names would ftp to
    > the server OK, but the "/" would be translated to "%2f". So, a Mac file
    > named 7/19/03 would ftp as 7%2f19%2f03... not a very desirable filename,
    > especially when there are hundreds of them.
    >
    > 3. The last problem was spaces at the beginning and ending of file and
    > dir names. We encountered hundreds of files and dirs like this on the
    > Macs. They would ftp up to the Linux server OK but when the Windows PC
    > attempted to d/l them, the ftp transaction would stop and complain about
    > not finding files whenever it tried to transfer a file. Dirs with spaces
    > at the beginning or ending would literally crash the ftp client.
    >


    Shame on Apple for allowing subversive filenames! ;-)

    > These were problems that we did not expect. So, we wrote a script to
    > clean up these names. Since all of the files were bring uploaded to a
    > Linux ftp server, we decided to do the cleaning there. Python is a
    > simple, easily readable programming language, so we chose to use it.
    > Long story short, attached to this email is the script. If anyone can
    > use it to address Mac to PC migrations, feel free to.
    >


    You never do, until they bite you!

    > The only caveat is that the script uses os.walk. I don't think Python
    > 2.2.x comes with os.walk. To address this, we d/l 2.3b2 and have used it
    > extensively with this script w/o any problems. And, someone here on
    > comp.lang.python told me that os.walk could be incorporated into 2.2.x
    > too, but I never tried to do that as 2.3b2 worked just fine.
    >


    There is os.path.walk, instead.

    > Thanks to everyone who contributed to this script. Much of it is
    > straight from advice that I received here. Also, if anyone sees how it
    > can be improved, let me know. For now, I'm satisfied with it as it works
    > "well enough" for what I need it to do, however, I'm trying to become a
    > better programmer so I appreciated feedback from those who are much more
    > experienced than I am.
    >


    You're welcome. We all come here to learn... :))

    > Special Thanks to Andy Jewell, Bengt Richter and Ethan Mindlace Fremen
    > as they wrote much of the code initially and gave a lot of great tips!!!


    :))

    Some additional comments on your source-code, if I may. The following points
    will help you make your program much more efficient:

    1) You'd normally place your functions in a separate section, usually at the
    top of your program, rather than 'in the middle'. It will work fine this
    way, but it's a bit less readable.

    2) There seem to be some indentation anomalies, probably because of using a
    combination of tabs and spaces. This WILL bite you sometime in the future:
    best to stick to one or t'other, preferably just spaces: the convention in
    Python is to indent by 4 spaces for each 'suite', or logical 'block' of
    code.

    3) I'm not sure you quite get the recursive bit yet! Simply calling your
    function lots of times in succession doesn't cut it... all that happens is
    that each time you call it, it does the same thing, effectively doing the job
    10 times... What you'd have to do is call the function from *WITHIN* itself,
    i.e. in the body, like:

    def recurse(dir,depth=0):
    """ walk dir's subdirectories recursively, printing their name """
    # process list of files in dir...
    for entry in os.listdirs(dir):
    # if the current one is a directory...
    if os.path.isdir(os.join(dir,entry)):
    print " "*depth+"+"+entry
    # recurse (call ourselves)
    recurse(os.join(dir,entry),depth+1)

    ** NOTE: Looking at the docs, if you use os.walk, you don't need to do the
    recursion yourself, as os.walk does it for you!

    3) You're still repeating yourself several times, too. You can get away with
    JUST ONE os.walk() loop:

    for root, dirs, files in os.walk(setpath):
    for thisfile in dirs+files:
    badchars=bad.findall(thisfile)
    newname=thisfile.strip() # strip off leading and trailing whitespace
    # replace any bad characters...
    for badchar in badchars:
    newname=neaname.replace(badchar,"-")
    # rename thisfile ONLY if newname is different...
    if newname != thisfile: # check if it's changed:
    print renaming thisfile,newname,"in",root
    os.rename(os.path.join(root,thisfile),os.path.join(root,newname)

    !! that replaces what your four for loops do... 8-0

    hope you find this useful :)

    -andyj
    Andy Jewell, Jul 20, 2003
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.

Share This Page