rename and case insensitive filing systems

J

Jon Ripley

I while back I wrote a perl script which recursed through a directory
structure and tidied up any non-preferred file names. It used the standard
rename (oldfile, newfile); approach and I discovered, to my horror, some
very unexpected behaviour when it was run on an operating system with a case
insensitive filing system.

Where filename 'aaa_bbb.ccc.ccc' would be correctly renamed to 'Aaa Bcc.ccc'
on all systems. For cases where only the case of a filename changed - as in
'Aaa bbb.ccc' becoming 'Aaa Bbb.ccc', perl sucessfully deleted all the
files.

It seems that perl was doing:

copy oldfile newfile
delete oldfile

which on a case-insensitive filesystem has the effect of simply deleting
oldfile.

Is this, as it seems to be, standard perl behaviour or is it possible that
it is a piece of very sloppy programming in the particular port I am using?

If copy/delete is standard then there is a *major* data loss causing bug in
perl which needs to be resolved.

(Currently I am manually doing rename old TMP; rename TMP new which is just
a hacky workaround.)

Jon Ripley
 
J

John Bokma

Jon Ripley wrote:

[ rename ]
It seems that perl was doing:

copy oldfile newfile
delete oldfile

which on a case-insensitive filesystem has the effect of simply
deleting oldfile.

ls -al index.html
-rw-rw-rw- 1 user group 0 Feb 1 2004 index.html

perl -e "rename('index.html', 'InNdeX.html')"

ls -al InNdeX.html
-rw-rw-rw- 1 user group 0 Feb 1 2004 InNdeX.html

Windows XP, perl, v5.8.3

Which OS, version of Perl, etc.
 
A

Andy Hassall

Jon Ripley wrote:

[ rename ]
It seems that perl was doing:

copy oldfile newfile
delete oldfile

which on a case-insensitive filesystem has the effect of simply
deleting oldfile.

ls -al index.html
-rw-rw-rw- 1 user group 0 Feb 1 2004 index.html

perl -e "rename('index.html', 'InNdeX.html')"

But you've changed the filename now - there's more N's in the second one.
Windows XP, perl, v5.8.3

G:\temp>dir /b index.html
index.html

G:\temp>perl -e "rename('index.html', 'InDeX.html')"

G:\temp>dir /b index.html
InDeX.html

G:\temp>perl -v

This is perl, v5.8.6 built for MSWin32-x86-multi-thread

(On Windows 2000)
 
S

Sherm Pendley

Jon said:
Where filename 'aaa_bbb.ccc.ccc' would be correctly renamed to 'Aaa
Bcc.ccc' on all systems. For cases where only the case of a filename
changed - as in 'Aaa bbb.ccc' becoming 'Aaa Bbb.ccc', perl sucessfully
deleted all the files.

It seems that perl was doing:

Perl was doing exactly what the docs say it will do:

rename OLDNAME,NEWNAME
Changes the name of a file; an existing file NEWNAME will be
clobbered. Returns true for success, false otherwise.

If you try to rename a file, Perl first looks to see if the new name already
exists. If so, it *must* delete the file first to allow the rename to take
place; if it doesn't the rename will fail, because you can't have two files
with the same name in the same directory.
which on a case-insensitive filesystem has the effect of simply deleting
oldfile.

Yes. If your OS doesn't distinguish between "newname" and "NEWNAME", then
rename("newname", "NEWNAME") will delete the file. Perl checks to see if
"NEWNAME" exists, sees that it does, and deletes it to make room for the
rename.
If copy/delete is standard then there is a *major* data loss causing bug
in perl which needs to be resolved.

Not a bad idea. I've been playing with the idea of diving into the perl
source for some time now - this looks like a reasonably small patch to
start with. I'll have a look at it. (No promises, though!)

sherm--
 
A

Andy Hassall

Perl was doing exactly what the docs say it will do:

rename OLDNAME,NEWNAME
Changes the name of a file; an existing file NEWNAME will be
clobbered. Returns true for success, false otherwise.

If you try to rename a file, Perl first looks to see if the new name already
exists. If so, it *must* delete the file first to allow the rename to take
place; if it doesn't the rename will fail, because you can't have two files
with the same name in the same directory.


Yes. If your OS doesn't distinguish between "newname" and "NEWNAME", then
rename("newname", "NEWNAME") will delete the file. Perl checks to see if
"NEWNAME" exists, sees that it does, and deletes it to make room for the
rename.


Not a bad idea. I've been playing with the idea of diving into the perl
source for some time now - this looks like a reasonably small patch to
start with. I'll have a look at it. (No promises, though!)

Looks like Perl 5.8.6 already tries to cope with this under Windows;
$PERL_SRC/win32/win32.c lines 3071 onwards, it works differently depending on
whether you're on an NT-based system (NT, 2000, XP, 2003) or a Windows 95-based
system.

The NT branch uses the Windows MoveFileEx function, which is supposed to
handle case changes, whereas the 95 version does a bit of trickery with
temporary files, e.g. see the comments:

3072 /* if newname exists, rename it to a temporary name so that we
3073 * don't delete it in case oname happens to be the same file
3074 * (but perhaps accessed via a different path)
3075 */

Maybe the OP is on an older Perl, a different operating system or using a
different filesystem, etc.? Perhaps running from Linux and accessing a
case-insensitive filesystem over a Samba share, so the code from above wouldn't
be applicable?
 
J

John Bokma

Jon said:
RISC OS 4.02, Perl 5.005_03, ADFS file system.

I learned Perl (4) on RISC OS 3.1x :-D.

5.005_03 is like 10? years old (or so) Isn't there a more recent version?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top