case sensitive filenames

T

Tom Anderson

Tom said:
Also, it appears that getCanonicalPath deals with varying case-sensitivity
across the directory tree correctly - i'm on a Mac, which has a
case-insensitive HFS+ filesystem [1], and have a linux box mounted over
sftp, which has a case-sensitive filesystem of some sort. If i have a
foo.txt on both, getCanonicalPath correctly maps foo.TXT to foo.txt on the
Mac filesystem, and keeps it as foo.TXT on the linux.

It doesn't on Linux with VFAT filesystems. They remain resolutely case-sensitive
as far as File is concerned:

File.getCanonicalPath("/some/vfatpath/foo.txt") returns /some/vfatpath/foo.txt
File.getCanonicalPath("/some/vfatpath/FOO.txt") returns /some/vfatpath/FOO.txt

Interesting. But that's clearly a bug with linux, not java! :)

tom

--
But in natural sciences whose conclusions are true and necessary and
have nothing to do with human will, one must take care not to place
oneself in the defence of error; for here a thousand Demostheneses and
a thousand Aristotles would be left in the lurch by every mediocre wit
who happened to hit upon the truth for himself. -- Galileo
 
T

Tom Anderson

The fact that he's interested in getting the answer demonstrates
that he's not dealing exclusively with Windows ...

I thought he was interested in dealing with variable case, which is very
much a Windows (well, and Mac) problem.
When do you suppose we'll get a font-sensitive file system? Wouldn't
it be nice if the editor saved the old version of Foo.java not as
Foo.java~ but as <i>Foo.java</i>?

Hmm. Can you do some of this (in DOS, at least) with ANSI escapes?

tom

--
But in natural sciences whose conclusions are true and necessary and
have nothing to do with human will, one must take care not to place
oneself in the defence of error; for here a thousand Demostheneses and
a thousand Aristotles would be left in the lurch by every mediocre wit
who happened to hit upon the truth for himself. -- Galileo
 
M

Martin Gregorie

Tom said:
Also, it appears that getCanonicalPath deals with varying
case-sensitivity across the directory tree correctly - i'm on a Mac,
which has a case-insensitive HFS+ filesystem [1], and have a linux box
mounted over sftp, which has a case-sensitive filesystem of some sort.
If i have a foo.txt on both, getCanonicalPath correctly maps foo.TXT
to foo.txt on the Mac filesystem, and keeps it as foo.TXT on the
linux.

It doesn't on Linux with VFAT filesystems. They remain resolutely
case-sensitive as far as File is concerned:

File.getCanonicalPath("/some/vfatpath/foo.txt") returns
/some/vfatpath/foo.txt File.getCanonicalPath("/some/vfatpath/FOO.txt")
returns /some/vfatpath/FOO.txt

Interesting. But that's clearly a bug with linux, not java! :)
If its a bug. The Linux tin says its case sensitive. It matches the
description by being consistently case sensitive over its native filing
systems plus at least vfat.

For me the only annoyance with this approach is that the tendency of some
Windows applications to use uppercase extensions upsets programs that use
the MIME mechanism to determine file type from the extension. Nautilus is
an example of this behavior. Other programs, such as the GIMP image
handler, determine file type from content rather than extension, and so
aren't phased by this.

Its rational, at least to me, that Java should be case sensitive on a
Linux platform.

If there's confusion anywhere, its on Windows where generally the
extension:application mapping is case insensitive but the Java compiler,
for one, breaks this convention by insisting that class and file names
match case.
 
T

Tom Anderson

Tom said:
Also, it appears that getCanonicalPath deals with varying case-sensitivity
across the directory tree correctly - i'm on a Mac, which has a
case-insensitive HFS+ filesystem [1], and have a linux box mounted over
sftp, which has a case-sensitive filesystem of some sort. If i have a
foo.txt on both, getCanonicalPath correctly maps foo.TXT to foo.txt on the
Mac filesystem, and keeps it as foo.TXT on the linux.

It doesn't on Linux with VFAT filesystems. They remain resolutely
case-sensitive
as far as File is concerned:

File.getCanonicalPath("/some/vfatpath/foo.txt") returns
/some/vfatpath/foo.txt
File.getCanonicalPath("/some/vfatpath/FOO.txt") returns
/some/vfatpath/FOO.txt

Interesting. But that's clearly a bug with linux, not java! :)

Quick question - did the file foo.txt exist when you did that test?

tom

--
But in natural sciences whose conclusions are true and necessary and
have nothing to do with human will, one must take care not to place
oneself in the defence of error; for here a thousand Demostheneses and
a thousand Aristotles would be left in the lurch by every mediocre wit
who happened to hit upon the truth for himself. -- Galileo
 
T

Tom Anderson

Tom Anderson wrote:

Also, it appears that getCanonicalPath deals with varying
case-sensitivity across the directory tree correctly - i'm on a Mac,
which has a case-insensitive HFS+ filesystem [1], and have a linux box
mounted over sftp, which has a case-sensitive filesystem of some sort.
If i have a foo.txt on both, getCanonicalPath correctly maps foo.TXT
to foo.txt on the Mac filesystem, and keeps it as foo.TXT on the
linux.

It doesn't on Linux with VFAT filesystems. They remain resolutely
case-sensitive as far as File is concerned:

File.getCanonicalPath("/some/vfatpath/foo.txt") returns
/some/vfatpath/foo.txt File.getCanonicalPath("/some/vfatpath/FOO.txt")
returns /some/vfatpath/FOO.txt

Interesting. But that's clearly a bug with linux, not java! :)

If its a bug. The Linux tin says its case sensitive. It matches the
description by being consistently case sensitive over its native filing
systems plus at least vfat.

Hang on, what actually is the case sensitivity under linux + vfat? If i
do:

echo "lowercase" > foo.txt
cat foo.TXT

What do i get?

If it is case sensitive, and cat there tells me "No such file or
directory", then the behaviour Nigel reports is quite correct.

If cat prints the contents of foo.txt, then java should be correcting the
case in getCanonicalPath.

tom

--
But in natural sciences whose conclusions are true and necessary and
have nothing to do with human will, one must take care not to place
oneself in the defence of error; for here a thousand Demostheneses and
a thousand Aristotles would be left in the lurch by every mediocre wit
who happened to hit upon the truth for himself. -- Galileo
 
R

Roedy Green

FileSystem.getFileSystem(
do you mean FileSystemView or some hidden class FileSystem?
--
Roedy Green Canadian Mind Products
http://mindprod.com
PM Steven Harper is fixated on the costs of implementing Kyoto, estimated as high as 1% of GDP.
However, he refuses to consider the costs of not implementing Kyoto which the
famous economist Nicholas Stern estimated at 5 to 20% of GDP
 
A

Arne Vajhøj

Andreas said:
Depends. In most cases: yes.

If it were security-relevant, things may be different.

True.

But it looks more as something to be used help the user
than as critical logic to me.

Arne
 
A

Arne Vajhøj

Eric said:
When do you suppose we'll get a font-sensitive file system?
Wouldn't it be nice if the editor saved the old version of Foo.java
not as Foo.java~ but as <i>Foo.java</i>?

:)

Hopefully never !

Arne
 
A

Arne Vajhøj

Tom said:
Tom said:
Also, it appears that getCanonicalPath deals with varying
case-sensitivity
across the directory tree correctly - i'm on a Mac, which has a
case-insensitive HFS+ filesystem [1], and have a linux box mounted over
sftp, which has a case-sensitive filesystem of some sort. If i have a
foo.txt on both, getCanonicalPath correctly maps foo.TXT to foo.txt
on the
Mac filesystem, and keeps it as foo.TXT on the linux.

It doesn't on Linux with VFAT filesystems. They remain resolutely
case-sensitive
as far as File is concerned:

File.getCanonicalPath("/some/vfatpath/foo.txt") returns
/some/vfatpath/foo.txt
File.getCanonicalPath("/some/vfatpath/FOO.txt") returns
/some/vfatpath/FOO.txt

Interesting. But that's clearly a bug with linux, not java! :)

I would call it a bug in the Java implementation
on Linux.

Arne
 
R

Roedy Green

Cross-platform application? His customers may not like it...

I keep a separate little database for each directory, for UNTOUCH, so
you may get duplicate checksums, but it should work out in the wash.

It would likely even work if I just presumed case sensitivity.
--
Roedy Green Canadian Mind Products
http://mindprod.com
PM Steven Harper is fixated on the costs of implementing Kyoto, estimated as high as 1% of GDP.
However, he refuses to consider the costs of not implementing Kyoto which the
famous economist Nicholas Stern estimated at 5 to 20% of GDP
 
M

Mike Schilling

Arne said:
Tom said:
Tom Anderson wrote:
Also, it appears that getCanonicalPath deals with varying
case-sensitivity
across the directory tree correctly - i'm on a Mac, which has a
case-insensitive HFS+ filesystem [1], and have a linux box mounted
over sftp, which has a case-sensitive filesystem of some sort. If
i have a foo.txt on both, getCanonicalPath correctly maps foo.TXT
to foo.txt on the
Mac filesystem, and keeps it as foo.TXT on the linux.

It doesn't on Linux with VFAT filesystems. They remain resolutely
case-sensitive
as far as File is concerned:

File.getCanonicalPath("/some/vfatpath/foo.txt") returns
/some/vfatpath/foo.txt
File.getCanonicalPath("/some/vfatpath/FOO.txt") returns
/some/vfatpath/FOO.txt

Interesting. But that's clearly a bug with linux, not java! :)

I would call it a bug in the Java implementation
on Linux.

I'm unconvinced it's a bug. Windows filesystems are case-preserving, though
not case-sensitive. There's no reason to consider either upper or lower
case more canonical than the other.
 
A

Arne Vajhøj

Mike said:
Arne said:
Tom said:
Also, it appears that getCanonicalPath deals with varying
case-sensitivity
across the directory tree correctly - i'm on a Mac, which has a
case-insensitive HFS+ filesystem [1], and have a linux box mounted
over sftp, which has a case-sensitive filesystem of some sort. If
i have a foo.txt on both, getCanonicalPath correctly maps foo.TXT
to foo.txt on the
Mac filesystem, and keeps it as foo.TXT on the linux.
It doesn't on Linux with VFAT filesystems. They remain resolutely
case-sensitive
as far as File is concerned:

File.getCanonicalPath("/some/vfatpath/foo.txt") returns
/some/vfatpath/foo.txt
File.getCanonicalPath("/some/vfatpath/FOO.txt") returns
/some/vfatpath/FOO.txt
Interesting. But that's clearly a bug with linux, not java! :)
I would call it a bug in the Java implementation
on Linux.

I'm unconvinced it's a bug. Windows filesystems are case-preserving, though
not case-sensitive. There's no reason to consider either upper or lower
case more canonical than the other.

The docs for getCanonicalPath says:

"A canonical pathname is both absolute and unique."

which I would interpret as if there can only be one canonical
path for a given file.

But on the other hand:

System.out.println((new File("C:\\foo.txt")).getCanonicalPath());
System.out.println((new File("C:\\FOO.txt")).getCanonicalPath());

outputs:

C:\foo.txt
C:\FOO.txt

so apparently unique does not mean that.

Arne
 
L

Lew

Tom said:
Quick question - did the file foo.txt exist when you did that test?

IIRC, the existence or non-existence of a file has no bearing on the action of
'getCanonicalPath()'.
 
L

Lew

Arne said:
The docs for getCanonicalPath says:

"A canonical pathname is both absolute and unique."

which I would interpret as if there can only be one canonical
path for a given file.

File does not represent a file, it represents a pathname.

<http://java.sun.com/javase/6/docs/api/java/io/File.html>
"An abstract representation of file and directory pathnames. "

So the canonicalization is not based on whether two pathnames represent the
same file, but whether they represent the same name.

It makes sense that "foo.txt" and "FOO.txt" would have different
canonicalizations, particularly since, as Mike Schilling stated upthread,
But on the other hand:

System.out.println((new File("C:\\foo.txt")).getCanonicalPath());
System.out.println((new File("C:\\FOO.txt")).getCanonicalPath());

outputs:

C:\foo.txt
C:\FOO.txt

so apparently unique does not mean that.

It means "unique" in the context of pathnames, not in the context of files.
 
A

Andreas Leitgeb

Roedy Green said:
do you mean FileSystemView or some hidden class FileSystem?

I looked into the src.zip shipped with my jdk.
There was a toplevel class FileSystem in java.io, but I didn't
care if it was public or package-default. So it might be
a "hidden" class from the public-API PoV.

Ok, I quick-checked back: "abstract class FileSystem"
is obviously hidden (no "public")
 
A

Andreas Leitgeb

Arne Vajhøj said:
True.
But it looks more as something to be used help the user
than as critical logic to me.

I confess that my "PS:"-placed paragraph about a "safe" test was not an
answer to any explicitly posed question in the post I responded to.

I further confess, that I'm entirely not sorry for it, and that even
in future I shall not refrain from adding extra unsolicited remarks
based on my own (not necessarily unflawed) judgement on usefulness to
either the poster or any by-reader.

:)
 
N

Nigel Wade

Tom said:
Tom said:
Also, it appears that getCanonicalPath deals with varying case-sensitivity
across the directory tree correctly - i'm on a Mac, which has a
case-insensitive HFS+ filesystem [1], and have a linux box mounted over
sftp, which has a case-sensitive filesystem of some sort. If i have a
foo.txt on both, getCanonicalPath correctly maps foo.TXT to foo.txt on the
Mac filesystem, and keeps it as foo.TXT on the linux.

It doesn't on Linux with VFAT filesystems. They remain resolutely case-sensitive
as far as File is concerned:

File.getCanonicalPath("/some/vfatpath/foo.txt") returns /some/vfatpath/foo.txt
File.getCanonicalPath("/some/vfatpath/FOO.txt")
returns /some/vfatpath/FOO.txt

Interesting. But that's clearly a bug with linux, not java! :)

tom

There is nothing wrong with Linux. Outside of Java FOO.txt and foo.txt on a VFAT
filesystem are the same file. If you ask for the canonical path to foo.txt you
get back foo.txt, which is a perfectly valid path. If you ask for the path to
FOO.txt you get back FOO.txt which is also a perfectly valid path to the same
file.

I see no justification for expecting to get back foo.txt when you ask for
FOO.txt. Why foo.txt and not, for example, Foo.txt or fOO.txt or FOO.TXT all of
which are equally valid responses? Why do you expect to the lowercase filename
to be returned?
 
T

Tom Anderson

IIRC, the existence or non-existence of a file has no bearing on the action
of 'getCanonicalPath()'.

Shocking! Lew, you're the last person i expected to have to say this to,
but: read the javadocs. They say:

Every pathname that denotes an existing file or directory has a unique
canonical form. Every pathname that denotes a nonexistent file or
directory also has a unique canonical form. The canonical form of the
pathname of a nonexistent file or directory may be different from the
canonical form of the same pathname after the file or directory is
created. Similarly, the canonical form of the pathname of an existing
file or directory may be different from the canonical form of the same
pathname after the file or directory is deleted.

Returns: The canonical pathname string denoting the same file or
directory as this abstract pathname

Two key things: whether a file exists or not *does* (or at least *can*)
matter, and the definition of the canonicalised path is "the canonical
pathname string *denoting the same file or directory* as this abstract
pathname" (my emphasis). That means that if a file or directory exists,
then all paths which refer to it map to the same canonical path. If it
doesn't, i don't think that bit can apply. Thus, my question is germane.

I'd be interested to see the result of the linux VFAT test when the named
file exists.

tom
 
T

Tom Anderson

Mike said:
Arne said:
Tom Anderson wrote:
Also, it appears that getCanonicalPath deals with varying
case-sensitivity
across the directory tree correctly - i'm on a Mac, which has a
case-insensitive HFS+ filesystem [1], and have a linux box mounted
over sftp, which has a case-sensitive filesystem of some sort. If
i have a foo.txt on both, getCanonicalPath correctly maps foo.TXT
to foo.txt on the
Mac filesystem, and keeps it as foo.TXT on the linux.
It doesn't on Linux with VFAT filesystems. They remain resolutely
case-sensitive
as far as File is concerned:

File.getCanonicalPath("/some/vfatpath/foo.txt") returns
/some/vfatpath/foo.txt
File.getCanonicalPath("/some/vfatpath/FOO.txt") returns
/some/vfatpath/FOO.txt
Interesting. But that's clearly a bug with linux, not java! :)
I would call it a bug in the Java implementation
on Linux.

I'm unconvinced it's a bug. Windows filesystems are case-preserving,
though not case-sensitive. There's no reason to consider either upper or
lower case more canonical than the other.

The docs for getCanonicalPath says:

"A canonical pathname is both absolute and unique."

which I would interpret as if there can only be one canonical
path for a given file.

But on the other hand:

System.out.println((new File("C:\\foo.txt")).getCanonicalPath());
System.out.println((new File("C:\\FOO.txt")).getCanonicalPath());

outputs:

C:\foo.txt
C:\FOO.txt

so apparently unique does not mean that.

It does if the file exists.

tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Debugging regex 3
Browser news 4
almost equal strings 20
Regex Puzzle 5
Constellations 38
Smoothing 2
case sensitive file names 14
case strings 8

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top