Parsing FileName for upload

T

Tony McGuire

If a user selects a file on a Windows box, with IE at least, the FULL
PATH of the file on the user's system is transmitted to the server.

If the user does the same thing using Opera on a Linux box, and
apparently Firebird, then only the specific filename gets transmitted
to the server.

I've been going batty trying to figure out a routine that will detect
when there is a full path sent and parse the file name from that path,
and when there is only a file name sent.

I would dearly appreciate anyone who can help with this. I found many
references to parsing filenames, but nothing I could translate or make
work for this specific situation. Although I only went thru a couple
dozen posts, it's true.
 
Z

Zebee Johnstone

In comp.lang.perl.misc on 24 Aug 2004 14:49:59 -0700
Tony McGuire said:
I've been going batty trying to figure out a routine that will detect
when there is a full path sent and parse the file name from that path,
and when there is only a file name sent.

can you give exmaples of the strings you expect?

Don't forget IE on mac :) what strings does that send?

What problems did you have parsing the strings?

Zebee
 
B

Brian McCauley

Tony said:
If a user selects a file on a Windows box, with IE at least, the FULL
PATH of the file on the user's system is transmitted to the server.

If the user does the same thing using Opera on a Linux box, and
apparently Firebird, then only the specific filename gets transmitted
to the server.

I've been going batty trying to figure out a routine that will detect
when there is a full path sent and parse the file name from that path,
and when there is only a file name sent.

I would dearly appreciate anyone who can help with this. I found many
references to parsing filenames, but nothing I could translate or make
work for this specific situation.

You problem is fundamentally insoluble. This, of course, is not related
to Perl.

Your problem can be restated as "given a string in which one or more
unspecified characters may act as delimiters extract the portion after
the last delimiter".

In practice however it's usually a valid assumption that the delimiter
is either '/' or '\' an the other one won't appear so you can use
/([^\\\/]+$)/.
 
R

Richard Gration

If a user selects a file on a Windows box, with IE at least, the FULL
PATH of the file on the user's system is transmitted to the server. If
the user does the same thing using Opera on a Linux box, and apparently
Firebird, then only the specific filename gets transmitted to the
server.

Here are some snippets from code I wrote to deal with this exact problem:

use CGI qw(&param &cookie &upload &url_param &uploadInfo);
use File::Basename;

my ($error,$flag);
my @img_file_suffices = qw(jpg jpeg gif png bmp);
my $isWin = ($ENV{HTTP_USER_AGENT} =~ /Windows/);

my $fh = upload('img_file');
my $fn = param('img_file');
my $fi = uploadInfo($fn);
my $user_dir = get_upload_dir_for_user($c,$i);
if ($fi->{'Content-Type'} =~ /^image/) {
fileparse_set_fstype("MSWin32") if ($isWin);
my ($fn_name,$fn_path,$fn_suffix) = fileparse($fn,@img_file_suffices);
fileparse_set_fstype("Unix") if ($isWin);
$fn = $fn_name . $fn_suffix;
my $index = 0;
my $path = "$ENV{DOCUMENT_ROOT}/$user_dir";
my $filename = "$i->{ui}->{uid}_${index}_$fn";
while (-e "$path/$filename") {
$index++;
$filename = "$i->{ui}->{uid}_${index}_$fn";
}

open (IMG,">$path/$filename") or die MyApp::Error->new(403,qq(Error writing file "$path/$filename": $!));
while (<$fh>) {print IMG}
close (IMG);
$i->{img_src_link} = "/$user_dir/$filename";
} else {
$error++;
$flag = 'ERROR_NOT_AN_IMAGE_FILE';
}

HTH
Rich
 
T

Tore Aursand

I've been going batty trying to figure out a routine that will detect
when there is a full path sent and parse the file name from that path,
and when there is only a file name sent.

Use the File::Basename module. With that module, you can easily extract
the various parts of a filename (i.e. path and the filename itself), and
then compare it to the original string.

BTW: Why do you need to know _if_ there is a full path present?
 
T

Trey Waters

Use the File::Basename module. With that module, you can easily extract
the various parts of a filename (i.e. path and the filename itself), and
then compare it to the original string.

BTW: Why do you need to know _if_ there is a full path present?

Assuming the OP is on a *nix system, just laying down the file with the
filename given by the browser would create a file with a name similar to:

"C:\Documents and Settings\Administrator\Desktop\MyFilename.blah"

Instead of what one expect as: "MyFilename.blah"
 
T

Tore Aursand

Assuming the OP is on a *nix system, just laying down the file with the
filename given by the browser would create a file with a name similar
to:

"C:\Documents and Settings\Administrator\Desktop\MyFilename.blah"

Instead of what one expect as: "MyFilename.blah"

I'm fully aware of that, but that wasn't my question, was it?

"Why do you need to know _if_ there is a full path present?"

Maybe it's my English that's bad, but I think that the OP really wants a
way to extract only the filename from a full path.

My suggestion was to use File::Basename; it is easy to deal with, cross
platform, and (if I remember correctly) comes with the standard Perl
distribution.

In other words: No need to reinvent the wheel.
 
T

Tony McGuire

Tore Aursand said:
BTW: Why do you need to know _if_ there is a full path present?

The examples I've found expect to see either '/' or '\'. Then they
grab the last portion as the file name.

A user with Opera on Linux selects a file on their system, and
something like '/home/name/file.jpg' is entered into the text block of
which file to upload.

On the server, all that arrives is 'file.jpg'.

If you parse out and take the second portion based on '/' or '\', you
get a blank. Which means the system sees only the directory you were
going to place the file in, and prevents writing to a file name that
is the directory.

I've even tried replacing the variable I'm using as the file name with
the full value received. And this also fails; this portion could be
an error on my part but I've gone over it many times and haven't found
where I'm doing anything wrong.
 
T

Tore Aursand

The examples I've found expect to see either '/' or '\'. Then they grab
the last portion as the file name. [...]

Still, I don't understand. Why can't you use File::Basename? Am I
misunderstanding something here?
 
J

jon

#!/usr/bin/perl -w

$file='C:\Documents and Settings\My Pictures\pic1.jpg';

$file=~s/.*[\/\\](.*)/$1/;
# result: $file="pic1.jpg"

:)
 
T

Tore Aursand

#!/usr/bin/perl -w

$file='C:\Documents and Settings\My Pictures\pic1.jpg';

$file=~s/.*[\/\\](.*)/$1/;
# result: $file="pic1.jpg"

Don't do this. What happends if the separator is something different
than '/' or '\'?

Use the File::Basename module, for God's sake!
 
B

Brian McCauley

Tore said:
#!/usr/bin/perl -w

$file='C:\Documents and Settings\My Pictures\pic1.jpg';

$file=~s/.*[\/\\](.*)/$1/;
# result: $file="pic1.jpg"


Don't do this. What happends if the separator is something different
than '/' or '\'?

Use the File::Basename module, for God's sake!

Bzzzt! That doesn't help. File::Basename parses (by default) acording
to the local OS's syntax. Since we're talking here about filnames
comming from a remote computer running a potentially different OS this
is worse than useless.
 
T

Tore Aursand

Bzzzt! That doesn't help. File::Basename parses (by default) acording
to the local OS's syntax. Since we're talking here about filnames
comming from a remote computer running a potentially different OS this
is worse than useless.

Damn! You're so right. My fault not thinking of that, of course. But
there must be (a) module(s) which is safer than the approached solution?
 
T

Tony McGuire

Tore Aursand said:
The examples I've found expect to see either '/' or '\'. Then they grab
the last portion as the file name. [...]

Still, I don't understand. Why can't you use File::Basename? Am I
misunderstanding something here?

I've not tried this yet.

I'll have to investigate to see if I have what is needed; otherwise
I'll get it...I really need this to work reliably.

By the way, my 'server' is on W2k Pro. And I'm running Apache
2.something.

Thanks Tore, and everyone else taking the time to try to help me. As
a newbie to PERL, it is comforting to know there is a place to ask for
and get help.
 
G

Gunnar Hjalmarsson

Tony said:
THANK YOU for pointing this out.

Preliminary results are *perfect*!

The filename is correctly parsed both when sent from an IE/Windows
machine as well as when sent from the Opera/Linux machine.

I never wanted the remote path on the user's machine, but IE sends
it anyway.

The File:Basename and fileparse() get the correct information every
time.

Please read the rest of the thread to find out why File::Basename is
*not* an adequate solution to your problem.
 
T

Tony McGuire

Tore Aursand said:
Still, I don't understand. Why can't you use File::Basename? Am I
misunderstanding something here?

Tore,

THANK YOU for pointing this out.

Preliminary results are *perfect*!

The filename is correctly parsed both when sent from an IE/Windows
machine as well as when sent from the Opera/Linux machine.

I never wanted the remote path on the user's machine, but IE sends it
anyway.

The File:Basename and fileparse() get the correct information every
time.

Again, thanks for responding as well as for your patience.
 
T

Tony McGuire

Brian McCauley said:
Bzzzt! That doesn't help. File::Basename parses (by default) acording
to the local OS's syntax. Since we're talking here about filnames
comming from a remote computer running a potentially different OS this
is worse than useless.

Well, initial testing indicates that File::Basename does work - at
least in my situation.

I have Apache running on Windows.

That routine correctly parses the [remote] path, name and extension
from a system running on Linux with Opera.

While the path is '.\', that's OK since I'm throwing out the remote
path anyway.

And the filename and extension do parse correctly.

In reading up on the fileparse() routine, it says that if you don't
set the OS type, the routine uses the incoming connection and attempts
to set the OS-type based on that.

At least in limited testing, it correctly parses Opera/Linux as well
as IE/Windows.

So, while not perfect perhaps (time will tell), it appears to be
working.

Please correct my findings if they aren't correct. I'd rather do more
work now and put up a working system than have it fall apart after
people are trying to use it.
 
T

Tore Aursand

THANK YOU for pointing this out.

No. Please don't thank me, as I was severely wrong. :) I think I must
have been thinking "backwards" or something, 'cause File::Basename will
only work (correctly) for the server OS (ie. not the client's OS, which is
the important part here).

Still, I'm quite puzzled by the fact that it doesn't seem to be a module
(or a reasonable way) to solve this. Why is that?

One approach which _might_ work, is to read the client's HTTP_USER_AGENT
and try to figure out what OS the client is running, and then feed
File::Basename with that information.

It should work, but I don't know how "valid" this "solution" is.
 
T

Tore Aursand

Well, initial testing indicates that File::Basename does work - at least
in my situation.

[...]

In reading up on the fileparse() routine, it says that if you don't set
the OS type, the routine uses the incoming connection and attempts to
set the OS-type based on that.

Where does it say that? The documentation clearly states that if one
haven't set the filesystem type (using 'fileparse_set_fstype()'), the
'fileparse()' function will use $^O, which is a special variable
containing the current OS name.

It returns the OS name for the computer where the CGI script is running,
and not the OS name of the client machine that is requesting the CGI
script.
Please correct my findings if they aren't correct.

Try this simple script (untested):

#!/usr/bin/perl
#
use strict;
use warnings;
use Data::Dumper;
use File::Basename qw( fileparse );

my @paths = ('c:\Documents and Settings\username\filename.ext',
'/home/username/filename.ext',
'Doc_Root:[username]:filename.ext');

foreach ( @paths ) {
my @info = fileparse( $_ );
print Dumper( \@info );
}

Gunnar: Please respect my use of Data::Dumper in this case also. :)
 
C

ctcgag

Tore Aursand said:
No. Please don't thank me, as I was severely wrong. :) I think I must
have been thinking "backwards" or something, 'cause File::Basename will
only work (correctly) for the server OS (ie. not the client's OS, which
is the important part here).

I'm not sure that that is so important. I assume that he is using the
basename as the basis of a filename to store the file locally. In that
case, the most important thing is probably that the basename that you
obtain doesn't contain things that are directory separators on the server's
filesystem. Whether or not they are separators on the client's filesystem
is probably of lesser importance. If some bizarre OS uses q as a separator
and tells me the file is 'qetcqpasswd', and I leave it like that, well,
that is no skin off my nose.

Still, I'm quite puzzled by the fact that it doesn't seem to be a module
(or a reasonable way) to solve this. Why is that?

Because you generally let the server do the server's job and the client do
the client's job. Why make the server gratuitously psychoanalyze the
client?

If all you want is something to name the file which serves
as a mnemonic for the person who loaded it, then I'd use something like:

$name =~ /([a-zA-Z0-9._]+)$/ or die "Bad name $name";
my $newname=$1;

Xho
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,012
Latest member
RoxanneDzm

Latest Threads

Top