Re: Ascertaing uploaded file size

Discussion in 'Perl Misc' started by Peter J. Holzer, Apr 25, 2013.

  1. On 2013-04-25 14:17, Joey@still_Learning.invalid <Joey@still_Learning.invalid> wrote:
    > Ben Morrow wrote:
    >
    >>
    >>Quoth "Joey@still_Learning.invalid" <Joey@still_Learning.invalid>:
    >>> Ben Morrow wrote:
    >>> >Quoth "Joey@still_Learning.invalid" <Joey@still_Learning.invalid>:
    >>> >>
    >>> >> Indeed this is a piece of server-side code that accepts a multi-file
    >>> >> upload from a web browser.
    >>> >>
    >>> >> What I ended up doing was upload all files to the server and then using a
    >>> >> filename as a key, I search the server directory at $filepath and compare
    >>> >> the actual size of the uploaded file to the size of the file submitted for
    >>> >> upload:

    >>[...]
    >>> >
    >>> >If not, then your comparison still isn't accomplishing anything, because
    >>> >you still don't have access to the files the client was trying to
    >>> >upload.
    >>>
    >>> HTML5 Forms and client-side javascript enable the capture of filenames,
    >>> size and other parameters of the multiple files selected for upload and
    >>> submitted to the perl script. My perl script has all the information about
    >>> the files intended for upload and can easily compare that information with
    >>> the parameters of files stored on the server, e.g., file size.

    >>
    >>OK, so where are these file sizes sent from the client?

    >
    > Are you trying to pull my leg?
    >
    > Files are selected by the user and are submitted to the server:


    Where are the file sizes which are submitted to the server IN YOUR CODE?

    You haven't shown us any code which retrieves the parameters containing
    those file sizes. Instead you have shown us this:

    for (my $j = 0; $j < $imgCnt; $j++) {
    $filename = $uploadFiles[$j];
    $filesize = -s $filename

    At this point $filesize doesn't contain a size which was possibly
    captured by some javascript code on the client and submitted to the
    server. It contains the size of the file $uploadFiles[$j] ON THE SERVER.

    As far as I can see you haven't shown us how @uploadFiles is populated
    either, so we don't know what's in that array either. It might be a
    relative or absolute path name or it might be a file handle. It might
    refer to a temporary file created by your request parsing module (I
    think that's another piece of information you've neglected to tell us)
    or it might refer to one of the files on their final destination.



    >>You do realise, don't you, that the code you have posted so far doesn't
    >>upload anything,


    True.

    >>it just copies a file from one place on the server to
    >>another?


    Also true.

    >>The actual upload is being done by the webserver, or CGI.pm, or
    >>something;


    Um, I'd quibble with your terminology here. The server doesn't upload
    anything. It's the client which does the uploading. The server just
    receives the uploaded data (and may pass it on to a CGI script or
    something).

    >>since you still haven't explained properly how you are
    >>running this it's hard to tell. In fact, you almost certainly don't need
    >>to copy the file at all: since it's already been written out on the
    >>server you can probably just rename it to where you want it to live in
    >>the end.


    I don't think it's safe to assume that the temporary files and the final
    destination are on the same file system. (Of course File::Copy::move
    will take care of that detail).


    > I realize nothing of the sort.


    I guess that's part of the problem.

    > I have files on my desktop machine that are
    > being successfully uploaded and tested for size.


    Yes, but not in the code you've shown us. Both uploading and receiving
    the upload is happening in code you haven't shown us.


    > Here's a snippet of the client HTML code:


    Ah, now we are finally getting to see the relevant code (or at least
    part of it).


    ><div>
    ><form id='form2' name='form2' action='cgi-bin/image_uploadtest.pl'
    > method='post' enctype='multipart/form-data' onSubmit='sendItP2()'>
    >
    ><input type='file' name='myFile0[]' id='myFile0' multiple='multiple'
    > min='1' max='1000'>
    >
    ><input type=hidden name='fileArray' value=' '>
    >
    ></div>
    ></form>
    >
    > A js event listener is employed:
    >
    > fileselect = document.getElementById("myFile0");
    > fileselect.addEventListener("change", FileSelectHandler, false);
    >
    > where (in part)::
    >
    > function FileSelectHandler(e) {
    > var files = e.dataTransfer.files;
    > document.forms["form2"].elements["fileArray"].value = files;
    > imgCnt = files.length;
    > document.forms["form2"].elements["imgCnt"].value = imgCnt;
    > }


    I don't see where this code "captures ... the size ... of the multiple
    files", as you claimed above. It does capture the number of files,
    though.


    > Files are selected by the user and passed to javascript function
    > 'sendItP2(),' which submits the form: document.forms['form2'].submit();
    >
    > At the server side the information submitted is passed:
    >
    > $filePath = path to server folder where files are to be stored;
    > $imgCnt = param('imgCnt');
    > @uploadFiles = param('myFile0[]');
    > @upload_filehandles = $query->upload("myFile0[]");
    > @filesize = ();


    Ok, from the sub names param and upload I gather that you are using the
    CGI module (The mixture of function and method calls is curious, though).

    The value(s) returned by param() are magic: They contain both the
    file name sent by the client and an open file handle to the temporary
    file created by CGI.pm (same as upload()).

    > chdir $filePath or die
    >
    > for (my $i=0;$i<$imgCnt;$i++) {
    > $filename = $uploadFiles[$i];
    > $filesize = -s $filename;


    So here you apply -s to one of these dual-valued variables. That might
    either try to determine the size of a local file with the same name as
    the one submitted by the client (and as you changed to directory to
    $filePath that would probably be a previously uploaded one) or the size
    of the temporary file which the filehandle refers to. Without having
    tried it I think the latter is more likely.

    In no case does this determine the size of a file on the client. Except
    of course, that the temporary file on the server should have the same
    contents (and size) as the file on the client, but that's the thing you
    are trying to test, so you can't assume it.

    > push(@filesize, $filesize);
    > $upload_filehandle = $upload_filehandles[$i];
    > $filename =~ s/.*[\/\\](.*)/$1/;
    > open (UPLOADFILE, ">$filePath/$filename") or die...
    > binmode UPLOADFILE;
    > while (<$upload_filehandle>) {
    > print UPLOADFILE;
    > }
    > close UPLOADFILE or die "Can't Close. $!";
    > }
    >
    > At this point the selected files have all (ostensibly) been uploaded to
    > $filepath.


    More correctly: They have been uploaded to a temporary location before
    your loop even started. In the loop you copied them from this temporary
    location to $filepath.

    > Next, comes the comparison of file sizes (which hasn't yet been
    > corrected re your input):
    >
    > for (my $j = 0; $j < $imgCnt; $j++) {
    > $filename = $uploadFiles[$j];
    > $filesize = $filesize[$j];
    > opendir DH,$filePath or die "Can't opendir $filePath $!";
    > while (my $serverFileName = readdir DH) {
    > $serverFileSize = -s $serverFileName;
    > if (($serverFileName eq $filename) && ($filesize ==
    > $serverFileSize)) {
    > $numImagesReceived++;
    > }
    > }


    Here you either compare the sizes of the temporary files to the final
    files (should be the same unless the disk is full (you should check that
    above) or a user is simultaniously uploading another file with the same
    name) or the size of the final file before and after the upload (that
    would generally not be the same so you would notice this).

    You do not compare to anything on the client. If the upload was
    interrupted or garbled you wouldn't notice it here (though there's a
    good chance that either the web server or CGI.pm would catch the problem
    and wouldn't even let you get that far.

    >>What's much more interesting is whether the file
    >>written out by whatever actually did the upload is the same size as the
    >>file the client sent; in order to test that, you need to compare a -s on
    >>that file with a size sent by the client.

    >
    > Yes, which is exactly what I believe I'm doing.


    To believe is to not know.

    >>I'm beginning to suspect the code you are running is not the same as the
    >>code you are posting, since the code you are posting doesn't refer to
    >>any client-submitted file sizes at all. Please only post code *you have
    >>actually run*.

    >
    > With the exception of code that is not related to the upload itself, you
    > have the exact code I'm running, above.


    Now we have it (well, part of it, but we can guess at the rest). Before
    this posting we had ONLY code NOT related to the upload itself.


    > I contend that the above code properly submits files selected for upload,
    > ascertains the file sizes, uploads the file to $filePath, and compares the
    > uploaded file sizes with the selected file sizes.
    >
    > I have tested this by hard-coding file names and sizes to simulate form
    > submission, eliminating the upload process,


    How can you test something about the upload process by eliminating the
    upload process? You've eliminated the very thing you claim you are
    trying to test!

    > and comparing hard coded parameters with the files previously uploaded
    > and stored on the server. If the file size of the uploaded files is
    > different from the hard-coded values, I detect the error. It works
    > albeit with the issues you originally highlighted.


    The purpose of your code is trying to detect an error during upload. You
    don't know that your code works until you successfully detected such an
    error (and even then you know only that you can detect that specific
    error).

    hp


    --
    _ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
    |_|_) | Sysadmin WSR | Man feilt solange an seinen Text um, bis
    | | | | die Satzbestandteile des Satzes nicht mehr
    __/ | http://www.hjp.at/ | zusammenpaßt. -- Ralph Babel
    Peter J. Holzer, Apr 25, 2013
    #1
    1. Advertising

  2. Peter J. Holzer

    Dave Saville Guest

    On Thu, 25 Apr 2013 23:09:31 UTC, Ben Morrow <> wrote:
    >
    > I was going to get to that next, once we'd got to the point of
    > distinguishing between 'files on the server' and 'files on the client'.
    > In any case, you can usually arrange for the uploads to be spooled to a
    > directory of your choosing, so that a rename is sufficient. There's no
    > point copying the file from one filesystem to another if you don't need
    > to.


    Hi Ben

    Please, how do you get it to spool where you want? I am troubled by
    left overs sometimes. It seems to default to the directory the CGI
    script is in.

    --
    Regards
    Dave Saville
    Dave Saville, Apr 26, 2013
    #2
    1. Advertising

  3. Ben Morrow <> writes:
    > Quoth "Dave Saville" <>:
    >> On Thu, 25 Apr 2013 23:09:31 UTC, Ben Morrow <> wrote:
    >> >
    >> > I was going to get to that next, once we'd got to the point of
    >> > distinguishing between 'files on the server' and 'files on the client'.
    >> > In any case, you can usually arrange for the uploads to be spooled to a
    >> > directory of your choosing, so that a rename is sufficient. There's no
    >> > point copying the file from one filesystem to another if you don't need
    >> > to.

    >>
    >> Please, how do you get it to spool where you want? I am troubled by
    >> left overs sometimes. It seems to default to the directory the CGI
    >> script is in.

    >
    > You're using CGI.pm? You set $CGITempFile::TMPDIRECTORY. (Yes, CGI.pm
    > stomps on other random top-level namespaces, including Fh.)


    The documented way to achieve this is to set the environment variable
    TMPDIR to a suitable value.
    Rainer Weikusat, Apr 26, 2013
    #3
  4. CGI.pm file upload locations (was: Ascertaing uploaded file size)

    "Dave Saville" <> writes:
    > On Thu, 25 Apr 2013 23:09:31 UTC, Ben Morrow <> wrote:
    >> I was going to get to that next, once we'd got to the point of
    >> distinguishing between 'files on the server' and 'files on the client'.
    >> In any case, you can usually arrange for the uploads to be spooled to a
    >> directory of your choosing, so that a rename is sufficient. There's no
    >> point copying the file from one filesystem to another if you don't need
    >> to.

    >
    > Hi Ben
    >
    > Please, how do you get it to spool where you want? I am troubled by
    > left overs sometimes. It seems to default to the directory the CGI
    > script is in.


    Quoting the CGI.pm documentation:

    The temporary directory is selected using the following
    algorithm:

    1. if the current user (e.g. "nobody") has a directory named
    "tmp" in its home directory, use that (Unix systems only).

    2. if the environment variable TMPDIR exists, use the location
    indicated.

    3. Otherwise try the locations /usr/tmp, /var/tmp, C:\temp,
    /tmp, /temp, ::Temporary Items, and \WWW_ROOT.

    Each of these locations is checked that it is a directory and
    is writable. If not, the algorithm tries the next choice.

    Another option is to pass an 'upload hook' subroutine to the CGI.pm
    constructor (use use the upload_hook function). This looks like this:

    my $cgi = CGI->new(\&hook, undef, undef);


    The 2nd argument is an optional 'data item' which will be passed to
    the upload_hook routine. It is usually more convenient to use a
    closure with direct access to any necessary 'data item' instead. The
    3rd argument specifies whether CGI.pm should automatically create a
    temporary file (default is 'yes'). If this is set to false, no
    temporary file will be created. The 'upload hook' routine is going to
    be invoked with the remote filename, a buffer containing a block of
    file data, the amount of data in this buffer (someone had a serious
    "Am I not writing C ATM?" problem here ...) and the 'data item' which
    was specified when the hook as created. The 'upload hook' subroutine
    can then do whatever it wants to do with the file data, including
    writing it to a real file with any name and in any location the CGI
    process can write to.
    Rainer Weikusat, Apr 26, 2013
    #4
  5. CGI.pm file upload locations (was: Ascertaing uploaded file size)

    "Dave Saville" <> writes:
    > On Thu, 25 Apr 2013 23:09:31 UTC, Ben Morrow <> wrote:
    >> I was going to get to that next, once we'd got to the point of
    >> distinguishing between 'files on the server' and 'files on the client'.
    >> In any case, you can usually arrange for the uploads to be spooled to a
    >> directory of your choosing, so that a rename is sufficient. There's no
    >> point copying the file from one filesystem to another if you don't need
    >> to.

    >
    > Hi Ben
    >
    > Please, how do you get it to spool where you want? I am troubled by
    > left overs sometimes. It seems to default to the directory the CGI
    > script is in.


    Quoting the CGI.pm documentation:

    The temporary directory is selected using the following
    algorithm:

    1. if the current user (e.g. "nobody") has a directory named
    "tmp" in its home directory, use that (Unix systems only).

    2. if the environment variable TMPDIR exists, use the location
    indicated.

    3. Otherwise try the locations /usr/tmp, /var/tmp, C:\temp,
    /tmp, /temp, ::Temporary Items, and \WWW_ROOT.

    Each of these locations is checked that it is a directory and
    is writable. If not, the algorithm tries the next choice.

    Another option is to pass an 'upload hook' subroutine to the CGI.pm
    constructor (or use the upload_hook function). This looks like this:

    my $cgi = CGI->new(\&hook, undef, undef);


    The 2nd argument is an optional 'data item' which will be passed to
    the upload_hook routine. It is usually more convenient to use a
    closure with direct access to any necessary 'data items' instead. The
    3rd argument specifies whether CGI.pm should automatically create a
    temporary file (default is 'yes'). If this is set to false, no
    temporary file will be created. The 'upload hook' routine is going to
    be invoked with the remote filename, a buffer containing a block of
    file data, the amount of data in this buffer (someone had a serious
    "Am I not writing C ATM?" problem here ...) and the 'data item' which
    was specified when the hook as created. The 'upload hook' subroutine
    can then process the file data in any desired way, including writing
    it to a real file with any name and in any location the CGI process
    can write to.
    Rainer Weikusat, Apr 26, 2013
    #5
  6. Ben Morrow <> writes:
    > Quoth Rainer Weikusat <>:
    >> Ben Morrow <> writes:
    >> >
    >> > You're using CGI.pm? You set $CGITempFile::TMPDIRECTORY. (Yes, CGI.pm
    >> > stomps on other random top-level namespaces, including Fh.)

    >>
    >> The documented way to achieve this is to set the environment variable
    >> TMPDIR to a suitable value.

    >
    > That affects more than just CGI.pm. The CGI.pm documentation (version
    > 3.59) says
    >
    > | The temporary directory is selected using the following algorithm:
    > |
    > | 1. if $CGITempFile::TMPDIRECTORY is already set, use that
    > |
    > | 2. if the environment variable TMPDIR exists, use the location
    > | indicated.
    > |
    > | 3. Otherwise try the locations /usr/tmp, /var/tmp, C:\temp,
    > | /tmp, /temp, ::Temporary Items, and \WWW_ROOT.
    > |
    > | Each of these locations is checked that it is a directory and is
    > | writable. If not, the algorithm tries the next choice.


    I quoted this from 3.43.
    Rainer Weikusat, Apr 26, 2013
    #6
  7. Peter J. Holzer

    Dave Saville Guest

    On Fri, 26 Apr 2013 14:04:54 UTC, Ben Morrow <> wrote:

    ><snip>


    > That affects more than just CGI.pm. The CGI.pm documentation (version
    > 3.59) says
    >
    > | The temporary directory is selected using the following algorithm:
    > |
    > | 1. if $CGITempFile::TMPDIRECTORY is already set, use that
    > |
    > | 2. if the environment variable TMPDIR exists, use the location
    > | indicated.
    > |
    > | 3. Otherwise try the locations /usr/tmp, /var/tmp, C:\temp,
    > | /tmp, /temp, ::Temporary Items, and \WWW_ROOT.
    > |
    > | Each of these locations is checked that it is a directory and is
    > | writable. If not, the algorithm tries the next choice.


    Interesting, TMPDIR exists but OTOH it is not explicitly passed by
    Apache PassEnv. Never figured out why some are passed by default and
    others you have to specify. But I have certainly seen files called
    CGItmpnnn or similar appearing in the cgi-bin directories. Yet another
    OS/2 perl 5.8.2 funny I expect :)

    I may have a play with the first one.
    --
    Regards
    Dave Saville
    Dave Saville, Apr 26, 2013
    #7
  8. Peter J. Holzer

    Dave Saville Guest

    On Fri, 26 Apr 2013 12:07:22 UTC, Ben Morrow <> wrote:

    > You're using CGI.pm? You set $CGITempFile::TMPDIRECTORY. (Yes, CGI.pm
    > stomps on other random top-level namespaces, including Fh.)
    >


    Hi Ben - Would that be before or after calling any CGI.pm modules?

    TIA
    --
    Regards
    Dave Saville
    Dave Saville, Apr 27, 2013
    #8
  9. "Dave Saville" <> writes:
    > On Fri, 26 Apr 2013 12:07:22 UTC, Ben Morrow <> wrote:
    >
    >> You're using CGI.pm? You set $CGITempFile::TMPDIRECTORY. (Yes, CGI.pm
    >> stomps on other random top-level namespaces, including Fh.)
    >>

    >
    > Hi Ben - Would that be before or after calling any CGI.pm modules?


    That would be after loading them and before processing any user data
    which may contain file uploads.
    Rainer Weikusat, Apr 27, 2013
    #9
  10. Peter J. Holzer

    Dave Saville Guest

    On Sat, 27 Apr 2013 14:01:22 UTC, Ben Morrow <> wrote:

    >
    > Quoth "Dave Saville" <>:
    > > On Fri, 26 Apr 2013 12:07:22 UTC, Ben Morrow <> wrote:
    > >
    > > > You're using CGI.pm? You set $CGITempFile::TMPDIRECTORY. (Yes, CGI.pm
    > > > stomps on other random top-level namespaces, including Fh.)

    > >
    > > Hi Ben - Would that be before or after calling any CGI.pm modules?

    >
    > I would imagine it would be before creating a CGI object; that is,
    > either before the call to CGI->new or before the first call to one of
    > the functional interface functions (which construct an object behind the
    > scenes). (I don't use CGI.pm, so I'm not an expert. If I were handling
    > upload requests without a framework I would probably use HTTP::Body.)
    >


    Yes it is. I now have both my upload scripts using it - Saving a file
    copy. Thanks so much.
    --
    Regards
    Dave Saville
    Dave Saville, Apr 27, 2013
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Matt G
    Replies:
    1
    Views:
    1,135
    Deepak Kumar Vasudevan
    Aug 22, 2003
  2. Johnson

    Limiting Uploaded File Size

    Johnson, Feb 18, 2005, in forum: ASP .Net
    Replies:
    4
    Views:
    501
    Lau Lei Cheong
    Feb 19, 2005
  3. RobM
    Replies:
    0
    Views:
    619
  4. h3m4n
    Replies:
    2
    Views:
    571
    h3m4n
    Jul 8, 2006
  5. Justin C

    Re: Ascertaing uploaded file size

    Justin C, Apr 24, 2013, in forum: Perl Misc
    Replies:
    5
    Views:
    205
    Jürgen Exner
    Apr 27, 2013
Loading...

Share This Page