copying files

Discussion in 'C Programming' started by Hans Vlems, Feb 16, 2012.

  1. Hans Vlems

    Hans Vlems Guest

    I'm maintaing large numbers of Adobe Reader files (.pdf). One of my
    programs, written in C (gcc 4.4.4), must make a copies between
    different filesystems of these pdf files.
    There is AFAIK no library function that does this, which leaves me two
    options:
    1- use the console interface, i.e. build a command string and pass
    this to system().
    2- open the file, copy the contents and close the target
    I'd rather avoid option 1 because system runs out of control of my
    program.
    My question is what read and write functions are best suited to copy
    the (binary) pdf files?
    'Performance is not the main objective, but I want to be sure that the
    copy finished succesfully and accurately.
    Hans
     
    Hans Vlems, Feb 16, 2012
    #1
    1. Advertising

  2. On Feb 16, 9:03 am, Hans Vlems <> wrote:
    > I'm maintaing large numbers of Adobe Reader files (.pdf). One of my
    > programs, written in C (gcc 4.4.4), must make a copies between
    > different filesystems of these pdf files.
    > There is AFAIK no library function that does this, which leaves me two
    > options:
    > 1- use the console interface, i.e. build a command string and pass
    > this to system().
    > 2- open the file, copy the contents and close the target
    > I'd rather avoid option 1 because system runs out of control of my
    > program.
    > My question is what read and write functions are best suited to copy
    > the (binary) pdf files?
    > 'Performance is not the main objective, but I want to be sure that the
    > copy finished succesfully and accurately.
    > Hans


    /*
    Untested code

    return -2 can't open input, -3 can't open output, -1 read/write
    error (probably hardware problems).
    */
    int copy(const char *dest, const char *source)
    {
    FILE *fpin;
    FILE *fpout;
    int err;
    int ch;

    fpin = fopen(source, "rb2");
    if(!fpin)
    return -2;
    fpout = fopen(dest, "wb");
    if(!fpout)
    {
    fclose(fpin);
    return -3;
    }
    while( (ch = fgetc(fpin)) != EOF)
    {
    err = fputc(ch, fpout);
    if(err == EOF)
    goto error_exit;
    }
    /* if EOF was generated by read error instead of end of file, feof
    is false */
    if(!feof(fpin))
    goto error_exit;
    fclose(fpin);
    /* we need to check that fclose flushes data to destination
    correctly */
    err = fclose(fpout);
    if(err == EOF)
    goto error_exit;
    return 0;

    /* read/write error */
    error_exit:
    fclose(fpin);
    fclose(fpout);
    return -1;
    }
     
    Malcolm McLean, Feb 16, 2012
    #2
    1. Advertising

  3. Malcolm McLean <> writes:
    [...]
    > fpin = fopen(source, "rb2");

    [...]

    Is "rb2" a typo?

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Will write code for food.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Feb 16, 2012
    #3
  4. Hans Vlems <> writes:
    > I'm maintaing large numbers of Adobe Reader files (.pdf). One of my
    > programs, written in C (gcc 4.4.4), must make a copies between
    > different filesystems of these pdf files.
    > There is AFAIK no library function that does this, which leaves me two
    > options:
    > 1- use the console interface, i.e. build a command string and pass
    > this to system().
    > 2- open the file, copy the contents and close the target
    > I'd rather avoid option 1 because system runs out of control of my
    > program.
    > My question is what read and write functions are best suited to copy
    > the (binary) pdf files?
    > 'Performance is not the main objective, but I want to be sure that the
    > copy finished succesfully and accurately.


    I'd just invoke the OS's command to copy the files ("cp" on
    Unix-like systems, "copy" on Windows, probably something else on
    other systems). It's likely to be at least as fast as anything
    you write yourself, and it may preserve metadata (permissions,
    etc.) that you're not going to be able to handle in your own code
    without considerable difficulty.

    I'm not sure why system running "out of control" of your program
    should be an issue; can you elaborate?

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Will write code for food.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Feb 16, 2012
    #4
  5. Hans Vlems

    JohnF Guest

    Hans Vlems <> wrote:
    > I'm maintaing large numbers of Adobe Reader files (.pdf). One of my
    > programs, written in C (gcc 4.4.4), must make a copies between
    > different filesystems of these pdf files.
    > There is AFAIK no library function that does this, which leaves me two
    > options:
    > 1- use the console interface, i.e. build a command string and pass
    > this to system().
    > 2- open the file, copy the contents and close the target
    > I'd rather avoid option 1 because system runs out of control of my
    > program.
    > My question is what read and write functions are best suited to copy
    > the (binary) pdf files?
    > 'Performance is not the main objective, but I want to be sure that the
    > copy finished succesfully and accurately.
    > Hans


    I'm guessing you're already aware of what Malcolm suggested
    in preceding followup, and it's not adequate. And since I've
    read your posts in comp.os.vms, I'm also guessing we're
    talking about an ods-2/5 filesystem at one end, and maybe a
    linux ext3, or whatever, at the other. Could you elaborate on
    that a little? And does linux have some ods-2/5 support so you
    can mount a vms disk? I wasn't aware of that. If you can indeed
    just mount it, then by all means try Malcolm's suggestion and
    see what happens. Should just work if the ods filesystem support
    is any good. Otherwise, how are you intending to even access
    the disk? Decnet for linux (note that it's no longer being
    very actively supported)? I think that would be the driving
    question that dictates an appropriate answer to your question.
    So you need to supply all that additional info first.
    By the way, I usually just ftp zipped files back and forth
    between vms and linux boxes on my soho lan. Despite your
    "out of control" issue, I'd just write a script (using C's
    system() if you want the script in C) to do the job, unless
    security's some really, really significant issue for your
    situation.
    --
    John Forkosh ( mailto: where j=john and f=forkosh )
     
    JohnF, Feb 16, 2012
    #5
  6. Hans Vlems

    Nobody Guest

    On Thu, 16 Feb 2012 01:03:12 -0800, Hans Vlems wrote:

    > I'm maintaing large numbers of Adobe Reader files (.pdf). One of my
    > programs, written in C (gcc 4.4.4), must make a copies between
    > different filesystems of these pdf files.
    > There is AFAIK no library function that does this, which leaves me two
    > options:
    > 1- use the console interface, i.e. build a command string and pass
    > this to system().


    Avoid system() unless executing a "canned" command supplied by the user.
    If you need to spawn a child process with specific arguments, use fork()
    and exec*() rather than attempting to construct a shell command.

    > 2- open the file, copy the contents and close the target


    First, you need to decide what you mean by "copy". Part of the reason that
    there isn't a library function is that there isn't a single obvious
    definition of what it means to copy a file. Two plausible choices are:

    1. open() the destination, write the contents of the source to it, close()
    it.

    2. open() a temporary file in the same directory as the destination,
    write the contents of the source to it, close() it, rename() it over the
    original.

    The two alternatives have many differences, including (but not limited to):

    1. If there are multiple hard links to the destination, #1 will leave all
    links intact, and all will refer to the modified file. #2 will break one
    specific hard link, causing the filename to point to a new file; the
    others will still refer to the original file.

    2. If the destination file is open in some other process, #1 will cause
    the process to immediately see the new contents, while #2 will only affect
    processes which open() the file after the rename() has occurred.

    3. #1 requires that you have write permission on the destination file if
    it exists, or write permission on the directory if it doesn't. #2 requires
    that you have write permission on the directory (the file's permissions
    don't matter); if the destination exists and the directory has the sticky
    bit set, you must own the file (or be root).

    4. #1 won't affect the owner, group or permissions of an existing file. #2
    will create a new file with your uid, primary gid and umask.

    5. If the destination exists and is a device (block or character) or FIFO,
    #1 will open() it and write to it. #2 will replace it with a file.

    6. If the destination exists and is a symlink, #1 will open() it (i.e.
    open its target) and write to it. #2 will replace it with a file.

    Note that the Unix "cp" command is similar to option #1, except that if
    the file exists but open()ing it fails and the "-f" flag is used, it
    attempts to unlink() it then, if that succeeds, proceeds as if the file
    didn't exist.

    > I'd rather avoid option 1 because system runs out of control of my
    > program.
    > My question is what read and write functions are best suited to copy
    > the (binary) pdf files?
    > 'Performance is not the main objective, but I want to be sure that the
    > copy finished succesfully and accurately.


    For robustness, choose option #2 above. If writing the file fails,
    remove() the temporary file rather than rename()ing it. The original file
    will be left intact.

    Performance-wise, mmap()ing the source and write()ing directly from the
    mmap()d region eliminates a copy. mmap()ing both source and destination
    and memcpy()ing may or may nor provide any additional benefit.
     
    Nobody, Feb 16, 2012
    #6
  7. Hans Vlems

    Hans Vlems Guest

    On 16 feb, 10:49, JohnF <> wrote:
    > Hans Vlems <> wrote:
    > > I'm maintaing large numbers of Adobe Reader files (.pdf). One of my
    > > programs, written in C (gcc 4.4.4), must make a copies between
    > > different filesystems of these pdf files.
    > > There is AFAIK no library function that does this, which leaves me two
    > > options:
    > > 1- use the console interface, i.e. build a command string and pass
    > > this to system().
    > > 2- open the file, copy the contents and close the target
    > > I'd rather avoid option 1 because system runs out of control of my
    > > program.
    > > My question is what read and write functions are best suited to copy
    > > the (binary) pdf files?
    > > 'Performance is not the main objective, but I want to be sure that the
    > > copy finished succesfully and accurately.
    > > Hans

    >
    > I'm guessing you're already aware of what Malcolm suggested
    > in preceding followup, and it's not adequate. And since I've
    > read your posts in comp.os.vms, I'm also guessing we're
    > talking about an ods-2/5 filesystem at one end, and maybe a
    > linux ext3, or whatever, at the other. Could you elaborate on
    > that a little? And does linux have some ods-2/5 support so you
    > can mount a vms disk? I wasn't aware of that. If you can indeed
    > just mount it, then by all means try Malcolm's suggestion and
    > see what happens. Should just work if the ods filesystem support
    > is any good. Otherwise, how are you intending to even access
    > the disk? Decnet for linux (note that it's no longer being
    > very actively supported)? I think that would be the driving
    > question that dictates an appropriate answer to your question.
    > So you need to supply all that additional info first.
    >    By the way, I usually just ftp zipped files back and forth
    > between vms and linux boxes on my soho lan. Despite your
    > "out of control" issue, I'd just write a script (using C's
    > system() if you want the script in C) to do the job, unless
    > security's some really, really significant issue for your
    > situation.
    > --
    > John Forkosh  ( mailto:    where j=john and f=forkosh)- Tekst uit oorspronkelijk bericht niet weergeven -
    >
    > - Tekst uit oorspronkelijk bericht weergeven -


    John,
    your investigating powers are impressive! Unfortunately they've led
    you into a dead end street...
    On a VMS system I wouldn't have had the need to ask a question. VMS
    has an IO subsystem (RMS) and a neatly documented API.
    And I doubt I'd have used C to solve this problem ;-) since I have a
    choice of at least 4 other languages that I'm more
    comfortable with...
    The project I'm involved in runs on a Windows platform, on Citrix
    servers more precisely and I have _no_ provileges on these
    systems. The reason I use the (old) DJGPP compiler is that doesn't
    need a Windows install process that uses the registry.
    The command line interface on WIndows doesn't even come close to what
    DCL has to offer. But I digress.

    I want to copy pdf files from one windows disk to another, so the
    rename() function is useless. Next, I must retain the original file
    which is another reason why rename() won't do.
    C has a choice of functions to read from and write to diskfiles. I
    want to be sure that all content gets copied, unaltered and without
    inflating the file too much. One option is to read the input file one
    byte at a time and write it until EOF is signalled.
    Or read blocks, say 1 kB, and write them. Probably faster but may have
    other drawbacks I'm not aware of.
    The original post was written with this in mind and that was perhaps
    not too smart.

    Hans
     
    Hans Vlems, Feb 16, 2012
    #7
  8. Hans Vlems

    Hans Vlems Guest

    On 16 feb, 10:23, Malcolm McLean <>
    wrote:
    > On Feb 16, 9:03 am, Hans Vlems <> wrote:
    >
    > > I'm maintaing large numbers of Adobe Reader files (.pdf). One of my
    > > programs, written in C (gcc 4.4.4), must make a copies between
    > > different filesystems of these pdf files.
    > > There is AFAIK no library function that does this, which leaves me two
    > > options:
    > > 1- use the console interface, i.e. build a command string and pass
    > > this to system().
    > > 2- open the file, copy the contents and close the target
    > > I'd rather avoid option 1 because system runs out of control of my
    > > program.
    > > My question is what read and write functions are best suited to copy
    > > the (binary) pdf files?
    > > 'Performance is not the main objective, but I want to be sure that the
    > > copy finished succesfully and accurately.
    > > Hans

    >
    > /*
    >    Untested code
    >
    >    return -2 can't open input, -3 can't open output, -1 read/write
    > error (probably hardware problems).
    > */
    > int copy(const char *dest, const char *source)
    > {
    >   FILE *fpin;
    >   FILE *fpout;
    >   int err;
    >   int ch;
    >
    >   fpin = fopen(source, "rb2");
    >   if(!fpin)
    >     return -2;
    >   fpout = fopen(dest, "wb");
    >   if(!fpout)
    >   {
    >     fclose(fpin);
    >     return -3;
    >   }
    >   while( (ch = fgetc(fpin)) != EOF)
    >   {
    >     err = fputc(ch, fpout);
    >     if(err == EOF)
    >       goto error_exit;
    >   }
    >   /* if EOF was generated by read error instead of end of file, feof
    > is false */
    >   if(!feof(fpin))
    >     goto error_exit;
    >   fclose(fpin);
    >   /* we need to check that fclose flushes data to destination
    > correctly */
    >   err = fclose(fpout);
    >   if(err == EOF)
    >     goto error_exit;
    >   return 0;
    >
    >   /* read/write error */
    > error_exit:
    >   fclose(fpin);
    >   fclose(fpout);
    >   return -1;
    >
    >
    >
    > }- Tekst uit oorspronkelijk bericht niet weergeven -
    >
    > - Tekst uit oorspronkelijk bericht weergeven -


    Malcolm,
    thanks for the example. Copying the input file one byte per read
    (getch) operation may not be the fastest way,
    it does not inflate the detsination filesize (we pay for dsk storage
    here).
    Hans
     
    Hans Vlems, Feb 16, 2012
    #8
  9. Hans Vlems

    Hans Vlems Guest

    On 16 feb, 10:31, Keith Thompson <> wrote:
    > Malcolm McLean <> writes:
    >
    > [...]>   fpin = fopen(source, "rb2");
    >
    > [...]
    >
    > Is "rb2" a typo?
    >
    > --
    > Keith Thompson (The_Other_Keith)  <http://www.ghoti.net/~kst>
    >     Will write code for food.
    > "We must do something.  This is something.  Therefore, we must do this."
    >     -- Antony Jay and Jonathan Lynn, "Yes Minister"


    Possibly, but the setting the proper filemode is not my main concern.
    Hans
     
    Hans Vlems, Feb 16, 2012
    #9
  10. Hans Vlems

    Kleuske Guest

    On Thu, 16 Feb 2012 01:03:12 -0800, Hans Vlems saw fit to publish the
    following:

    > I'm maintaing large numbers of Adobe Reader files (.pdf). One of my
    > programs, written in C (gcc 4.4.4), must make a copies between different
    > filesystems of these pdf files. There is AFAIK no library function that
    > does this, which leaves me two options:
    > 1- use the console interface, i.e. build a command string and pass this
    > to system().


    Don't. It's ineffective and may open up your system to abuse.


    > 2- open the file, copy the contents and close the target I'd rather
    > avoid option 1 because system runs out of control of my program.
    > My question is what read and write functions are best suited to copy the
    > (binary) pdf files?


    Try fopen, fread, fwrite and fclose. Use a big buffer, since PDF's (especially
    with grahics) tend to be big.

    > 'Performance is not the main objective, but I want to be sure that the
    > copy finished succesfully and accurately. Hans


    Check for error codes.

    --
    Each kiss is as the first.
    -- Miramanee, Kirk's wife, "The Paradise Syndrome",
    stardate 4842.6
     
    Kleuske, Feb 16, 2012
    #10
  11. Hans Vlems

    Hans Vlems Guest

    On 16 feb, 10:34, Keith Thompson <> wrote:
    > Hans Vlems <> writes:
    > > I'm maintaing large numbers of Adobe Reader files (.pdf). One of my
    > > programs, written in C (gcc 4.4.4), must make a copies between
    > > different filesystems of these pdf files.
    > > There is AFAIK no library function that does this, which leaves me two
    > > options:
    > > 1- use the console interface, i.e. build a command string and pass
    > > this to system().
    > > 2- open the file, copy the contents and close the target
    > > I'd rather avoid option 1 because system runs out of control of my
    > > program.
    > > My question is what read and write functions are best suited to copy
    > > the (binary) pdf files?
    > > 'Performance is not the main objective, but I want to be sure that the
    > > copy finished succesfully and accurately.

    >
    > I'd just invoke the OS's command to copy the files ("cp" on
    > Unix-like systems, "copy" on Windows, probably something else on
    > other systems).  It's likely to be at least as fast as anything
    > you write yourself, and it may preserve metadata (permissions,
    > etc.) that you're not going to be able to handle in your own code
    > without considerable difficulty.
    >
    > I'm not sure why system running "out of control" of your program
    > should be an issue; can you elaborate?
    >
    > --
    > Keith Thompson (The_Other_Keith)  <http://www.ghoti.net/~kst>
    >     Will write code for food.
    > "We must do something.  This is something.  Therefore, we must do this."
    >     -- Antony Jay and Jonathan Lynn, "Yes Minister"


    Keith, the system(s) we have to work with are connected to disks in a
    way that seriously affects performance.
    I've seen a file copy last a little over one minute, with a filesize
    of approx. 2 MB. A copy may also just fail.
    The function described in the OP may be used for several files, say
    about 20 and all at least 1 MB in size.
    That is certainly tot impressive by todays standards and shouldn't be
    a challenge for the underlying hardware.
    Unfortunately, these things do fail occasionally and when a copy fails
    then system() does not signal that failure.
    I'd rather know about it and hence the desire to perform the copy in
    my own code. Malcolm's example demonstrates
    the various possibilities to signal an error situation.
    Hans
     
    Hans Vlems, Feb 16, 2012
    #11
  12. Hans Vlems

    JohnF Guest

    Hans Vlems <> wrote:
    > JohnF <> wrote:
    >> Hans Vlems <> wrote:
    >> > I'm maintaing large numbers of Adobe Reader files (.pdf). One of my
    >> > programs, written in C (gcc 4.4.4), must make a copies between
    >> > different filesystems of these pdf files.
    >> > There is AFAIK no library function that does this, which leaves me two
    >> > options:
    >> > 1- use the console interface, i.e. build a command string and pass
    >> > this to system().
    >> > 2- open the file, copy the contents and close the target
    >> > I'd rather avoid option 1 because system runs out of control of my
    >> > program.
    >> > My question is what read and write functions are best suited to copy
    >> > the (binary) pdf files?
    >> > 'Performance is not the main objective, but I want to be sure that the
    >> > copy finished succesfully and accurately.
    >> > Hans

    >>
    >> I'm guessing you're already aware of what Malcolm suggested
    >> in preceding followup, and it's not adequate. And since I've
    >> read your posts in comp.os.vms, I'm also guessing we're
    >> talking about an ods-2/5 filesystem at one end, and maybe a
    >> linux ext3, or whatever, at the other. Could you elaborate on
    >> that a little? And does linux have some ods-2/5 support so you
    >> can mount a vms disk? I wasn't aware of that. If you can indeed
    >> just mount it, then by all means try Malcolm's suggestion and
    >> see what happens. Should just work if the ods filesystem support
    >> is any good. Otherwise, how are you intending to even access
    >> the disk? Decnet for linux (note that it's no longer being
    >> very actively supported)? I think that would be the driving
    >> question that dictates an appropriate answer to your question.
    >> So you need to supply all that additional info first.
    >> ? ?By the way, I usually just ftp zipped files back and forth
    >> between vms and linux boxes on my soho lan. Despite your
    >> "out of control" issue, I'd just write a script (using C's
    >> system() if you want the script in C) to do the job, unless
    >> security's some really, really significant issue for your
    >> situation.
    >> --
    >> John Forkosh ( mailto: where j=john and f=forkosh )


    > John,
    > your investigating powers are impressive!

    Indeed they are!
    > Unfortunately they've led you into a dead end street...

    Indeed they do!

    > On a VMS system I wouldn't have had the need to ask a question. VMS
    > has an IO subsystem (RMS) and a neatly documented API.
    > And I doubt I'd have used C to solve this problem ;-) since I have a
    > choice of at least 4 other languages that I'm more
    > comfortable with...
    > The project I'm involved in runs on a Windows platform, on Citrix
    > servers more precisely and I have _no_ provileges on these
    > systems. The reason I use the (old) DJGPP compiler is that doesn't
    > need a Windows install process that uses the registry.
    > The command line interface on WIndows doesn't even come close to what
    > DCL has to offer. But I digress.
    >
    > I want to copy pdf files from one windows disk to another, so the
    > rename() function is useless. Next, I must retain the original file
    > which is another reason why rename() won't do.
    > C has a choice of functions to read from and write to diskfiles. I
    > want to be sure that all content gets copied, unaltered and without
    > inflating the file too much. One option is to read the input file one
    > byte at a time and write it until EOF is signalled.
    > Or read blocks, say 1 kB, and write them. Probably faster but may have
    > other drawbacks I'm not aware of.
    > The original post was written with this in mind and that was perhaps
    > not too smart.
    > Hans


    Yeah, I've used djgpp and mingw to compile C programs on windows.
    I think I'd recommend mingw if that works for you (I say "think"
    because I can't recall >>why<< I prefer it). Anyway, when you say
    >>no<< priviliges, I assume your program can read and write files,

    i.e., whoever's running it has whatever's necessary to do that.
    In that case, block reads and writes are fine. I've done that
    and it works okay for me. Of course, you should try it yourself,
    since only God knows what'll happen in your particular situation.
    But I think you can safely start with something of the form...

    int fcopy( char *infile, char *outfile ) {
    FILE *inptr = fopen(infile,"rb"), /*open file for binary read*/
    *outptr = fopen(outfile,"wb"); /*and write*/
    unsigned char buff[256]; /*block of bytes from infile*/
    int buflen=255, nread=0,nwrite=0, /*#bytes we try to read/write*/
    nrw = 0; /*total bytes read/written*/
    if ( inptr!=NULL && outptr!=NULL ) { /*have opened files*/
    while ( 1 ) { /*read & write them till eof*/
    /* --- read bytes from infile --- */
    nread = fread(buff,sizeof(unsigned char),buflen,inptr); /*read*/
    if ( nread < 1 ) break; /* no bytes left in file */
    /* --- write bytes to outfile --- */
    nwrite = fwrite(buff,sizeof(unsigned char),nread,outptr); /*write*/
    if ( nwrite != nread ) { nrw=(-1); break; } /*problem writing*/
    nrw += nwrite; /*total #bytes*/
    if ( nread < buflen ) break; /* no bytes left in file */
    } /* --- end-of-while(1) --- */
    fclose(inptr); fclose(outptr); /* close files */
    } /* --- end-of-if(fileptrs!=NULL) --- */
    return ( nrw ); /*back to caller with file size*/
    } /* --- end-of-function fcopy() --- */

    ....which I've snipped from some code that works (with a few
    essentially cosmetic changes so it reads okay as a code fragment).
    --
    John Forkosh ( mailto: where j=john and f=forkosh )
     
    JohnF, Feb 16, 2012
    #12
  13. Hans Vlems

    Stefan Ram Guest

    Keith Thompson <> writes:
    >I'd just invoke the OS's command to copy the files ("cp" on
    >Unix-like systems, "copy" on Windows, probably something else on
    >other systems). It's likely to be at least as fast as anything


    I'd invoke the OS call to copy the file, on Windows it's
    »CopyFile«. For example, from one of my programs:


    #include <windows.h>
    #include <tchar.h>

    ....

    int filecopy( LPTSTR const target, LPTSTR const source )
    { BOOL const success = CopyFile( source, target, TRUE );
    int const terminate = success ? 1 : error5();
    return terminate; }

    ....
     
    Stefan Ram, Feb 16, 2012
    #13
  14. Hans Vlems

    Hans Vlems Guest

    On 16 feb, 13:56, -berlin.de (Stefan Ram) wrote:
    > Keith Thompson <> writes:
    > >I'd just invoke the OS's command to copy the files ("cp" on
    > >Unix-like systems, "copy" on Windows, probably something else on
    > >other systems).  It's likely to be at least as fast as anything

    >
    >   I'd invoke the OS call to copy the file, on Windows it's
    >   »CopyFile«. For example, from one of my programs:
    >
    > #include <windows.h>
    > #include <tchar.h>
    >
    > ...
    >
    > int filecopy( LPTSTR const target, LPTSTR const source )
    > { BOOL const success = CopyFile( source, target, TRUE );
    >   int const terminate = success ? 1 : error5();
    >   return terminate; }
    >
    > ...


    OK, I tried both John's and Malcom's solutions and they both work
    fine, as expected.
    There's very little difference in performance between the two. Both
    copy 2 MB files
    instantly: that is the user hits return and immediately gets a result
    printed.
    Thanks for your time!

    Hans
     
    Hans Vlems, Feb 16, 2012
    #14
  15. Hans Vlems

    Hans Vlems Guest

    On 16 feb, 12:15, Kleuske <> wrote:
    > On Thu, 16 Feb 2012 01:03:12 -0800, Hans Vlems saw fit to publish the
    > following:
    >
    > > I'm maintaing large numbers of Adobe Reader files (.pdf). One of my
    > > programs, written in C (gcc 4.4.4), must make a copies between different
    > > filesystems of these pdf files. There is AFAIK no library function that
    > > does this, which leaves me two options:
    > > 1- use the console interface, i.e. build a command string and pass this
    > > to system().

    >
    > Don't. It's ineffective and may open up your system to abuse.
    >
    > > 2- open the file, copy the contents and close the target I'd rather
    > > avoid option 1 because system runs out of control of my program.
    > > My question is what read and write functions are best suited to copy the
    > > (binary) pdf files?

    >
    > Try fopen, fread, fwrite and fclose. Use a big buffer, since PDF's (especially
    > with grahics) tend to be big.
    >
    > > 'Performance is not the main objective, but I want to be sure that the
    > > copy finished succesfully and accurately. Hans

    >
    > Check for error codes.
    >
    > --
    > Each kiss is as the first.
    >                 -- Miramanee, Kirk's wife, "The Paradise Syndrome",
    >                    stardate 4842.6


    Given the quality and performance of the disks on the systems we've
    got to work with
    error checking is my main objective here. Performance is not my main
    concern right.
    However once the hardware problems get solved, performance may just
    become user issue #1 again.
    Hans
     
    Hans Vlems, Feb 16, 2012
    #15
  16. Hans Vlems

    Hans Vlems Guest

    On 16 feb, 12:52, JohnF <> wrote:
    > Hans Vlems <> wrote:
    > > JohnF <> wrote:
    > >> Hans Vlems <> wrote:
    > >> > I'm maintaing large numbers of Adobe Reader files (.pdf). One of my
    > >> > programs, written in C (gcc 4.4.4), must make a copies between
    > >> > different filesystems of these pdf files.
    > >> > There is AFAIK no library function that does this, which leaves me two
    > >> > options:
    > >> > 1- use the console interface, i.e. build a command string and pass
    > >> > this to system().
    > >> > 2- open the file, copy the contents and close the target
    > >> > I'd rather avoid option 1 because system runs out of control of my
    > >> > program.
    > >> > My question is what read and write functions are best suited to copy
    > >> > the (binary) pdf files?
    > >> > 'Performance is not the main objective, but I want to be sure that the
    > >> > copy finished succesfully and accurately.
    > >> > Hans

    >
    > >> I'm guessing you're already aware of what Malcolm suggested
    > >> in preceding followup, and it's not adequate. And since I've
    > >> read your posts in comp.os.vms, I'm also guessing we're
    > >> talking about an ods-2/5 filesystem at one end, and maybe a
    > >> linux ext3, or whatever, at the other. Could you elaborate on
    > >> that a little? And does linux have some ods-2/5 support so you
    > >> can mount a vms disk? I wasn't aware of that. If you can indeed
    > >> just mount it, then by all means try Malcolm's suggestion and
    > >> see what happens. Should just work if the ods filesystem support
    > >> is any good. Otherwise, how are you intending to even access
    > >> the disk? Decnet for linux (note that it's no longer being
    > >> very actively supported)? I think that would be the driving
    > >> question that dictates an appropriate answer to your question.
    > >> So you need to supply all that additional info first.
    > >> ? ?By the way, I usually just ftp zipped files back and forth
    > >> between vms and linux boxes on my soho lan. Despite your
    > >> "out of control" issue, I'd just write a script (using C's
    > >> system() if you want the script in C) to do the job, unless
    > >> security's some really, really significant issue for your
    > >> situation.
    > >> --
    > >> John Forkosh ( mailto: where j=john and f=forkosh )

    > > John,
    > > your investigating powers are impressive!

    > Indeed they are!
    > > Unfortunately they've led you into a dead end street...

    >
    > Indeed they do!
    >
    >
    >
    >
    >
    > > On a VMS system I wouldn't have had the need to ask a question. VMS
    > > has an IO subsystem (RMS) and a neatly documented API.
    > > And I doubt I'd have used C to solve this problem ;-) since I have a
    > > choice of at least 4 other languages that I'm more
    > > comfortable with...
    > > The project I'm involved in runs on a Windows platform, on Citrix
    > > servers more precisely and I have _no_ provileges on these
    > > systems. The reason I use the (old) DJGPP compiler is that doesn't
    > > need a Windows install process that uses the registry.
    > > The command line interface on WIndows doesn't even come close to what
    > > DCL has to offer. But I digress.

    >
    > > I want to copy pdf files from one windows disk to another, so the
    > > rename() function is useless. Next, I must retain the original file
    > > which is another reason why rename() won't do.
    > > C has a choice of functions to read from and write to diskfiles. I
    > > want to be sure that all content gets copied, unaltered and without
    > > inflating the file too much. One option is to read the input file one
    > > byte at a time and write it until EOF is signalled.
    > > Or read blocks, say 1 kB, and write them. Probably faster but may have
    > > other drawbacks I'm not aware of.
    > > The original post was written with this in mind and that was perhaps
    > > not too smart.
    > > Hans

    >
    > Yeah, I've used djgpp and mingw to compile C programs on windows.
    > I think I'd recommend mingw if that works for you (I say "think"
    > because I can't recall >>why<< I prefer it). Anyway, when you say>>no<< priviliges, I assume your program can read and write files,
    >
    > i.e., whoever's running it has whatever's necessary to do that.
    >    In that case, block reads and writes are fine. I've done that
    > and it works okay for me. Of course, you should try it yourself,
    > since only God knows what'll happen in your particular situation.
    > But I think you can safely start with something of the form...
    >
    > int     fcopy( char *infile, char *outfile ) {
    > FILE    *inptr = fopen(infile,"rb"),  /*open file for binary read*/
    >         *outptr = fopen(outfile,"wb");        /*and write*/
    > unsigned char buff[256];                /*block of bytes from infile*/
    > int     buflen=255, nread=0,nwrite=0,   /*#bytes we try to read/write*/
    >         nrw = 0;                        /*total bytes read/written*/
    > if ( inptr!=NULL && outptr!=NULL ) {    /*have opened files*/
    >   while ( 1 ) {                         /*read & write them till eof*/
    >     /* --- read bytes from infile --- */
    >     nread = fread(buff,sizeof(unsigned char),buflen,inptr); /*read*/
    >     if ( nread < 1 ) break;          /* no bytes left in file */
    >     /* --- write bytes to outfile --- */
    >     nwrite = fwrite(buff,sizeof(unsigned char),nread,outptr); /*write*/
    >     if ( nwrite != nread ) { nrw=(-1); break; } /*problem writing*/
    >     nrw += nwrite;                      /*total #bytes*/
    >     if ( nread < buflen ) break;     /* no bytes left in file */
    >     } /* --- end-of-while(1) --- */
    >   fclose(inptr); fclose(outptr);        /* close files */
    >   } /* --- end-of-if(fileptrs!=NULL) --- */
    > return ( nrw );                         /*back tocaller with file size*/
    >
    > } /* --- end-of-function fcopy() --- */
    >
    > ...which I've snipped from some code that works (with a few
    > essentially cosmetic changes so it reads okay as a code fragment).
    > --
    > John Forkosh  ( mailto:    where j=john and f=forkosh)- Tekst uit oorspronkelijk bericht niet weergeven -
    >
    > - Tekst uit oorspronkelijk bericht weergeven -- Tekst uit oorspronkelijk bericht niet weergeven -
    >
    > - Tekst uit oorspronkelijk bericht weergeven -


    The reason I use djgpp is that it is *very* simple to set up: unpack a
    zip file and that's it.
    Much later I came across mingw and that proved not as easy to set up.
    Since I don't write code that is so subtle that it takes a very
    refined compiler I think that gcc 4.4.4 is quite alright.

    Hans
     
    Hans Vlems, Feb 16, 2012
    #16
  17. Hans Vlems

    Ike Naar Guest

    On 2012-02-16, JohnF <> wrote:
    > int fcopy( char *infile, char *outfile ) {
    > FILE *inptr = fopen(infile,"rb"), /*open file for binary read*/
    > *outptr = fopen(outfile,"wb"); /*and write*/
    > unsigned char buff[256]; /*block of bytes from infile*/
    > int buflen=255, nread=0,nwrite=0, /*#bytes we try to read/write*/
    > nrw = 0; /*total bytes read/written*/
    > if ( inptr!=NULL && outptr!=NULL ) { /*have opened files*/
    > while ( 1 ) { /*read & write them till eof*/
    > /* --- read bytes from infile --- */
    > nread = fread(buff,sizeof(unsigned char),buflen,inptr); /*read*/


    sizeof(unsigned char) is 1 by definition.
    Is there a particular reason why you are reading one byte less than
    the size of the buffer? It seems more logical to use the entire buffer:

    nread = fread(buff, 1, sizeof buff, inptr);

    > if ( nread < 1 ) break; /* no bytes left in file */
    > /* --- write bytes to outfile --- */
    > nwrite = fwrite(buff,sizeof(unsigned char),nread,outptr); /*write*/
    > if ( nwrite != nread ) { nrw=(-1); break; } /*problem writing*/
    > nrw += nwrite; /*total #bytes*/
    > if ( nread < buflen ) break; /* no bytes left in file */


    if (nread < sizeof buff) break;

    > } /* --- end-of-while(1) --- */
    > fclose(inptr); fclose(outptr); /* close files */
    > } /* --- end-of-if(fileptrs!=NULL) --- */
    > return ( nrw ); /*back to caller with file size*/


    Potential file descriptor leak: if one of the files could be opened
    but the other couldn't, the opened file is not closed.
    Also, in this case 0 is returned, so from the return value you
    cannot distinguish between having fopen errors, and having completed
    a successful copy of an empty file.

    > } /* --- end-of-function fcopy() --- */
    >
    > ...which I've snipped from some code that works (with a few
    > essentially cosmetic changes so it reads okay as a code fragment).
     
    Ike Naar, Feb 16, 2012
    #17
  18. Hans Vlems <> writes:
    <snip>
    > Keith, the system(s) we have to work with are connected to disks in a
    > way that seriously affects performance.
    > I've seen a file copy last a little over one minute, with a filesize
    > of approx. 2 MB. A copy may also just fail.
    > The function described in the OP may be used for several files, say
    > about 20 and all at least 1 MB in size.
    > That is certainly tot impressive by todays standards and shouldn't be
    > a challenge for the underlying hardware.
    > Unfortunately, these things do fail occasionally and when a copy fails
    > then system() does not signal that failure.
    > I'd rather know about it and hence the desire to perform the copy in
    > my own code. Malcolm's example demonstrates
    > the various possibilities to signal an error situation.


    Given your concerns about safety, you may well be working at the wrong
    level. It's perfectly possible for a C program (using standard C IO) to
    signal success without a single byte of data hitting the disk.

    If your concern for a safe copy is indeed paramount, you will have ask
    about OS-level facilities in a suitable group. Both the OS and the
    file-system types that are involved in the copy may have a bearing on
    how to get maximum safety.

    --
    Ben.
     
    Ben Bacarisse, Feb 16, 2012
    #18
  19. Nobody <> writes:
    > On Thu, 16 Feb 2012 01:03:12 -0800, Hans Vlems wrote:
    >> I'm maintaing large numbers of Adobe Reader files (.pdf). One of my
    >> programs, written in C (gcc 4.4.4), must make a copies between
    >> different filesystems of these pdf files.
    >> There is AFAIK no library function that does this, which leaves me two
    >> options:
    >> 1- use the console interface, i.e. build a command string and pass
    >> this to system().

    >
    > Avoid system() unless executing a "canned" command supplied by the user.
    > If you need to spawn a child process with specific arguments, use fork()
    > and exec*() rather than attempting to construct a shell command.

    [...]

    This, as well as the rest of your response, relies heavily on the
    assumption that the OP is on a Unix-like system.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Will write code for food.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Feb 16, 2012
    #19
  20. Hans Vlems <> writes:
    > On 16 feb, 10:31, Keith Thompson <> wrote:
    >> Malcolm McLean <> writes:
    >>
    >> [...]>   fpin = fopen(source, "rb2");
    >>
    >> [...]
    >>
    >> Is "rb2" a typo?

    >
    > Possibly, but the setting the proper filemode is not my main concern.


    If you mean that it's a trivial thing to correct and you're not going to
    get it wrong, that's fine. If you mean that it doesn't matter, it most
    certainly does.


    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Will write code for food.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Feb 16, 2012
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. yaduraj

    copying multiple files

    yaduraj, Aug 9, 2004, in forum: Perl
    Replies:
    1
    Views:
    550
    Jim Gibson
    Aug 9, 2004
  2. ALPO

    Copying files .........

    ALPO, Dec 16, 2003, in forum: ASP .Net
    Replies:
    1
    Views:
    451
    Jim Hughes
    Dec 20, 2003
  3. John Lundrigan

    Copying files in the root with Copy Project

    John Lundrigan, May 20, 2004, in forum: ASP .Net
    Replies:
    0
    Views:
    308
    John Lundrigan
    May 20, 2004
  4. Replies:
    0
    Views:
    496
  5. Shawn Mcclain
    Replies:
    0
    Views:
    185
    Shawn Mcclain
    Sep 28, 2007
Loading...

Share This Page