filecopy with std::copy()

  • Thread starter Thomas J. Clancy
  • Start date
T

Thomas J. Clancy

I was wondering if anyone knew of a way to use std::copy() and
istream_iterator<>/ostream_iterator<> write a file copy function that is
quick and efficient.

Doing this messes up the file because it seems to ignore '\n'

ifstream in("somefile");
ofstream out("someOtherFile");

std::copy(std::istream_iterator<unsigned char>(in),
std::istream_iterator<unsigned char>(),
std::eek:stream_iterator<unsigned char>(out));

Now, I figured out how to do it correctly but it is dog slow. I was
wondering if anyone knew how to do this in an ellegant manner?

thomas j. clancy
 
I

Ivan Vecerina

Hi Thomas,
Thomas J. Clancy said:
I was wondering if anyone knew of a way to use std::copy() and
istream_iterator<>/ostream_iterator<> write a file copy function that is
quick and efficient. ....
Now, I figured out how to do it correctly but it is dog slow. I was
wondering if anyone knew how to do this in an ellegant manner?

Unless you insist on using std::copy, the elegant and efficient
manner to copy an entire file (or stream):
dstStream << srcStream.rdbuf();
A C++ implementation should be able to ultimately optimize this
operation (but performance may vary...).

hth,
Ivan
 
T

Thomas J. Clancy

Ivan Vecerina said:
Hi Thomas,


Unless you insist on using std::copy, the elegant and efficient
manner to copy an entire file (or stream):
dstStream << srcStream.rdbuf();
A C++ implementation should be able to ultimately optimize this
operation (but performance may vary...).


Elegant, yes... this I already knew about, but boy is it
sloooooooooooooooowwwwww.... I came up with a different solution using the
std::copy and a type (class) that contains a buffer of chars and uses
stream::read() and stream::write() within the input stream operator (>>) and
the output stream operator (<<), respectively. And man does it scream.

Anyway, I was just wondering if there were alternatives to creating this
sort of thing using or extending the stream stuff.
 
J

Josh Sebastian

Elegant, yes... this I already knew about, but boy is it
sloooooooooooooooowwwwww....

Nothing using IOStreams is going to be faster. File copies are best left
to OS routines.

Josh
 
J

Josh Sebastian

Ummm... the rest of my previous reply talks about what I did to make it
much, much faster than the solution you mentioned, so what do you mean by
your statement above?

It was actually faster using a copy than rdbuf? That's a messed-up
IOStreams implementation. :-}
 
J

Josh Sebastian

Not at all... when you use the output iterator of rdbuf(), I believe that it
is doing it byte by byte and not in chunks. At least this is the behaviour
I am seeing with VC7.1's implementation, which they get from Dinkumware, I
believe. Now I could try this using STLPort.

It shouldn't be, there should be buffering done both by your OS and by
IOStreams. For example

curien@balar:~/prog$ uname -a
Linux balar 2.4.18 #1 Sun Aug 10 12:24:29 EDT 2003 i686 GNU/Linux
curien@balar:~/prog$ cat blah.cpp
#include <fstream>
#include <ios>

int main() {
std::ifstream infile("test.dat", std::ios_base::binary);
std::eek:fstream outfile("test~.dat", std::ios_base::binary);

outfile << infile.rdbuf();
}
curien@balar:~/prog$ dd if=/dev/zero of=test.dat bs=1024 count=50K
51200+0 records in
51200+0 records out
52428800 bytes transferred in 0.700340 seconds (74861922 bytes/sec)
curien@balar:~/prog$ g++ -v
Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.2/specs
Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux
Thread model: posix
gcc version 3.3.2 20030812 (Debian prerelease)
curien@balar:~/prog$ g++ -ansi -pedantic -W -Wall -O2 blah.cpp
curien@balar:~/prog$ time ./a.out

real 0m0.619s
user 0m0.030s
sys 0m0.540s

Josh
 
T

Thomas J. Clancy

Josh Sebastian said:
It shouldn't be, there should be buffering done both by your OS and by
IOStreams. For example

curien@balar:~/prog$ uname -a
Linux balar 2.4.18 #1 Sun Aug 10 12:24:29 EDT 2003 i686 GNU/Linux
curien@balar:~/prog$ cat blah.cpp
#include <fstream>
#include <ios>

int main() {
std::ifstream infile("test.dat", std::ios_base::binary);
std::eek:fstream outfile("test~.dat", std::ios_base::binary);

outfile << infile.rdbuf();
}
curien@balar:~/prog$ dd if=/dev/zero of=test.dat bs=1024 count=50K
51200+0 records in
51200+0 records out
52428800 bytes transferred in 0.700340 seconds (74861922 bytes/sec)

Hey man, I have the numbers, too, and believe me, they suck. I wonder if
Microsoft is pulling a fast one? :)
curien@balar:~/prog$ g++ -v
Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.2/specs
Configured with:
.../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treela
ng --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gx
x-include-dir=/usr/include/c++/3.3 --enable-shared --with-system-zlib --enab
le-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu
--enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-
gc i486-linux
 
T

Thomas J. Clancy

It shouldn't be, there should be buffering done both by your OS and by
IOStreams. For example

My bad, you're right. Under microsoft, if you build this little application
in debug, it is dog slow. I thought I had been building in release mode.
Once I set it to release mode and rebuilt the thing flew! Thanks for the
information on this.

Tom
 
K

Kevin Goodsell

Thomas said:
Elegant, yes... this I already knew about, but boy is it
sloooooooooooooooowwwwww....

Are you using Visual C++ 5 or 6 by chance? There's a known bug in the
iostream library that causes buffering to be wrongly disabled in file
streams that are opened by name. That could account for this being slow,
I think. Check here:

http://www.dinkumware.com/vc_fixes.html
I came up with a different solution using the
std::copy and a type (class) that contains a buffer of chars and uses
stream::read() and stream::write() within the input stream operator (>>) and
the output stream operator (<<), respectively. And man does it scream.

Buffering should be automatic, making this unnecessary. I guess that it
should be possible to make the solution using standard stream classes
perform just as well or better than this solution, but I wouldn't know
exactly how to do it.

-Kevin
 
T

Thomas J. Clancy

Kevin Goodsell said:
Buffering should be automatic, making this unnecessary. I guess that it
should be possible to make the solution using standard stream classes
perform just as well or better than this solution, but I wouldn't know
exactly how to do it.

Here was my solution before I realized that using stream::rdbuf() worked
well while NOT in debug mode using VC++7.1 (.NET 2003).

/**
* A block buffer type that can be used with std::copy() and
istream_iterators without
* having to write a special form of copy or an istream_iterator.
*/
class ByteBlock
{
public:
ByteBlock()
: m_bytesRead(0),
m_fileSize(-1),
m_totalRead(0)
{
}

private:
unsigned char m_block[10240];
int m_bytesRead;
long m_fileSize;
long m_totalRead;
friend std::istream& operator >> (std::istream& stream, ByteBlock& byte);
friend std::eek:stream& operator << (std::eek:stream& stream, const ByteBlock&
byte);
};

std::istream& operator >> (std::istream& stream, ByteBlock& block)
{
if (block.m_fileSize == -1)
{
stream.seekg(0, std::ios::end);
block.m_fileSize = stream.tellg();
stream.seekg(0, std::ios::beg);
}
std::size_t leftToRead = block.m_fileSize - block.m_totalRead;
if (leftToRead)
{
stream.read((char*)block.m_block, std::min(sizeof(block.m_block),
leftToRead));
block.m_bytesRead = stream.gcount();
block.m_totalRead += block.m_bytesRead;
}
else
{
stream.setstate(std::ios_base::eofbit | std::ios_base::badbit);
}
return stream;
}

std::eek:stream& operator << (std::eek:stream& stream, const ByteBlock& block)
{
stream.write((char*)block.m_block, block.m_bytesRead);
return stream;
}

void blockCopyFile(const char* source, const char* dest)
{
ifstream in(source, ios::in | ios::binary);
ofstream out(dest, ios::eek:ut | ios::binary);
copy(istream_iterator<tjc_std::ByteBlock>(in),
istream_iterator<tjc_std::ByteBlock>(),
ostream_iterator<tjc_std::ByteBlock>(out));
}

Yes, this was a naive approach, but it worked quickly and in fact for some
reason this approach still seemed to work slightly faster than:

out << in.rdbuf();

I don't know why that would be, especially since what I've recently read and
what I've been told by others here in this newsgroup. But hey, I just need
a way to copy files without relying on the OS, so both of these ideas seems
to work just fine.

Tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top