Need advices on efficient file i/o with standard library

G

Gan Quan

I'm writing a c++ program that has many (100+) threads read/write files
simultaneously. It works well if not considering the efficiency. The
file i/o seems to be the bottleneck.

This is my code to read from and write to files:

#include <fstream>
#include <sstream>
#include <string>

using namespace std;

bool write(const string &path, const string &contents,
ios::eek:penmode mode)
{
ofstream out;
bool status;

out.open(path.c_str(), mode);
if (!out.fail()) {
out << contents;
}
status = !out.fail();
out.close();

return status;
}

bool read(const string &path, string &contents)
{
ifstream in;
stringstream ss;
bool status;

contents.clear();
in.open(path.c_str(), ios::in);
if (in) {
ss << in.rdbuf();
contents = ss.str();
}
status = !in.fail();
in.close();

return status;
}

I have few clues about how to optimize my code, any direction would be
greatly appreciated.
 
G

Gan Quan

bool read(const string &path, string &contents)
{
ifstream in;
stringstream ss;
bool status;

contents.clear();
in.open(path.c_str(), ios::in);
if (in) {
ss << in.rdbuf();
contents = ss.str();
}
status = !in.fail();
in.close();

return status;
}

I just tried to use fopen(), fread(), fclose() to read file contents:

bool read(const string &path, string &contents)
{
FILE *fp;
char buf[2048];
fp = fopen(path.c_str(), "r");
if (fp) {
while (fread(buf, 2048, 1, fp)) {
contents += buf;
}
fclose(fp);
return true;
} else {
return false;
}
}

This runs 4 times faster (literally) than the previous C++ version. Is
it possible to get the C++ version close to this speed?
 
K

Kai-Uwe Bux

Gan said:
bool read(const string &path, string &contents)
{
ifstream in;
stringstream ss;
bool status;

contents.clear();
in.open(path.c_str(), ios::in);
if (in) {
ss << in.rdbuf();
contents = ss.str();
}
status = !in.fail();
in.close();

return status;
}

I just tried to use fopen(), fread(), fclose() to read file contents:

bool read(const string &path, string &contents)
{
FILE *fp;
char buf[2048];
fp = fopen(path.c_str(), "r");
if (fp) {
while (fread(buf, 2048, 1, fp)) {
contents += buf;
}
fclose(fp);
return true;
} else {
return false;
}
}

This runs 4 times faster (literally) than the previous C++ version. Is
it possible to get the C++ version close to this speed?

I don't know about speed, but you could try:

#include <iterator>
#include <iostream>
#include <fstream>
#include <string>
#include <iosfwd>

bool read_1 ( std::string const & path,
std::string & contents )
{
std::ifstream in;
bool status;
in.open( path.c_str(), std::ios::in );
if ( in ) {
std::string buffer ( std::istreambuf_iterator<char>( in ),
(std::istreambuf_iterator<char>()) );
contents.swap( buffer );
}
status = !in.fail();
in.close();
return status;
}

At least, this avoids the detour through a stringstream. As for the
performance, you will just have to measure. But I take it, that you have
already a framework in place for doing that. I would be interested in the
comparison.


Best

Kai-Uwe Bux
 
W

william

bool read(const string &path, string &contents)
{
ifstream in;
stringstream ss;
bool status;

contents.clear();
in.open(path.c_str(), ios::in);
if (in) {
ss << in.rdbuf();
contents = ss.str();
}
status = !in.fail();
in.close();

return status;
}

I just tried to use fopen(), fread(), fclose() to read file contents:

bool read(const string &path, string &contents)
{
FILE *fp;
char buf[2048];
fp = fopen(path.c_str(), "r");
if (fp) {
while (fread(buf, 2048, 1, fp)) {
contents += buf;
}
fclose(fp);
return true;
} else {
return false;
}
}

This runs 4 times faster (literally) than the previous C++ version. Is
it possible to get the C++ version close to this speed?

you should reduce the disk operation for performance,
for each file, you can caculate its size first and malloc a buffer big
enough to get the data in one read operation.
 
R

Roland Pibinger

I just tried to use fopen(), fread(), fclose() to read file contents:

Well, there is always a way to speed up things ...
bool read(const string &path, string &contents)
{

Reserve enough space for 'contents' to read the entire file without
reallocation (see thread 'How do i copy an entire file into a
string'). But verify that your string implementation really uses the
reserved capacity. Some string implementations deallocate the reserved
space on assignment.
FILE *fp;
char buf[2048];
fp = fopen(path.c_str(), "r");
if (fp) {
while (fread(buf, 2048, 1, fp)) {
contents += buf;

I guess you need to terminate the read chars in buf with '\0' (fread
returns the number of chars read).
}
fclose(fp);
return true;

bool success = (ferror (fp) == 0);
fclose(fp); // return value can be ignored for read
return success;
} else {
return false;
}
}

This runs 4 times faster (literally) than the previous C++ version. Is
it possible to get the C++ version close to this speed?

It's no secret that iostreams are slow. Some people also don't like
their design and interfaces. But they have one advantage: operator<<
which makes it very convenient to trace information. For other IO
tasks I prefer functions that encapsulate stdio functions similar to
the one you have implemented.

Best wishes,
Roland Pibinger
 
G

Gan Quan

Kai-Uwe Bux 写é“:
Gan said:
bool read(const string &path, string &contents)
{
ifstream in;
stringstream ss;
bool status;

contents.clear();
in.open(path.c_str(), ios::in);
if (in) {
ss << in.rdbuf();
contents = ss.str();
}
status = !in.fail();
in.close();

return status;
}

I just tried to use fopen(), fread(), fclose() to read file contents:

bool read(const string &path, string &contents)
{
FILE *fp;
char buf[2048];
fp = fopen(path.c_str(), "r");
if (fp) {
while (fread(buf, 2048, 1, fp)) {
contents += buf;
}
fclose(fp);
return true;
} else {
return false;
}
}

This runs 4 times faster (literally) than the previous C++ version. Is
it possible to get the C++ version close to this speed?

I don't know about speed, but you could try:

#include <iterator>
#include <iostream>
#include <fstream>
#include <string>
#include <iosfwd>

bool read_1 ( std::string const & path,
std::string & contents )
{
std::ifstream in;
bool status;
in.open( path.c_str(), std::ios::in );
if ( in ) {
std::string buffer ( std::istreambuf_iterator<char>( in ),
(std::istreambuf_iterator<char>()) );
contents.swap( buffer );
}
status = !in.fail();
in.close();
return status;
}

At least, this avoids the detour through a stringstream. As for the
performance, you will just have to measure. But I take it, that you have
already a framework in place for doing that. I would be interested in the
comparison.


Best

Kai-Uwe Bux

Interesting, I did a simple test on the following 3 methods:

bool read_1(const string &path, string &contents)
{
ifstream in;
stringstream ss;
bool status;

contents.clear();
in.open(path.c_str(), ios::in);
if (in) {
ss << in.rdbuf();
contents = ss.str();
}
status = !in.fail();
in.close();

return status;
}

bool read_2(const string &path, string &contents)
{
ifstream in;
bool status;

contents.clear();
in.open(path.c_str(), ios::in);
if (in) {
string buffer(istreambuf_iterator<char>(in),
(istreambuf_iterator<char>()));
contents.swap(buffer);
}
status = !in.fail();
in.close();
return status;
}

bool read_3(const string &path, string &contents)
{
FILE *fp;
char *snippet, *buffer;
bool status = false;
size_t i;

if (fp = fopen(path.c_str(), "r")) {
fseek(fp, 0, SEEK_END);
buffer = new char[ftell(fp)+1];
fseek(fp, 0, SEEK_SET);

snippet = buffer;
while (i = fread(snippet, sizeof(char), 2048, fp)) {
snippet += i;
}
*snippet = '\0';

status = feof(fp) ? true : false;
fclose(fp);

contents.clear();
contents.assign(buffer);
delete []buffer;
}
return status;
}

int main(int argc, char* argv[])
{
string path("/path/to/file");
string contents;
int i, j, k;
clock_t c1, c2;

i = j = k = 100;

c1 = clock();
while (i--) {
read_1(path, contents);
}
c2 = clock();
cout << "read_1(): " << ((double)c2 - c1) / CLOCKS_PER_SEC <<
endl;

c1 = clock();
while (j--) {
read_2(path, contents);
}
c2 = clock();
cout << "read_2(): " << ((double)c2 - c1) / CLOCKS_PER_SEC <<
endl;

c1 = clock();
while (k--) {
read_3(path, contents);
}
c2 = clock();
cout << "read_3(): " << ((double)c2 - c1) / CLOCKS_PER_SEC <<
endl;

return 0;
}

Each method was called 100 times to read the contents of the same file,
the file size is 2,395,008 in bytes, clock() was used to measure time,
and each method was tested 3 times, the average times consumed by each
method are as follow:
read_1(): 80.49s
read_2(): 99.907s
read_3(): 13.578s
clock() is not so accurate, but since the differences are fairly
obvious, I think it's good enough for this test.
 
K

Kai-Uwe Bux

Gan said:
Kai-Uwe Bux ??? [snip]
I don't know about speed, but you could try:

#include <iterator>
#include <iostream>
#include <fstream>
#include <string>
#include <iosfwd>

bool read_1 ( std::string const & path,
std::string & contents )
{
std::ifstream in;
bool status;
in.open( path.c_str(), std::ios::in );
if ( in ) {
std::string buffer ( std::istreambuf_iterator<char>( in ),
(std::istreambuf_iterator<char>()) );
contents.swap( buffer );
}
status = !in.fail();
in.close();
return status;
}

At least, this avoids the detour through a stringstream. As for the
performance, you will just have to measure. But I take it, that you have
already a framework in place for doing that. I would be interested in the
comparison.


Best

Kai-Uwe Bux

Interesting, I did a simple test on the following 3 methods:

bool read_1(const string &path, string &contents)
{
ifstream in;
stringstream ss;
bool status;

contents.clear();
in.open(path.c_str(), ios::in);
if (in) {
ss << in.rdbuf();
contents = ss.str();
}
status = !in.fail();
in.close();

return status;
}

bool read_2(const string &path, string &contents)
{
ifstream in;
bool status;

contents.clear();
in.open(path.c_str(), ios::in);
if (in) {
string buffer(istreambuf_iterator<char>(in),
(istreambuf_iterator<char>()));
contents.swap(buffer);
}
status = !in.fail();
in.close();
return status;
}

bool read_3(const string &path, string &contents)
{
FILE *fp;
char *snippet, *buffer;
bool status = false;
size_t i;

if (fp = fopen(path.c_str(), "r")) {
fseek(fp, 0, SEEK_END);
buffer = new char[ftell(fp)+1];
fseek(fp, 0, SEEK_SET);

snippet = buffer;
while (i = fread(snippet, sizeof(char), 2048, fp)) {
snippet += i;
}
*snippet = '\0';

status = feof(fp) ? true : false;
fclose(fp);

contents.clear();
contents.assign(buffer);
delete []buffer;
}
return status;
}

int main(int argc, char* argv[])
{
string path("/path/to/file");
string contents;
int i, j, k;
clock_t c1, c2;

i = j = k = 100;

c1 = clock();
while (i--) {
read_1(path, contents);
}
c2 = clock();
cout << "read_1(): " << ((double)c2 - c1) / CLOCKS_PER_SEC <<
endl;

c1 = clock();
while (j--) {
read_2(path, contents);
}
c2 = clock();
cout << "read_2(): " << ((double)c2 - c1) / CLOCKS_PER_SEC <<
endl;

c1 = clock();
while (k--) {
read_3(path, contents);
}
c2 = clock();
cout << "read_3(): " << ((double)c2 - c1) / CLOCKS_PER_SEC <<
endl;

return 0;
}

Each method was called 100 times to read the contents of the same file,
the file size is 2,395,008 in bytes, clock() was used to measure time,
and each method was tested 3 times, the average times consumed by each
method are as follow:
read_1(): 80.49s
read_2(): 99.907s
read_3(): 13.578s
clock() is not so accurate, but since the differences are fairly
obvious, I think it's good enough for this test.

You got lucky. On my machine, the numbers from your measurement code are
devastating for the istreambuf_iterator approach:

read_1(): 0.09
read_2(): 2.68
read_3(): 0.02

This is somewhat strange, because with a good STL implementation, method
read_2 could be really fast: one would need an overload for the string
constructor from istreambuf_iterator and an implementation of
istreambuf_iterator that allows to measure the size of the file (that would
be an extension for internal use by the implementation). Then the string
could allocate the right amount of memory and with one read dump the file
just into the right place. No further copying in memory should be
necessary. But apparently, the g++ implementation is very non-smart about
istreambuf_iterators: they managed to make it 27 times slower than a string
stream approach and more then 100 times slower than FILE* based IO. This
sucks :-(


Thanks

Kai-Uwe Bux
 
R

Roland Pibinger

I did a simple test on the following 3 methods: [...]
bool read_3(const string &path, string &contents)
{
FILE *fp;
char *snippet, *buffer;
bool status = false;
size_t i;

if (fp = fopen(path.c_str(), "r")) {
fseek(fp, 0, SEEK_END);
buffer = new char[ftell(fp)+1];
fseek(fp, 0, SEEK_SET);

snippet = buffer;
while (i = fread(snippet, sizeof(char), 2048, fp)) {
snippet += i;
}
*snippet = '\0';

status = feof(fp) ? true : false;
fclose(fp);

contents.clear();
contents.assign(buffer);
delete []buffer;
}
return status;
}

What about:

#include <stdio.h>
#include <string>


inline bool read_4 (const std::string &path, std::string &contents) {
bool status = false;
contents.resize(0); // may deallocate string buffer

FILE* fp = fopen(path.c_str(), "r");
if (fp) {
fseek(fp, 0, SEEK_END);
long len = ftell(fp);
if (len > 0) {
contents.reserve (len);
}
fseek(fp, 0, SEEK_SET);

char buf[BUFSIZ] = "";
size_t numread = 0;
while ((numread = fread (buf, 1, sizeof (buf), fp)) > 0) {
contents.append (buf, numread);
}

status = feof(fp) > 0 && ferror(fp) == 0;
fclose(fp);
}
return status;
}
Each method was called 100 times to read the contents of the same file,
the file size is 2,395,008 in bytes, clock() was used to measure time,
and each method was tested 3 times, the average times consumed by each
method are as follow:
read_1(): 80.49s
read_2(): 99.907s
read_3(): 13.578s
clock() is not so accurate, but since the differences are fairly
obvious, I think it's good enough for this test.

Be aware though that you measure the performance of various caches and
buffers (disk, OS, progam).

Best wishes,
Roland Pibinger
 
A

Alf P. Steinbach

* Gan Quan:
I'm writing a c++ program that has many (100+) threads read/write files
simultaneously. It works well if not considering the efficiency. The
file i/o seems to be the bottleneck.

Try calling synch_with_stdio(false).
 
Y

Yannick Tremblay

I'm writing a c++ program that has many (100+) threads read/write files
simultaneously. It works well if not considering the efficiency. The
file i/o seems to be the bottleneck.

This is my code to read from and write to files:

#include <fstream>
#include <sstream>
#include <string>

using namespace std;

bool write(const string &path, const string &contents,
ios::eek:penmode mode)
{
ofstream out;
bool status;

Why do you declare variable uninitialised at the top of
your function like that?

bool status will be unsafe.
ofstream out calls a default constructor that is not
needed and unsafe. This waste a few CPU cycles but
more importantly, you have created an invalid (well,
kind of invalid) object. google RAII for more details.

out.open(path.c_str(), mode);

replace both lines with with:
ofstream out(path.c_str(), mode)
if (!out.fail()) {
out << contents;
}

if( out ) {
out << contents;
}
status = !out.fail();
not needed, why not return !out.fail() directly
out.close();
not needed the ofstream destructor will close the file.
return status;
return !out.fail();
or
return out.good();

<< operator might be quite slow since it does formatting
and all kind of other things under the scene.
fstream.read() and fstream.write() might be faster but
you will need to profile your own platform.

For real optimisation and multithreading performance, you
will have to do more than just optimise reading and writng
to one file. File locking and multiple thread accessing
the same file might be a real cause of concern.

For example, you could use a solution with a file object
knowing its own internal state, including potentially lock,
mutex, share status, buffering, etc. Depending of your
platform and your usage pattern, you may find the memory
mapping files beneficial.



Yan
 
L

Larry Smith

Gan said:
bool read(const string &path, string &contents)
{
ifstream in;
stringstream ss;
bool status;

contents.clear();
in.open(path.c_str(), ios::in);
if (in) {
ss << in.rdbuf();
contents = ss.str();
}
status = !in.fail();
in.close();

return status;
}

I just tried to use fopen(), fread(), fclose() to read file contents:

bool read(const string &path, string &contents)
{
FILE *fp;
char buf[2048];
fp = fopen(path.c_str(), "r");
if (fp) {
while (fread(buf, 2048, 1, fp)) {
contents += buf;
}
fclose(fp);
return true;
} else {
return false;
}
}

This runs 4 times faster (literally) than the previous C++ version. Is
it possible to get the C++ version close to this speed?

Yes.

You used the formatted IO operators (<< and >>) in your first
example, but you used the raw functions (fread and fwrite)
in your second example.

Try using fstream.read() and fstream.write(); these are
the raw functions corresponding to fread() and fwrite().
 
G

Gan Quan

Roland said:
What about:

#include <stdio.h>
#include <string>


inline bool read_4 (const std::string &path, std::string &contents) {
bool status = false;
contents.resize(0); // may deallocate string buffer
Indeed your solution runs faster. I thought it's because the buffer is
appended to the string during the loop, this avoided the big .assign(),
but it turned out to be the .resize(0) that made the big difference.
comment out the .resize(0) line just made the function runs 3 times
slower. It's really odd, .reserve() will resize the string anyway,
what's the difference between .resize() and .reserve()?



Yannick said:
Why do you declare variable uninitialised at the top of
your function like that?

bool status will be unsafe.
ofstream out calls a default constructor that is not
needed and unsafe. This waste a few CPU cycles but
more importantly, you have created an invalid (well,
kind of invalid) object. google RAII for more details.
I just thought it would make the code clearer.
not needed, why not return !out.fail() directly

not needed the ofstream destructor will close the file.
I was used to write a close() right after the open() call, and I assume
that out.fail() would be invalid after out.close() was executed, so
there was status.
<< operator might be quite slow since it does formatting
and all kind of other things under the scene.
fstream.read() and fstream.write() might be faster but
you will need to profile your own platform.

For real optimisation and multithreading performance, you
will have to do more than just optimise reading and writng
to one file. File locking and multiple thread accessing
the same file might be a real cause of concern.

For example, you could use a solution with a file object
knowing its own internal state, including potentially lock,
mutex, share status, buffering, etc. Depending of your
platform and your usage pattern, you may find the memory
mapping files beneficial.

Larry said:
Yes.

You used the formatted IO operators (<< and >>) in your first
example, but you used the raw functions (fread and fwrite)
in your second example.

Try using fstream.read() and fstream.write(); these are
the raw functions corresponding to fread() and fwrite().

Thanks for the good input, I'm working on replacing <</>> operators
with read() and write() calls.
 
G

Gan Quan

I have a new version of i/o functions now, comments interspersed:

bool read_cpp(const string &path, string &contents)
{
ifstream in(path.c_str(), ios::in);
char buf[BUFSIZ] = "";
bool status = false;

ios::sync_with_stdio(false); // comment out this line
// didn't make much difference

contents.resize(0);
in.seekg(0, ios::end);
contents.reserve(in.tellg()); // surprisingly, comment out
// this line didn't make much
// difference either
in.seekg(0, ios::beg);

while (in.good()) {
in.read(buf, BUFSIZ);
contents.append(buf, BUFSIZ);
}
status = in.eof();
in.close();

return status;
}

bool write_cpp(const string &path, const string &contents)
{
ofstream out(path.c_str(), ios::eek:ut);
bool status = false;

if (out) {
out.write(contents.c_str(), contents.length());
}
status = !out.fail();
out.close();

return status;
}

read_cpp() and write_cpp() runs as (almost) fast as their
fread()/fwrite()-implemented counterparts.
Thanks to all you guys.
 
R

Roland Pibinger

contents.resize(0); just clears the contents of 'contents'.
Unfortunately some std::string implementations thereby deallocate the
internal buffer (std::string internals are not standardized).
Indeed your solution runs faster. I thought it's because the buffer is
appended to the string during the loop, this avoided the big .assign(),
but it turned out to be the .resize(0) that made the big difference.
comment out the .resize(0) line just made the function runs 3 times
slower. It's really odd, .reserve() will resize the string anyway,
what's the difference between .resize() and .reserve()?

After .resize(n) the new .size() of the string is n, after .reserve(n)
only the .capacity() is n (not the .size()). It's unspecified how long
the reserved capacity will remain.

Best wishes,
Roland Pibinger
 
G

Gan Quan

Roland said:
After .resize(n) the new .size() of the string is n, after .reserve(n)
only the .capacity() is n (not the .size()). It's unspecified how long
the reserved capacity will remain.
Thanks, but still, why .resize(0) makes such a huge difference on
performance?
without contents.resize(0); my code runs 3-4 times slower.
 
R

Roland Pibinger

I have a new version of i/o functions now, comments interspersed:

bool write_cpp(const string &path, const string &contents)
{
ofstream out(path.c_str(), ios::eek:ut);
bool status = false;

if (out) {
out.write(contents.c_str(), contents.length());
}
status = !out.fail();
out.close();

You must check the success of .close(), otherwise you will not detect
some errors.
return status;
}

read_cpp() and write_cpp() runs as (almost) fast as their
fread()/fwrite()-implemented counterparts.

Since you have written an encapsulated function that's merely an
implementation detail.

Best wishes,
Roland Pibinger
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,190
Latest member
Martindap

Latest Threads

Top