fstream Buffers

Mike Copeland · May 17, 2012

I have the following routine that determines the time required to
copy a large text file to another. I'm comparing this code to a class I
wrote (too large to post here) that does text file I/o with explicit
buffer size designation (4096 character buffers). The code below runs
12-13 times slower, although doing exactly the same work.
I tried to google "fstream buffer" with great confusion and little
understanding of what I'd need to do to change the code below to use
explicit buffers: hoping to get a better performance match to my old
processing.
I'd like to use C++ strings and I/o streams if possible. but I wonder
if the overhead to use these functions can match older C processing
(fgets, setbuf, etc.) at all. The following seems probable:
1. The overhead to size and allocate/release a string variable for each
logical record will always be slower than use of an explicit char
[some_size] for the text string being read from the input file.
2. The "<<" operators are probably much slower than "fputs".
Are my assumptions valid? Are there options or function calls I can
use to specify fstream buffers (and how are they used?)? Please advise.
TIA

//////////////////// Code ////////////////////
string line;
fstream fVar1, fVar2;
fVar1.open("pat12.n00", fstream::in);
if(fVar1.is_open())
{
fVar2.open("test.txt", fstream:

ut);
while(getline(fVar1, line))
{
fVar2 << line << endl;
}
fVar1.close(), fVar2.close();
}
else cout << "Unable to open file";
//////////////////////////////////////////////

Joshua Maurice · May 17, 2012

I have the following routine that determines the time required to
copy a large text file to another. I'm comparing this code to a class I
wrote (too large to post here) that does text file I/o with explicit
buffer size designation (4096 character buffers). The code below runs
12-13 times slower, although doing exactly the same work.
I tried to google "fstream buffer" with great confusion and little
understanding of what I'd need to do to change the code below to use
explicit buffers: hoping to get a better performance match to my old
processing.
I'd like to use C++ strings and I/o streams if possible. but I wonder
if the overhead to use these functions can match older C processing
(fgets, setbuf, etc.) at all. The following seems probable:
1. The overhead to size and allocate/release a string variable for each
logical record will always be slower than use of an explicit char
[some_size] for the text string being read from the input file.
2. The "<<" operators are probably much slower than "fputs".
Are my assumptions valid? Are there options or function calls Ican
use to specify fstream buffers (and how are they used?)? Please advise..
TIA

//////////////////// Code ////////////////////
string line;
fstream fVar1, fVar2;
fVar1.open("pat12.n00", fstream::in);
if(fVar1.is_open())
{
fVar2.open("test.txt", fstream:ut);
while(getline(fVar1, line))
{
fVar2 << line << endl;
}
fVar1.close(), fVar2.close();}

else cout << "Unable to open file";
//////////////////////////////////////////////

First silly question: what compiler flags? Are optimizations enabled?
What compiler?

cartec69 · May 17, 2012

while(getline(fVar1, line))
{
fVar2 << line << endl;
}
fVar1.close(), fVar2.close();

std::endl sends a '\n' to the stream and flushes it. flushing after each and every line of output will likely kill performance.

Joshua Maurice · May 17, 2012

std::endl sends a '\n' to the stream and flushes it. flushing after each and every line of output will likely kill performance.

Heh. Good point.

I usually won't look at code in such questions until I get which
compiler and which compiler options. It's always annoying to debug
performance problems in code when the code was being run without
optimizations.

Jorgen Grahn · May 17, 2012

I have the following routine that determines the time required to
copy a large text file to another. I'm comparing this code to a class I
wrote (too large to post here) that does text file I/o with explicit
buffer size designation (4096 character buffers). The code below runs
12-13 times slower, although doing exactly the same work. ....

string line;
fstream fVar1, fVar2;

Not very good names. 'src' and 'dest' would be better.

fVar1.open("pat12.n00", fstream::in);
if(fVar1.is_open())
{
fVar2.open("test.txt", fstream:ut);
while(getline(fVar1, line))
{
fVar2 << line << endl;

^^^^
You're flushing the stream for every line. That almost certainly is
the cause of your 12x slowdown. Fix that and measure again.

Remember, the C++ line terminator is the same as in C, namely '\n'.

/Jorgen

Jorgen Grahn · May 17, 2012

Not to mention buffering the data _three_ times: ....
That's a lot of cpu cycles wasted moving data between buffers,
and more system calls than is
necessary.

Probably insignificant compared to the constant flushing, though.

/Jorgen

Mike Copeland · May 17, 2012

=A0 =A0I have the following routine that determines the time required to
copy a large text file to another. =A0I'm comparing this code to a class = I
wrote (too large to post here) that does text file I/o with explicit
buffer size designation (4096 character buffers). =A0The code below runs
12-13 times slower, although doing exactly the same work.
=A0 =A0I tried to google "fstream buffer" with great confusion and little
understanding of what I'd need to do to change the code below to use
explicit buffers: hoping to get a better performance match to my old
processing.
=A0 =A0I'd like to use C++ strings and I/o streams if possible. but I won= der
if the overhead to use these functions can match older C processing
(fgets, setbuf, etc.) at all. =A0The following seems probable:
=A01. The overhead to size and allocate/release a string variable for each
logical record will always be slower than use of an explicit char
[some_size] for the text string being read from the input file.
=A02. The "<<" operators are probably much slower than "fputs".
Are my assumptions valid? =A0Are there options or function calls I can
use to specify fstream buffers (and how are they used?)? Please advise.
TIA

//////////////////// Code ////////////////////
string =A0 line;
fstream =A0fVar1, fVar2;
fVar1.open("pat12.n00", fstream::in);
if(fVar1.is_open())
{
=A0 fVar2.open("test.txt", fstream:ut);
=A0 while(getline(fVar1, line))
=A0 {
=A0 =A0 fVar2 << line << endl;
=A0 }
=A0 fVar1.close(), fVar2.close();}

else cout << "Unable to open file";
//////////////////////////////////////////////

Click to expand...

First silly question: what compiler flags? Are optimizations enabled?
What compiler?

Unfortunately, a "silly answer": I don't know what compiler flags
and/or efficiencies are involved (don't know how to find out...), and
it's VS6.0 (which everyone here condemns, but it's all I have.
The real point is that this code is in linear sequence in a program,
the old calls followed by the posted code. I also reversed the order of
the tests, to assure that there's not a "caching" advantage that the
second process has versus the first. Whether the above code is first or
second, the time it takes is 12-13 times as long as the old
code/technique (fgets/fputs/setbuf). 8<{{

Mike Copeland · May 17, 2012

std::endl sends a '\n' to the stream and flushes it. flushing after each and every line of output will likely kill performance.

Okay, that's useful. Guess I'll have to append a '\n' to the data
line prior to the "<<" operator. 8<}}

Mike Copeland · May 17, 2012

(e-mail address removed) says...
Okay, that's useful. Guess I'll have to append a '\n' to the data
line prior to the "<<" operator. 8<}}

Doing so has no effect whatsoever; code follows:

line +='\n';
fVar2 << line;

Mike Copeland · May 17, 2012

Not very good names. 'src' and 'dest' would be better.

^^^^
You're flushing the stream for every line. That almost certainly is
the cause of your 12x slowdown. Fix that and measure again.

Remember, the C++ line terminator is the same as in C, namely '\n'.

And doing so (appending the '\n' to "line" prior to writing it) has
no effect. I.e.

while(getline(fVar1, line))
{
line +='\n';
fVar2 << line;
}

Same 12x-13x performance degradation with this code versus
fgets/fputs/setbuf. 8<{{

Mike Copeland · May 17, 2012

Can you invoke the operating system utility that is optimized for
copying files (e.g. system("cp pat12.n00 test.text")?

You'll never be able to approach optimal performance using buffering
in the application (either via C++ streams or using vanilla <stdio.h>).

That's not the point here: I'm writing applications that do a lot of
text file I/o and am trying to convert fgets/fputs/setbuf logic (with C-
style strings) to fstream/string processing. My first tests of such
modifications seemed to show great slowdown; I wrote some simple "file
copy" testing to see if that's so; I found an enormous performance
"hit" with the new logic.
My guess was that the absence of a user-defined I/o buffer was the
major source of the degradation, and I sought a way to achieve the
"setbuf" functionality with something that fstream might provide.
Searching the Web for information hasn't led to an answer, so I asked in
this NG.
I don't much care about actual "file copying", per se.
My goal here is to (greatly) improve the performance of
fstream/string usage in applications I write, as it's a deal-killer for
now... 8<{{

Joshua Maurice · May 17, 2012

Unfortunately, a "silly answer": I don't know what compiler flags
and/or efficiencies are involved (don't know how to find out...),

You need to learn your tools. Determine if it's an optimized build or
not, and get back to us. If you're using default settings, there's
probably a toolbar with a couple dropdowns, one for "config" and one
for "platform". Currently, the toolbar in my IDE reads; "Debug"
"Win32". Change "Debug" to "Release", compile, run, and see if that
helps.

and
it's VS6.0 (which everyone here condemns, but it's all I have.

Visual studios 2008, 2010, etc., are all free for download. The
express versions at least, and AFAIK they're all you need unless
you're doing some microsoft specific coding (beyond the basic win32
API), and I know nothing about such things. It's sufficient for my
uses.

The real point is that this code is in linear sequence in a program,
the old calls followed by the posted code. I also reversed the order of
the tests, to assure that there's not a "caching" advantage that the
second process has versus the first. Whether the above code is first or
second, the time it takes is 12-13 times as long as the old
code/technique (fgets/fputs/setbuf). 8<{{

The difference is that the C++ standard library, and good C++ code in
general, has its efficiency depend heavily on good optimization,
including inline-expansion of functions. If you don't have the basic
optimization flags on, chances are that the C++ standard library
fstream solution will be much slower.

In addition to what was noted else-thread: std::endl writes a '\n' and
does a flush. You probably don't want to flush that often. You
probably want to flush only at the end.

Joshua Maurice · May 17, 2012

Doing so has no effect whatsoever; code follows:

line +='\n';
fVar2 << line;

Instead, for this context I'd suggest:
fVar2 << line << '\n';

Also, fstream exposes a read() member function. You can do the same
fixed-size buffer approach, and that may get you better results. As
is, you're reading in potentially very small lines, resulting in more
work than necessary.

Also, are you doing a text file copy or another kind of file? Look up
std::ios::binary if you don't want garbled binary files.
http://cplusplus.com/reference/iostream/ios_base/openmode/

Jorgen Grahn · May 18, 2012

And doing so (appending the '\n' to "line" prior to writing it) has
no effect. I.e.

while(getline(fVar1, line))
{
line +='\n';
fVar2 << line;
}

Same 12x-13x performance degradation with this code versus
fgets/fputs/setbuf. 8<{{

I can't repeat that result here (Linux, gcc -Os).

fVar2 << line << endl;

./filecopy 0.89s user 2.81s system 98% cpu 3.774 total
./filecopy 0.78s user 2.77s system 97% cpu 3.623 total
./filecopy 0.68s user 2.68s system 98% cpu 3.421 total

fVar2 << line << '\n';

./filecopy 0.36s user 0.19s system 97% cpu 0.565 total
./filecopy 0.35s user 0.18s system 70% cpu 0.750 total
./filecopy 0.34s user 0.22s system 56% cpu 0.996 total

line +='\n'; fVar2 << line;

./filecopy 0.32s user 0.14s system 96% cpu 0.475 total
./filecopy 0.32s user 0.15s system 95% cpu 0.488 total
./filecopy 0.31s user 0.18s system 68% cpu 0.711 total

cp a b # standard Unix tool

cp a b 0.00s user 0.16s system 96% cpu 0.166 total
cp a b 0.00s user 0.15s system 95% cpu 0.155 total
cp a b 0.00s user 0.16s system 94% cpu 0.166 total

The version with the flushing is wedged in a myriad system calls. The
others read and write (on my system) 8 kB at a time. 'cp' does the
same system calls, but 32 kB at a time. It doesn't try to find lines
either, so it's not surprising it's quite a bit faster. You normally
read line-by-line because you want to parse the text line by line.

If you're not seeing any difference, perhaps it's the very old
compiler you're using (mentioned elsethread) or with Windows. I have
no experience with either of them.

/Jorgen

Dombo · May 18, 2012

Op 18-May-12 0:13, Mike Copeland schreef:

Unfortunately, a "silly answer": I don't know what compiler flags
and/or efficiencies are involved (don't know how to find out...), and
it's VS6.0 (which everyone here condemns, but it's all I have.

I vaguely remember there was an issue with the iostream implementation
that came with Visual Studio 6.0 which made it very slow. I believe it
had to do something with constantly syncing with the stdio, and that it
could be disabled.

Adam Skutt · May 18, 2012

Not to mention buffering the data _three_ times:

once in the kernel file/buffer cache (at one or more page (typ. 4096 byte) granularity)

You're done once the data reaches here. There's no guaranteed way to
bypass this caching, nor any value in doing so for most applications
(certainly any application using fstream). We can pretend it does not
exist.

once in the streams (or stdio) buffer (at streams buffer/setvbuf (1024/2408 byte) granularity)
once in the application buffer. (at line size granularity)

That's a lot of cpu cycles wasted moving data between buffers, and more system calls than is
necessary.

Moving data between processors is actually pretty cheap, especially
compared to making system calls. However, if more system calls than
necessary are occurring, it's a bug in iostream implementation, not in
his code. I suspect while not strictly optimal, it's probably close
enough.

Adam

Adam Skutt · May 18, 2012

Can you invoke the operating system utility that is optimized for
copying files (e.g. system("cp pat12.n00 test.text")?

Please don't do this. For starters, it's incredibly difficult to use
the system() function securely--in a few cases it simply cannot be
used securely. Secondly, for all the effort involved, you could just
copy the code from a BSD implementation or something similar.

You'll never be able to approach optimal performance using buffering
in the application (either via C++ streams or using vanilla <stdio.h>).

Actually, one can, but it depends on the iostream implementation.

Using pread/pwrite with 1MB transfers is probably the best you'll be able
to do.

I'm not sure why you think 1MB [sic] is the right transfer size. I'm
not aware of any cp(1) implementation that uses a buffer that large,
because it's pointless.

And you may want to open the files with O_DIRECT to avoid the kernel
buffer/file cache, which may help avoid filling the OS file cache with
"access once" data. On POSIX systems anyway.

Nope. There's no guarantee that it does any of those things, and it
has terrible semantics for correct usage. Again, why are you
recommending things that cp(1) doesn't do?

System copy utilities will generally preserve 'holes' in the file as well, where
a less sophisticated copy algorithm will fill the holes (and the resulting output
file will consume more disk space).

This is not relevant here and plainly apparent from the OP.

Adam

Adam Skutt · May 18, 2012

I'm not sure why you think 1MB [sic] is the right transfer size. I'm
not aware of any cp(1) implementation that uses a buffer that large,
because it's pointless.

Click to expand...

Twenty years of measuring and optimizing disk performance, particularly
on large Oracle installations.

I'm not sure why you think that is remotely relevant here. The OP is
plainly not writing a part of, nor a replacement for, Oracle.

Optimal performance, of course. cp is hardly optimal. dd(1) or xdd(1)
are much better from a performance standpoint.

It hits the I/O limit on my systems with no problems. He's
manipulating text, so optimal performance in the I/O stack is very
likely to be pointless anyway. His largest source of overhead is
likely to be in whatever processing his application is actually
doing. Your suggestions are both factually incorrect and
irrelevant. dd(1) also doesn't do direct I/O unless asked, FWIW.

O_DIRECT is guaranteed to DMA the data from the device into the
application buffer, completely bypassing any kernel or glibc buffers.

Nope. All it does is disable any _caching_ done by the kernel, if
possible. On modern Linux, that means that I/O skips the page cache
for the file data only. It doesn't necessarily result in direct DMA
because that's simply not possible on some systems. If the I/O
requires bounce buffers, then there still will be a copy operation in
the kernel. If the file is open anywhere else, then you still get
manipulate of the page cache in order to guarantee cache coherency.
If the filesystem needs to do some magic to ensure consistency (e.g.,
block-level checksums, ordered writes) then those will still happen,
and they may or may not involve a copy operation or some other I/O
"performance killing" behavior.

That's ignoring all the other implementation requirements for using
O_DIRECT:
* You're not guaranteed it's supported. Some file systems simply do
not support it, and the open(2) call will fail. Some operating
systems simply do not support the flag. There's no portable way to
determine whether the flag is supported.
* You have to align everything correctly, and there is no portable way
to find out the correct alignment, nevermind the optimal one.

All of that for a rather dubious performance benefit, and direct I/O
as you describe it is essentially impossible on some modern
filesystems (e.g., ZFS, BTRFS). Moreover, I would suspect the VM
gymnastics required to allow the FS to work correctly would kill a lot
of the performance, especially when compared to just copying the
buffer. TLB flushes and messing with page tables is expensive.
Making copies is not expensive up to a pretty large point.

Adam

Ian Collins · May 19, 2012

I have the following routine that determines the time required to
copy a large text file to another. I'm comparing this code to a class I
wrote (too large to post here) that does text file I/o with explicit
buffer size designation (4096 character buffers). The code below runs
12-13 times slower, although doing exactly the same work.
I tried to google "fstream buffer" with great confusion and little
understanding of what I'd need to do to change the code below to use
explicit buffers: hoping to get a better performance match to my old
processing.
I'd like to use C++ strings and I/o streams if possible. but I wonder
if the overhead to use these functions can match older C processing
(fgets, setbuf, etc.) at all. The following seems probable:
1. The overhead to size and allocate/release a string variable for each
logical record will always be slower than use of an explicit char
[some_size] for the text string being read from the input file.
2. The "<<" operators are probably much slower than "fputs".
Are my assumptions valid? Are there options or function calls I can
use to specify fstream buffers (and how are they used?)? Please advise.
TIA

//////////////////// Code ////////////////////
string line;
fstream fVar1, fVar2;
fVar1.open("pat12.n00", fstream::in);
if(fVar1.is_open())
{
fVar2.open("test.txt", fstream:ut);
while(getline(fVar1, line))
{
fVar2<< line<< endl;
}
fVar1.close(), fVar2.close();
}
else cout<< "Unable to open file";
//////////////////////////////////////////////

You haven't provided enough information (anywhere in the thread) to get
a meaningful answer.

What is the code that runs so much faster?

Is the data format line or record based?

Mike Copeland · May 20, 2012

I tried to google "fstream buffer" with great confusion and little
understanding of what I'd need to do to change the code below to use
explicit buffers: hoping to get a better performance match to my old
processing.
I'd like to use C++ strings and I/o streams if possible. but I wonder
if the overhead to use these functions can match older C processing
(fgets, setbuf, etc.) at all. The following seems probable:
1. The overhead to size and allocate/release a string variable for each
logical record will always be slower than use of an explicit char
[some_size] for the text string being read from the input file.
2. The "<<" operators are probably much slower than "fputs".
Are my assumptions valid? Are there options or function calls I can
use to specify fstream buffers (and how are they used?)? Please advise.
TIA

//////////////////// Code ////////////////////
string line;
fstream fVar1, fVar2;
fVar1.open("pat12.n00", fstream::in);
if(fVar1.is_open())
{
fVar2.open("test.txt", fstream:ut);
while(getline(fVar1, line))
{
fVar2<< line<< endl;
}
fVar1.close(), fVar2.close();
}
else cout<< "Unable to open file";
//////////////////////////////////////////////

Click to expand...

You haven't provided enough information (anywhere in the thread) to get
a meaningful answer.

What is the code that runs so much faster?

Is the data format line or record based?

The code above _is_ the code that runs much slower, and the code that
I'm basing it on uses fgets/fputs and setbuf with a size of 4096
characters/bytes. That is the only distinction between the 2 code
fragments, but including the I/o class definitions I wrote to prepare
and handle the fgets/fputs with *FILE and open/close logic is too much
to post here.
Bottom line: the executing code is only comparing *FILE
fgets/fputs/setbuf with the fstream getline/<< code above. The only
difference I can see if that the absence of a "setbuf" capability with
fstreams is an enormous performance hit...or getline/<< is a terrible
way to to text file I/o. 8<{{

fstream File i/o	1	May 16, 2012
fstream - write a file	3	Mar 26, 2009
Cyrillic text from file - set utf8 in cmd, unknown characters output anyway	0	Nov 11, 2022
Understanding fstream seeking	2	May 24, 2009
fstream vs FILE	4	Mar 7, 2008
fstream issue with ! operator	6	Jun 17, 2008
fstream problem	3	Nov 12, 2007
formatting buffers	3	Nov 14, 2005

fstream Buffers

Mike Copeland

Joshua Maurice

cartec69

Joshua Maurice

Jorgen Grahn

Jorgen Grahn

Mike Copeland

Mike Copeland

Mike Copeland

Mike Copeland

Mike Copeland

Joshua Maurice

Joshua Maurice

Jorgen Grahn

Dombo

Adam Skutt

Adam Skutt

Adam Skutt

Ian Collins

Mike Copeland

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads