What is the fastest way to do Data File I/O in C++

E

elet.mirror

Hi everyone,

I am currently using Visual C++ 8 Standard.

I have a quite large data array (1GB) in the form of an unsigned char
array.
I need to save this to disk in 16-bit format (two chars form a 16-bit
value) or 32-bit format (half of 32 bit value empty).

I was wondering what class I should use to save this data as fast as
possible.
Saving it as bytes is not best, as unfortunately the reading program
(Matlab) will have to convert this to 16-bit, and this will take quite
long.

What matters is for this data to end up on disk as fast as possible in
16-bit or 32-bit format.

Any help would be greatly appreciated!

Thank you,
Best regards,
Tele
 
A

Alf P. Steinbach

* (e-mail address removed):
Hi everyone,

I am currently using Visual C++ 8 Standard.

I have a quite large data array (1GB) in the form of an unsigned char
array.
I need to save this to disk in 16-bit format (two chars form a 16-bit
value) or 32-bit format (half of 32 bit value empty).

I was wondering what class I should use to save this data as fast as
possible.
Saving it as bytes is not best, as unfortunately the reading program
(Matlab) will have to convert this to 16-bit, and this will take quite
long.

What matters is for this data to end up on disk as fast as possible in
16-bit or 32-bit format.

Any help would be greatly appreciated!

Presumably when you write that you "have" that array you mean that
there's no problem holding it in memory.

So just fix up endianness (if required) in memory.

Then dump the whole thing to a file using relevant API functions.

It has little nothing to do with C++ classes, or C++ at all.

Follow-ups therefore set to [comp.os.ms-windows.programmer.win32], and
please in future post Windows questions to Windows groups, etc.
 
H

HumbleWorker

Hi everyone,

I am currently using Visual C++ 8 Standard.

I have a quite large data array (1GB) in the form of an unsigned char
array.
I need to save this to disk in 16-bit format (two chars form a 16-bit
value) or 32-bit format (half of 32 bit value empty).

I was wondering what class I should use to save this data as fast as
possible.
Saving it as bytes is not best, as unfortunately the reading program
(Matlab) will have to convert this to 16-bit, and this will take quite
long.

What matters is for this data to end up on disk as fast as possible in
16-bit or 32-bit format.

Any help would be greatly appreciated!

Thank you,
Best regards,
Tele

Use ofstream class.
 
I

Ian Collins

HumbleWorker said:
Use ofstream class.
That's probably bad advice!

The good advice is to ask on a windows group for the appropriate
platform specific API.
 
H

HumbleWorker

Any particular reason why I should use ofstream?
As opposed for example to binarywriter or other.

Regards,
Tele- Hide quoted text -

- Show quoted text -

BinaryWriters are .NET classes, and they are wrappers around the
streams only. They may not be specifically wrapping around Standard C+
+ streams, but they definitely wrap around something conceptually
similar. If you set the constant ios::binary in the constructor for
fstream, you get a binary writer.

if you can send me your code i can revert back with a complete code.

Thanks.
HW
 
E

elet.mirror

BinaryWriters are .NET classes, and they are wrappers around the
streams only. They may not be specifically wrapping around Standard C+
+ streams, but they definitely wrap around something conceptually
similar. If you set the constant ios::binary in the constructor for
fstream, you get a binary writer.

if you can send me your code i can revert back with a complete code.

Thanks.
HW

Thanks, this information helps.
I didn't know BinaryWriters wrapped around streams, but I know how to
use ios::binary.
Thanks!
Tele
 
J

James Kanze

HumbleWorker wrote:
That's probably bad advice!
The good advice is to ask on a windows group for the appropriate
platform specific API.

It depends. A good implementation of ofstream is likely to
handle buffering in some clever fashion which would require a
lot of work on your part to duplicate.

Given the quantity of data, of course, and the fact that he's
probably on a PC, IO bandwidth will probably be the limiting
factor, whatever solution he adopts, so he might as well do
whatever is simplest.
 
I

Ian Collins

James said:
It depends. A good implementation of ofstream is likely to
handle buffering in some clever fashion which would require a
lot of work on your part to duplicate.
I'd expect a memory mapped file (where supported) to be the fastest and
simplest solution, but obviously some testing would be required.
Given the quantity of data, of course, and the fact that he's
probably on a PC, IO bandwidth will probably be the limiting
factor, whatever solution he adopts, so he might as well do
whatever is simplest.
True, although the filesystem cache might mask this. Again, testing and
measuring beats speculation.
 
G

Gernot Frisch

I am currently using Visual C++ 8 Standard.


fstream are quite slow on Visual Studio's implementation. I found
fopen/fwrite to be pretty fast, but the API might be an even better
choice.
comp.os.ms-windows.programmer.win32 might be a place to ask.
 
J

James Kanze

I'd expect a memory mapped file (where supported) to be the
fastest and simplest solution, but obviously some testing
would be required.

It depends. For random access, memory mapped files almost
always beat anything else. For sequential access, I've seen
intelligent buffering win out (although more often for reading
than for writing).
True, although the filesystem cache might mask this.

Given the size of his data, I'm not sure.
Again, testing and measuring beats speculation.

I certainly agree with that. I was just throwing out some
ideas. If the original is not fast enough, and the profiler
shows that the problem isn't CPU per se, then he'll have to
experiment with different versions. I just have a sneaky
suspicion, however, that if filebuf is well implemented, and
he's not an expert on the system, he won't be able to beat it.

Of course, if the problem is CPU, because e.g. the standard
requires filebuf to consider locale specific mappings, even if
the file type is binary, or to maintain a local buffer (where as
he has all of his data already formatted and buffered) then it
shouldn't be too hard to come up with something which requires
very little CPU.

As to which case he's in, of course: only his profiler knows for
sure.
 
I

Ian Collins

James said:
Given the size of his data, I'm not sure.
On this box (Solaris with ZFS), the file system will use as much of the
8GB of RAM as it can for cache - so it can make a big impact!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top