Optimization of file reading...

P

Paolo

Hi!

I'm trying to read a file, a very big file (few Gb at least, could be
hundreds too). I started using i/ostream, but found two problems: they
can't manage efficiently 64 bits offsets and don't perform well. Then
started using fread fwrite, with a 20 MB buffer. The question is: is
there an optimal buffer size? should I change the fread internal buffer
instead? Which are the parameters I must take into consideration for
optimizing read?

Thank you!
 
I

Ian Collins

Paolo said:
Hi!

I'm trying to read a file, a very big file (few Gb at least, could be
hundreds too). I started using i/ostream, but found two problems: they
can't manage efficiently 64 bits offsets and don't perform well. Then
started using fread fwrite, with a 20 MB buffer. The question is: is
there an optimal buffer size? should I change the fread internal buffer
instead? Which are the parameters I must take into consideration for
optimizing read?
If your platform supports it, memory map the file (mmap on UNIX like
systems).
 
J

Jacek Dziedzic

Paolo said:
Hi!

I'm trying to read a file, a very big file (few Gb at least, could be
hundreds too). I started using i/ostream, but found two problems: they
can't manage efficiently 64 bits offsets and don't perform well. Then
started using fread fwrite, with a 20 MB buffer. The question is: is
there an optimal buffer size? should I change the fread internal buffer
instead? Which are the parameters I must take into consideration for
optimizing read?

I do not know the answer to your question, but of curiosity, did
you read your files in binary or in text mode at the time you had
noticed fstream's poor performance?

- J.
 
P

Paolo

Jacek Dziedzic ha scritto:
I do not know the answer to your question, but of curiosity, did
you read your files in binary or in text mode at the time you had
noticed fstream's poor performance?

- J.

I used binary mode. I didn't directly notice poor performances, but I
read something about it in some discussions, maybe in this group too.
 
P

Paolo

Paolo ha scritto:
Jacek Dziedzic ha scritto:


I used binary mode. I didn't directly notice poor performances, but I
read something about it in some discussions, maybe in this group too.

Any idea????
 
V

Victor Bazarov

Paolo said:
Paolo ha scritto:


Any idea????

Try asking in a database newsgroup, they have plenty of tricks up their
sleeves. Also, the optimal buffer size for any I/O operation would be
very specific for your platform. Consider asking in the newsgroup for
your OS. Yet another idea: experiment on a smaller file with different
buffer sizes, find which one is better; although it can so happen that
it is different in different programs because of the surrounding
functionality. Any time you're concerned with performance, you should
measure and only then decide what (if anything) can or should be done.

V
 
P

Paolo

Victor Bazarov ha scritto:
Try asking in a database newsgroup, they have plenty of tricks up their
sleeves. Also, the optimal buffer size for any I/O operation would be
very specific for your platform. Consider asking in the newsgroup for
your OS. Yet another idea: experiment on a smaller file with different
buffer sizes, find which one is better; although it can so happen that
it is different in different programs because of the surrounding
functionality. Any time you're concerned with performance, you should
measure and only then decide what (if anything) can or should be done.

V

Thank you very much for your answer! I made this post only to know if
there are "global" rules that may help me, but it seems there aren't so
I'll do some testing and chose the best size.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top