Having trouble with setvbuf with large buffer size under linux

G

grunes

I wish to fread a few thousand bytes at a time in turn from several
very large files from a DVD data disk, under Redhat Fedora Linux.

When this is done the DVD drive wastes a lot of time and almost shakes
itself to pieces.

I tried using setvbuf, with large buffers, e.g. (example only, not
checked):

#include <stdio.h>
#include <string.h>
FILE *FP1,*FP2;
#define big (32*1024*1024)
char buf1(big),buf2(big);
char a(1024),b(1024);
int i;

if(FP1=fopen("/mnt/dvd/file1","rb") == NULL) {
printf("bad fopen\n");
exit(1);
}
if(setvbuf(FP1,buf1,_IOFBF,big) {
printf("bad setvbuf\n");
exit(1);
}
if(FP2=fopen("/mnt/dvd/file2","rb") == NULL) {
printf("bad fopen\n");
exit(1);
}
if(setvbuf(FP2,buf2,_IOFBF,big) {
printf("bad setvbuf\n");
exit(1);
}
for (i=0,i<1024*1024*1024; i+=1024) {
if (fread(a,sizeof(char),1024,FP1) != 1024) {
printf("bad fread\n");
exit(1);
}
if (fread(b,sizeof(char),1024,FP2) != 1024) {
printf("bad fread\n");
exit(1);
}
... play with a and b ...
}

Why does the value of "big" have no effect on time or number of drive
shakes? What can I do instead? (e.g., are there free substitutes for
stdio that work?)

Obviously I could create
my_fopen, my_fopen64, my_fread, my_fseek, my_fseeko, my_fscanf,
my_ftell, my_getc, my_ftell...

which use the standard library calls to read large blocks sequentially
into my own buffers, but it would be silly to virtually recreate
stdio. It would also be nice to be able to efficiently use other
people's software, without substituting my routine names for all the
standard ones.
 
J

Jens.Toerring

grunes said:
I wish to fread a few thousand bytes at a time in turn from several
very large files from a DVD data disk, under Redhat Fedora Linux.
When this is done the DVD drive wastes a lot of time and almost shakes
itself to pieces.
I tried using setvbuf, with large buffers, e.g. (example only, not
checked):
#include <stdio.h>
#include <string.h>
FILE *FP1,*FP2;
#define big (32*1024*1024)
char buf1(big),buf2(big);
char a(1024),b(1024);

What's that? Didn't you mean e.g. "buf1[big]" etc.? And do you
realize that there's nnormally a size limit for automatic arrays?
As far as I remember you can't count on more than 65536 bytes.
So you might be better of malloc()ing the buffers.
if(FP1=fopen("/mnt/dvd/file1","rb") == NULL) {
printf("bad fopen\n");
exit(1);
}
if(setvbuf(FP1,buf1,_IOFBF,big) {
printf("bad setvbuf\n");
exit(1);
}
if(FP2=fopen("/mnt/dvd/file2","rb") == NULL) {
printf("bad fopen\n");
exit(1);
}
if(setvbuf(FP2,buf2,_IOFBF,big) {
printf("bad setvbuf\n");
exit(1);
}
for (i=0,i<1024*1024*1024; i+=1024) {
if (fread(a,sizeof(char),1024,FP1) != 1024) {
printf("bad fread\n");
exit(1);
}
if (fread(b,sizeof(char),1024,FP2) != 1024) {
printf("bad fread\n");
exit(1);
}
... play with a and b ...
}
Why does the value of "big" have no effect on time or number of drive
shakes? What can I do instead? (e.g., are there free substitutes for
stdio that work?)

That's something that's not a C question - the C standard doesn't
talk about the "shaking of DVD drives" nor even, shockingly enough,
about DVD drives at all... Chances are that this is nothing you can
influence from your userland program at all (<OT> just think about
what influence e.g. the size of kernel buffers, OS timing issues,
the way the driver for the DVD drive is written etc.may have </OT>).
Perhaps you get some reasonable answers in
comp.os.linux.development.apps.
Regards, Jens
 
G

grunes

Maybe I should give a working program, which I have in fact tested!
You should eject, then remount the disk just before running it, so
execution time is not lessened by disk buffer caching (otherwise the
2nd run would be much faster, regardless of software efficiency,
because the file would already be in memory).



/*Why doesn't size of "big" affect execution time much? */

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {

FILE *F1,*F2;
#define big (32*1024*1024) /* Buffer size */
/*#define big (1024) */ /* My other trial value*/
#define fsiz (128*1024*1024) /* File Size */
char *b1,*b2;
char a[1024],b[1024];
int i;

if ((b1=malloc(big))==NULL) {
printf("bad malloc\n");
exit(1);
}

if ((b2=malloc(big))==NULL) {
printf("bad malloc\n");
exit(1);
}

F1=fopen("/mnt/cdrom/file1","rb");
if (F1==NULL) {
printf("bad fopen\n");
exit(1);
}

if(setvbuf(F1,b1,_IOFBF,big)) {
printf("bad setvbuf\n");
exit(1);
}


F2=fopen("/mnt/cdrom/file2","rb");
if (F2==NULL) {
printf("bad fopen\n");
exit(1);
}

if(setvbuf(F2,b2,_IOFBF,big)) {
printf("bad setvbuf\n");
exit(1);
}

for (i=0; i<fsiz; i+=1024) {
if(fread(a,1,1024,F1) != 1024) {
printf("bad fread\n");
exit(1);
}
if(fread(b,1,1024,F1) != 1024) {
printf("bad fread\n");
exit(1);
}
}
return(0);
}
 
J

Jens.Toerring

grunes said:
Maybe I should give a working program, which I have in fact tested!
You should eject, then remount the disk just before running it, so
execution time is not lessened by disk buffer caching (otherwise the
2nd run would be much faster, regardless of software efficiency,
because the file would already be in memory).

Ok, you're program is written in C, and that's probably why you
think it is useful to post your question in comp.lang.c. But C
as a language makes no promises at all about how fast a program
is going to run, how short disk access times are or how much
"DVD shaking" is allowed. All C as a language is concerned with
is that it works - it doesn't distinguish how fast or slow.

You have tried a standard C conforming solution by playing around
with setvbuf(). But you have found that it doesn't help you. Now
you have to start looking behind the scenes and try to understand
what are the real factors, most probably related to things like
the OS you're using, the way drivers talk with the DVD drive and
dozens of other factors. But for getting help on these kinds of
topics you have to look somewhere else. Since you are running
some kind of Linux comp.os.linux.development.apps would seem to
be the first place to go to, or comp.os.linux.development.system,
if you find that you have to deal with real low level issues.

Regards, Jens
 
G

grunes

What's that? Didn't you mean e.g. "buf1[big]" etc.? And do you
realize that there's nnormally a size limit for automatic arrays?
As far as I remember you can't count on more than 65536 bytes.
So you might be better of malloc()ing the buffers.

Oops. Both of those, as well as the inappropriate include files were
fixed in my second post, where I used code I had actually tried. Not
doing that the first time was a mistake.
That's something that's not a C question - the C standard doesn't
talk about the "shaking of DVD drives" nor even, shockingly enough,
about DVD drives at all... Chances are that this is nothing you can
influence from your userland program at all (<OT> just think about
what influence e.g. the size of kernel buffers, OS timing issues,
the way the driver for the DVD drive is written etc. may have </OT>).
Perhaps you get some reasonable answers in
comp.os.linux.development.apps.

I'm not talking ANSI standard C in general. I'm talking recent
versions of Redhat Fedora Linux, compiled by gcc using default options
(except that I use -O). Clearly setvbuf size specification simply does
not work in this environment. I don't actually need to know why
changing buffer size does not reduce the number of hardware disk
seeks. I just need to know that it does not, and that I need it to, as
does anyone who is trying to set buffer size to speed up access to
multiple files, or reduce wear and tear on the drive.

(The same problem might apply to data from hard disk drives, but it is
more important on CD and DVD drives, because of the slower seek, and
because the mechanisms used on on many CD and DVD drives are more
easily worn out than those used on modern hard drives.)
 
R

Richard Bos

I'm not talking ANSI standard C in general.

Then why do you ask in a newsgroup that _does_ talk ISO C?
I'm talking recent versions of Redhat Fedora Linux,

Then you really do need to visit a system-specific group. What, you
think we're experts on every single system there was ever a C compiler
for? This group is for C, not for C-and-some-guys-system-setup.

Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,480
Members
44,900
Latest member
Nell636132

Latest Threads

Top