Help with "read" issue please

Z

ZafT

Thanks in advance for any tips that might get me going in the right
direction.

I am working on a simple exercise for school that is supposed to use read to
read a file (about 10 MB). I am supposed to change the buffer size and see
how this affects the read time. In other words, the buffer is supposed to
limit how much of the file gets read per call, and cause some change in
speed. I am supposed to do the same with fread as well, but I'm just asking
for help on read for now.

Everything works like a charm if I have the buffer set to the same size as
the file, but if I make the buffer size smaller than the file, I get a core
dump - segmentation fault.

Here's all the code. Do I need to put the data somewhere or clear the
buffer after a pass or something? I'm guessing I'm stepping out of my
buffer when it gets full.

#include <fstream>
#include <time.h>
#include <iostream>
using namespace std;


void main(){

cout << "start \n" ;

char buff[100];
int length;
time_t seconds;

ifstream inputFile ("TEST", ios::binary);

inputFile.seekg (0, ios::end);
length = inputFile.tellg();
inputFile.seekg (0, ios::beg);

seconds = time (NULL);

cout << seconds << " Seconds at start \n" << length << " bytes\n" ;

// If buffer is same size as the file, I have no problem, but if I
drop the buff size - CRASH!
while(inputFile.read(buff,10000000)){
}

seconds = time(NULL);

cout << seconds << " Seconds at finish \n" ;

}


Any help is appreciated!


Shane
 
P

Phlip

ZafT said:
char buff[100];
while(inputFile.read(buff,10000000)){

The only thing read() knows about your buf is where it begins. read() relies
on you to tell it where the buf array ends. That's why lying in the second
argument is not a good idea. Pass the actual length of the buf array.

You are learning that the C languages are very mechanical, and will compile
any well-formed source code you give it, regardless of whether it makes
sense.
 
Z

ZafT

Phlip said:
ZafT said:
char buff[100];
while(inputFile.read(buff,10000000)){

The only thing read() knows about your buf is where it begins. read() relies
on you to tell it where the buf array ends. That's why lying in the second
argument is not a good idea. Pass the actual length of the buf array.
Phlip
http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces

Hmmm...

I agree that normally things would work better if I passed a buffer the size
of the file, but I have been asked to do the opposite for the sake of
reading the file in chunks to test performance. Is there no way to do this?

Thanks for letting me know I'm not crazy at least :) The help is very
appreciated.

Shane
 
Z

ZafT

ZafT said:
Phlip said:
ZafT said:
char buff[100];
while(inputFile.read(buff,10000000)){

The only thing read() knows about your buf is where it begins. read() relies
on you to tell it where the buf array ends. That's why lying in the second
argument is not a good idea. Pass the actual length of the buf array.
Phlip
http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces

Hmmm...

I agree that normally things would work better if I passed a buffer the size
of the file, but I have been asked to do the opposite for the sake of
reading the file in chunks to test performance. Is there no way to do this?

Thanks for letting me know I'm not crazy at least :) The help is very
appreciated.

Shane

Here's the description of my exercise -

Your program will read the input file several times: for each of the two
cases above, you will use a read size of 1, 2, 5, 10, 20, 50, 100, 200, 500,
1000, 2000, 5000, 10000, 20000, and 50000. In other words, your program will
first open the input file, read its entire contents 1-byte at a time using
the read system call, then close the file. Your program will then open the
file again and read though the file, reading 2-bytes at a time, and so on,
up to 50,000 bytes at a time.

I'm not sure how to accomplish this without changing the buffer.
 
P

Phlip

ZafT said:
char buff[100];
while(inputFile.read(buff,10000000)){

The only thing read() knows about your buf is where it begins. read() relies
on you to tell it where the buf array ends. That's why lying in the second
argument is not a good idea. Pass the actual length of the buf array.
Hmmm...

I agree that normally things would work better if I passed a buffer the size
of the file, but I have been asked to do the opposite for the sake of
reading the file in chunks to test performance. Is there no way to do
this?

You didn't read my reply, or the help files for read, or something.

read() doesn't give a rat's ass how big your buffer is. It will read the
amount specified in its second argument. If that number is too big. read()
will write characters outside your buf array. Your buf is 100 and your read
says 10000000. You are writing all over your stack.

Look these topics up in your tutorial before proceding.
 
J

John Harrison

ZafT said:
Phlip said:
ZafT wrote:

char buff[100];

while(inputFile.read(buff,10000000)){

The only thing read() knows about your buf is where it begins. read() relies
on you to tell it where the buf array ends. That's why lying in the second
argument is not a good idea. Pass the actual length of the buf array.

Hmmm...

I agree that normally things would work better if I passed a buffer the size
of the file, but I have been asked to do the opposite for the sake of
reading the file in chunks to test performance. Is there no way to do this?

Thanks for letting me know I'm not crazy at least :) The help is very
appreciated.

Shane

Here's the description of my exercise -

Your program will read the input file several times: for each of the two
cases above, you will use a read size of 1, 2, 5, 10, 20, 50, 100, 200,
500,
1000, 2000, 5000, 10000, 20000, and 50000. In other words, your program
will
first open the input file, read its entire contents 1-byte at a time
using
the read system call, then close the file. Your program will then open
the
file again and read though the file, reading 2-bytes at a time, and so
on,
up to 50,000 bytes at a time.

I'm not sure how to accomplish this without changing the buffer.

You obviously have some sort of blind spot here, here's two cases

// read the entire file 100 bytes at a time
char buf[100];
while (file.read(buf, 100))
{
}

// read the entire file 50000 bytes at a time
char buf[50000];
while (file.read(buf, 50000))
{
}

That's all there is to it.

john
 
Z

ZafT

read() doesn't give a rat's ass how big your buffer is. It will read the
amount specified in its second argument. If that number is too big. read()
will write characters outside your buf array. Your buf is 100 and your read
says 10000000. You are writing all over your stack.

Look these topics up in your tutorial before proceding.

Thanks for the help. I guessed that I was going a bit out of bounds, but
since the teacher referred to the buffer as the point of manipulation in
class, I thought read() would go until the buffer was full, then make
another pass at it or something. At least now I know I'm wrong.
Unfortionately, I'm not sure how to proceed now.

Can I read 100 byte chunks, ie..

char buff[100];

while(inputFile.read(buff,100)){
// fill buffer
// empty buffer
// go back and read the next 100 bytes
}

??

btw, I did not intentionally ignore your advice - I just misinterpreted it.

Shane
 
Z

ZafT

You obviously have some sort of blind spot here, here's two cases

// read the entire file 100 bytes at a time
char buf[100];
while (file.read(buf, 100))
{
}

// read the entire file 50000 bytes at a time
char buf[50000];
while (file.read(buf, 50000))
{
}

That's all there is to it.

john

It's sad that I actually tried that, but both read and fread read in the
file so fast that I assumed that I was doing something wrong. I thought
that I was reading only the first ___ bytes and ending without continuing on
through the rest of the file. At least I can call the program done now and
report my results!

Thanks for all of the help!
 
J

John Harrison

You obviously have some sort of blind spot here, here's two cases

// read the entire file 100 bytes at a time
char buf[100];
while (file.read(buf, 100))
{
}

// read the entire file 50000 bytes at a time
char buf[50000];
while (file.read(buf, 50000))
{
}

That's all there is to it.

john

It's sad that I actually tried that, but both read and fread read in the
file so fast that I assumed that I was doing something wrong. I thought
that I was reading only the first ___ bytes and ending without
continuing on
through the rest of the file. At least I can call the program done now
and
report my results!

Thanks for all of the help!

There are several things you should be aware of in interpreting your
results. First there are several other buffers between the buffer in your
code and the file. ifstream will maintain its own buffer, and it very
likely that the operating system will have its own buffers as well, and
the hardware could also have its own buffers.

So don't assume that

char buf[100];
while (file.read(buf, 100))
{
}

results in bytes being phyically read 100 at a time from the file.

Another thing to be aware of it that on some operating systems you could
get different results on different occasions. For instance it could be
that if you run your program twice it will run faster the second time.
This is because the operating system is 'remembering' that you read the
file before and so storing part or all of the file in memory assuming that
you will want to read it again. Another case of buffering.

john
 
Z

ZafT

..............
So don't assume that

char buf[100];
while (file.read(buf, 100))
{
}

results in bytes being phyically read 100 at a time from the file.

Another thing to be aware of it that on some operating systems you could
get different results on different occasions. For instance it could be
that if you run your program twice it will run faster the second time.
This is because the operating system is 'remembering' that you read the
file before and so storing part or all of the file in memory assuming that
you will want to read it again. Another case of buffering.

john

It's like you were reading my mind. I was just wondering why it ran so much
faster the second time. heh. Well it seems as though this experiment for
comparing read and fread is not going to give any kind of raw result.
That's okay though. I've learned a few things tonight. I appreciate your
time very much. This group has been good to me as long as I try everything
myself first and post enough code so people don't have to read my mind. :)
 
T

Thomas Matthews

ZafT said:
.............
So don't assume that

char buf[100];
while (file.read(buf, 100))
{
}

results in bytes being phyically read 100 at a time from the file.

Another thing to be aware of it that on some operating systems you could
get different results on different occasions. For instance it could be
that if you run your program twice it will run faster the second time.
This is because the operating system is 'remembering' that you read the
file before and so storing part or all of the file in memory assuming that
you will want to read it again. Another case of buffering.

john


It's like you were reading my mind. I was just wondering why it ran so much
faster the second time. heh. Well it seems as though this experiment for
comparing read and fread is not going to give any kind of raw result.
That's okay though. I've learned a few things tonight. I appreciate your
time very much. This group has been good to me as long as I try everything
myself first and post enough code so people don't have to read my mind. :)

#include <iostream>
#include <fstream>
#include <cstdlib>
using namespace std; // for toy programs.

const unsigned int MAX_BUFFER_SIZE = 50000;
const unsigned int chunk_sizes[] =
{
1, 2, 5, 10, 20, 50, 100, 200, 500,
1000, 2000, 5000, 10000, 20000, 50000
}
const unsigned int NUM_CHUNK_SIZES =
sizeof(chunk_sizes) / sizeof(chunk_sizes[0]);

unsigned char buffer[MAX_BUFFER_SIZE];

int main(void)
{
ifstream data_file;

for (unsigned int i = 0; i < NUM_CHUNK_SIZES; ++i)
{
unsigned int chunk_size = chunk_sizes;
data_file.open(/*...*/); // Per requirements.
while (data_file.read(buffer, chunk_size))
{
// Do stuff here.
}
data_file.close(); // Per requirements.
}
return EXIT_SUCCESS;
}


If your compiler cannot allocate an automatic array
of 50,000 then you will have to use the new operator
and allocate memory before you start reading. I
highly recommend allocating the largest size and only
do this once.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top