fgets and problems reading into array

E

Eigenvector

I've been dinking around with this code for a while now and I can't seem to
figure out where the problem lies. The original problem was that I for some
reason or another couldn't allocate few 80,000,000 element arrays. Normally
that shouldn't be a problem but it was. So I tried to do some calloc calls
and everything went south from there.

Essentially I'm trying to read in a big file, stick each line into an array,
then from then on do stuff to it. The do stuff part should work without a
hitch, if I can just get that read in part to work.

Here's the code.
#include <stdio.h>
#include <string.h>

int main()
{
int i=0;
int j=0;

int eof_marker=0;

char *array[1000000];
char *out_array[1000000];

char input_line[80];

FILE *fp;
FILE *fp_out;
/*************************************************/

fp=fopen("/var/tmp/cpu.out", "r");
fp_out=fopen("/var/tmp/cpu.output2", "w");

for (i=0; i<1000000; i++)
array=calloc(80, sizeof(char));

while(fgets(input_line, sizeof(input_line), fp) != NULL)
{
strcat(array, input_line); /* At this line the code core dumps */
i++;
}
eof_marker=i;
}

Please forgive M$'s abominable text parsing, my neat indenting doesn't seem
to have survived intact.

I don't think I understand the logic behind malloc and calloc. I've read
the FAQ and a few past threads from this group already - which is where my
syntax comes from mostly. It just doesn't seem to be clicking.

Rob
 
A

Artie Gold

Eigenvector said:
I've been dinking around with this code for a while now and I can't seem to
figure out where the problem lies. The original problem was that I for some
reason or another couldn't allocate few 80,000,000 element arrays. Normally
that shouldn't be a problem but it was. So I tried to do some calloc calls
and everything went south from there.

Essentially I'm trying to read in a big file, stick each line into an array,
then from then on do stuff to it. The do stuff part should work without a
hitch, if I can just get that read in part to work.

Here's the code.
#include <stdio.h>
#include <string.h>

int main()
{
int i=0;
int j=0;

int eof_marker=0;

char *array[1000000];
char *out_array[1000000];

char input_line[80];

FILE *fp;
FILE *fp_out;
/*************************************************/

fp=fopen("/var/tmp/cpu.out", "r");
fp_out=fopen("/var/tmp/cpu.output2", "w");

for (i=0; i<1000000; i++)
array=calloc(80, sizeof(char));

^^^^^^^^^^^^
sizeof(char) is 1 by definition.

....but back to your problem. What is the value of `i' here?
while(fgets(input_line, sizeof(input_line), fp) != NULL)
{
strcat(array, input_line); /* At this line the code core dumps */
BOOM!

i++;
}
eof_marker=i;
}

Please forgive M$'s abominable text parsing, my neat indenting doesn't seem
to have survived intact.

I don't think I understand the logic behind malloc and calloc. I've read
the FAQ and a few past threads from this group already - which is where my
syntax comes from mostly. It just doesn't seem to be clicking.


HTH,
--ag
 
K

Kelsey Bjarnason

[snips]

for (i=0; i<1000000; i++)
array=calloc(80, sizeof(char));


Three items on one line...

1) why not allocate the length of the line actually read in? That way
short lines don't waste space, long lines don't overflow.

2) No check for calloc failure.

3) sizeof(char)? Just use 1.
while(fgets(input_line, sizeof(input_line), fp) != NULL)
{
strcat(array, input_line); /* At this line the code core dumps

*/

Assuming calloc properly zero-inits the buffer, that should be okay...
except you don't even know if the allocation worked or not.

'Course, the fact that i currently points past the _last_ element of the
array might be an issue...
 
E

Eigenvector

Kelsey Bjarnason said:
[snips]

for (i=0; i<1000000; i++)
array=calloc(80, sizeof(char));


Three items on one line...

1) why not allocate the length of the line actually read in? That way
short lines don't waste space, long lines don't overflow.

2) No check for calloc failure.

3) sizeof(char)? Just use 1.
while(fgets(input_line, sizeof(input_line), fp) != NULL)
{
strcat(array, input_line); /* At this line the code core dumps

*/

Assuming calloc properly zero-inits the buffer, that should be okay...
except you don't even know if the allocation worked or not.

'Course, the fact that i currently points past the _last_ element of the
array might be an issue...


Oh, crap!!!!!!! I didn't even think of that! I forgot to reset the value
for i

Please tell me that the value for 'i' was in fact the reason why my code
bombed.

You'll have to forgive some of the generic formatting and other strange
anomolies in my code - some of it was straight plagarism just to figure out
how the syntax for calloc worked. It'll get cleaned out when I actually get
the code working.
 
B

Burne C

Eigenvector said:
I've been dinking around with this code for a while now and I can't seem to
figure out where the problem lies. The original problem was that I for some
reason or another couldn't allocate few 80,000,000 element arrays. Normally
that shouldn't be a problem but it was. So I tried to do some calloc calls
and everything went south from there.

Essentially I'm trying to read in a big file, stick each line into an array,
then from then on do stuff to it. The do stuff part should work without a
hitch, if I can just get that read in part to work.

Here's the code.
#include <stdio.h>
#include <string.h>

int main()
{
int i=0;
int j=0;

int eof_marker=0;

char *array[1000000];
char *out_array[1000000];

char input_line[80];

FILE *fp;
FILE *fp_out;
/*************************************************/

fp=fopen("/var/tmp/cpu.out", "r");
fp_out=fopen("/var/tmp/cpu.output2", "w");

for (i=0; i<1000000; i++)
array=calloc(80, sizeof(char));

while(fgets(input_line, sizeof(input_line), fp) != NULL)
{
strcat(array, input_line); /* At this line the code core dumps */
i++;
}


I think "i" is the main problem, you didn't reset i = 0 before the while loop. Also, why do you use
strcat instead of strcpy ? ( I know that it is not a problem, because calloc fills the array block
with all zero. )

BTW, you should add code to trap the failure of fopen and calloc.
 
R

Randy Howard

char *array[1000000];
char *out_array[1000000];

You've already received several good responses to the program operation,
but you should be aware that there is no guarantee that you can declare
an array this large and have it work (portably). A fair number of
compilers will bomb out on arrays that large. I believe the limit for
the size of an array declared this way is 64KB in C99, and half that for
C89, but I can't recall the "chapter and verse" on that as I can't
remember the last time I even attempted to declare an array this large in
this way.

clc Faq entries 16.3 and 19.23 cover this, perhaps a bit obliquely.
(http://www.eskimo.com/~scs/C-faq/faq.html)
 
E

Eigenvector

Randy Howard said:
char *array[1000000];
char *out_array[1000000];

You've already received several good responses to the program operation,
but you should be aware that there is no guarantee that you can declare
an array this large and have it work (portably). A fair number of
compilers will bomb out on arrays that large. I believe the limit for
the size of an array declared this way is 64KB in C99, and half that for
C89, but I can't recall the "chapter and verse" on that as I can't
remember the last time I even attempted to declare an array this large in
this way.
Well that's kind of lame. I understand that not every computer has loads of
memory, but putting an actual limit or even recognizing a limit seems
artificial and unnecessary. I guess programmers need to make their code as
efficient as possible at all times, but I intensely dislike those types of
standards.

64Kb is pretty damn small, maybe not for a PC, but certainly my 16 GB
internal memory UNIX server shouldn't have this limit imposed on it. But
like you said, its not guarenteed, and there are other more efficient ways
of doing the job.
 
R

Richard Heathfield

Eigenvector wrote:

64Kb is pretty damn small, maybe not for a PC, but certainly my 16 GB
internal memory UNIX server shouldn't have this limit imposed on it.

The limit is not imposed on your implementation. It is imposed on programs
which the programmer requires to be portable to any conforming C compiler.

<snip>
 
K

Kevin Easton

Eigenvector said:
Randy Howard said:
char *array[1000000];
char *out_array[1000000];

You've already received several good responses to the program operation,
but you should be aware that there is no guarantee that you can declare
an array this large and have it work (portably). A fair number of
compilers will bomb out on arrays that large. I believe the limit for
the size of an array declared this way is 64KB in C99, and half that for
C89, but I can't recall the "chapter and verse" on that as I can't
remember the last time I even attempted to declare an array this large in
this way.
Well that's kind of lame. I understand that not every computer has loads of
memory, but putting an actual limit or even recognizing a limit seems
artificial and unnecessary. I guess programmers need to make their code as
efficient as possible at all times, but I intensely dislike those types of
standards.

Don't think of it as telling you your array won't work if it's bigger
than 64kB - instead think of it as promising that you that your array
*will* work if it's smaller than 64kB.

- Kevin.
 
K

Kelsey Bjarnason

[snips]

Well that's kind of lame. I understand that not every computer has loads of
memory, but putting an actual limit or even recognizing a limit seems
artificial and unnecessary. I guess programmers need to make their code as
efficient as possible at all times, but I intensely dislike those types of
standards.

64Kb is pretty damn small, maybe not for a PC, but certainly my 16 GB
internal memory UNIX server shouldn't have this limit imposed on it. But
like you said, its not guarenteed, and there are other more efficient ways
of doing the job.

It's not a limit; you can (perhaps) create objects terabytes in size.
What it is is a statement that you cannot create objects larger than X and
expect them to actually work across all conforming implementations.
 
E

Eigenvector

Kelsey Bjarnason said:
[snips]

Well that's kind of lame. I understand that not every computer has loads of
memory, but putting an actual limit or even recognizing a limit seems
artificial and unnecessary. I guess programmers need to make their code as
efficient as possible at all times, but I intensely dislike those types of
standards.

64Kb is pretty damn small, maybe not for a PC, but certainly my 16 GB
internal memory UNIX server shouldn't have this limit imposed on it. But
like you said, its not guarenteed, and there are other more efficient ways
of doing the job.

It's not a limit; you can (perhaps) create objects terabytes in size.
What it is is a statement that you cannot create objects larger than X and
expect them to actually work across all conforming implementations.
Okay, then perhaps I read it wrong. Then what the standard is saying is
that 64 Kb is RAM size of a minimal system running C? something like that.

Still, it seems artificial for C to be assuming system capacity in order for
C standard conformity. What does the compiler or the language care about
the upper bounds of memory allocation? What is so special about 64 Kb in
the first place, at 64 Kb you might as well have 128 or 256 or 16 for that
matter. Granted, not concerning yourself about resource allocation is what
got us into that M$ Windows memory race that bloated computer systems
everywhere - but still what is the 64 Kb concern - what is that 16 bits?
Does that imply that C is still based in the 16 bit world?
 
E

Eigenvector

Kelsey Bjarnason said:
[snips]

Okay, then perhaps I read it wrong. Then what the standard is saying is
that 64 Kb is RAM size of a minimal system running C? something like
that.

Rather, that to be conforming, an implementation _only_ needs be able to
create one object of a specific maximum size. Nothing prevents it being
able to create an object larger, or more than one such object.

It's about like saying "to be a car, a vehicle must have at least three
wheels" - most have four, nothing wrong with that.
Still, it seems artificial for C to be assuming system capacity in order for
C standard conformity.

Not at all. Without setting some minimums, the coder has no way to
predict whether his code can run anywhere other than the implementation he
developed it with.

Consider writing, say, a compiler - something presumably doable in ISO C.
To produce an efficient compiler, or one capable of decent optimizations,
might require creating buffers of, say, 64K. Now, in developing this
compiler, can I be assured that I'll actually _have_ 64K to play with?

Yes, I can; C defines this. If it didn't, I might not be able to do this;
indeed, I might not be able to reliably create buffers 1K in size.
What does the compiler or the language care about
the upper bounds of memory allocation?

It doesn't; it cares about the *lower* bound. "You must be able to
provide *at least* this much memory, or you ain't a C compiler."
Granted, not concerning yourself about resource allocation is what
got us into that M$ Windows memory race that bloated computer systems
everywhere - but still what is the 64 Kb concern - what is that 16 bits?
Does that imply that C is still based in the 16 bit world?

No, but it suggests that C admits that 16 bit systems are still important
in many areas. Or even just memory-limited systems.

I think I understand now. I was looking at it from the wrong perspective.

I'm still a little hacked off that I couldn't get an 80 million element
array to compile, but perhaps my OS manufacturer can patch that up.
However, pertaining to my original problem, it works just fine now. A
calloc call, resetting the value for i, and aways we go.

Thanks all for the replies.
 
G

Glen Herrmannsfeldt

Eigenvector said:
Kelsey Bjarnason said:
[snips]

It's not a limit; you can (perhaps) create objects terabytes in size.
What it is is a statement that you cannot create objects larger
than X and expect them to actually work across
all conforming implementations.
(snip)
No, but it suggests that C admits that 16 bit systems are still important
in many areas. Or even just memory-limited systems.

I think I understand now. I was looking at it from the wrong perspective.

I'm still a little hacked off that I couldn't get an 80 million element
array to compile, but perhaps my OS manufacturer can patch that up.
However, pertaining to my original problem, it works just fine now. A
calloc call, resetting the value for i, and aways we go.

I have found that even large systems can limit the size of static arrays. I
believe it was on a DEC Alpha, running its preferred OS, with 64 bit
pointers on a system with over 4GB of memory. I tried to create a static
array of about 100,000 and it failed. malloc() would allocate many
gigabytes, though.

-- glen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,906
Latest member
SkinfixSkintag

Latest Threads

Top