adapting getline

E

Eric Sosman

Franken said:
I'm trying to read a line of text from a file one at a time and pass the
linenumber and and a pointer to the text thereafter. I've got a pretty
good start on it. This compiles:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define my_file "text42.txt"
#define NUMBER 100
#define MAXFMTLEN 200

int main(void)
{
FILE *fp;
char *text;
unsigned long lineNumber;
int len;


if ((fp = fopen(my_file, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(1);
}

while((len = getline(text, NUMBER)) > 0) {
printf("%c, %s", len, text);
}


fclose(fp);
return 0;
}


/* getline: get line into s, return length */
int getline(char *s, int lim)
{
char *p;
int c;

p = s;
while (--lim > 0 && (c = getchar()) != EOF && c != '\n')
*p++ = c;
if (c == '\n')
*p++ = c;
*p = '\0';
return (int)(p - s);
}


// gcc ben5.c -Wall -o out


E:\gfortran\dan>gcc ben5.c -Wall -o out
ben5.c: In function `main':
ben5.c:24: warning: implicit declaration of function `getline'
ben5.c:15: warning: unused variable `lineNumber'

E:\gfortran\dan>

The warnings show where I'm stuck. How do I write getline to read from
my_file instead of stdin?

Change getline() so it takes a third parameter, the
FILE* stream to read from, and use getc(the_stream_param)
instead of getchar(). In the call to the new getline(),
pass the value of `fp' as the argument matching the new
parameter.

To get rid of the "implicit declaration" warning, either
insert a declaration of getline() someplace before it's first
called, or move the entire definition of getline() to a spot
before main().

To get rid of the "unused variable" warning, either get
rid of `lineNumber' or use it for something.

To get a useful warning about a potential problem in
getline(), crank up gcc's diagnostic level and call for an
optimization level that prompts the compiler to study the
usage patterns of variables. `-W -Wall -ansi -pedantic -O2'
is a reasonable starting point.

To avert criticism on grounds of style and portability,
change exit(1) to exit(EXIT_FAILURE).

To cure the error that will occur after you get the code
compiled and try to run it, see Question 7.1 in the comp.lang.c
Frequently Asked Questions (FAQ) list at <http://www.c-faq.com/>.

To forestall the accusation that you're re-inventing the
wheel, explain why you're writing your own getline() instead
of using fgets() and perhaps strlen(), both of which are
already written for you.
 
F

Flash Gordon

Franken said:
I'm trying to read a line of text from a file one at a time and pass the
linenumber and and a pointer to the text thereafter. I've got a pretty
good start on it. This compiles:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define my_file "text42.txt"
#define NUMBER 100
#define MAXFMTLEN 200

int main(void)
{
FILE *fp;
char *text;

Where does text point?
unsigned long lineNumber;
int len;


if ((fp = fopen(my_file, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(1);
}

while((len = getline(text, NUMBER)) > 0) {

text cannot be changed by this call, only the data it points to can be
changed. Oops, it doesn't point anywhere yet!
printf("%c, %s", len, text);
}


fclose(fp);
return 0;
}


/* getline: get line into s, return length */
int getline(char *s, int lim)
{
char *p;
int c;

p = s;
while (--lim > 0 && (c = getchar()) != EOF && c != '\n')

fgetc and getc take an additional parameter you might be interested in.
You should also read your C text book to understand the differences.
*p++ = c;
if (c == '\n')
*p++ = c;
*p = '\0';
return (int)(p - s);
}


// gcc ben5.c -Wall -o out


E:\gfortran\dan>gcc ben5.c -Wall -o out
ben5.c: In function `main':
ben5.c:24: warning: implicit declaration of function `getline'
ben5.c:15: warning: unused variable `lineNumber'

E:\gfortran\dan>

The warnings show where I'm stuck. How do I write getline to read from
my_file instead of stdin?

Both of these warnings are, I would say, fairly self explanatory and not
related to your question. At line 24 you call a function for which there
isn't yet a declaration. Add a declaration before main or move the
entire definition to before main. As to the other warning, read your
code and you will see that you don't use lineNumber.

Please also read the comments I put in your code. One answers your
question, the other comments point out a major problem you have not yet
spotted.

You may also find the C FAQ at http://c-faq.com/ specifically the
sections about arrays and pointers which might help you with your
misunderstandings.
 
I

Ian Collins

Franken said:
I'm trying to read a line of text from a file one at a time and pass the
linenumber and and a pointer to the text thereafter. I've got a pretty
good start on it. This compiles:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define my_file "text42.txt"
#define NUMBER 100
#define MAXFMTLEN 200

Why are these global?
int main(void)
{
FILE *fp;
char *text;

Here you declare a pointer.
unsigned long lineNumber;
int len;


if ((fp = fopen(my_file, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(1);
}

while((len = getline(text, NUMBER)) > 0) {

Here you pass it to a function. Nowhere do you allocate or release any
memory.
printf("%c, %s", len, text);
}


fclose(fp);
return 0;
}


/* getline: get line into s, return length */
int getline(char *s, int lim)
{
char *p;
int c;

p = s;
while (--lim > 0 && (c = getchar()) != EOF && c != '\n')
*p++ = c;
if (c == '\n')
*p++ = c;

Why double up on '\n'?
*p = '\0';
return (int)(p - s);

Why the cast?
 
F

Franken Sense

I'm trying to read a line of text from a file one at a time and pass the
linenumber and and a pointer to the text thereafter. I've got a pretty
good start on it. This compiles:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define my_file "text42.txt"
#define NUMBER 100
#define MAXFMTLEN 200

int main(void)
{
FILE *fp;
char *text;
unsigned long lineNumber;
int len;


if ((fp = fopen(my_file, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(1);
}

while((len = getline(text, NUMBER)) > 0) {
printf("%c, %s", len, text);
}


fclose(fp);
return 0;
}


/* getline: get line into s, return length */
int getline(char *s, int lim)
{
char *p;
int c;

p = s;
while (--lim > 0 && (c = getchar()) != EOF && c != '\n')
*p++ = c;
if (c == '\n')
*p++ = c;
*p = '\0';
return (int)(p - s);
}


// gcc ben5.c -Wall -o out


E:\gfortran\dan>gcc ben5.c -Wall -o out
ben5.c: In function `main':
ben5.c:24: warning: implicit declaration of function `getline'
ben5.c:15: warning: unused variable `lineNumber'

E:\gfortran\dan>

The warnings show where I'm stuck. How do I write getline to read from
my_file instead of stdin?

Thanks for your comment, and Cheers,
--
Frank

Mistakes are a part of being human. Appreciate your mistakes for what they
are: precious life lessons that can only be learned the hard way. Unless
it's a fatal mistake, which, at least, others can learn from.
~~ Al Franken,
 
K

kid joe

#define my_file "text42.txt"
#define NUMBER 100
#define MAXFMTLEN 200

Hi Franken,

I would avoid this sort of thing - don't put arbitrary limits in your
program but let buffers resize dynamically.

Here's my personal version of getline which does this, it's also much
shorter than yours. Check it out.

Cheers,
Joe


// returns a NULL-terminated line, caller should free buffer,
// returns NULL for end of file or error
void *getline(FILE * fd)
{
char *buf = NULL;
int c = 1, bufsz = 0;
while (c && (c = fgetc(fd)) != EOF) {
LOOP:
buf = realloc(buf, ++bufsz);
*(buf + bufsz - 1) = c;
if (c == '\n') {
c = 0;
goto LOOP;
}
}
return buf;
}



.---. .-----------
/ \ __ / ------
/ / \( )/ -----
////// ' \/ ` --- /=================\
//// / // : : --- | |
// / / /` '-- | Good Evening... |
// //..\\ | |
=============UU====UU============\================/
'//||\\`
''``
 
B

BartC

Franken Sense said:
In Dread Ink, the Grave Hand of kid joe Did Inscribe:
realloc is, in my book, dangerous for programmers and not to be used if
you
can avoid it, which, for my current purposes, I can. I'm inputting text.
It's not text if it's over a hundred characters long per line, it's
calligraphy or a long vector of letters.

If it's over a hundred it should give me a run-time.

Text files with 100 characters per line are common.

1000 characters per line are less common (or the file is not line-oriented).

Why not just set the limit at 1000 or some other number not likely to be
exceeded?
 
K

kid joe

In Dread Ink, the Grave Hand of kid joe Did Inscribe:


Thanks, joe, I've got two problems here. One is that I don't know how to
call this, when it has one more degree of indirection. I should, but I
don't.

Hi Franken,

It's very simple to use, for example:

main()
{
char *line;
int n = 0;
double len = 0;
while (line = getline(stdin)) {
n++;
len += strlen(line);
printf(line);
free(line);
}
printf("There were %d lines, average length %f\n", n, len / n);
}

realloc is, in my book, dangerous for programmers and not to be used if you
can avoid it, which, for my current purposes, I can. I'm inputting text.
It's not text if it's over a hundred characters long per line, it's
calligraphy or a long vector of letters.

No, it's much safer using realloc than fixedsize buffers. Check out
"buffer overflows" on Wikipedia!

Cheers,
Joe


.---. .-----------
/ \ __ / ------
/ / \( )/ -----
////// ' \/ ` --- /=================\
//// / // : : --- | |
// / / /` '-- | Good Evening... |
// //..\\ | |
=============UU====UU============\================/
'//||\\`
''``
 
F

Franken Sense

In Dread Ink, the Grave Hand of Eric Sosman Did Inscribe:
Franken Sense wrote:
Change getline() so it takes a third parameter, the
FILE* stream to read from, and use getc(the_stream_param)
instead of getchar(). In the call to the new getline(),
pass the value of `fp' as the argument matching the new
parameter.

That's got it:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define my_file "text42.txt"
#define NUMBER 100
#define MAXFMTLEN 200

int getline(FILE *, char *, int );

int main(void)
{
FILE *fp;
char text[NUMBER];
unsigned long lineNumber = 0;
int len;

if ((fp = fopen(my_file, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(EXIT_FAILURE);
}

while((len = getline(fp, text, NUMBER)) > 0) {
++ lineNumber,
printf("%lu, %s", lineNumber, text);
}


fclose(fp);
return 0;
}


/* getline: get line into s, return length */
int getline(FILE *fp, char *s, int lim)
{
char *p;
int c;


p = s;
while (--lim > 0 && (c = getc(fp)) != EOF && c != '\n')
*p++ = c;
if (c == '\n')
*p++ = c;
*p = '\0';
return (p - s);
}


// gcc ben6.c -Wall -o out
To get a useful warning about a potential problem in
getline(), crank up gcc's diagnostic level and call for an
optimization level that prompts the compiler to study the
usage patterns of variables. `-W -Wall -ansi -pedantic -O2'
is a reasonable starting point.

What would I be asking gcc.exe to do with these warnings?
To cure the error that will occur after you get the code
compiled and try to run it, see Question 7.1 in the comp.lang.c
Frequently Asked Questions (FAQ) list at <http://www.c-faq.com/>.

I think I've got that covered. It shows the proper use of fgets:

#include <stdio.h>
#include <string.h>

char answer[100], *p;
printf("Type something:\n");
fgets(answer, sizeof answer, stdin);
if((p = strchr(answer, '\n')) != NULL)
*p = '\0';
printf("You typed \"%s\"\n", answer);
To forestall the accusation that you're re-inventing the
wheel, explain why you're writing your own getline() instead
of using fgets() and perhaps strlen(), both of which are
already written for you.

I'm following the development in K&R and am relying heavily on the
solutions to these that I can find. Any suggestions for sleeker code are
welcome.

It seemed complicated at dinkumware:

fgets
char *fgets(char *restrict s, int n, FILE *restrict stream);

The function reads characters from the input stream stream and stores them
in successive elements of the array beginning at s and continuing until it
stores n-1 characters, stores an NL character, or sets the end-of-file or
error indicators. If fgets stores any characters, it concludes by storing a
null character in the next element of the array. It returns s if it stores
any characters and it has not set the error indicator for the stream;
otherwise, it returns a null pointer. If it sets the error indicator, the
array contents are indeterminate.
 
F

Franken Sense

In Dread Ink, the Grave Hand of Ian Collins Did Inscribe:
Why are these global?

? It's clear.
Here you declare a pointer.


Here you pass it to a function. Nowhere do you allocate or release any
memory.

Thanks, I solve that by declaring
char text[NUMBER];
Why double up on '\n'?

That doesn't make any sense to me either.
Why the cast?

I snagged the getline from the clc wiki solns, which are usually pretty
good. I took the cast off and got no warnings.
 
F

Franken Sense

In Dread Ink, the Grave Hand of kid joe Did Inscribe:
Hi Franken,

I would avoid this sort of thing - don't put arbitrary limits in your
program but let buffers resize dynamically.

Here's my personal version of getline which does this, it's also much
shorter than yours. Check it out.

Cheers,
Joe


// returns a NULL-terminated line, caller should free buffer,
// returns NULL for end of file or error
void *getline(FILE * fd)
{
char *buf = NULL;
int c = 1, bufsz = 0;
while (c && (c = fgetc(fd)) != EOF) {
LOOP:
buf = realloc(buf, ++bufsz);
*(buf + bufsz - 1) = c;
if (c == '\n') {
c = 0;
goto LOOP;
}
}
return buf;
}

Thanks, joe, I've got two problems here. One is that I don't know how to
call this, when it has one more degree of indirection. I should, but I
don't.

realloc is, in my book, dangerous for programmers and not to be used if you
can avoid it, which, for my current purposes, I can. I'm inputting text.
It's not text if it's over a hundred characters long per line, it's
calligraphy or a long vector of letters.

If it's over a hundred it should give me a run-time.
--
Frank

Mistakes are a part of being human. Appreciate your mistakes for what they
are: precious life lessons that can only be learned the hard way. Unless
it's a fatal mistake, which, at least, others can learn from.
~~ Al Franken,
 
C

CBFalconer

Franken said:
In Dread Ink, the Grave Hand of kid joe Did Inscribe:
.... snip ...


My counterargument is that if realloc can blow up in Heathfield's
face, then it can do so to anyone. Persons who have had similar
difficulty comprise a who's who of clc. Maybe this use is an
example of something safe with realloc.

Anything in C can be misused. The malloc group, properly used, is
very safe. Barring implementation errors, which will be rapidly
caught.
 
F

Franken Sense

In Dread Ink, the Grave Hand of kid joe Did Inscribe:
It's very simple to use, for example:

These suggestions were enough:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define my_file "text42.txt"

void *getline(FILE *);

int main(void)
{
FILE *fp;
char *line;
int n = 0;
double len = 0;

if ((fp = fopen(my_file, "r")) == NULL ) {
fprintf(stderr, "can't open file\n");
exit(EXIT_FAILURE);
}

while ((line = getline(fp))) {
n++;
len += strlen(line);
printf(line);
free(line);
}
printf("There were %d lines, average length %f\n", n, len / n);


fclose(fp);
return (0);
}

// returns a NULL-terminated line, caller should free buffer,
// returns NULL for end of file or error
void *getline(FILE * fd)
{
char *buf = NULL;
int c = 1, bufsz = 0;
while (c && (c = fgetc(fd)) != EOF) {
LOOP:
buf = realloc(buf, ++bufsz);
*(buf + bufsz - 1) = c;
if (c == '\n') {
c = 0;
goto LOOP;
}
}
return buf;
}

// gcc ben7.c -Wall -o out

// end source begin abridged output

E:\gfortran\dan>out
ALLUSERSPROFILE=C:\Documents and Settings\All Users
APPDATA=C:\Documents and Settings\dan\Application Data
CLIENTNAME=Console
....
USERNAME=dan
USERPROFILE=C:\Documents and Settings\dan
windir=C:\WINDOWS
There were 32 lines, average length 38.125000

E:\gfortran\dan>

I really like working something up a few different ways, for when, for
example, I need variable buffers. One question here:
printf("There were %d lines, average length %f\n", n, len / n);
I wouldn't think an integer divided by an integer would behave like it
does(?)
No, it's much safer using realloc than fixedsize buffers. Check out
"buffer overflows" on Wikipedia!

My counterargument is that if realloc can blow up in Heathfield's face,
then it can do so to anyone. Persons who have had similar difficulty
comprise a who's who of clc. Maybe this use is an example of something
safe with realloc.
--
Frank

I think some people hold [G.W.Bush] in high esteem because they watch Fox.
And they get their news from Rush Limbaugh. And they are fooled.
~~ Al Franken, in response to the 2004 SOTU address
 
B

Ben Bacarisse

Franken Sense said:
In Dread Ink, the Grave Hand of kid joe Did Inscribe:

Fix the bugs before you use it!

// returns a NULL-terminated line, caller should free buffer,
// returns NULL for end of file or error
void *getline(FILE * fd)
{
char *buf = NULL;
int c = 1, bufsz = 0;
while (c && (c = fgetc(fd)) != EOF) {
LOOP:
buf = realloc(buf, ++bufsz);

This only works if realloc never fails.
*(buf + bufsz - 1) = c;
if (c == '\n') {
c = 0;
goto LOOP;
}
}
return buf;
}

An error or EOF not preceded by a newline returns a buffer that is
not null-terminated.

Setting aside the design (no way to limit memory use, no length
returned), the method (one realloc per byte is rather expensive), the
return type (should be char *) and the memory leak (realloc assumed
not to fail) all the trickery with c and the goto just makes the code
buggy and more complex than it needs to be:

void *getline(FILE *fd)
{
char *buf = NULL;
int c, bufsz = 1;
while ((c = fgetc(fd)) != EOF) {
buf = realloc(buf, ++bufsz);
buf[bufsz - 2] = c;
if (c == '\n')
break;
}
if (buf)
buf[bufsz - 1] = 0;
return buf;
}

does null-terminate the strings even when no '\n' is seen. But don't
use even this revision without correcting the other issues.
 
C

CBFalconer

Franken said:
.... snip ...

size_t len = 0;

...
while ((line = getline(fp))) {
n++;
len += strlen(line);
printf(line);
free(line);
}
printf("There were %d lines, average length %f\n", n, len / n);
E:\gfortran\dan>gcc ben8.c -Wall -o out
ben8.c: In function `main':
ben8.c:27: warning: double format, different type arg (arg 3)

How should I change the printf specifier?

You seem to have a faulty compiler. size_t is an unsigned integral
value, picked so as to be able to describe all sizes. Not a
double, which is signed and floating point. I assume n is
integral.

With C99 I believe the printf format for size_t is x (assuming a
C99 library). For C90, cast the value to long and print a long.
 
B

Ben Bacarisse

Franken Sense said:
In Dread Ink, the Grave Hand of CBFalconer Did Inscribe:


The if control is meaningless.

What do you mean by this? The if makes sense to me. The code is
buggy but the if is not the issue (try calling getline(&c, 1);).
 
F

Franken Sense

In Dread Ink, the Grave Hand of CBFalconer Did Inscribe:
He didn't. Look again.

The if control is meaningless.
rather, why is the function returning int. It should be size_t.
So should lim.

size_t len = 0;

....
while ((line = getline(fp))) {
n++;
len += strlen(line);
printf(line);
free(line);
}
printf("There were %d lines, average length %f\n", n, len / n);
E:\gfortran\dan>gcc ben8.c -Wall -o out
ben8.c: In function `main':
ben8.c:27: warning: double format, different type arg (arg 3)

E:\gfortran\dan>

How should I change the printf specifier?
 
F

Franken Sense

In Dread Ink, the Grave Hand of Ben Bacarisse Did Inscribe:
Setting aside the design (no way to limit memory use, no length
returned), the method (one realloc per byte is rather expensive), the
return type (should be char *) and the memory leak (realloc assumed
not to fail) all the trickery with c and the goto just makes the code
buggy and more complex than it needs to be:

I just dug this up from the clc wiki K&R solns by Heathfield:

*
* Ideally, we'd use a clever GetLine function which expanded its
* buffer dynamically to cope with large lines. Since we can't use
* realloc, and because other solutions would require quite hefty
* engineering, we'll adopt a simple solution - a big buffer.
*
* Note: making the buffer static will help matters on some
* primitive systems which don't reserve much storage for
* automatic variables, and shouldn't break anything anywhere.
*
*/

--
Frank

Drug war, well, as Rush Limbaugh said, anyone who uses drugs illegally
should be prosecuted and put away. I don't agree with him; I think they
should be treated, but that's what Rush believes and so, you know, we're
praying for Rush because he's in recovery and you take responsibilities for
your actions so I'm sure any day now Rush will demand to be put away for
the maximum sentence and ask for the most dangerous prison and we'll be
praying for maybe an African American cellmate who saw the Donovan McNabb
comments on ESPN. So we're prayin'.
~~ Al Franken, Book TV, on Rush Limbaugh's illegal drug arrest and racist
remarks
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top