reading line by line from file

P

plmanikandan

Hi,

I need to read a file line by line.each line contains different number
of characters.I opened file using fopen function.is there any function
to read the file line by line

Regards,
Mani
 
V

Vladimir S. Oka

Hi,

I need to read a file line by line.each line contains different number
of characters.I opened file using fopen function.is there any function
to read the file line by line

Look up `fgets()`.
 
C

CBFalconer

Vladimir S. Oka said:
Look up `fgets()`.

Among others. Possibly the most convenient is ggets, which is
non-standard, but written in standard C, and available at:

<http://cbfalconer.home.att.net/download/ggets.zip>

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
C

CBFalconer

Vladimir S. Oka said:
CBFalconer opined:

Plugging your wares again? ;-)

Good! Anything to get world rid of getses!

Yup. I put it out there in the public domain almost four years
ago, and have had no reports of bugs with it. Some people dislike
the linear buffer increase, but I consider that optimum for the
normal use in interactive input.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
J

Jordan Abel

Yup. I put it out there in the public domain almost four years
ago, and have had no reports of bugs with it. Some people dislike
the linear buffer increase, but I consider that optimum for the
normal use in interactive input.

Is there an easy way to get it to increase the buffer size by doubling
instead?
 
K

Keith Thompson

CBFalconer said:
Yup. I put it out there in the public domain almost four years
ago, and have had no reports of bugs with it. Some people dislike
the linear buffer increase, but I consider that optimum for the
normal use in interactive input.

I do have one small quibble. Since it quietly strips the newline
character from the input line, there's no good way to tell whether the
last line of an input file had a trailing newline in the first place
(for systems that don't require one).

It's not a huge deal, and it's probably ok as a default behavior, but
it might be nice to have an alternative interface that handles this --
perhaps a second function that leaves the '\n' in place.

<OT>
I like Perl's behavior in this area, but it may not translate well to
C. In Perl, reading a line gives you a string that includes the
trailing newline character; the "chomp" function deletes it. Perl's
strings are variable-length, and are represented in such a way that
"chomp" doesn't have to scan to find the end of the string.
Duplicating this behavior in C might require too much scaffolding,
e.g., returning a structure with additional information rather than
just returning a pointer to a string.
</OT>
 
C

CBFalconer

Jordan said:
Is there an easy way to get it to increase the buffer size by
doubling instead?

Yes, but I am not going to do it, nor sanction it. Using such a
routine for a truly large buffer is going to be an extremely rare
occurence. If it happens, it is likely to be due to the cat
standing on an auto-repeat key, and the important thing is that the
system doesn't barf. I don't need an efficient cat-on-key
detection system.

There is a much better argument for making the internal operation
depend on getc, rather than fgets, because this reduces the load on
the standard library for embedded work.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
W

William Ahern

Among others. Possibly the most convenient is ggets, which is
non-standard, but written in standard C, and available at:

<http://cbfalconer.home.att.net/download/ggets.zip>

I beg to different. All the BSDs offer fgetln(3) and fparseln(3), the
latter is useful for reading configuration files.

fgetln(3) is at least as convenient as ggets(3), plus you can get the line
length for free. It's also available w/o a download on tens of thousands,
maybe even millions, of systems.

And in response to Keith's comment about not being able to detect the
absence of a newline, having the line length allows one to a) detect this
condition and b) easily strip the newline as an option if/when one decides
(since, trailing spaces aren't stripped, stripping a newline by default
hardly offers anything in the way of convenience for most uses).
 
J

Jordan Abel

I do have one small quibble. Since it quietly strips the newline
character from the input line, there's no good way to tell whether the
last line of an input file had a trailing newline in the first place
(for systems that don't require one).

feof(). it's not pretty, but it works.
 
C

CBFalconer

William said:
I beg to different. All the BSDs offer fgetln(3) and fparseln(3), the
latter is useful for reading configuration files.

fgetln(3) is at least as convenient as ggets(3), plus you can get the line
length for free. It's also available w/o a download on tens of thousands,
maybe even millions, of systems.

And in response to Keith's comment about not being able to detect the
absence of a newline, having the line length allows one to a) detect this
condition and b) easily strip the newline as an option if/when one decides
(since, trailing spaces aren't stripped, stripping a newline by default
hardly offers anything in the way of convenience for most uses).

The prototype of ggets is "int ggets(char **ln);"
and for fggets is "int fggets(char **ln, FILE *f);"

My theory is that, having gotten complete lines, we are not in the
least interested in the terminating \n. The routine is written in
standard C, so is available anywhere. Returning the linelength
would be possible, but would complicate the simplified interface,
and thus could lead to errors. The return differentiates between
file errors/EOF and memory exhaustion.

The user has no control of trailing space stripping, that is
entirely up to the file system, not the interface routines. If the
blanks are there, ggets will return them.

What are the prototypes for fgetln and fparseln?

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
K

Keith Thompson

Jordan Abel said:
feof(). it's not pretty, but it works.

(Context: discussing CBFalconer's ggets() routine.)

How does feof() help? I haven't done the experiment, but I presume
that if I've read the last line of a file use ggets(), then feof()
will return true whether that line had a newline terminator or not.
If the newline is there, ggets() will have read and discarded it.
 
W

websnarf

Out of the frying pan, into the fire. Its just trading in a buffer
overflow vulnerability for a denial of service vulnerability. Try
this:

http://www.azillionmonkeys.com/qed/userInput.html

and lets try to get rid of all the core problems at once.

But its not optimum.
I do have one small quibble. Since it quietly strips the newline
character from the input line, there's no good way to tell whether the
last line of an input file had a trailing newline in the first place
(for systems that don't require one).

Ooooh! Nice catch; I see you are getting better at this "code review"
thing. I missed this the last time I looked at it.
It's not a huge deal, and it's probably ok as a default behavior, but
it might be nice to have an alternative interface that handles this --
perhaps a second function that leaves the '\n' in place.

I don't know how you can possibly come to this conclusion. This takes
*away* obvious functionality that is otherwise present even when using
fgets(). Losing the faithful representation property seems like a
pretty big deal to me.
<OT>
I like Perl's behavior in this area, but it may not translate well to
C. In Perl, reading a line gives you a string that includes the
trailing newline character; the "chomp" function deletes it. Perl's
strings are variable-length, and are represented in such a way that
"chomp" doesn't have to scan to find the end of the string.
Duplicating this behavior in C might require too much scaffolding,
e.g., returning a structure with additional information rather than
just returning a pointer to a string.
</OT>

Ironic comment. Bstrlib matches Perl's behavior here (it doesn't have
a chomp, but it has trim functions and simulating an exact chomp is a
one-liner) -- I don't know what you mean by "too much scaffolding".
(And of course, its not technically OT to talk about it.)
 
J

Jordan Abel

(Context: discussing CBFalconer's ggets() routine.)

How does feof() help? I haven't done the experiment, but I presume
that if I've read the last line of a file use ggets(), then feof()
will return true whether that line had a newline terminator or not.
If the newline is there, ggets() will have read and discarded it.

But it will not have attempted to read _past_ the newline, thus feof()
will return false. The same applies to fgets() itself.
 
R

Richard G. Riley

But it will not have attempted to read _past_ the newline, thus feof()
will return false. The same applies to fgets() itself.

Assuming the status hasnt been reset in whatever that third party
library was. Seems to me to be taking a flame thrower to a candle :
unnecessary and overkill. Reading line by line is hardly new and has
been addressed in the C language. 99.999% of programmes use these
libraries with no issue since the programmer knows the input maximum line
length and can set a safety buffer accordingly : that combined with
the abilitiy to tell the function maximum of of bytes to read would
make me reluctant to pull someone elses stuff into the equation.
 
K

Keith Thompson

Jordan Abel said:
But it will not have attempted to read _past_ the newline, thus feof()
will return false. The same applies to fgets() itself.

Ok, I think you're right. (I'll give it a try later.)
 
J

Jordan Abel

[...]
I do have one small quibble. Since it quietly strips the newline
character from the input line, there's no good way to tell whether the
last line of an input file had a trailing newline in the first place
(for systems that don't require one).

feof(). it's not pretty, but it works.

(Context: discussing CBFalconer's ggets() routine.)

How does feof() help? I haven't done the experiment, but I presume
that if I've read the last line of a file use ggets(), then feof()
will return true whether that line had a newline terminator or not.
If the newline is there, ggets() will have read and discarded it.

But it will not have attempted to read _past_ the newline, thus feof()
will return false. The same applies to fgets() itself.

Assuming the status hasnt been reset in whatever that third party
library was. Seems to me to be taking a flame thrower to a candle :
unnecessary and overkill.

But how often do you need to know this? _especially_ given the assurance
that what is read is a complete line, which you don't even get with
fgets().
 
W

William Ahern

The prototype of ggets is "int ggets(char **ln);"
and for fggets is "int fggets(char **ln, FILE *f);"

My theory is that, having gotten complete lines, we are not in the
least interested in the terminating \n. The routine is written in
standard C, so is available anywhere. Returning the linelength
would be possible, but would complicate the simplified interface,
and thus could lead to errors. The return differentiates between
file errors/EOF and memory exhaustion.

The user has no control of trailing space stripping, that is
entirely up to the file system, not the interface routines. If the
blanks are there, ggets will return them.

What are the prototypes for fgetln and fparseln?

char *fgetln(FILE *stream, size_t *len);

char *fparseln(FILE *stream, size_t *len, size_t *lineno,
const char delim[3], int flags);

It's been awhile since I used fgetln(). As somebody else pointed
out--by e-mail--fgetln() returns a pointer that you don't own (the space
isn't immutable, but to persist the data you have to copy out). Also,
fgetln() doesn't return a true, NUL-terminated string. But, for most of my
purposes having the line length is far more useful than have an altered
string that I have to free (since sometimes you just want to parse
in-place and move on).

GNU getline(), I think, is a great compromise between
fgetln() and ggets(): ssize_t getline(char **buf, size_t *bufsiz, FILE *).
It's one of the rare GNU extensions that does most everything you want in
a more-or-less elegant manner:

getline() reads an entire line, storing the address of the buffer
containing the text into *lineptr. The buffer is null-terminated
and includes the newline character, if a newline delimiter was found.

If *lineptr is NULL, the getline() routine will allocate a buffer for
containing the line, which must be freed by the user program.
Alternatively, before calling getline(), *lineptr can contain a pointer
to a malloc()-allocated buffer *n bytes in size. If the buffer is not
large enough to hold the line read in, getline() resizes the buffer to
fit with realloc(), updating *lineptr and *n as necessary. In either
case, on a successful call, *lineptr and *n will be updated to reflect
the buffer address and size respectively.

...

On success, getline() ... return the number of characters
read, including the delimiter character, but not including the termi-
nating null character.

fparseln() is probably not a fair comparison since it's not strictly a
general purpose line buffering interface. From the man page:

The fparseln() function returns a pointer to the next logical line
from the stream referenced by stream. This string is null terminated and
dynamically allocated on each invocation. It is the responsibility of
the caller to free the pointer.

By default, if a character is escaped, both it and the preceding escape
character will be present in the returned string. Various flags alter
this behaviour.
 
J

Jordan Abel

GNU getline(), I think, is a great compromise between
fgetln() and ggets(): ssize_t getline(char **buf, size_t *bufsiz, FILE *).
It's one of the rare GNU extensions that does most everything you want in
a more-or-less elegant manner:

getline in terms of fgetln:

/*
void *reallocf(void * orig, size_t size) {
register void *tmp;
if(!(tmp=realloc(orig,size)))
free(orig);
return tmp;
}
*/

ssize_t getline(char **buf, size_t * bufsiz, FILE *stream) {
size_t len;
char *fgotln = fgetln(stream, &len);
if(!fgotln) return -1;
if(*bufsiz < len+1) {
*bufsiz = len+1;
*buf = reallocf(*buf,len+1);
if(!*buf) return -1;
}
memcpy(*buf,fgotln,len);
(*buf)[len]=0;
return len;
}

I get the impression that fgetln is _very_ low-level - the way it's
described implies that the pointer it returns may be into the stdio
stream's buffer.
 
W

William Ahern

I get the impression that fgetln is _very_ low-level - the way it's
described implies that the pointer it returns may be into the stdio
stream's buffer.

Yep. This is exactly how it works, where it's native. I had to rewrite
it for a compat library, since fparseln() requires it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top