How to display the last n lines from a text file

N

nic977

I am asked to write a simple program to displays the last n lines from a
given text file. But I have no ideas how C defines a "line" in a text
file. How does it tell if it is the end of the line, is there such thing
call EOL like the EOF?
 
N

Nick Austin

I am asked to write a simple program to displays the last n lines from a
given text file. But I have no ideas how C defines a "line" in a text
file. How does it tell if it is the end of the line, is there such thing
call EOL like the EOF?

You open the file in text mode, e.g:

FILE *f = fopen( "example.txt", "r" );

In this mode all newlines are translated to the integer '\n'.

Nick.
 
J

Jack Klein

I am asked to write a simple program to displays the last n lines from a
given text file. But I have no ideas how C defines a "line" in a text
file. How does it tell if it is the end of the line, is there such thing
call EOL like the EOF?

Yes, it is named '\n'.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
 
I

Irrwahn Grausewitz

nic977 said:
I am asked to write a simple program to displays the last n lines from a
given text file. But I have no ideas how C defines a "line" in a text
file. How does it tell if it is the end of the line, is there such thing
call EOL like the EOF?

C does not define a line in a text file.

To find an end-of-line read the file one character at a time and watch
out for '\n', it's C's representation of a newline-character.

Or use the fgets function, it reads a whole line of input (when provided
with a properly sized buffer).

And, please, read the faq-list:

http://www.eskimo.com/~scs/C-faq/top.html


Irrwahn
 
T

T.M. Sommers

nic977 said:
I am asked to write a simple program to displays the last n lines from a
given text file. But I have no ideas how C defines a "line" in a text
file. How does it tell if it is the end of the line, is there such thing
call EOL like the EOF?

No; you just look for newlines. The simplest way to do what you want
is to look at each character in the file, and count how many are newlines.
 
L

lallous

Never tried that, but a fast way might be to:
1.open file in binary mode
2.start reading backwards (and from its end) and scanning for '\n' (new
line)
3.when you locate n times the '\n' then you have found your last n lines
from the file.
 
G

Glen Herrmannsfeldt

(snip about finding the last n lines in a text file)

lallous said:
Never tried that, but a fast way might be to:
1.open file in binary mode
2.start reading backwards (and from its end) and scanning for '\n' (new
line)
3.when you locate n times the '\n' then you have found your last n lines
from the file.

I think you have to test for seekable source first. If it isn't seekable
then you need to read all the lines and save the lines in some kind of
buffer.

-- glen
 
J

Joe Wright

Never tried that, but a fast way might be to:
1.open file in binary mode
2.start reading backwards (and from its end) and scanning for '\n' (new
line)
3.when you locate n times the '\n' then you have found your last n lines
from the file.
You're kidding. Read backwards? How, exactly?
 
T

Tom Zych

You're kidding. Read backwards? How, exactly?

Well, I don't think it's a good approach in general - only works on
seekable files, for one thing - but it doesn't seem hard. fseek(),
fread, scan the buffer backwards, repeat as needed. Kind of messy
but workable.
 
B

Ben Pfaff

Irrwahn Grausewitz said:
C does not define a line in a text file.

Yes it does, see C99 7.19.2#2:

A text stream is an ordered sequence of characters
composed into lines, each line consisting of zero or more
characters plus a terminating new-line character.
 
J

Joe Wright

Tom said:
Well, I don't think it's a good approach in general - only works on
seekable files, for one thing - but it doesn't seem hard. fseek(),
fread, scan the buffer backwards, repeat as needed. Kind of messy
but workable.
Bear with me Tom, fseek() what exactly? fread() a text file? In binary
mode? I think not. You can't read backward. Let's start over..

The term 'lines' suggests a 'text' file. A line is a series of zero or
more characters terminating with a '\n' character. There may be any
number of lines in a text file. The last line of the text file may or
may not be terminated with a '\n' character.
 
T

Tom Zych

Joe said:
Tom Zych wrote:
Bear with me Tom, fseek() what exactly? fread() a text file? In binary
mode? I think not. You can't read backward. Let's start over..
The term 'lines' suggests a 'text' file. A line is a series of zero or
more characters terminating with a '\n' character. There may be any
number of lines in a text file. The last line of the text file may or
may not be terminated with a '\n' character.

It's my understanding that C makes no fundamental distinction
between "text" files/streams and "binary" ones. The "r"/"rb"
business in fopen is just there for line-terminator translation.
So we can treat the input file as a bunch of bytes and read it any
way we want.

Here's an outline of the algorithm I was thinking of. It would
need a lot more work but I think it shows the approach is
workable, if messy.

--------------------------------------------------
/* assumes the input file is seekable */
/* just an outline, not compiled or tested, no error checking */

#include <stdio.h>

#define SIZE 1024
#define FILENAME "whatever"

int main(void)
{
FILE *in;
char buf[SIZE];
long file_size, pos;
size_t ct;

in = fopen(FILENAME, "rb");

fseek(in, 0L, SEEK_END);
file_size = ftell(in);

pos = file_size - (file_size % SIZE);
while (pos >= 0) {
fseek(in, pos, SEEK_SET);
ct = fread(buf, 1, SIZE, in);
// if last block, check for final \n
// scan first ct bytes of buf backwards, counting \n's
// if we hit 10, compute the offset and break
pos -= SIZE;
}

// other stuff here

return 0;
}
 
T

Twirlip

Tom said:
It's my understanding that C makes no fundamental distinction
between "text" files/streams and "binary" ones.

The world is stranger than you understand...
The "r"/"rb"
business in fopen is just there for line-terminator translation.

Exactly. If you read a text file as binary, then the the line
terminator will not be translated. It could appear as anything;
a single arbitrary character, a sequence of characters, or nothing
at all.

Conversely, treating a text stream as binary can lead to spurious
'\n's being injected into the stream or to characters being
removed. On some platforms this breaks many stdin -> stdout
binary filter program (stdin and stdout are supposed to be text
streams).

Twirlip
 
T

Tom Zych

The world is stranger than you understand...
Exactly. If you read a text file as binary, then the the line
terminator will not be translated. It could appear as anything;
a single arbitrary character, a sequence of characters, or nothing
at all.
Conversely, treating a text stream as binary can lead to spurious
'\n's being injected into the stream or to characters being
removed. On some platforms this breaks many stdin -> stdout
binary filter program (stdin and stdout are supposed to be text
streams).

Good timing...I recently started reading K&R2 for the first time and
just now reached Section 1.5. I'm glad this was pointed out to me
here before I read that bit, or I might have missed its importance.
If I had ever thought about it I'd have realized that there are all
kinds of systems out there that use all kinds of coding, including
EBCDIC and fixed line lengths. But only being experienced with *nix
and MS-DOS (and maybe due to stuff I read in my old Borland C
manual), I'd assumed \n and \r\n covered all possibilities. :p

So, my new understanding: if you open a stream with "r" or "w", the
program will read and/or write lines terminated with '\n', but the
actual file or whatever on the system may have a completely
different representation, and the library translates between the
two. If you open a file with "rb" or "wb", you get straight binary,
no translation. Don't mix them. And stdin, stdout, and stderr are
text, not binary. All correct?

Thanks to all for straightening me out!
 
I

Irrwahn Grausewitz

So, my new understanding: if you open a stream with "r" or "w", the
program will read and/or write lines terminated with '\n', but the
actual file or whatever on the system may have a completely
different representation, and the library translates between the
two. If you open a file with "rb" or "wb", you get straight binary,
no translation. Don't mix them. And stdin, stdout, and stderr are
text, not binary. All correct?
Exactely. Just to add, I found the following footnote in n843:

213The primary use of the freopen function is to change the
file associated with a standard text stream (stderr,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
stdin, or stdout), as those identifiers need not be
^^^^^^^^^^^^^^^^^
modifiable lvalues to which the value returned by the
fopen function may be assigned.

Irrwahn
--
do not write: void main(...)
do not use gets()
do not cast the return value of malloc()
do not fflush( stdin )
read the c.l.c-faq: http://www.eskimo.com/~scs/C-faq/top.html
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top