How to get the total row number of a text file

W

Wei Su

Hi,
I have a text file abc.txt and it looks like:
12 34 56
23 45 56
33 56 78
... .. ..
... .. ..

I want to get how many rows totally in the text file, how to do this?
Thanks.
 
C

castreuil.anthony

Wei Su schreef:
Hi,
I have a text file abc.txt and it looks like:
12 34 56
23 45 56
33 56 78
.. .. ..
.. .. ..

I want to get how many rows totally in the text file, how to do this?
Thanks.

You could count the numbers of linebreaks.
 
T

ThunderBird

it mentioned "You could count the numbers of linebreaks. ". My question
is how to count the numbers of linebreaks?
TB
 
C

Charles M. Reinke

----- Original Message -----
From: "ThunderBird" <[email protected]>
Newsgroups: comp.lang.c
Sent: Thursday, August 11, 2005 1:14 PM
Subject: Re: How to get the total row number of a text file

it mentioned "You could count the numbers of linebreaks. ". My question
is how to count the numbers of linebreaks?
TB

This works for me...

cmreinke@hologram>cat lines.c
/* lines.c */
#include <stdio.h>

int main(int argc, char **argv) {
long lines=0;
FILE *fp;

if(argc>1) {
if((fp=fopen(argv[1], "r"))) {
while(fscanf(fp, "%*[^\n]\n")!=EOF) lines++;
printf("Number of lines in file \"%s\": %ld\n", argv[1], lines);
} /* if */
else printf("ERROR: could not open file \"%s\"\n", argv[1]);
} /* if */
else printf("ERROR: no input file specified\n");

return 0;
} /* main */
/* lines.c */

cmreinke@hologram>gcc -Wall -ansi -pedantic -o lines.exe lines.c
cmreinke@hologram>lines.exe output2
Number of lines in file "output2": 270

-Charles
 
M

moxm

This is a kind of tricky question, different os use different
combinations to get the newline.
Windows: \r\n
Unix: \n
Mac: \r . it said to be, I have never used a Mac machine. However, the
text file transfered between M$'s os and Unix, I dit get some trouble
with the carriage return and line feed.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

static const char *prog_name="wcn";

int
main(int argc, char *argv[]) {
FILE *fp;
int i;

if (argc == 1) {
printf("Usage: %s [files]\n", prog_name);
exit(EXIT_FAILURE);
}

for (i=1; i<argc; ++i) {
unsigned int r_c=0, n_c=0, rn_c=0;
int c, last=0;
if (! (fp=fopen(argv, "rb"))) {
printf("%s: %s: %s\n", prog_name, argv, strerror(errno));
continue;
}


while ((c=fgetc(fp)) != EOF) {
if (c == '\n') {
++n_c;
if (last == '\r')
++rn_c;
} else if (last == '\r')
++r_c;
last = c;
}
r_c += last=='\r'; /* In case the last char is '\r' */

if (ferror(fp)) {
printf("%s: %s: %s\n", prog_name, argv, strerror(errno));
if (fclose(fp) == EOF)
perror(prog_name);
continue;
}
if (fclose(fp) == EOF)
printf("%s: %s: %s\n", prog_name, argv, strerror(errno));
printf("%s:\t \\n: %u\t \\r: %u\t \\r\\n: %u\n", argv, n_c, r_c,
rn_c);
}
exit(EXIT_SUCCESS);
}
 
W

Walter Roberson

This is a kind of tricky question, different os use different
combinations to get the newline.
Windows: \r\n
Unix: \n
Mac: \r . it said to be, I have never used a Mac machine. However, the
text file transfered between M$'s os and Unix, I dit get some trouble
with the carriage return and line feed.
if (! (fp=fopen(argv, "rb"))) {


If you do not open in binary mode, then the local OS's internal convention
will be transformed into \n when the data is read.

If you are trying to deal with files that have been copied from other
OS's then the first line of defence is to copy them in text mode instead
of binary mode.
 
M

moxm

Charles' code is very clean and elegant, I think it works well in
windows and unix , but what if the newline character is just '\r' at a
Mac machine.
 
K

Keith Thompson

Charles M. Reinke said:
ThunderBird said:
it mentioned "You could count the numbers of linebreaks. ". My question
is how to count the numbers of linebreaks?

This works for me...

cmreinke@hologram>cat lines.c
/* lines.c */
#include <stdio.h>

int main(int argc, char **argv) {
long lines=0;
FILE *fp;

if(argc>1) {
if((fp=fopen(argv[1], "r"))) {
while(fscanf(fp, "%*[^\n]\n")!=EOF) lines++;
printf("Number of lines in file \"%s\": %ld\n", argv[1], lines);
} /* if */
else printf("ERROR: could not open file \"%s\"\n", argv[1]);
} /* if */
else printf("ERROR: no input file specified\n");

return 0;
} /* main */
/* lines.c */

Why bother with fscanf()?

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
long lines = 0;
FILE *fp;

if (argc>1) {
if((fp = fopen(argv[1], "r"))) {
int c;
while ((c = getc(fp)) != EOF) {
if (c == '\n') {
lines ++;
}
}
fclose(fp);
printf("Number of lines in file \"%s\": %ld\n", argv[1], lines);
}
else {
fprintf(stderr, "ERROR: could not open file \"%s\"\n", argv[1]);
exit(EXIT_FAILURE);
}
}
else {
fprintf(stderr, "ERROR: no input file specified\n");
exit(EXIT_FAILURE);
}
return 0;
}

I've also added a call to fclose(), improved error handling, and
(IMHO) improved code layout.
 
M

moxm

Do we have any ways to tell the newline character in our os? If it is
not '\n', then Charles' fscanf statement have to be changed.

I mean if I have a text file which is gotten from other os in binary
mode, and I want a general program to deal with it, what should I do?
 
C

Charles M. Reinke

Keith Thompson said:
Why bother with fscanf()?
I thought about using something like fgetc(), but preferred to consuming the
whole line at a time. Although fscanf() is still probably not the *best*
way to do this. :)

[code snipped]
I've also added a call to fclose(), improved error handling, and
(IMHO) improved code layout.
I completely forgot to fclose(), thanx! And I agree that your error
handling is superior. Alas, I am not a l33t haxor.

-Charles
 
K

Keith Thompson

moxm said:
Do we have any ways to tell the newline character in our os? If it is
not '\n', then Charles' fscanf statement have to be changed.

I mean if I have a text file which is gotten from other os in binary
mode, and I want a general program to deal with it, what should I do?

In the article to which you replied, Walter Roberson wrote:

If you do not open in binary mode, then the local OS's internal
convention will be transformed into \n when the data is read.

As long as you open the file in text mode, and as long as the input
file is a valid text file for the OS, you don't need to know how the
OS represents an end-of-line.

And please provide some context when you post a followup. Search for
"google broken reply link" in this newsgroup for many many copies of
the same explanation.
 
C

CBFalconer

ThunderBird said:
it mentioned "You could count the numbers of linebreaks. ". My question
is how to count the numbers of linebreaks?

This is meaningless. See my sig below for a means of quoting on
the foul google usenet interface. Without context you might as
well not write anything.
 
K

Keith Thompson

Charles M. Reinke said:
I thought about using something like fgetc(), but preferred to consuming the
whole line at a time. Although fscanf() is still probably not the *best*
way to do this. :)

I think fscanf() is defined to do the equivalent of calling fgetc()
anyway. It may do some optimizations to avoid reading a single
character at a time, but fgetc() can use the same optimizations
(reading from an in-memory buffer whenever possible, for example).
There may be some overhead from the function calls, but using getc()
(which is typically a macro) avoids that, and in any case function
call overhead is probably insignificant.
 
M

Mark McIntyre

it mentioned "You could count the numbers of linebreaks. ". My question
is how to count the numbers of linebreaks?

walk through the file byte by byte, counting linefeed characters.
 
C

Chris Croughton

Do we have any ways to tell the newline character in our os? If it is
not '\n', then Charles' fscanf statement have to be changed.

fscanf is for use with text mode streams, so the C library should do any
conversion required for files correct for that OS. But...
I mean if I have a text file which is gotten from other os in binary
mode, and I want a general program to deal with it, what should I do?

.... you are now talking about files from a foreign OS. If the file has
been imported as a binary copy, the C library on your machine has no
idea how to handle it. For the most common ones these days, you can
faitly easily write your own code to read a line, though. Mine uses an
algorithm something like:

if the character is CR
get the next character.
if that is not LF, unget it (push it back so it's got next time).
return end of line
else if the character is LF
get the next character.
if that is not CR, unget it (push it back so it's got next time).
return end of line
end if

That copes with lines ending with CR, LF, CRLF and LFCR, as long as
there are no stray CR or LF characters which are supposed to be part of
the data (which is silly but occasionally happens).

Of course, your foreign file might have come from a system where each
like is represented as a 2 byte count followed by that many characters
in the line, or where all files are kept in a compressed form, or text
files have a header stating the line width and all lines are constant
width, or something more strange, and there is no way that you can
automatically detect all of the possibilities...

Chris C
 
C

CBFalconer

Charles M. Reinke said:
From: "ThunderBird said:
it mentioned "You could count the numbers of linebreaks. ".
My question is how to count the numbers of linebreaks?

This works for me...

cmreinke@hologram>cat lines.c
/* lines.c */
#include <stdio.h>

int main(int argc, char **argv) {
long lines=0;
FILE *fp;

if(argc>1) {
if((fp=fopen(argv[1], "r"))) {
while(fscanf(fp, "%*[^\n]\n")!=EOF) lines++;
printf("Number of lines in file \"%s\": %ld\n", argv[1], lines);
} /* if */
else printf("ERROR: could not open file \"%s\"\n", argv[1]);
} /* if */
else printf("ERROR: no input file specified\n");

return 0;
} /* main */
/* lines.c */

Why so complex? Use stdin. :)

[1] c:\c\wc>cat lc.c
#include <stdio.h>

int main(void)
{
unsigned long chars, lines;
int ch;

chars = lines = 0;
while (EOF != (ch = getchar())) {
chars++;
if ('\n' == ch) lines++;
}
printf("%lu chars in %lu lines\n", chars, lines);
return 0;
} /* main lines */

[1] c:\c\wc>cc lc.c -o lc.exe

[1] c:\c\wc>.\lc < lc.c
285 chars in 15 lines
 
M

Mark

Keith Thompson said:
In the article to which you replied, Walter Roberson wrote:

If you do not open in binary mode, then the local OS's internal
convention will be transformed into \n when the data is read.
I don't understand how... what magic causes this to happen?
The local OS wouldn't know what character(s) to 'transform'.
As long as you open the file in text mode, and as long as the input
file is a valid text file for the OS, you don't need to know how the
OS represents an end-of-line.

He has already stated that it may not be a valid text file for the current
system ...
Obviously, when transferring 'text' files between different operating
systems
they should be transferred in 'text' mode to make the appropriate
conversions.
When a text file is transferred in binary mode, these conversions do NOT
take
place.

Mark
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top