How to count lines in a text file ?

A

Alex Martelli

Andrew Dalke said:
'Cause that's what Python does. Witness:

If you tell it to count non-lines too (pieces that don't end with an
endline marker), it does, of course:
% echo -n 'bu' | python -c \
? 'import sys; print len(sys.stdin.readlines())'
1

But that's just because you told it to.

To reproduce wc's behavior, you have to exclude non-lines -- use
len([ l for l in sys.stdin if l.endswith('\n') ]) for example. Or, the
simpler .count('\n') approach.

I suspect somebody who asks the subject question wants to reproduce wc's
counting behavior. Of course, it _is_ an imprecise spec they're giving.


Alex
 
A

Andrew Dalke

Alex said:
If you tell it to count non-lines too (pieces that don't end with an
endline marker), it does, of course:

My reply was meant to be a bit of a jest, pointing out that
I'm using Python's definition of a line. Otherwise if
lines must end with a newline then the method should be
named "readlines_and_any_trailing_text()"

Since you used

numlines=0
for line in file('/usr/share/dict/words'): numlines+=1

as a way to count lines, I assumed you would agree with
Python's definition as a reasonable way to count the
number of lines in a file and that your previous post
(on the behavior of wc) was meant more as a rhetorical
way to highlight the ambiguity than as an assertion of
general correctness.

I suspect somebody who asks the subject question wants to reproduce wc's
counting behavior.

Really? I was actually surprised at what wc does. I didn't
realize it only did a "\n" character count. The other programs
I know of number based on the start of line rather than end
of line.


% echo -n "blah" > blah.txt
% less blah.txt
(then press "=")
blah.txt lines 1-1/1 byte 4/4 (END) (press RETURN)


% echo -n "" | perl -ne '$line++; END{$line+=0;print "$line\n"}'
0
% echo -n "blah" | perl -ne '$line++; END{$line+=0;print "$line\n"}'
1

% echo -n "" | awk 'END {print NR}'
0
% echo -n "blah" | awk 'END {print NR}'
1

% echo -n "blah" | grep -n "blah"
1:blah
> Of course, it _is_ an imprecise spec they're giving.

Yup.

Andrew
(e-mail address removed)
 
A

Alex Martelli

Andrew Dalke said:
Really? I was actually surprised at what wc does. I didn't
realize it only did a "\n" character count. The other programs

Ah well -- maybe it's just me, 25+ years of either using Unix or pining
for it (when I had to use VMS, VM/SP, Windows, etc, etc) must have left
their mark.


Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,533
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top