How to count lines in a text file ?

Discussion in 'Python' started by Ling Lee, Sep 20, 2004.

  1. Ling Lee

    Ling Lee Guest

    Hi all.

    I'm trying to write a program that:
    1) Ask me what file I want to count number of lines in, and then counts the
    lines and writes the answear out.

    2) I made the first part like this:

    in_file = raw_input("What is the name of the file you want to open: ")
    in_file = open("test.txt","r")
    text = in_file.read()

    3) I think that I have to use a for loop ( something like: for line in text:
    count +=1)
    Or maybee I have to do create a def: something like: ( def loop(line,
    count)), but not sure how to do this properly.
    And then perhaps use the readlines() function, but again not quite sure how
    to do this. So do one of you have a good idea.

    Thanks for all help
     
    Ling Lee, Sep 20, 2004
    #1
    1. Advertising

  2. Ling Lee

    Ling Lee Guest

    Oh I just did it.

    Just used the line:

    print "%d lines in your choosen file" % len(open("test.txt").readlines())

    Thanks though :)


    "Ling Lee" <> wrote in message
    news:414ed896$0$26378$...
    > Hi all.
    >
    > I'm trying to write a program that:
    > 1) Ask me what file I want to count number of lines in, and then counts
    > the lines and writes the answear out.
    >
    > 2) I made the first part like this:
    >
    > in_file = raw_input("What is the name of the file you want to open: ")
    > in_file = open("test.txt","r")
    > text = in_file.read()
    >
    > 3) I think that I have to use a for loop ( something like: for line in
    > text: count +=1)
    > Or maybee I have to do create a def: something like: ( def loop(line,
    > count)), but not sure how to do this properly.
    > And then perhaps use the readlines() function, but again not quite sure
    > how to do this. So do one of you have a good idea.
    >
    > Thanks for all help
    >
    >
    >
     
    Ling Lee, Sep 20, 2004
    #2
    1. Advertising

  3. Ling Lee

    Phil Frost Guest

    Yes, you need a for loop, and a count variable. You can count in several
    ways. File objects are iterable, and they iterate over the lines in the
    file. readlines() returns a list of the lines, which will have the same
    effect, but because it builds the entire list in memory first, it uses
    more memory. Example:

    ########

    filename = raw_input('file? ')
    file = open(filename)

    lines = 0
    for line in file:
    # line is ignored here, but it contains each line of the file,
    # including the newline
    lines += 1

    print '%r has %r lines' % (filename, lines)

    ########

    another alternative is to use the standard posix program "wc" with the
    -l option, but this isn't Python.

    On Mon, Sep 20, 2004 at 03:18:53PM +0200, Ling Lee wrote:
    > Hi all.
    >
    > I'm trying to write a program that:
    > 1) Ask me what file I want to count number of lines in, and then counts the
    > lines and writes the answear out.
    >
    > 2) I made the first part like this:
    >
    > in_file = raw_input("What is the name of the file you want to open: ")
    > in_file = open("test.txt","r")
    > text = in_file.read()
    >
    > 3) I think that I have to use a for loop ( something like: for line in text:
    > count +=1)
    > Or maybee I have to do create a def: something like: ( def loop(line,
    > count)), but not sure how to do this properly.
    > And then perhaps use the readlines() function, but again not quite sure how
    > to do this. So do one of you have a good idea.
    >
    > Thanks for all help
     
    Phil Frost, Sep 20, 2004
    #3
  4. Ling Lee <> wrote:

    > Oh I just did it.
    >
    > Just used the line:
    >
    > print "%d lines in your choosen file" % len(open("test.txt").readlines())
    >
    > Thanks though :)


    You're welcome;-). However, this approach reads all of the file into
    memory at once. If you must be able to deal with humungoug files, too
    big to fit in memory at once, try something like:

    numlines = 0
    for line in open('text.txt'): numlines += 1


    Alex
     
    Alex Martelli, Sep 20, 2004
    #4
  5. Ling Lee wrote:
    > Hi all.
    >
    > I'm trying to write a program that:
    > 1) Ask me what file I want to count number of lines in, and then counts the
    > lines and writes the answear out.
    >
    > 2) I made the first part like this:
    >
    > in_file = raw_input("What is the name of the file you want to open: ")
    > in_file = open("test.txt","r")
    > text = in_file.read()
    >
    > 3) I think that I have to use a for loop ( something like: for line in text:
    > count +=1)
    > Or maybee I have to do create a def: something like: ( def loop(line,
    > count)), but not sure how to do this properly.
    > And then perhaps use the readlines() function, but again not quite sure how
    > to do this. So do one of you have a good idea.
    >
    > Thanks for all help
    >
    >
    >

    text = in_file.readlines()
    print len(text)

    HtH, Roland
     
    Roland Heiber, Sep 20, 2004
    #5
  6. Ling Lee

    Ling Lee Guest

    Thanks for you replies :)

    I just ran the program with a different file name, and it only counts the
    number of lines in the file named test.txt. I try to give it a nother try
    with your input...

    Thanks again... for the fast reply... Hope I get it right this time :)



    "Phil Frost" <> wrote in message
    news:...
    > Yes, you need a for loop, and a count variable. You can count in several
    > ways. File objects are iterable, and they iterate over the lines in the
    > file. readlines() returns a list of the lines, which will have the same
    > effect, but because it builds the entire list in memory first, it uses
    > more memory. Example:
    >
    > ########
    >
    > filename = raw_input('file? ')
    > file = open(filename)
    >
    > lines = 0
    > for line in file:
    > # line is ignored here, but it contains each line of the file,
    > # including the newline
    > lines += 1
    >
    > print '%r has %r lines' % (filename, lines)
    >
    > ########
    >
    > another alternative is to use the standard posix program "wc" with the
    > -l option, but this isn't Python.
    >
    > On Mon, Sep 20, 2004 at 03:18:53PM +0200, Ling Lee wrote:
    >> Hi all.
    >>
    >> I'm trying to write a program that:
    >> 1) Ask me what file I want to count number of lines in, and then counts
    >> the
    >> lines and writes the answear out.
    >>
    >> 2) I made the first part like this:
    >>
    >> in_file = raw_input("What is the name of the file you want to open: ")
    >> in_file = open("test.txt","r")
    >> text = in_file.read()
    >>
    >> 3) I think that I have to use a for loop ( something like: for line in
    >> text:
    >> count +=1)
    >> Or maybee I have to do create a def: something like: ( def loop(line,
    >> count)), but not sure how to do this properly.
    >> And then perhaps use the readlines() function, but again not quite sure
    >> how
    >> to do this. So do one of you have a good idea.
    >>
    >> Thanks for all help
     
    Ling Lee, Sep 20, 2004
    #6
  7. Ling Lee

    Erik Heneryd Guest

    Phil Frost wrote:
    > another alternative is to use the standard posix program "wc" with the
    > -l option, but this isn't Python.
    >


    Not the same thing. wc -l counts newline bytes, not "real" lines.


    Erik
     
    Erik Heneryd, Sep 20, 2004
    #7
  8. Ling Lee said unto the world upon 2004-09-20 09:36:
    > Thanks for you replies :)
    >
    > I just ran the program with a different file name, and it only counts the
    > number of lines in the file named test.txt. I try to give it a nother try
    > with your input...
    >
    > Thanks again... for the fast reply... Hope I get it right this time :)
    >
    >


    <SNIP>

    >>On Mon, Sep 20, 2004 at 03:18:53PM +0200, Ling Lee wrote:
    >>
    >>>Hi all.
    >>>
    >>>I'm trying to write a program that:
    >>>1) Ask me what file I want to count number of lines in, and then counts
    >>>the
    >>>lines and writes the answear out.
    >>>
    >>>2) I made the first part like this:
    >>>
    >>>in_file = raw_input("What is the name of the file you want to open: ")
    >>>in_file = open("test.txt","r")
    >>>text = in_file.read()
    >>>
    >>>3) I think that I have to use a for loop ( something like: for line in
    >>>text:
    >>>count +=1)
    >>>Or maybee I have to do create a def: something like: ( def loop(line,
    >>>count)), but not sure how to do this properly.
    >>>And then perhaps use the readlines() function, but again not quite sure
    >>>how
    >>>to do this. So do one of you have a good idea.
    >>>
    >>>Thanks for all help


    Hi Ling Lee,

    you've got:

    in_file = raw_input("What is the name of the file you want to open: ")
    in_file = open("test.txt","r")

    What this does is take the user input and assign it the name "in_file"
    and then promptly reassigns the name "in_file" to the output of
    open("test.txt","r").

    So, you never make use of the input, and keep asking it to open test.txt
    instead.

    Try something like:

    in_file_name = raw_input("What is the file you want to open: ")
    in_file = open(in_file_name,"r")

    Also, and I say this as a fellow newbie, you might want to check out the
    Tutor list: <http://mail.python.org/pipermail/tutor/>

    HTH,

    Brian vdB
     
    Brian van den Broek, Sep 20, 2004
    #8
  9. Ling Lee

    Andrew Dalke Guest

    Ling Lee wrote:
    > 2) I made the first part like this:
    >
    > in_file = raw_input("What is the name of the file you want to open: ")
    > in_file = open("test.txt","r")
    > text = in_file.read()


    You have two different objects related to the file.
    One is the filename (the result of calling raw_input) and
    the other is the file handle (the result of calling open).
    You are using same variable name for both of them. You
    really should make them different.

    First you get the file name and reference it by the variable
    named 'in_file'. Next you use another filename ("test.txt")
    for the open call. This returns a file handle, but not
    a file handle to the file named in 'in_file'.

    You then change things so that 'in_file' no longer refers
    to the filename but now refers to the file handle.

    A nicer solution is to use one variable name for the name
    (like "in_filename") and another for the handle (you can
    keep "in_file" if you want to). In the following I
    reformatted it so the example fits in under 80 colums

    in_filename = raw_input("What is the name of the file "
    "you want to open: ")
    in_file = open(in_filename,"r")
    text = in_file.read()


    Now the in_file.read() reads all of the file into memory. There
    are several ways to count the number of lines. The first is
    to count the number of newline characters. Because the newline
    character is special, it's most often written as what's called
    an escape code. In this case, "\n". Others are backspace ("\b")
    and beep ("\g"), and backslash ("\\") since otherwise there's
    no way to get the single character "\".

    Here's how to cound the number of newlines in the text

    num_lines = text.count("\n")

    print "There are", num_lines, "in", in_filename


    This will work for almost every file except for one where
    the last line doesn't end with a newline. It's rare, but
    it does happen. To fix that you need to see if the
    text ends with a newline and if it doesn't then add one
    more to the count


    num_lines = text.count("\n")
    if not text.endswith("\n"):
    num_lines = num_lines + 1

    print "There are", num_lines, "in", in_filename


    > 3) I think that I have to use a for loop ( something like
    > for line in text: count +=1)


    Something like that will work. When you say "for xxxx in string"
    it loops through every character in the string, and not
    every line. What you need is some way to get the lines.

    One solution is to use the 'splitlines' method of strings.
    This knows how to deal with the "final line doesn't end with
    a newline" case and return a list of all the lines. You
    can use it like this

    count = 0
    for line in text.splitlines():
    count = count + 1

    or, since splitlines() returns a list of lines you can
    also do

    count = len(text.splitlines())

    It turns out that reading lines from a file is very common.
    When you say "for xxx in file" it loops through every line
    in the file. This is not a list so you can't say

    len(open(in_filename, "r")) # DOES NOT WORK

    instead you need to have the explicit loop, like this

    count = 0
    for line in open(in_filename, "r")):
    count = count + 1

    An advantage to this approach is that it doesn't read
    the whole file into memory. That's only a problems
    if you have a large file. Try counting the number of
    lines in a 1.5 GB file!

    By the way, the "r" is the default for the a file open.
    Most people omit it from the parameter list and just use

    open(in_filename)

    Hope this helped!

    By the way, you might want to look at the "Beginner's
    Guide to Python" page at http://python.org/topics/learn/ .
    It has pointers to resources that might help, including
    the tutor mailing list meant for people like you who
    are learning to program in Python.

    Andrew
     
    Andrew Dalke, Sep 20, 2004
    #9
  10. Ling Lee

    Ling Lee Guest

    Thanks for explaining it that well, really makes sense now :)

    Cheers....
    "Andrew Dalke" <> wrote in message
    news:ekE3d.648$...
    > Ling Lee wrote:
    >> 2) I made the first part like this:
    >>
    >> in_file = raw_input("What is the name of the file you want to open: ")
    >> in_file = open("test.txt","r")
    >> text = in_file.read()

    >
    > You have two different objects related to the file.
    > One is the filename (the result of calling raw_input) and
    > the other is the file handle (the result of calling open).
    > You are using same variable name for both of them. You
    > really should make them different.
    >
    > First you get the file name and reference it by the variable
    > named 'in_file'. Next you use another filename ("test.txt")
    > for the open call. This returns a file handle, but not
    > a file handle to the file named in 'in_file'.
    >
    > You then change things so that 'in_file' no longer refers
    > to the filename but now refers to the file handle.
    >
    > A nicer solution is to use one variable name for the name
    > (like "in_filename") and another for the handle (you can
    > keep "in_file" if you want to). In the following I
    > reformatted it so the example fits in under 80 colums
    >
    > in_filename = raw_input("What is the name of the file "
    > "you want to open: ")
    > in_file = open(in_filename,"r")
    > text = in_file.read()
    >
    >
    > Now the in_file.read() reads all of the file into memory. There
    > are several ways to count the number of lines. The first is
    > to count the number of newline characters. Because the newline
    > character is special, it's most often written as what's called
    > an escape code. In this case, "\n". Others are backspace ("\b")
    > and beep ("\g"), and backslash ("\\") since otherwise there's
    > no way to get the single character "\".
    >
    > Here's how to cound the number of newlines in the text
    >
    > num_lines = text.count("\n")
    >
    > print "There are", num_lines, "in", in_filename
    >
    >
    > This will work for almost every file except for one where
    > the last line doesn't end with a newline. It's rare, but
    > it does happen. To fix that you need to see if the
    > text ends with a newline and if it doesn't then add one
    > more to the count
    >
    >
    > num_lines = text.count("\n")
    > if not text.endswith("\n"):
    > num_lines = num_lines + 1
    >
    > print "There are", num_lines, "in", in_filename
    >
    >
    >> 3) I think that I have to use a for loop ( something like
    >> for line in text: count +=1)

    >
    > Something like that will work. When you say "for xxxx in string"
    > it loops through every character in the string, and not
    > every line. What you need is some way to get the lines.
    >
    > One solution is to use the 'splitlines' method of strings.
    > This knows how to deal with the "final line doesn't end with
    > a newline" case and return a list of all the lines. You
    > can use it like this
    >
    > count = 0
    > for line in text.splitlines():
    > count = count + 1
    >
    > or, since splitlines() returns a list of lines you can
    > also do
    >
    > count = len(text.splitlines())
    >
    > It turns out that reading lines from a file is very common.
    > When you say "for xxx in file" it loops through every line
    > in the file. This is not a list so you can't say
    >
    > len(open(in_filename, "r")) # DOES NOT WORK
    >
    > instead you need to have the explicit loop, like this
    >
    > count = 0
    > for line in open(in_filename, "r")):
    > count = count + 1
    >
    > An advantage to this approach is that it doesn't read
    > the whole file into memory. That's only a problems
    > if you have a large file. Try counting the number of
    > lines in a 1.5 GB file!
    >
    > By the way, the "r" is the default for the a file open.
    > Most people omit it from the parameter list and just use
    >
    > open(in_filename)
    >
    > Hope this helped!
    >
    > By the way, you might want to look at the "Beginner's
    > Guide to Python" page at http://python.org/topics/learn/ .
    > It has pointers to resources that might help, including
    > the tutor mailing list meant for people like you who
    > are learning to program in Python.
    >
    > Andrew
    >
     
    Ling Lee, Sep 20, 2004
    #10
  11. On Mon, 20 Sep 2004 15:29:18 +0200, rumours say that
    (Alex Martelli) might have written:

    >Ling Lee <> wrote:


    >> Oh I just did it.
    >>
    >> Just used the line:
    >>
    >> print "%d lines in your choosen file" % len(open("test.txt").readlines())
    >>
    >> Thanks though :)


    [Alex]
    >You're welcome;-). However, this approach reads all of the file into
    >memory at once. If you must be able to deal with humungoug files, too
    >big to fit in memory at once, try something like:
    >
    >numlines = 0
    >for line in open('text.txt'): numlines += 1


    And a short story of premature optimisation follows...

    Saw the plain code above and instantly the programmer's instinct of
    optimisation came into action... we all know that C loops are faster
    than python loops, right? So I spent 2 minutes of my time to write the
    following 'clever' function:

    def count_lines(filename):
    fp = open(filename)
    count = 1 + max(enumerate(fp))[0]
    fp.close()
    return count

    Proud of my programming skills, I timed it against another function
    containing Alex' code. Guess what? My code was slower... (and I should
    put a try: except Value: clause to cater for empty files)

    Of course, on second thought, the reason must be that enumerate
    generates one tuple for every line in the file; in any case, I'll mark
    this rule:

    C loops are *always* faster than python loops, unless the loop does
    something useful ;-) in the latter case, timeit.py is your friend.
    --
    TZOTZIOY, I speak England very best,
    "Tssss!" --Brad Pitt as Achilles in unprecedented Ancient Greek
     
    Christos TZOTZIOY Georgiou, Sep 22, 2004
    #11
  12. Christos TZOTZIOY Georgiou <> wrote:
    ...
    > >memory at once. If you must be able to deal with humungoug files, too
    > >big to fit in memory at once, try something like:
    > >
    > >numlines = 0
    > >for line in open('text.txt'): numlines += 1

    >
    > And a short story of premature optimisation follows...


    Thanks for sharing!

    > def count_lines(filename):
    > fp = open(filename)
    > count = 1 + max(enumerate(fp))[0]
    > fp.close()
    > return count


    Cute, actually!

    > containing Alex' code. Guess what? My code was slower... (and I should
    > put a try: except Value: clause to cater for empty files)
    >
    > Of course, on second thought, the reason must be that enumerate
    > generates one tuple for every line in the file; in any case, I'll mark


    I thought built-ins could recycle their tuples, sometimes, but you may
    in fact be right (we should check with Raymong Hettinger, though).

    With 2.4, I measure 30 msec with your approach, and 24 with mine, to
    count the 45425 lines of /usr/share/dict/words on my Linux box
    (admittedly not a creat example of 'humungous file'); and similarly
    kjv.txt, a King James' Bible (31103 lines, but 10 times the size of the
    words file), 41 with yours, 36 with mine. They're pretty close. At
    least they beat len(file(...).readlines()), which takes 33 on words, 62
    on kjv.txt...

    If one is really in a hurry counting lines, a dedicated C extension
    might help. E.g.:

    static PyObject *count(PyObject *self, PyObject *args)
    {
    PyObject* seq;
    PyObject* item;
    int result;

    /* get one argument as an iterator */
    if(!PyArg_ParseTuple(args, "O", &seq))
    return 0;
    seq = PyObject_GetIter(seq);
    if(!seq)
    return 0;

    /* count items */
    result = 0;
    while((item=PyIter_Next(seq))) {
    result += 1;
    Py_DECREF(item);
    }

    /* clean up and return result */
    Py_DECREF(seq);
    return Py_BuildValue("i", result);
    }

    Using this count-items-in-iterable thingy, words takes 10 msec, kjv
    takes 26.

    Happier news is that one does NOT have to learn C to gain this.
    Consider the Pyrex file:

    def count(seq):
    cdef int i
    it = iter(seq)
    i = 0
    for x in it:
    i = i + 1
    return i

    pyrexc'ing this and building the Python extension from the resulting C
    file gives just about the same performance as the pure-C coding: 10 msec
    on words, 26 on kjv, the same to within 1% as pure-C coding (there is a
    systematic speedup of a bit less than 1% for the C-coded function).

    And if one doesn't even want to use pyrex? Why, that's what psyco is
    for...:

    import psyco
    def count(seq):
    it = iter(seq)
    i = 0
    for x in it:
    i = i + 1
    return i
    psyco.bind(seq)

    Again to the same level of precision, the SAME numbers, 10 and 26 msec
    (actually, in this case the less-than-1% systematic bias is in favour of
    psyco compared to pure-C coding...!-)


    So: your instinct that C-coded loops are faster weren't too badly off...
    and you can get the same performance (just about) with Pyrex or (on an
    intel or compatible processor, only -- sigh) with psyco.


    Alex
     
    Alex Martelli, Sep 22, 2004
    #12
  13. On Mon, 20 Sep 2004 15:29:18 +0200, (Alex Martelli) wrote:

    >Ling Lee <> wrote:
    >
    >> Oh I just did it.
    >>
    >> Just used the line:
    >>
    >> print "%d lines in your choosen file" % len(open("test.txt").readlines())
    >>
    >> Thanks though :)

    >
    >You're welcome;-). However, this approach reads all of the file into
    >memory at once. If you must be able to deal with humungoug files, too
    >big to fit in memory at once, try something like:
    >
    >numlines = 0
    >for line in open('text.txt'): numlines += 1


    I don't have 2.4, but how would that compare with a generator expression like (untested)

    sum(1 for line in open('text.txt'))

    or, if you _are_ willing to read in the whole file,

    open('text.txt').read().count('\n')

    Regards,
    Bengt Richter
     
    Bengt Richter, Sep 22, 2004
    #13
  14. Bengt Richter <> wrote:
    ...
    > >memory at once. If you must be able to deal with humungoug files, too
    > >big to fit in memory at once, try something like:
    > >
    > >numlines = 0
    > >for line in open('text.txt'): numlines += 1

    >
    > I don't have 2.4


    2.4a3 is freely available for download and everybody's _encouraged_ to
    download it and try it out -- come on, don't be the last one to!-)

    > but how would that compare with a generator expression like (untested)
    >
    > sum(1 for line in open('text.txt'))
    >
    > or, if you _are_ willing to read in the whole file,
    >
    > open('text.txt').read().count('\n')


    I'm not on the same machine as when I ran the other timing measurements
    (including pyrex &c) but here's the results on this one machine...:

    $ wc /usr/share/dict/words
    234937 234937 2486825 /usr/share/dict/words
    $ python2.4 ~/cb/timeit.py "numlines=0
    for line in file('/usr/share/dict/words'): numlines+=1"
    10 loops, best of 3: 3.08e+05 usec per loop
    $ python2.4 ~/cb/timeit.py
    "file('/usr/share/dict/words').read().count('\n')"
    10 loops, best of 3: 2.72e+05 usec per loop
    $ python2.4 ~/cb/timeit.py
    "len(file('/usr/share/dict/words').readlines())"
    10 loops, best of 3: 3.25e+05 usec per loop
    $ python2.4 ~/cb/timeit.py "sum(1 for line in
    file('/usr/share/dict/words'))"
    10 loops, best of 3: 4.42e+05 usec per loop

    Last but not least...:

    $ python2.4 ~/cb/timeit.py -s'import cou'
    "cou.cou(file('/usr/share/dict/words'))"
    10 loops, best of 3: 2.05e+05 usec per loop

    where cou.pyx is the pyrex program I've already shown on the other
    subthread. Using the count.c I've also shown takes 2.03e+05 usec.
    (Can't try psyco here, not an intel-like cpu).


    Summary: "sum(1 for ...)" is no speed daemon; the plain loop is best
    among the pure-python approaches for files that can't fit in memory. If
    the file DOES fit in memory, read().count('\n') is faster, but
    len(...readlines()) is slower. Pyrex rocks, essentially removing the
    need for C-coded extensions (less than a 1% advantage) -- and so does
    psyco, but not if you're using a Mac (quick, somebody gift Armin Rigo
    with a Mac before it's too late...!!!).


    Alex
     
    Alex Martelli, Sep 22, 2004
    #14
  15. Ling Lee

    Andrew Dalke Guest

    Bengt Richter wrote:
    > or, if you _are_ willing to read in the whole file,
    >
    > open('text.txt').read().count('\n')


    Except the last line might not have a terminal newline.

    Andrew
     
    Andrew Dalke, Sep 22, 2004
    #15
  16. Ling Lee

    Andrew Dalke Guest

    Alex Martelli wrote:
    > If one is really in a hurry counting lines, a dedicated C extension
    > might help. E.g.:
    >
    > static PyObject *count(PyObject *self, PyObject *args)

    ...
    > Using this count-items-in-iterable thingy


    There's been a few times I've wanted a function like
    this. I keep expecting that len(iterable) will work,
    but of course it doesn't.

    Would itertools.len(iterable) be useful? More likely
    the name collision with len itself would be a problem,
    so perhaps itertools.length(iterable).


    BTW, I saw itertools.count and figured that might be
    it. Nope. And don't try the following

    >>> import itertools
    >>> itertools.count(5)

    count(5)
    >>> print list(_)


    :)

    Andrew
     
    Andrew Dalke, Sep 22, 2004
    #16
  17. On Wed, 22 Sep 2004 19:48:21 GMT, Andrew Dalke <> wrote:

    >Bengt Richter wrote:
    >> or, if you _are_ willing to read in the whole file,
    >>
    >> open('text.txt').read().count('\n')

    >
    >Except the last line might not have a terminal newline.
    >

    I _knew_ I should have mentioned that ;-)

    Regards,
    Bengt Richter
     
    Bengt Richter, Sep 22, 2004
    #17
  18. Andrew Dalke <> wrote:

    > Alex Martelli wrote:
    > > If one is really in a hurry counting lines, a dedicated C extension
    > > might help. E.g.:
    > >
    > > static PyObject *count(PyObject *self, PyObject *args)

    > ...
    > > Using this count-items-in-iterable thingy

    >
    > There's been a few times I've wanted a function like


    Me too, that's why I wrote the C and Pyrex versions:).

    > this. I keep expecting that len(iterable) will work,
    > but of course it doesn't.


    Yep -- it would probably be too risky to have len(...) consume a whole
    iterator, beginning users wouldn't expect that and might get burnt.


    > Would itertools.len(iterable) be useful? More likely
    > the name collision with len itself would be a problem,
    > so perhaps itertools.length(iterable).


    Unfortunately, itertools's functions are there to produce iterators, not
    to consume them. I doubt Raymond Hettinger, itertools' guru, would
    approve of changing that (though one could surely ask him, and if he
    surprised me, I guess the change might get in).

    There's currently no good single place for 'accumulators', i.e.
    consumers of iterators which produce scalars or thereabouts -- sum, max,
    and min, are built-ins; other useful accumulators can be found in heapq
    (because they're implemented via a heap...)... and there's nowhere to
    put the obviously needed "trivial" accumulators, such as average,
    median, variance, count...

    A "stats" module was proposed, but also shot down (presumably people
    have more ambitious ideas about 'statistics' than there simple
    accumulators, alas -- I'm not sure exactly what the problem was).


    Alex
     
    Alex Martelli, Sep 22, 2004
    #18
  19. Andrew Dalke <> wrote:

    > Bengt Richter wrote:
    > > or, if you _are_ willing to read in the whole file,
    > >
    > > open('text.txt').read().count('\n')

    >
    > Except the last line might not have a terminal newline.


    ....and wc would then not count that non-line as a line, so why should
    we...? Witness...:

    $ echo -n 'bu'>em
    $ wc em
    0 1 2 em

    zero lines, one word, two characters: seems right to me.


    Alex
     
    Alex Martelli, Sep 22, 2004
    #19
  20. Ling Lee

    Andrew Dalke Guest

    Alex Martelli wrote:
    > ....and wc would then not count that non-line as a line, so why should
    > we...? Witness...:



    'Cause that's what Python does. Witness:

    % echo -n 'bu' | python -c \
    ? 'import sys; print len(sys.stdin.readlines())'
    1

    ;)

    Andrew
     
    Andrew Dalke, Sep 22, 2004
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Joe Wright
    Replies:
    0
    Views:
    555
    Joe Wright
    Jul 27, 2003
  2. Murali
    Replies:
    2
    Views:
    612
    Jerry Coffin
    Mar 9, 2006
  3. jaswinder

    make a program that count lines in a text

    jaswinder, Aug 17, 2010, in forum: C Programming
    Replies:
    37
    Views:
    1,621
    Ben Bacarisse
    Aug 20, 2010
  4. PerlFAQ Server
    Replies:
    0
    Views:
    170
    PerlFAQ Server
    Jan 31, 2011
  5. Cah Sableng
    Replies:
    0
    Views:
    263
    Cah Sableng
    Apr 23, 2007
Loading...

Share This Page