RE: Beginner question : skips every second line in file when usingreadline()

Discussion in 'Python' started by Pettersen, Bjorn S, Oct 20, 2003.

  1. > From: Hans Nowak [mailto:]
    >
    > peter leonard wrote:
    >
    > > Hi,
    > > I having a problem with reading each line from a text file.

    [...]
    > >
    > > The following script attempts to print out each line :
    > >
    > > datafile ="C:\\Classifier\Data\\test.txt"
    > > dataobject = open(datafile,"r")
    > >
    > > while dataobject.readline() !="":
    > >
    > > line = dataobject.readline()
    > > print line

    >
    > You're calling readline() twice. Use something like:
    >
    > line = dataobject.readline()
    > while <line is not empty>:
    > ...do stuff...
    > line = dataobject.readline()
    >
    > or:
    >
    > for line in dataobject:
    > if <line is empty>:
    > break
    > ...do stuff...
    >
    > I'm writing <line is empty>, because you might want to revise
    > dataobject.readline() != "" as well. If you read from a text
    > file, lines will end in \n, so comparing them to "" will always
    > return false.

    [..]

    To back up a couple of steps... (it looks like you're coming from a
    C/C++/Java background <wink>).

    In Python reading from a file (by either read or readline) always
    returns _something_, unless the end of the file is reached. This works
    because readline doesn't throw away the newline at the end of a line.
    I.e. if you're reading an empty line in a text file, readline() returns
    the string '\n' (a one character string). It also gives a convenient way
    of testing for the end of the file, e.g. if you look at Python code
    that's a little older you'll find this use idiomatic:

    while 1:
    line = fp.readline()
    if not line:
    break
    ..do stuff..

    note that the empty line tests false (as well as all other empty objects
    in Python), and the above is considered much better (re. style as well
    as flexibility) than:

    if line <> '':

    or the harder to type:

    if line != "":

    Assuming that your version of Python is more recent, you can now iterate
    over file objects using a for loop without having to deal with end of
    file conditions. A common idiom is:

    for line in file(datafile):
    ..do stuff..

    (file being the preferred way of spelling open.. at least officially ;-)
    This takes advantage of both

    - the default mode for opening files is for reading ('r'),
    so it doesn't need to be specified.
    - the file is automatically closed at the end of the for loop.

    Automatic closing is "implementation defined" behavior -- i.e. it won't
    ever change in CPython (the C implementation), but doesn't work this way
    in Jython (the Java implementation). Some people argue that you should
    always close files explicitly, like you would in Jython (and most other
    programming languages):

    df = file(datafile)
    for line in df:
    ..do something..
    df.close()

    personally I just find that obfuscated <grin>.

    hth,
    -- bjorn
    Pettersen, Bjorn S, Oct 20, 2003
    #1
    1. Advertising

  2. Pettersen, Bjorn S

    Paul Watson Guest

    "Pettersen, Bjorn S" <> wrote in message
    news:...
    > From: Hans Nowak [mailto:]
    >
    > peter leonard wrote:
    >
    > > Hi,
    > > I having a problem with reading each line from a text file.

    [...]
    > >
    > > The following script attempts to print out each line :
    > >
    > > datafile ="C:\\Classifier\Data\\test.txt"
    > > dataobject = open(datafile,"r")
    > >
    > > while dataobject.readline() !="":
    > >
    > > line = dataobject.readline()
    > > print line

    >
    > You're calling readline() twice. Use something like:
    >
    > line = dataobject.readline()
    > while <line is not empty>:
    > ...do stuff...
    > line = dataobject.readline()
    >
    > or:
    >
    > for line in dataobject:
    > if <line is empty>:
    > break
    > ...do stuff...
    >
    > I'm writing <line is empty>, because you might want to revise
    > dataobject.readline() != "" as well. If you read from a text
    > file, lines will end in \n, so comparing them to "" will always
    > return false.

    [..]

    To back up a couple of steps... (it looks like you're coming from a
    C/C++/Java background <wink>).

    In Python reading from a file (by either read or readline) always
    returns _something_, unless the end of the file is reached. This works
    because readline doesn't throw away the newline at the end of a line.
    I.e. if you're reading an empty line in a text file, readline() returns
    the string '\n' (a one character string). It also gives a convenient way
    of testing for the end of the file, e.g. if you look at Python code
    that's a little older you'll find this use idiomatic:

    while 1:
    line = fp.readline()
    if not line:
    break
    ..do stuff..

    note that the empty line tests false (as well as all other empty objects
    in Python), and the above is considered much better (re. style as well
    as flexibility) than:

    if line <> '':

    or the harder to type:

    if line != "":

    Assuming that your version of Python is more recent, you can now iterate
    over file objects using a for loop without having to deal with end of
    file conditions. A common idiom is:

    for line in file(datafile):
    ..do stuff..

    (file being the preferred way of spelling open.. at least officially ;-)
    This takes advantage of both

    - the default mode for opening files is for reading ('r'),
    so it doesn't need to be specified.
    - the file is automatically closed at the end of the for loop.

    Automatic closing is "implementation defined" behavior -- i.e. it won't
    ever change in CPython (the C implementation), but doesn't work this way
    in Jython (the Java implementation). Some people argue that you should
    always close files explicitly, like you would in Jython (and most other
    programming languages):

    df = file(datafile)
    for line in df:
    ..do something..
    df.close()

    personally I just find that obfuscated <grin>.

    hth,
    -- bjorn


    Does this cause the entire input file to be read into memory before the for
    loop begins execution?

    This is great for reading 5 lines, but I might need to read 30 million lines
    from a mortgage company file. I cannot read the entire file into memory.
    Paul Watson, Oct 20, 2003
    #2
    1. Advertising

  3. On Mon, 20 Oct 2003 11:20:22 -0500, "Paul Watson"
    <> wrote:

    >>
    >>"Pettersen, Bjorn S" <> wrote in message
    >>news:...
    >>
    >>Assuming that your version of Python is more recent, you can now iterate
    >>over file objects using a for loop without having to deal with end of
    >>file conditions. A common idiom is:
    >>
    >> for line in file(datafile):
    >> ..do stuff..
    >>

    >
    >Does this cause the entire input file to be read into memory before the for
    >loop begins execution?
    >
    >This is great for reading 5 lines, but I might need to read 30 million lines
    >from a mortgage company file. I cannot read the entire file into memory.
    >


    From the library reference:

    xreadlines()
    This method returns the same thing as iter(f). New in version 2.1.
    Deprecated since release 2.3. Use for line in file instead.

    So, this reads the file in line by line, and you can feed your 30
    million lines to it without problems...

    --
    Christopher
    Christopher Koppler, Oct 21, 2003
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jay McGavren
    Replies:
    11
    Views:
    1,111
    Alan Krueger
    Jan 16, 2006
  2. peter leonard
    Replies:
    3
    Views:
    540
    Roy Smith
    Oct 20, 2003
  3. Replies:
    5
    Views:
    997
  4. Marek Stepanek
    Replies:
    12
    Views:
    401
    Peter J. Holzer
    Sep 2, 2006
  5. yelipolok
    Replies:
    4
    Views:
    243
    John W. Krahn
    Jan 27, 2010
Loading...

Share This Page