read lines

Horacius ReX · Dec 4, 2007

Hi, I have a text file like this;

1 -33.453579
2 -148.487125
3 -195.067172
4 -115.958374
5 -100.597841
6 -121.566441
7 -121.025381
8 -132.103507
9 -108.939327
10 -97.046703
11 -52.866534
12 -48.432623
13 -112.790419
14 -98.516975
15 -98.724436

So I want to write a program in python that reads each line and
detects which numbers of the second column are the maximum and the
minimum.

I tried with;

import os, sys,re,string

# first parameter is the name of the data file
name1 = sys.argv[1]
infile1 = open(name1,"r")

# 1. get minimum and maximum

minimum=0
maximum=0

print " minimum = ",minimum
print " maximum = ",maximum

while 1:
line = infile1.readline()
ll = re.split("\s+",string.strip(line))
print ll[0],ll[1]
a=ll[0]
b=ll[1]
print a,b
if(b<minimum):
minimum=b
print " minimum= ",minimum
if(b>maximum):
maximum=b
print " maximum= ",maximum

print minimum, maximum

But it does not work and I get errors like;

Traceback (most recent call last):
File "translate_to_intervals.py", line 20, in <module>
print ll[0],ll[1]
IndexError: list index out of range

Could anybody help me ?

Thanks

Chris · Dec 4, 2007

Hi, I have a text file like this;

1 -33.453579
2 -148.487125
3 -195.067172
4 -115.958374
5 -100.597841
6 -121.566441
7 -121.025381
8 -132.103507
9 -108.939327
10 -97.046703
11 -52.866534
12 -48.432623
13 -112.790419
14 -98.516975
15 -98.724436

So I want to write a program in python that reads each line and
detects which numbers of the second column are the maximum and the
minimum.

I tried with;

import os, sys,re,string

# first parameter is the name of the data file
name1 = sys.argv[1]
infile1 = open(name1,"r")

# 1. get minimum and maximum

minimum=0
maximum=0

print " minimum = ",minimum
print " maximum = ",maximum

while 1:
line = infile1.readline()
ll = re.split("\s+",string.strip(line))
print ll[0],ll[1]
a=ll[0]
b=ll[1]
print a,b
if(b<minimum):
minimum=b
print " minimum= ",minimum
if(b>maximum):
maximum=b
print " maximum= ",maximum

print minimum, maximum

But it does not work and I get errors like;

Traceback (most recent call last):
File "translate_to_intervals.py", line 20, in <module>
print ll[0],ll[1]
IndexError: list index out of range

Could anybody help me ?

Thanks

You're not guaranteed to have that 2 or even 1 element after
splitting. If the line is empty or has 1 space you need to handle
it. Also is there really a need for regex for a simple string split ?

import sys

infile = open(sys.argv[1], 'r')
min, max = 0, 0

for each_line in infile.readlines():
if each_line.strip():
tmp = each_line.strip().split()
try:
b = tmp[1]
except IndexError:
continue
if b < min: min = b
if b > max: max = b

Zepo Len · Dec 4, 2007

Hi, I have a text file like this;

1 -33.453579
2 -148.487125
....

So I want to write a program in python that reads each line and
detects which numbers of the second column are the maximum and the
minimum.

I tried with;

import os, sys,re,string

# first parameter is the name of the data file
name1 = sys.argv[1]
infile1 = open(name1,"r")

# 1. get minimum and maximum

minimum=0
maximum=0

print " minimum = ",minimum
print " maximum = ",maximum

while 1:
line = infile1.readline()
ll = re.split("\s+",string.strip(line))
print ll[0],ll[1]
a=ll[0]
b=ll[1]
print a,b
if(b<minimum):
minimum=b
print " minimum= ",minimum
if(b>maximum):
maximum=b
print " maximum= ",maximum

print minimum, maximum

But it does not work and I get errors like;

Traceback (most recent call last):
File "translate_to_intervals.py", line 20, in <module>
print ll[0],ll[1]
IndexError: list index out of range

Your regex is not working correctly I guess, I don't even know why you are
using a regex, something like this would work just fine:

import sys
nums = [float(line.split(' -')[1]) for line in open(sys.argv[1])]
print 'min=', min(nums), 'max=', max(nums)

Neil Cerutti · Dec 4, 2007

Hi, I have a text file like this;

1 -33.453579
2 -148.487125
3 -195.067172
4 -115.958374
5 -100.597841
6 -121.566441
7 -121.025381
8 -132.103507
9 -108.939327
10 -97.046703
11 -52.866534
12 -48.432623
13 -112.790419
14 -98.516975
15 -98.724436

So I want to write a program in python that reads each line and
detects which numbers of the second column are the maximum and
the minimum.

Check out 3.6.1 String Methods in the Python Library Reference.
It contains what you need.

Also, read about max and min from 2.1 Built-in Functions.

I tried with;

import os, sys,re,string

The string module is best avoided, except for a few character
classes, e.g., Paladins and Clerics. ;-) Use str methods instead.

It's more readable to import one module per line.

# first parameter is the name of the data file
name1 = sys.argv[1]
infile1 = open(name1,"r")

# 1. get minimum and maximum

minimum=0
maximum=0

print " minimum = ",minimum
print " maximum = ",maximum

while 1:
line = infile1.readline()

This isn't the best way to read files in Python. Check out 7.2
Reading and Writing Files in the Python Tutorial.

ll = re.split("\s+",string.strip(line))
print ll[0],ll[1]
a=ll[0]
b=ll[1]

Don't mix tabs and spaces. Python's Style Guide generally
recommends four spaces per indent.

print a,b
if(b<minimum):

readline returns str objects. You'll need to convert them to
numbers manually before comparing.

minimum=b
print " minimum= ",minimum
if(b>maximum):
maximum=b
print " maximum= ",maximum

print minimum, maximum

But it does not work and I get errors like;

Traceback (most recent call last):
File "translate_to_intervals.py", line 20, in <module>
print ll[0],ll[1]
IndexError: list index out of range

This is caused by line becoming an empty string when readline
encounters end of the file.

Could anybody help me ?

The following will not work in Python 2.4 or earlier.

from __future__ import with_statement
import sys
from operator import itemgetter
from contextmanager import closing

with closing(file(sys.argv[1])) as fp:
table = [(int(i), float(n)) for i, n in (line.split() for line in fp)]
print table
print "maximum =", max(table, key=itemgetter(1))
print "minimum =", min(table, key=itemgetter(1))

Piet van Oostrum · Dec 4, 2007

Horacius ReX said:
HR> while 1:
HR> line = infile1.readline()

You have an infinite loop. Fortunately your program stops because of the
error. When you encounter end of file, line becomes the empty string and
the split gives you only 1 item instead of 2.

So add the following:
if not line: break

Also your choice for 0 as initial values of minimum and maximum isn't good.

Zepo Len · Dec 4, 2007

Your regex is not working correctly I guess, I don't even know why you

are using a regex, something like this would work just fine:

import sys
nums = [float(line.split(' -')[1]) for line in open(sys.argv[1])]
print 'min=', min(nums), 'max=', max(nums)

Sorry, that should be line.split() - didn't realise those were negative
numbers.

Bruno Desthuilliers · Dec 4, 2007

Chris a écrit :

Hi, I have a text file like this;

1 -33.453579
2 -148.487125
3 -195.067172
4 -115.958374
5 -100.597841
6 -121.566441
7 -121.025381
8 -132.103507
9 -108.939327
10 -97.046703
11 -52.866534
12 -48.432623
13 -112.790419
14 -98.516975
15 -98.724436

So I want to write a program in python that reads each line and
detects which numbers of the second column are the maximum and the
minimum.

Click to expand...

(snip)

You're not guaranteed to have that 2 or even 1 element after
splitting. If the line is empty or has 1 space you need to handle
it. Also is there really a need for regex for a simple string split ?

import sys

infile = open(sys.argv[1], 'r')
min, max = 0, 0

# shadowing the builtin min and max functions may not be such
# a good idea !-)
# Also, you may want to use a sentinel value here instead:
mini, maxi = None, None

for each_line in infile.readlines():

# You don't need to read the whole file in memory
# the file object knows how to iterate over lines.
# Also, you may want to track line numbers so you can
# warn about an incorrect line, cf below

for linenum, line in enumerate(infile):

if each_line.strip():

# you're uselessly calling line.strip two times...
line = line.strip()
if line:

tmp = each_line.strip().split()

tmp = line.split()

try:
b = tmp[1]

# Notice that here, b is a string, not a number...
try:
b = int(tmp[1])

except (IndexError, TypeError), e:

# you may want to warn about incorrect/unexpected format here
# (writing to sys.stderr, since stdout is for normal outputs)
print >> sys.sdterr, \
"incorrect line format line %s ('%s') : %e" \
% (linenum, line, e)

continue

if b < min: min = b
if b > max: max = b

# If the first test succeeds, doing the second is useless.
# also, take into account the sentinel value. The identity test
# against None should not be too costly. If it was, it's simple to
# optimize it out of the for loop.

if mini is None or b < mini:
mini = b
elif maxi is None or b > maxi:
maxi = b

# closing the file might be a good idea too, at least for any
# serious app
infile.close()

Now there are also these two builtin functions min and max, and the
itertools tee() function...

import sys
from itertools import tee

def extract_number(iterable):
for linenum, line in enumerate(iterable):
try:
yield int(line.strip().split()[1])
except (IndexError, TypeError), e:
print >> sys.stderr, e
continue

# please add proper error handling around here
infile = open(sys.argv[1])
lines1, lines2 = tee(infile)
print min(extract_numbers(lines1)), max(extract_numbers(lines2))
infile.close()

HTH

Bruno Desthuilliers · Dec 4, 2007

Bruno Desthuilliers a écrit :
(snip)

# Notice that here, b is a string, not a number...
try:
b = int(tmp[1])

oops, I meant:
b = float(tmp[1])

Idem here:

def extract_number(iterable):
for linenum, line in enumerate(iterable):
try:
yield int(line.strip().split()[1])

yield float(line.strip().split()[1])

Peter Otten · Dec 4, 2007

Bruno said:
# You don't need to read the whole file in memory

lines1, lines2 = tee(infile)
print min(extract_numbers(lines1)), max(extract_numbers(lines2))

tee() internally maintains a list of items that were seen by
one but not all of the iterators returned. Therefore after calling min()
and before calling max() you have a list of one float per line in memory
which is quite close conceptually to reading the whole file in memory.

If you want to use memory efficiently, stick with the for-loop.

Peter

Bruno Desthuilliers · Dec 4, 2007

tee() internally maintains a list of items that were seen by
one but not all of the iterators returned. Therefore after calling min()
and before calling max() you have a list of one float per line in memory
which is quite close conceptually to reading the whole file in memory.

If you want to use memory efficiently, stick with the for-loop.

Indeed - I should have specified that the second version was not
necesseraly better wrt/ either perfs and/or resources usage. Thanks for
having made this point clear.

Minimum Total Difficulty	0	Nov 15, 2023
C exercise	1	Feb 3, 2022
Python battle game help	2	Feb 23, 2023
Taskcproblem calendar	4	Aug 31, 2023
Python point location of intersect between two lines	0	Feb 28, 2018
Trouble with prediction code, for the life of me I can't figure out why it isnt running properly. Help would be appreciated.	0	Jul 8, 2023
Please Help with vertical histogram	1	Jul 12, 2011
Index Error during backpropagation in a multilayer neural network.	1	Jun 17, 2023

read lines

Horacius ReX

Chris

Zepo Len

Neil Cerutti

Piet van Oostrum

Zepo Len

Bruno Desthuilliers

Bruno Desthuilliers

Peter Otten

Bruno Desthuilliers

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads