A question about yield

chad · Nov 7, 2010

I have an input file named 'freq' which contains the following data

123 0

133 3
146 1
200 0
233 10
400 2

Now I've attempted to write a script that would take a number from the
standard input and then
have the program return the number in the input file that is closest
to that input file.

#!/usr/local/bin/python

import sys

def construct_set(data):
for line in data:
lines = line.splitlines()
for curline in lines:
if curline.strip():
key = curline.split(' ')
value = int(key[0])
yield value

def approximate(first, second):
midpoint = (first + second) / 2
return midpoint

def format(input):
prev = 0
value = int(input)

with open("/home/cdalten/oakland/freq") as f:
for next in construct_set(f):
if value > prev:
current = prev
prev = next

middle = approximate(current, prev)
if middle < prev and value > middle:
return prev
elif value > current and current < middle:
return current

if __name__ == "__main__":
if len(sys.argv) != 2:
print >> sys.stderr, "You need to enter a number\n"
sys.exit(1)

nearest = format(sys.argv[1])
print "The closest value to", sys.argv[1], "is", nearest

When I run it, I get the following...

[cdalten@localhost oakland]$ ./android4.py 123
The closest value to 123 is 123
[cdalten@localhost oakland]$ ./android4.py 130
The closest value to 130 is 133
[cdalten@localhost oakland]$ ./android4.py 140
The closest value to 140 is 146
[cdalten@localhost oakland]$ ./android4.py 146
The closest value to 146 is 146
[cdalten@localhost oakland]$ ./android4.py 190
The closest value to 190 is 200
[cdalten@localhost oakland]$ ./android4.py 200
The closest value to 200 is 200
[cdalten@localhost oakland]$ ./android4.py 205
The closest value to 205 is 200
[cdalten@localhost oakland]$ ./android4.py 210
The closest value to 210 is 200
[cdalten@localhost oakland]$ ./android4.py 300
The closest value to 300 is 233
[cdalten@localhost oakland]$ ./android4.py 500
The closest value to 500 is 400
[cdalten@localhost oakland]$ ./android4.py 1000000
The closest value to 1000000 is 400
[cdalten@localhost oakland]$

The question is about the construct_set() function.

def construct_set(data):
for line in data:
lines = line.splitlines()
for curline in lines:
if curline.strip():
key = curline.split(' ')
value = int(key[0])
yield value

I have it yield on 'value' instead of 'curline'. Will the program
still read the input file named freq line by line even though I don't
have it yielding on 'curline'? Or since I have it yield on 'value',
will it read the entire input file into memory at once?

Chad

chad · Nov 7, 2010

I have an input file named 'freq' which contains the following data

123 0

133 3
146 1
200 0
233 10
400 2

Now I've attempted to write a script that would take a number from the
standard input and then
have the program return the number in the input file that is closest
to that input file.

*and then have the program return the number in the input file that is
closest to the number the user inputs (or enters).*

Chris Rebert · Nov 7, 2010

On Sun said:
#!/usr/local/bin/python

import sys

def construct_set(data):
Â Â for line in data:
Â Â Â Â lines = line.splitlines()
Â Â Â Â for curline in lines:
Â Â Â Â Â Â if curline.strip():
Â Â Â Â Â Â Â Â key = curline.split(' ')
Â Â Â Â Â Â Â Â value = int(key[0])
Â Â Â Â Â Â Â Â yield value

def approximate(first, second):
Â Â midpoint = (first + second) / 2
Â Â return midpoint

def format(input):
Â Â prev = 0
Â Â value = int(input)

Â Â with open("/home/cdalten/oakland/freq") as f:
Â Â Â Â for next in construct_set(f):
Â Â Â Â Â Â if value > prev:
Â Â Â Â Â Â Â Â current = prev
Â Â Â Â Â Â Â Â prev = next

Â Â Â Â middle = approximate(current, prev)
Â Â Â Â if middle < prev and value > middle:
Â Â Â Â Â Â return prev
Â Â Â Â elif value > current and current < middle:
Â Â Â Â Â Â return current

The question is about the construct_set() function.

I have it yield on 'value' instead of 'curline'. Will the program
still read the input file named freq line by line even though I don't
have it yielding on 'curline'? Or since I have it yield on 'value',
will it read the entire input file into memory at once?

The former. The yield has no effect at all on how the file is read.
The "for line in data:" iteration over the file object is what makes
Python read from the file line-by-line. Incidentally, the use of
splitlines() is pointless; you're already getting single lines from
the file object by iterating over it, so splitlines() will always
return a single-element list.

Cheers,
Chris

chad · Nov 7, 2010

<snip>

#!/usr/local/bin/python

Click to expand...

import sys

Click to expand...

def construct_set(data):
for line in data:
lines = line.splitlines()
for curline in lines:
if curline.strip():
key = curline.split(' ')
value = int(key[0])
yield value

Click to expand...

def approximate(first, second):
midpoint = (first + second) / 2
return midpoint

Click to expand...

def format(input):
prev = 0
value = int(input)

Click to expand...

with open("/home/cdalten/oakland/freq") as f:
for next in construct_set(f):
if value > prev:
current = prev
prev = next

Click to expand...

middle = approximate(current, prev)
if middle < prev and value > middle:
return prev
elif value > current and current < middle:
return current

Click to expand...

The question is about the construct_set() function.

Click to expand...

I have it yield on 'value' instead of 'curline'. Will the program
still read the input file named freq line by line even though I don't
have it yielding on 'curline'? Or since I have it yield on 'value',
will it read the entire input file into memory at once?

Click to expand...

The former. The yield has no effect at all on how the file is read.
The "for line in data:" iteration over the file object is what makes
Python read from the file line-by-line. Incidentally, the use of
splitlines() is pointless; you're already getting single lines from
the file object by iterating over it, so splitlines() will always
return a single-element list.

But what happens if the input file is say 250MB? Will all 250MB be
loaded into memory at once? Just curious, because I thought maybe
using something like 'yield curline' would prevent this scenario.

Chris Rebert · Nov 7, 2010

On Sun said:
On Sun said:

#!/usr/local/bin/python

Click to expand...

import sys

Click to expand...

def construct_set(data):
Â Â for line in data:
Â Â Â Â lines = line.splitlines()
Â Â Â Â for curline in lines:
Â Â Â Â Â Â if curline.strip():
Â Â Â Â Â Â Â Â key = curline..split(' ')
Â Â Â Â Â Â Â Â value = int(key[0])
Â Â Â Â Â Â Â Â yield value

Click to expand...

def approximate(first, second):
Â Â midpoint = (first + second) / 2
Â Â return midpoint

Click to expand...

def format(input):
Â Â prev = 0
Â Â value = int(input)

Click to expand...

Â Â with open("/home/cdalten/oakland/freq") as f:
Â Â Â Â for next in construct_set(f):
Â Â Â Â Â Â if value > prev:
Â Â Â Â Â Â Â Â current = prev
Â Â Â Â Â Â Â Â prev = next

Click to expand...

Â Â Â Â middle = approximate(current, prev)
Â Â Â Â if middle < prev and value > middle:
Â Â Â Â Â Â return prev
Â Â Â Â elif value > current and current < middle:
Â Â Â Â Â Â return current

Click to expand...

The question is about the construct_set() function.

Click to expand...

I have it yield on 'value' instead of 'curline'. Will the program
still read the input file named freq line by line even though I don't
have it yielding on 'curline'? Or since I have it yield on 'value',
will it read the entire input file into memory at once?

Click to expand...

The former. The yield has no effect at all on how the file is read.
The "for line in data:" iteration over the file object is what makes
Python read from the file line-by-line. Incidentally, the use of
splitlines() is pointless; you're already getting single lines from
the file object by iterating over it, so splitlines() will always
return a single-element list.

Click to expand...

But what happens if the input file is say 250MB? Will all 250MB be
loaded into memory at once?

No. As I said, the file will be read from 1 line at a time, on an
as-needed basis; which is to say, "line-by-line".

Just curious, because I thought maybe
using something like 'yield curline' would prevent this scenario.

Using "for line in data:" is what prevents that scenario.
The "yield" is only relevant to how the file is read insofar as the
the alternative to yield-ing would be to return a list, which would
necessitate going through the entire file in continuous go and then
returning a very large list; but even then, the file's content would
still be read from line-by-line, not all at once as one humongous
string.

Cheers,
Chris

Simon Brunning · Nov 8, 2010

No. As I said, the file will be read from 1 line at a time, on an
as-needed basis; which is to say, "line-by-line".

IIRC, it's somewhere in between. Python will read the file in blocks.
If only *looks* like it's reading the file line by line.

A question about i++ in a for loop.	9	Nov 28, 2009
How do I get the if statement correct in the lines that has it in the program?	2	Mar 16, 2023
Create a class - ideas	14	Nov 6, 2010
writing on file not until the end	8	May 24, 2009
Help wanted with md2 hash algorithm	12	Jan 6, 2006
atan2 weirdness	3	Jul 20, 2008
M2Crypto-0.20.2, SWIG-2.0.0, and OpenSSL-1.0.0a build problem	5	Jul 13, 2010
VHDL Type Mismatch error indexed name returns a value whose type does not match	0	May 6, 2012

A question about yield

chad

chad

Chris Rebert

chad

Chris Rebert

Simon Brunning

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads