A question about yield

C

chad

I have an input file named 'freq' which contains the following data

123 0

133 3
146 1
200 0
233 10
400 2


Now I've attempted to write a script that would take a number from the
standard input and then
have the program return the number in the input file that is closest
to that input file.

#!/usr/local/bin/python

import sys

def construct_set(data):
for line in data:
lines = line.splitlines()
for curline in lines:
if curline.strip():
key = curline.split(' ')
value = int(key[0])
yield value

def approximate(first, second):
midpoint = (first + second) / 2
return midpoint

def format(input):
prev = 0
value = int(input)

with open("/home/cdalten/oakland/freq") as f:
for next in construct_set(f):
if value > prev:
current = prev
prev = next

middle = approximate(current, prev)
if middle < prev and value > middle:
return prev
elif value > current and current < middle:
return current

if __name__ == "__main__":
if len(sys.argv) != 2:
print >> sys.stderr, "You need to enter a number\n"
sys.exit(1)

nearest = format(sys.argv[1])
print "The closest value to", sys.argv[1], "is", nearest


When I run it, I get the following...

[cdalten@localhost oakland]$ ./android4.py 123
The closest value to 123 is 123
[cdalten@localhost oakland]$ ./android4.py 130
The closest value to 130 is 133
[cdalten@localhost oakland]$ ./android4.py 140
The closest value to 140 is 146
[cdalten@localhost oakland]$ ./android4.py 146
The closest value to 146 is 146
[cdalten@localhost oakland]$ ./android4.py 190
The closest value to 190 is 200
[cdalten@localhost oakland]$ ./android4.py 200
The closest value to 200 is 200
[cdalten@localhost oakland]$ ./android4.py 205
The closest value to 205 is 200
[cdalten@localhost oakland]$ ./android4.py 210
The closest value to 210 is 200
[cdalten@localhost oakland]$ ./android4.py 300
The closest value to 300 is 233
[cdalten@localhost oakland]$ ./android4.py 500
The closest value to 500 is 400
[cdalten@localhost oakland]$ ./android4.py 1000000
The closest value to 1000000 is 400
[cdalten@localhost oakland]$

The question is about the construct_set() function.

def construct_set(data):
for line in data:
lines = line.splitlines()
for curline in lines:
if curline.strip():
key = curline.split(' ')
value = int(key[0])
yield value

I have it yield on 'value' instead of 'curline'. Will the program
still read the input file named freq line by line even though I don't
have it yielding on 'curline'? Or since I have it yield on 'value',
will it read the entire input file into memory at once?

Chad
 
C

chad

I have an input file named 'freq' which contains the following data

123 0

133 3
146 1
200 0
233 10
400 2

Now I've attempted to write a script that would take a number from the
standard input and then
have the program return the number in the input file that is closest
to that input file.

*and then have the program return the number in the input file that is
closest to the number the user inputs (or enters).*
 
C

Chris Rebert

#!/usr/local/bin/python

import sys

def construct_set(data):
   for line in data:
       lines = line.splitlines()
       for curline in lines:
           if curline.strip():
               key = curline.split(' ')
               value = int(key[0])
               yield value

def approximate(first, second):
   midpoint = (first + second) / 2
   return midpoint

def format(input):
   prev = 0
   value = int(input)

   with open("/home/cdalten/oakland/freq") as f:
       for next in construct_set(f):
           if value > prev:
               current = prev
               prev = next

       middle = approximate(current, prev)
       if middle < prev and value > middle:
           return prev
       elif value > current and current < middle:
           return current
The question is about the construct_set() function.
I have it yield on 'value' instead of 'curline'. Will the program
still read the input file named freq line by line even though I don't
have it yielding on 'curline'? Or since I have it yield on 'value',
will it read the entire input file into memory at once?

The former. The yield has no effect at all on how the file is read.
The "for line in data:" iteration over the file object is what makes
Python read from the file line-by-line. Incidentally, the use of
splitlines() is pointless; you're already getting single lines from
the file object by iterating over it, so splitlines() will always
return a single-element list.

Cheers,
Chris
 
C

chad

<snip>


#!/usr/local/bin/python
import sys
def construct_set(data):
   for line in data:
       lines = line.splitlines()
       for curline in lines:
           if curline.strip():
               key = curline.split(' ')
               value = int(key[0])
               yield value
def approximate(first, second):
   midpoint = (first + second) / 2
   return midpoint
def format(input):
   prev = 0
   value = int(input)
   with open("/home/cdalten/oakland/freq") as f:
       for next in construct_set(f):
           if value > prev:
               current = prev
               prev = next
       middle = approximate(current, prev)
       if middle < prev and value > middle:
           return prev
       elif value > current and current < middle:
           return current
The question is about the construct_set() function.
I have it yield on 'value' instead of 'curline'. Will the program
still read the input file named freq line by line even though I don't
have it yielding on 'curline'? Or since I have it yield on 'value',
will it read the entire input file into memory at once?

The former. The yield has no effect at all on how the file is read.
The "for line in data:" iteration over the file object is what makes
Python read from the file line-by-line. Incidentally, the use of
splitlines() is pointless; you're already getting single lines from
the file object by iterating over it, so splitlines() will always
return a single-element list.

But what happens if the input file is say 250MB? Will all 250MB be
loaded into memory at once? Just curious, because I thought maybe
using something like 'yield curline' would prevent this scenario.
 
C

Chris Rebert

#!/usr/local/bin/python
import sys
def construct_set(data):
   for line in data:
       lines = line.splitlines()
       for curline in lines:
           if curline.strip():
               key = curline..split(' ')
               value = int(key[0])
               yield value
def approximate(first, second):
   midpoint = (first + second) / 2
   return midpoint
def format(input):
   prev = 0
   value = int(input)
   with open("/home/cdalten/oakland/freq") as f:
       for next in construct_set(f):
           if value > prev:
               current = prev
               prev = next
       middle = approximate(current, prev)
       if middle < prev and value > middle:
           return prev
       elif value > current and current < middle:
           return current
The question is about the construct_set() function.
I have it yield on 'value' instead of 'curline'. Will the program
still read the input file named freq line by line even though I don't
have it yielding on 'curline'? Or since I have it yield on 'value',
will it read the entire input file into memory at once?

The former. The yield has no effect at all on how the file is read.
The "for line in data:" iteration over the file object is what makes
Python read from the file line-by-line. Incidentally, the use of
splitlines() is pointless; you're already getting single lines from
the file object by iterating over it, so splitlines() will always
return a single-element list.

But what happens if the input file is say 250MB? Will all 250MB be
loaded into memory at once?

No. As I said, the file will be read from 1 line at a time, on an
as-needed basis; which is to say, "line-by-line".
Just curious, because I thought maybe
using something like 'yield curline' would prevent this scenario.

Using "for line in data:" is what prevents that scenario.
The "yield" is only relevant to how the file is read insofar as the
the alternative to yield-ing would be to return a list, which would
necessitate going through the entire file in continuous go and then
returning a very large list; but even then, the file's content would
still be read from line-by-line, not all at once as one humongous
string.

Cheers,
Chris
 
S

Simon Brunning

No. As I said, the file will be read from 1 line at a time, on an
as-needed basis; which is to say, "line-by-line".

IIRC, it's somewhere in between. Python will read the file in blocks.
If only *looks* like it's reading the file line by line.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top