A question about yield

Discussion in 'Python' started by chad, Nov 7, 2010.

  1. chad

    chad Guest

    I have an input file named 'freq' which contains the following data

    123 0

    133 3
    146 1
    200 0
    233 10
    400 2


    Now I've attempted to write a script that would take a number from the
    standard input and then
    have the program return the number in the input file that is closest
    to that input file.

    #!/usr/local/bin/python

    import sys

    def construct_set(data):
    for line in data:
    lines = line.splitlines()
    for curline in lines:
    if curline.strip():
    key = curline.split(' ')
    value = int(key[0])
    yield value

    def approximate(first, second):
    midpoint = (first + second) / 2
    return midpoint

    def format(input):
    prev = 0
    value = int(input)

    with open("/home/cdalten/oakland/freq") as f:
    for next in construct_set(f):
    if value > prev:
    current = prev
    prev = next

    middle = approximate(current, prev)
    if middle < prev and value > middle:
    return prev
    elif value > current and current < middle:
    return current

    if __name__ == "__main__":
    if len(sys.argv) != 2:
    print >> sys.stderr, "You need to enter a number\n"
    sys.exit(1)

    nearest = format(sys.argv[1])
    print "The closest value to", sys.argv[1], "is", nearest


    When I run it, I get the following...

    [cdalten@localhost oakland]$ ./android4.py 123
    The closest value to 123 is 123
    [cdalten@localhost oakland]$ ./android4.py 130
    The closest value to 130 is 133
    [cdalten@localhost oakland]$ ./android4.py 140
    The closest value to 140 is 146
    [cdalten@localhost oakland]$ ./android4.py 146
    The closest value to 146 is 146
    [cdalten@localhost oakland]$ ./android4.py 190
    The closest value to 190 is 200
    [cdalten@localhost oakland]$ ./android4.py 200
    The closest value to 200 is 200
    [cdalten@localhost oakland]$ ./android4.py 205
    The closest value to 205 is 200
    [cdalten@localhost oakland]$ ./android4.py 210
    The closest value to 210 is 200
    [cdalten@localhost oakland]$ ./android4.py 300
    The closest value to 300 is 233
    [cdalten@localhost oakland]$ ./android4.py 500
    The closest value to 500 is 400
    [cdalten@localhost oakland]$ ./android4.py 1000000
    The closest value to 1000000 is 400
    [cdalten@localhost oakland]$

    The question is about the construct_set() function.

    def construct_set(data):
    for line in data:
    lines = line.splitlines()
    for curline in lines:
    if curline.strip():
    key = curline.split(' ')
    value = int(key[0])
    yield value

    I have it yield on 'value' instead of 'curline'. Will the program
    still read the input file named freq line by line even though I don't
    have it yielding on 'curline'? Or since I have it yield on 'value',
    will it read the entire input file into memory at once?

    Chad
     
    chad, Nov 7, 2010
    #1
    1. Advertising

  2. chad

    chad Guest

    On Nov 7, 9:34 am, chad <> wrote:
    > I have an input file named 'freq' which contains the following data
    >
    > 123 0
    >
    > 133 3
    > 146 1
    > 200 0
    > 233 10
    > 400 2
    >
    > Now I've attempted to write a script that would take a number from the
    > standard input and then
    > have the program return the number in the input file that is closest
    > to that input file.


    *and then have the program return the number in the input file that is
    closest to the number the user inputs (or enters).*
     
    chad, Nov 7, 2010
    #2
    1. Advertising

  3. chad

    Chris Rebert Guest

    On Sun, Nov 7, 2010 at 9:34 AM, chad <> wrote:
    <snip>
    > #!/usr/local/bin/python
    >
    > import sys
    >
    > def construct_set(data):
    >    for line in data:
    >        lines = line.splitlines()
    >        for curline in lines:
    >            if curline.strip():
    >                key = curline.split(' ')
    >                value = int(key[0])
    >                yield value
    >
    > def approximate(first, second):
    >    midpoint = (first + second) / 2
    >    return midpoint
    >
    > def format(input):
    >    prev = 0
    >    value = int(input)
    >
    >    with open("/home/cdalten/oakland/freq") as f:
    >        for next in construct_set(f):
    >            if value > prev:
    >                current = prev
    >                prev = next
    >
    >        middle = approximate(current, prev)
    >        if middle < prev and value > middle:
    >            return prev
    >        elif value > current and current < middle:
    >            return current

    <snip>
    > The question is about the construct_set() function.

    <snip>
    > I have it yield on 'value' instead of 'curline'. Will the program
    > still read the input file named freq line by line even though I don't
    > have it yielding on 'curline'? Or since I have it yield on 'value',
    > will it read the entire input file into memory at once?


    The former. The yield has no effect at all on how the file is read.
    The "for line in data:" iteration over the file object is what makes
    Python read from the file line-by-line. Incidentally, the use of
    splitlines() is pointless; you're already getting single lines from
    the file object by iterating over it, so splitlines() will always
    return a single-element list.

    Cheers,
    Chris
    --
    http://blog.rebertia.com
     
    Chris Rebert, Nov 7, 2010
    #3
  4. chad

    chad Guest

    On Nov 7, 9:47 am, Chris Rebert <> wrote:
    > On Sun, Nov 7, 2010 at 9:34 AM, chad <> wrote:
    >
    > <snip>
    >
    >
    >
    > > #!/usr/local/bin/python

    >
    > > import sys

    >
    > > def construct_set(data):
    > >    for line in data:
    > >        lines = line.splitlines()
    > >        for curline in lines:
    > >            if curline.strip():
    > >                key = curline.split(' ')
    > >                value = int(key[0])
    > >                yield value

    >
    > > def approximate(first, second):
    > >    midpoint = (first + second) / 2
    > >    return midpoint

    >
    > > def format(input):
    > >    prev = 0
    > >    value = int(input)

    >
    > >    with open("/home/cdalten/oakland/freq") as f:
    > >        for next in construct_set(f):
    > >            if value > prev:
    > >                current = prev
    > >                prev = next

    >
    > >        middle = approximate(current, prev)
    > >        if middle < prev and value > middle:
    > >            return prev
    > >        elif value > current and current < middle:
    > >            return current

    > <snip>
    > > The question is about the construct_set() function.

    > <snip>
    > > I have it yield on 'value' instead of 'curline'. Will the program
    > > still read the input file named freq line by line even though I don't
    > > have it yielding on 'curline'? Or since I have it yield on 'value',
    > > will it read the entire input file into memory at once?

    >
    > The former. The yield has no effect at all on how the file is read.
    > The "for line in data:" iteration over the file object is what makes
    > Python read from the file line-by-line. Incidentally, the use of
    > splitlines() is pointless; you're already getting single lines from
    > the file object by iterating over it, so splitlines() will always
    > return a single-element list.
    >


    But what happens if the input file is say 250MB? Will all 250MB be
    loaded into memory at once? Just curious, because I thought maybe
    using something like 'yield curline' would prevent this scenario.
     
    chad, Nov 7, 2010
    #4
  5. chad

    Chris Rebert Guest

    On Sun, Nov 7, 2010 at 9:56 AM, chad <> wrote:
    > On Nov 7, 9:47 am, Chris Rebert <> wrote:
    >> On Sun, Nov 7, 2010 at 9:34 AM, chad <> wrote:
    >> <snip>
    >> > #!/usr/local/bin/python

    >>
    >> > import sys

    >>
    >> > def construct_set(data):
    >> >    for line in data:
    >> >        lines = line.splitlines()
    >> >        for curline in lines:
    >> >            if curline.strip():
    >> >                key = curline..split(' ')
    >> >                value = int(key[0])
    >> >                yield value

    >>
    >> > def approximate(first, second):
    >> >    midpoint = (first + second) / 2
    >> >    return midpoint

    >>
    >> > def format(input):
    >> >    prev = 0
    >> >    value = int(input)

    >>
    >> >    with open("/home/cdalten/oakland/freq") as f:
    >> >        for next in construct_set(f):
    >> >            if value > prev:
    >> >                current = prev
    >> >                prev = next

    >>
    >> >        middle = approximate(current, prev)
    >> >        if middle < prev and value > middle:
    >> >            return prev
    >> >        elif value > current and current < middle:
    >> >            return current

    >> <snip>
    >> > The question is about the construct_set() function.

    >> <snip>
    >> > I have it yield on 'value' instead of 'curline'. Will the program
    >> > still read the input file named freq line by line even though I don't
    >> > have it yielding on 'curline'? Or since I have it yield on 'value',
    >> > will it read the entire input file into memory at once?

    >>
    >> The former. The yield has no effect at all on how the file is read.
    >> The "for line in data:" iteration over the file object is what makes
    >> Python read from the file line-by-line. Incidentally, the use of
    >> splitlines() is pointless; you're already getting single lines from
    >> the file object by iterating over it, so splitlines() will always
    >> return a single-element list.

    >
    > But what happens if the input file is say 250MB? Will all 250MB be
    > loaded into memory at once?


    No. As I said, the file will be read from 1 line at a time, on an
    as-needed basis; which is to say, "line-by-line".

    > Just curious, because I thought maybe
    > using something like 'yield curline' would prevent this scenario.


    Using "for line in data:" is what prevents that scenario.
    The "yield" is only relevant to how the file is read insofar as the
    the alternative to yield-ing would be to return a list, which would
    necessitate going through the entire file in continuous go and then
    returning a very large list; but even then, the file's content would
    still be read from line-by-line, not all at once as one humongous
    string.

    Cheers,
    Chris
    --
    http://blog.rebertia.com
     
    Chris Rebert, Nov 7, 2010
    #5
  6. On 7 November 2010 18:14, Chris Rebert <> wrote:
    > On Sun, Nov 7, 2010 at 9:56 AM, chad <> wrote:
    >> But what happens if the input file is say 250MB? Will all 250MB be
    >> loaded into memory at once?

    >
    > No. As I said, the file will be read from 1 line at a time, on an
    > as-needed basis; which is to say, "line-by-line".


    IIRC, it's somewhere in between. Python will read the file in blocks.
    If only *looks* like it's reading the file line by line.

    --
    Cheers,
    Simon B.
     
    Simon Brunning, Nov 8, 2010
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Daniel Brodie

    Question about bytecode and yield

    Daniel Brodie, Jul 18, 2004, in forum: Python
    Replies:
    1
    Views:
    324
    Tim Peters
    Jul 18, 2004
  2. Robert Brewer

    RE: Question about bytecode and yield

    Robert Brewer, Jul 18, 2004, in forum: Python
    Replies:
    1
    Views:
    314
    Michael Hudson
    Jul 20, 2004
  3. Replies:
    1
    Views:
    331
    Gabriel Genellina
    Apr 22, 2008
  4. Markus
    Replies:
    1
    Views:
    208
    Mark Hubbart
    Sep 27, 2004
  5. Michael Edgar
    Replies:
    13
    Views:
    285
    Brian Candler
    Apr 21, 2011
Loading...

Share This Page