Geneator/Iterator Nesting Problem - Any Ideas? 2.4

Discussion in 'Python' started by ChaosKCW, Mar 4, 2005.

  1. ChaosKCW

    ChaosKCW Guest

    Hi

    Using Python 2.4 I am trying to procduce a generator which will return
    the results of dbi SQL statement using fetchmany for performance.

    So instead of fetching one record for each call, I will fetch batches
    of X (eg 100) and yeild each record in turn.

    For reasons of pure asthetics and my own learning I wanted it to look
    like this:

    </B>
    def resultsetbatchgen(cursor, size=100):
    for results in (recordbatch for recordbatch in
    cursor.fetchmany(size)):
    for rec in results:
    yield rec
    </B>

    Problem is this this gives spurious results. To understand the problem
    if I excute the following in the console I get the correct results:

    </B>
    >>> cur.execute(<QUERY WITH MOER THAN 1000 records>)
    >>> recs = (recordbatch for recordbatch in cur.fetchmany(1000))
    >>> sum(map(lambda x: 1, (rec for rec in recs)))

    1000
    </B>

    This is PERFECT! 1000 is what I expected. but now lets nest this inside
    another for construct like so

    </B>
    >>> cur.execute(<QUERY WITH MOER THAN 1000 records>)
    >>> for results in (recordbatch for recordbatch in

    cur.fetchmany(1000)):
    .... print sum(map(lambda x: 1, (rec for rec in results)))
    76
    76
    76
    76
    76
    76
    76
    76
    ............
    </B>

    Now it thinks each batch size is 76 ... ? This completly wrong, and
    baffling.

    The commands are exactly the same as far as I can tell, the only
    difference is that it is now nested wihtin another for loop?

    Any help would be greatly aprpeciated.

    PS a working but far less elegant version of this below but I would
    like to understand why teh above doesnt work and doesnt work
    consitantly:

    def resultsetbatch(cursor, size=100):
    results = cursor.fetchmany(size)
    while results <> []:
    for rec in results:
    yield rec
    results = cursor.fetchmany(size)
    ChaosKCW, Mar 4, 2005
    #1
    1. Advertising

  2. ChaosKCW wrote:
    > For reasons of pure asthetics and my own learning I wanted it to look
    > like this:
    >
    > </B>
    > def resultsetbatchgen(cursor, size=100):
    > for results in (recordbatch for recordbatch in
    > cursor.fetchmany(size)):
    > for rec in results:
    > yield rec
    > </B>


    Note that this is equivalent to:

    def resultsetbatchgen(cursor, size=100):
    for results in cursor.fetchmany(size):
    for rec in results:
    yield rec

    That is, your generator expression isn't really doing anything.

    >>>>cur.execute(<QUERY WITH MOER THAN 1000 records>)
    >>>>recs = (recordbatch for recordbatch in cur.fetchmany(1000))
    >>>>sum(map(lambda x: 1, (rec for rec in recs)))

    > 1000


    This should be equivalent to
    ....
    >>> recs = iter(cur.fetchmany(1000))
    >>> len(list(recs))


    >>>>cur.execute(<QUERY WITH MOER THAN 1000 records>)
    >>>>for results in (recordbatch for recordbatch in cur.fetchmany(1000)):

    > ... print sum(map(lambda x: 1, (rec for rec in results)))


    This is now equivalent to:
    >>> cur.execute(<QUERY WITH MOER THAN 1000 records>)
    >>> for rec in cur.fetchmany(1000):

    .... print len(list(results))

    Note that you're counting the number of elements in record, which I'm
    guessing is 76?

    I'm thinking you want to do something like:

    def resultsetbatchgen(cursor, size=100):
    while True:
    results = cursor.fetchmany(size)
    if not results:
    break
    for rec in results:
    yield rec

    I don't actually know how you tell when there isn't anything more to
    fetch, but if it returns an empty list, you could also write this like:

    def resultsetbatchgen(cursor, size=100):
    for results in iter(lambda: cursor.fetchmany(size), [])
    for rec in results:
    yield rec


    HTH,

    STeVe
    Steven Bethard, Mar 4, 2005
    #2
    1. Advertising

  3. ChaosKCW

    ChaosKCW Guest

    Hi

    Thanks this was very helpfull.

    Your final solution seems like the best (most elegant). I was trying to
    avoid the ugly while loops with breaks and this final one certainly
    does that.

    Thanks for your help .
    ChaosKCW, Mar 4, 2005
    #3
  4. ChaosKCW

    ChaosKCW Guest

    For those that are interested I ran a performance comparison of various
    functions for wrapping sql results in a interables and generators. The
    results are below and the code is at the very bottom.

    Surprisinly (in a happy way) the functional version of the batch
    retrieve comes in a very close second beating out two other common
    iterative approaches. However the winner by a small margin is still an
    iterative routine. The elegance of the more functional approach (thanks
    to Steve) and its excellent performance makes it a clear winner in my
    mind.

    The other intersting conclusion is that batch fetching results gives a
    fairly significant and real performance boost. Its not just academic.

    The Winner:

    def resultset_functional_batchgenerator(cursor, size=100):
    for results in iter(lambda: cursor.fetchmany(size), []):
    for rec in results:
    yield rec


    Test Results (P2.4 IBM T41 Thinkpad):
    Ordered by: cumulative time
    List reduced from 57 to 7 due to restriction <'test_'>

    ncalls tottime percall cumtime percall filename:lineno(function)
    1 2.140 2.140 54.002 54.002
    PerformanceTestGenerators.py:102(test_resultset_functional_generator)
    1 1.957 1.957 45.484 45.484
    PerformanceTestGenerators.py:98(test_resultset_iterative_generator)
    1 2.433 2.433 41.844 41.844
    PerformanceTestGenerators.py:94(test_resultset_iterator)
    1 1.930 1.930 39.793 39.793
    PerformanceTestGenerators.py:110(test_resultset_iterative_batchgenerator_2)
    1 1.734 1.734 35.561 35.561
    PerformanceTestGenerators.py:114(test_resultset_iterative_batchgenerator_3)
    1 1.980 1.980 34.579 34.579
    PerformanceTestGenerators.py:118(test_resultset_functional_batchgenerator)
    1 1.780 1.780 31.696 31.696
    PerformanceTestGenerators.py:106(test_resultset_iterative_batchgenerator_1)

    Code:

    import unittest
    import odbc
    import profile
    import pstats

    class resultset_iterator:
    "Iterate over the recordset and frees the cursor afterwards."
    def __init__(self, cursor):
    self.cursor = cursor
    def __iter__(self): return self
    def next(self):
    rec = self.cursor.fetchone()
    if not rec:
    raise StopIteration
    return rec

    def resultset_iterative_generator(cursor):
    rec = cursor.fetchone();
    while rec:
    yield rec
    rec = cursor.fetchone();

    def resultset_functional_generator(cursor):
    for rec in iter(lambda: cursor.fetchone(), None):
    yield rec

    def resultset_iterative_batchgenerator_1(cursor, size=100):
    results = cursor.fetchmany(size)
    while results:
    for rec in results:
    yield rec
    results = cursor.fetchmany(size)

    def resultset_iterative_batchgenerator_2(cursor, arraysize=100):
    'An iterator that uses fetchmany to keep memory usage down'
    done = False
    while not done:
    results = cursor.fetchmany(arraysize)
    if results == []:
    done = True
    for result in results:
    yield result

    def resultset_iterative_batchgenerator_3(cursor, size=100):
    while True:
    results = cursor.fetchmany(size)
    if not results:
    break
    for rec in results:
    yield rec

    def resultset_functional_batchgenerator(cursor, size=100):
    for results in iter(lambda: cursor.fetchmany(size), []):
    for rec in results:
    yield rec

    class testResultSetGenerators(unittest.TestCase):

    connectstring = "*REMOVED*"
    sql = "*REMOVED*"

    def setUp(self):
    self.con = odbc.odbc(self.connectstring)
    self.cur = self.con.cursor()
    self.cur.execute(self.sql)

    def tearDown(self):
    self.cur.close()
    self.con.close()

    def test_resultset_iterator(self):
    for row in resultset_iterator(self.cur):
    pass

    def test_resultset_iterative_generator(self):
    for row in resultset_iterative_generator(self.cur):
    pass

    def test_resultset_functional_generator(self):
    for row in resultset_functional_generator(self.cur):
    pass

    def test_resultset_iterative_batchgenerator_1(self):
    for row in resultset_iterative_batchgenerator_1(self.cur):
    pass

    def test_resultset_iterative_batchgenerator_2(self):
    for row in resultset_iterative_batchgenerator_2(self.cur):
    pass

    def test_resultset_iterative_batchgenerator_3(self):
    for row in resultset_iterative_batchgenerator_3(self.cur):
    pass

    def test_resultset_functional_batchgenerator(self):
    for row in resultset_functional_batchgenerator(self.cur):
    pass

    if __name__ == '__main__':
    suite = unittest.makeSuite(testResultSetGenerators)
    profile.run('unittest.TextTestRunner(verbosity=2).run(suite)',
    'c:\\temp\\profile')

    p = pstats.Stats('c:\\temp\\profile')
    p.strip_dirs().sort_stats('cumulative').print_stats('test_')
    ChaosKCW, Mar 7, 2005
    #4
  5. ChaosKCW

    Peter Otten Guest

    ChaosKCW wrote:

    > def resultset_functional_generator(cursor):
    > for rec in iter(lambda: cursor.fetchone(), None):
    > yield rec


    This can be simplified to

    def resultset_functional_generator(cursor):
    return iter(cursor.fetchone, None)

    It should be a bit faster, too.

    Peter
    Peter Otten, Mar 7, 2005
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Hendrik Maryns
    Replies:
    18
    Views:
    1,390
  2. greg
    Replies:
    6
    Views:
    441
    Dietmar Kuehl
    Jul 17, 2003
  3. Sonoman

    random number geneator

    Sonoman, Jan 8, 2004, in forum: C++
    Replies:
    10
    Views:
    2,731
    Martijn Lievaart
    Jan 10, 2004
  4. Trans
    Replies:
    10
    Views:
    279
    Sean O'Halpin
    Sep 16, 2005
  5. Jim Anderson

    problem with iterator (map iterator)

    Jim Anderson, Jan 10, 2014, in forum: C++
    Replies:
    3
    Views:
    106
    Luca Risolia
    Jan 13, 2014
Loading...

Share This Page