pymssql - execute loads all results into memory!

C

ChaosKCW

Hi

I am trying to use pymssql, and have an issue where by the execute
(not the fetch) is appearing to load all records into memory.

if I execute

con = pymssql.connect(...)
cur = con.cursor()
cur.execute(sql)
rec = cur.fetchone()

if I put in a query which returns a lot of records into "sql" then the
execute never returns, pythons memory usage slowly ballons till the
machine cant give anymore. If I put a sql query returning only few
rows, then it works fine.

So I am not sure why an execute would feel the need to load all rows,
but its seriously crippling. Does anyone know if this is a bug or
something I can "turn off"

thanks,
 
E

Eric Wertman

I am trying to use pymssql, and have an issue where by the execute
(not the fetch) is appearing to load all records into memory.

if I execute

con = pymssql.connect(...)
cur = con.cursor()
cur.execute(sql)
rec = cur.fetchone()

if I put in a query which returns a lot of records into "sql" then the
execute never returns, pythons memory usage slowly ballons till the
machine cant give anymore. If I put a sql query returning only few
rows, then it works fine.

So I am not sure why an execute would feel the need to load all rows,
but its seriously crippling. Does anyone know if this is a bug or
something I can "turn off"

I ran into this myself. After some digging I discovered that what you
are after is a server-side cursor that isn't implemented yet in
pymssql. There is one in MySQLdb, but it's not the default behavior.
Regardless of your usage (fetchone vs fetchmany), the result set is
held client side. AFAIK the only workaround is to keep your result
set small (enough). If you use fetchmany and iterate over it
directly, it may keep your memory usage down, I can't remember if
that worked. I definitely tried making a generator with it, that did
not help.

Eric
 
T

Tim Golden

Eric said:
I ran into this myself. After some digging I discovered that what you
are after is a server-side cursor that isn't implemented yet in
pymssql. There is one in MySQLdb, but it's not the default behavior.
Regardless of your usage (fetchone vs fetchmany), the result set is
held client side. AFAIK the only workaround is to keep your result
set small (enough). If you use fetchmany and iterate over it
directly, it may keep your memory usage down, I can't remember if
that worked. I definitely tried making a generator with it, that did
not help.


.... or just switch to pyodbc, for example, which behaves
perfectly well with this snippet against a table of >24 million
rows:

<code>
import pyodbc

conn = [
"Driver={SQL Server}",
"Server=SVR17",
"Database=TDI",
"TrustedConnection=Yes"
]
db = pyodbc.connect (";".join (conn))
q = db.cursor ()
q.execute ("SELECT * FROM revenue") # 24 million rows
q.fetchone ()
q.close ()

</code>


TJG
 
C

ChaosKCW

I ran into this myself.  After some digging I discovered that what you
are after is a server-side cursor that isn't implemented yet in
pymssql.  There is one in MySQLdb, but it's not the default behavior.
Regardless of your usage (fetchone vs fetchmany), the result set is
held client side.  AFAIK the only workaround is to keep your result
set small (enough).  If you use fetchmany and iterate over it
directly, it may keep your memory usage down,  I can't remember if
that worked.  I definitely tried making a generator with it, that did
not help.

... or just switch to pyodbc, for example, which behaves
perfectly well with this snippet against a table of >24 million
rows:

<code>
import pyodbc

conn = [
  "Driver={SQL Server}",
  "Server=SVR17",
  "Database=TDI",
  "TrustedConnection=Yes"
]
db = pyodbc.connect (";".join (conn))
q = db.cursor ()
q.execute ("SELECT * FROM revenue") # 24 million rows
q.fetchone ()
q.close ()

</code>

TJG

Thanks for the responses, I am astounded any db api tool doesnt
support cursors! pymssql is mostly useless, I will switch to an odbc
interface.
 
A

Aspersieman

Eric said:
I am trying to use pymssql, and have an issue where by the execute
(not the fetch) is appearing to load all records into memory.
if I execute
con = pymssql.connect(...)
cur = con.cursor()
cur.execute(sql)
rec  = cur.fetchone()
if I put in a query which returns a lot of records into "sql" then the
execute never returns, pythons memory usage slowly ballons till the
machine cant give anymore. If I put a sql query returning only few
rows, then it works fine.
So I am not sure why an execute would feel the need to load all rows,
but its seriously crippling. Does anyone know if this is a bug or
something I can "turn off"
I ran into this myself.  After some digging I discovered that what you
are after is a server-side cursor that isn't implemented yet in
pymssql.  There is one in MySQLdb, but it's not the default behavior.
Regardless of your usage (fetchone vs fetchmany), the result set is
held client side.  AFAIK the only workaround is to keep your result
set small (enough).  If you use fetchmany and iterate over it
directly, it may keep your memory usage down,  I can't remember if
that worked.  I definitely tried making a generator with it, that did
not help.

... or just switch to pyodbc, for example, which behaves
perfectly well with this snippet against a table of >24 million
rows:

<code>
import pyodbc

conn = [
  "Driver={SQL Server}",
  "Server=SVR17",
  "Database=TDI",
  "TrustedConnection=Yes"
]
db = pyodbc.connect (";".join (conn))
q = db.cursor ()
q.execute ("SELECT * FROM revenue") # 24 million rows
q.fetchone ()
q.close ()

</code>

TJG

Thanks for the responses, I am astounded any db api tool doesnt
support cursors! pymssql is mostly useless, I will switch to an odbc
interface.

Yes this is true, unfortunately...

However, pyODBC doesn't support return variables(parameters) in stored
procedures (at least with MS SQL). pymssql is the only db api for python
that I've found that can reliably do this. I've tried adodbapi, pyodbc and
one or two others (can't think of the names now... :-/).

Regards

Nicol
 
T

Tim Golden

Aspersieman said:
However, pyODBC doesn't support return variables(parameters) in stored
procedures (at least with MS SQL). pymssql is the only db api for python
that I've found that can reliably do this. I've tried adodbapi, pyodbc
and one or two others (can't think of the names now... :-/).

That's a fair point (I'm assuming you're right; I've never tried). That
tends to be a shortcoming of SQL interfaces. However, pymssql is based
on the defunct ntwdblib.dll interface which is becoming harder and harder
to find as of SQL2005. Have you tried putting a feature request in to
the pyodbc team? I've not had anything to do with them for a bit, but
they did implement the .nextset method at my request a couple of years
ago. (That's assuming it's even possible with ODBC in general, which
I also haven't checked).

TJG
 
T

Tim Golden

Aspersieman wrote:
[... re output params in pymssql / pyodbc ...]
Oh, trust me - I've tried a *LOT*. I aggree, submitting a feature
request is a good idea. I think I'll do that. :)

Looks like someone already has:

http://sourceforge.net/tracker/index.php?func=detail&aid=1985956&group_id=162557&atid=824257

I did read somewhere (I think on the pyodbc site) that pyodbc isn't DB
API 2.0 feature complete yet - so I'll have to probably check if they
haven't already planned this.

Obtaining output parameters from stored procedures is something I have
to do quite often, unfortunately - so maybe I can submit a patch even... :)


I'm sure that will be more than welcome. Looks like it's a
one-man band (as I guess most projects are) and there aren't
always enough minutes in the day.

TJG
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,260
Messages
2,571,038
Members
48,768
Latest member
first4landlord

Latest Threads

Top