Remap Mysql tuple to dictionary

P

Pom

Hello

I want to convert a Mysql resulset to a dictionary.

I made some code myself, and want to ask you if I do this the right way.

def remapmysql(a):
return (a[0], (a[1:]))

def test_map():
count = 100000 # count of simulated records
l1 = range(0, count)
l2 = range(count , 2 * count )
l3 = range(2 * count, 3 * count )
z1 = zip(l1, l2, l3) # simulate a mysql resultset

d1 = dict(map(remapmysql,z1))

return d1
 
F

Fredrik Lundh

Pom said:
I want to convert a Mysql resulset to a dictionary.

that is, you want to convert an array of (key, value, ...) tuples to a
dictionary containing key: (value, ...) pairs, right ?
I made some code myself, and want to ask you if I do this the right way.

def remapmysql(a):
return (a[0], (a[1:]))

def test_map():
count = 100000 # count of simulated records
l1 = range(0, count)
l2 = range(count , 2 * count )
l3 = range(2 * count, 3 * count )
z1 = zip(l1, l2, l3) # simulate a mysql resultset

d1 = dict(map(remapmysql,z1))

return d1

looks fine to me.

if you care about performance, and is using a recent Python, you could
try doing

d1 = dict((row[0], row[1:]) for row in z1)

instead, and see if that runs faster (this uses a generator expression
instead of a callback and a full list).

</F>
 
T

Tim Chase

def remapmysql(a):
return (a[0], (a[1:]))

def test_map():
count = 100000 # count of simulated records
l1 = range(0, count)
l2 = range(count , 2 * count )
l3 = range(2 * count, 3 * count )
z1 = zip(l1, l2, l3) # simulate a mysql resultset

d1 = dict(map(remapmysql,z1))

return d1

I'm not sure the map() is needed, as it could just be
>>> d1 = dict((row[0], row[1:]) for row in z1)

which worked in my tests.

However either seems to work fairly well.

-tkc
 
P

Pom

Fredrik said:
if you care about performance, and is using a recent Python, you could
yes i do ;-)
try doing

d1 = dict((row[0], row[1:]) for row in z1)

instead, and see if that runs faster (this uses a generator expression
instead of a callback and a full list).

</F>

I changed it and it saves me some time so I leave it like that!

with 100000 test records:dict+map (1, 2, 3) -> {1: (2, 3)}: 1.343 seconds.
dict+gen-expr (1, 2, 3) -> {1: (2, 3)}: 0.861 seconds.dict+map (1, 2, 3) -> {1: (2, 3)}: 1.397 seconds.
dict+gen-expr (1, 2, 3) -> {1: (2, 3)}: 0.943 seconds.

with 500000 test records:dict+map (1, 2, 3) -> {1: (2, 3)}: 13.297 seconds.
dict+gen-expr (1, 2, 3) -> {1: (2, 3)}: 8.335 seconds.dict+map (1, 2, 3) -> {1: (2, 3)}: 14.548 seconds.
dict+gen-expr (1, 2, 3) -> {1: (2, 3)}: 9.793 seconds.

thank you!!
 
P

Pom

Tim said:
def remapmysql(a):
return (a[0], (a[1:]))

def test_map():
count = 100000 # count of simulated records
l1 = range(0, count)
l2 = range(count , 2 * count )
l3 = range(2 * count, 3 * count )
z1 = zip(l1, l2, l3) # simulate a mysql resultset

d1 = dict(map(remapmysql,z1))

return d1

I'm not sure the map() is needed, as it could just be
d1 = dict((row[0], row[1:]) for row in z1)

which worked in my tests.

However either seems to work fairly well.

-tkc

thank you!!

changed it, see previous post.
 
D

Dennis Lee Bieber

Hello

I want to convert a Mysql resulset to a dictionary.
One thing I've not seen answered (or expanded upon)...
I made some code myself, and want to ask you if I do this the right way.

def remapmysql(a):
return (a[0], (a[1:]))

def test_map():
count = 100000 # count of simulated records

You mention here 100,000 records in a result set... Do you really
envision performing a

crsr.fetchall() on that result set BEFORE converting to dictionary?

Since, as I recall, MySQL does not have server-side cursor
operations, the MySQLdb emulation has already received the full result
data.... COPY 1

crsr.fetchall() will create a list of tuples of all the data... COPY
2

Your remap operation will create a dictionary using the first field
as the key, and the rest of the fields as a list identified by that
key.... COPY 3

I don't know when "COPY 1" gets freed -- on the next crsr.execute()
or after the last row has been fetched from the cursor.

"COPY 2" will hang around until you reuse it or explicitly delete
it.

In other words -- you could have up to THREE copies of your 100,000
record result set consuming memory at one time.

It might be more economical to perform the conversion while fetching
the data:

mdict = {}
for rec in crsr:
mdict[rec[0]] = rec[1:]

--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
P

Pom

Dennis said:
It might be more economical to perform the conversion while fetching
the data:

mdict = {}
for rec in crsr:
mdict[rec[0]] = rec[1:]

I didn't think of that. I just took the fetchall() from my first
version (where I looped through the tuples, which was very slow)

I just dropped the .fetchall() part, and as you said: it works fine,
with 1 copy less. (but it doesn't save me time)

dict((int(row[0]), row[1:]) for row in cursor)


thanks!
 
L

Lawrence D'Oliveiro

I want to convert a Mysql resulset to a dictionary.

Here's a function that does this one row at a time:

def GetEachRecord(TableName, Fields, Condition, Values, Extra = "") :
"""generator which does an SQL query which can return 0 or more
result rows, yielding each record in turn as a mapping from
field name to field value. TableName can be a single table name,
or a comma-separated list of names for a join. Extra allows
specification of order/group clauses."""
Cursor = sql.conn.cursor() # modify this as appropriate
Cursor.execute \
(
", ".join(Fields)
+
" from "
+
TableName
+
" where "
+
Condition
+
" "
+
Extra,
Values
)
while True :
NextRow = Cursor.fetchone()
if NextRow == None :
Cursor.close()
raise StopIteration
#end if
yield dict(zip(Fields, NextRow))
#end while
#end GetEachRecord

You'd use this something like

for Link in GetEachRecord(...) :
... Link[fieldname] ... blah-blah ...
 
F

Fredrik Lundh

Dennis said:
Your remap operation will create a dictionary using the first field
as the key, and the rest of the fields as a list identified by that
key.... COPY 3

the tuple-to-dictionary conversion mostly shuffles pointers around,
though; it's not a full copy of all the data in the result set.

</F>
 
F

Fredrik Lundh

Lawrence said:
What other kind of mapping could you produce?

and here we go again. how about reading the code the OP posted, or the
first few followups?

</F>
 
L

Lawrence D'Oliveiro

What other kind of mapping could you produce?

All right, sorry, looks like they want to load the entire table into RAM and
key it off the first field. Kind of defeats the point of having SQL, but
there you go...
 
F

Fredrik Lundh

Lawrence said:
> Kind of defeats the point of having SQL, but there you go...

there are plenty of reasons to use Python data structures instead of the
SQL engine for data crunching. especially if you care about efficiency.

</F>
 
L

Lawrence D'Oliveiro

there are plenty of reasons to use Python data structures instead of the
SQL engine for data crunching. especially if you care about efficiency.

I think you have something else in mind when you say "efficiency", from what
I'm thinking of.

SQL databases like MySQL are _designed_ for efficiency.
 
D

Diez B. Roggisch

SQL databases like MySQL are _designed_ for efficiency.

Efficiency with respect to what? That statement is plain wrong. They are
designed for a pretty general case of data storage efficiency, in the
domain of relational algebra. And for a lot of use-cases, they offer a good
ratio of ease-of-use, speed and reliability and the overhead they
introduce.

But there are lots of cases where hand-tailored data structures - in python
as well as in C - are way better suited. Think of graph representations for
example, especially for reachability queries and the like. Does google use
ORACLE (or whatever DB) for their index? Nope.

So - without knowing the usecase,

"""
All right, sorry, looks like they want to load the entire table into RAM and
key it off the first field. Kind of defeats the point of having SQL, but
there you go...
"""

is a pretty strong thing to say.

Diez
 
S

Simon Brunning

All right, sorry, looks like they want to load the entire table into RAM and
key it off the first field. Kind of defeats the point of having SQL, but
there you go...

Keeping an in-memory cache of small, unchanging, frequently-read
tables is a very common, and often very effective performance tweak in
database driven systems.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top