nested dictionary assignment goes too far

J

Jake Emerson

I'm attempting to build a process that helps me to evaluate the
performance of weather stations. The script below operates on an MS
Access database, brings back some data, and then loops through to pull
out statistics. One such stat is the frequency of reports from the
stations ('char_freq'). I have a collection of methods that operate on
the data to return the 'char_freq' and this works great. However, when
the process goes to insert the unique 'char_freq' into a nested
dictionary the value gets put into ALL of the sub-keys for all of the
weather stations. I have isolated (I think) the problem to the compound
key assignment in the next-to-last line before the print statements.
The result is that the last 'freq' to run throught the for loops gets
posted to all of the sensor_numbers. Eventually the process will put
in stats for the other keys in the nested dictionary, so that's why I
have set up the dictionary this way.

Thanks in advance!

run_flag=1
if run_flag > 0:

distinctID = runSQL(Unique_IDs)
distinctID = map(firstpart,distinctID) # converts the list of tuples
that is returned to a list of numbers
rain_raw_dict =
dict.fromkeys(distinctID,{'N':-6999,'char_freq':-6999,'tip1':-6999,'tip2':-6999,'tip3':-6999,'tip4':-6999,'tip5':-6999,'tip6':-6999,'lost_rain':-6999})

rawList = runSQL(Rain_Raw_Count)

temp_list = [110,140,650,1440]

for sensor_count, sensor_number in enumerate(temp_list):
# get the frequency of timer reports for each rain gauge
# note that when a for loop is of the form "for X, x in
enumerate(xList)",
# X is an index value, and x is the value itself.
timerList = []
for icount, i in enumerate(rawList):
if i[0]==sensor_number:
for jcount in range(icount+1,len(rawList)): # look ahead to
the next values for comparison
if rawList[jcount][0]==sensor_number:
if rawList[icount][2]==rawList[jcount][2]:
temp = rawList[jcount][1]-rawList[icount][1]
timerList.append(temp)
icount = jcount-1
break

# now build a histogram of the time differences stored in
"timerList"
h = Histogram()
timer = h.histo(timerList)
sorted_timer = h.sorted_histo(timer)
freq = h.characteristic_freq(sorted_timer)
rain_raw_dict[sensor_number]['char_freq'] = sensor_number #
<<<< here's the problem!!
freq = -6999

print "ID = 110:",rain_raw_dict[110]
print "ID = 140:",rain_raw_dict[140]
print "ID = 650:",rain_raw_dict[650]
print "ID = 1440:",rain_raw_dict[1440]
 
S

Serge Orlov

I'm attempting to build a process that helps me to evaluate the
performance of weather stations. The script below operates on an MS
Access database, brings back some data, and then loops through to pull
out statistics. One such stat is the frequency of reports from the
stations ('char_freq'). I have a collection of methods that operate on
the data to return the 'char_freq' and this works great. However, when
the process goes to insert the unique 'char_freq' into a nested
dictionary the value gets put into ALL of the sub-keys for all of the
weather stations.

It's a sure sign you're sharing an object. In python, unless
specifically written, an assignment-like method doesn't create copies:
d = dict.fromkeys([1,2,3],[4,5,6])
id(d[1]) == id(d[2])
True

Instead of
rain_raw_dict =
dict.fromkeys(distinctID,{'N':-6999,'char_freq':-6999,'tip1':-6999,'tip2':-6999,'tip3':-6999,'tip4':-6999,'tip5':-6999,'tip6':-6999,'lost_rain':-6999})

you should do something like this:

defaults = {'N':-6999,'char_freq':-6999,'tip1':-6999,'tip2':-6999,'tip3':-6999,'tip4':-6999,'tip5':-6999,'tip6':-6999,'lost_rain':-6999}
rain_raw_dict = {}
for ID in [110,140,650,1440]:
rain_raw_dict[ID] = defaults.copy()
 
B

Ben Cartwright

Jake said:
However, when
the process goes to insert the unique 'char_freq' into a nested
dictionary the value gets put into ALL of the sub-keys

The way you're currently defining your dict:
rain_raw_dict =
dict.fromkeys(distinctID,{'N':-6999,'char_freq':-6999,...})

Is shorthand for:
tmp = {'N':-6999,'char_freq':-6999,...}
rain_raw_dict = {}
for key in distinctID:
rain_raw_dict[key] = tmp

Note that tmp is a *reference*. Python does not magically create
copies for you; you have to be explicit. Unless you want a shared
value, dict.fromkeys should only be used with an immutable value (e.g.,
int or str).

What you'll need to do is either:
tmp = {'N':-6999,'char_freq':-6999,...}
rain_raw_dict = {}
for key in distinctID:
# explicitly make a (shallow) copy of tmp
rain_raw_dict[key] = dict(tmp)

Or more simply:
rain_raw_dict = {}
for key in distinctID:
rain_raw_dict[key] = {'N':-6999,'char_freq':-6999,...}

Or if you're a one-liner kinda guy,
rain_raw_dict = dict((key, {'N':-6999,'char_freq':-6999,...})
for key in distinctID)

--Ben
 
J

Jake Emerson

Thanks a lot Serge and Ben. Your posts were right on.

I hope the weather is good wherever you are.

Jake
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top