Thanks a lot for the clarification.
Actually my problem is giving to raster dataset in geo-tif format find out
unique pair combination, count the number of observation
unique combination in rast1, count the number of observation
unique combination in rast2, count the number of observation
I try different solution and this seems to me the faster
Rast00=dsRast00.GetRasterBand(1).ReadAsArray()
Rast10=dsRast10.GetRasterBand(1).ReadAsArray()
mask=( Rast00 != 0 ) & ( Rast10 != 0 ) # may be this masking
operation can be included in the for loop
Rast00_mask= Rast00[mask] # may be this masking
operation can be included in the for loop
Rast10_mask= Rast10[mask] # may be this masking
operation can be included in the for loop
array2D = np.array(zip( Rast00_mask,Rast10_mask))
unique_u=dict()
unique_k1=dict()
unique_k2=dict()
for key1,key2 in array2D :
row = tuple((key1,key2))
if row in unique_u:
unique_u[row] += 1
else:
unique_u[row] = 1
if key1 in unique_k1:
unique_k1[key1] += 1
else:
unique_k1[key1] = 1
if key2 in unique_k2:
unique_k2[key2] += 1
else:
unique_k2[key2] = 1
output = open(dst_file_rast0010, "w")
for (a, b), c in unique_u.items():
print(a, b, c, file=output)
output.close()
output = open(dst_file_rast00, "w")
for (a), b in unique_k1.items():
print(a, b, file=output)
output.close()
output = open(dst_file_rast10, "w")
for (a), b in unique_k2.items():
print(a, b, file=output)
output.close()
What do you think? is there a way to speed up the process?
Thanks
Giuseppe
Actually, they are different.
Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
Dict uses hashing to get a value from the dict and this is why it's O(1).
10.08.2012, в 1:21, Tim Chase напиÑал(а):
10.08.2012, в 0:35, Tim Chase напиÑал(а):
On 08/09/12 15:22, Roman Vashkevich wrote:
{(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
4 5 1
5 4 1
4 4 2
2 3 1
4 3 2
for key in dict:
print key[0], key[1], dict[key]
This might read more cleanly with tuple unpacking:
for (edge1, edge2), cost in d.iteritems(): # or .items()
print edge1, edge2, cost
(I'm making the assumption that this is a edge/cost graph...use
appropriate names according to what they actually mean)
dict.items() is a list - linear access time whereas with 'for
key in dict:' access time is constant:
http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#use-in-where-possible-1
That link doesn't actually discuss dict.{iter}items()
Both are O(N) because you have to touch each item in the dict--you
can't iterate over N entries in less than O(N) time. For small
data-sets, building the list and then iterating over it may be
faster faster; for larger data-sets, the cost of building the list
overshadows the (minor) overhead of a generator. Either way, the
iterate-and-fetch-the-associated-value of .items() & .iteritems()
can (should?) be optimized in Python's internals to the point I
wouldn't think twice about using the more readable version.
-tkc