save dictionary to a file without brackets.

  • Thread starter giuseppe.amatulli
  • Start date
G

giuseppe.amatulli

Hi,
I have a dict() unique
like this
{(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
4 5 1
5 4 1
4 4 2
2 3 1
4 3 2
Any ideas?
Thanks in advance
Giuseppe
 
R

Roman Vashkevich

for key in dict:
print key[0], key[1], dict[key]

10.08.2012, × 0:11, (e-mail address removed) ÎÁÐÉÓÁÌ(Á):
 
T

Tim Chase

{(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
4 5 1
5 4 1
4 4 2
2 3 1
4 3 2

for key in dict:
print key[0], key[1], dict[key]

This might read more cleanly with tuple unpacking:

for (edge1, edge2), cost in d.iteritems(): # or .items()
print edge1, edge2, cost

(I'm making the assumption that this is a edge/cost graph...use
appropriate names according to what they actually mean)

-tkc
 
G

Gelonida N

Hi,
I have a dict() unique
like this
{(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
4 5 1
5 4 1
4 4 2
2 3 1
4 3 2
Any ideas?
Thanks in advance
Giuseppe
Boring explicit solution:

d = {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
for key, val in d.items():
v1,v2 = key
fout.write("%d %d %d\n" % (v1, v2, val))
 
G

Giuseppe Amatulli

thanks for the fast replies
my testing were very closed to yours but i did not know how

Hi,
I have a dict() unique
like this
{(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
and i want to print to a file without the brackets comas and semicolon in
order to obtain something like this?
4 5 1
5 4 1
4 4 2
2 3 1
4 3 2
Any ideas?
Thanks in advance

How's this?

from __future__ import print_function

output = open("out.txt", "w")

for (a, b), c in d.items():
print(a, b, c, file=output)

output.close()

Oscar.
 
G

Giuseppe Amatulli

thanks for the fast replies
my testing were very closed to yours but i did not know how to print
the the number after the semicolon!
thanks!


Hi,
I have a dict() unique
like this
{(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
and i want to print to a file without the brackets comas and semicolon in
order to obtain something like this?
4 5 1
5 4 1
4 4 2
2 3 1
4 3 2
Any ideas?
Thanks in advance

How's this?

from __future__ import print_function

output = open("out.txt", "w")

for (a, b), c in d.items():
print(a, b, c, file=output)

output.close()

Oscar.
 
R

Roman Vashkevich

dict.items() is a list - linear access time whereas with 'for key in dict:' access time is constant: http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#use-in-where-possible-1

10.08.2012, × 0:35, Tim Chase ÎÁÐÉÓÁÌ(Á):
{(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
4 5 1
5 4 1
4 4 2
2 3 1
4 3 2

for key in dict:
print key[0], key[1], dict[key]

This might read more cleanly with tuple unpacking:

for (edge1, edge2), cost in d.iteritems(): # or .items()
print edge1, edge2, cost

(I'm making the assumption that this is a edge/cost graph...use
appropriate names according to what they actually mean)

-tkc
 
M

Mark Lawrence

dict.items() is a list - linear access time whereas with 'for key in dict:' access time is constant: http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#use-in-where-possible-1

10.08.2012, × 0:35, Tim Chase ÎÁÐÉÓÁÌ(Á):
{(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
4 5 1
5 4 1
4 4 2
2 3 1
4 3 2

for key in dict:
print key[0], key[1], dict[key]

This might read more cleanly with tuple unpacking:

for (edge1, edge2), cost in d.iteritems(): # or .items()
print edge1, edge2, cost

(I'm making the assumption that this is a edge/cost graph...use
appropriate names according to what they actually mean)

-tkc

I'm impressed, the OP gives a dict with five entries and already we're
optimising, a cunning plan if ever there was one. Hum, I think I'll
start on the blast proof ferro-concrete bunker tonight just in case
WWIII starts tomorrow.
 
T

Tim Chase

10.08.2012, × 0:35, Tim Chase ÎÁÐÉÓÁÌ(Á):
{(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
4 5 1
5 4 1
4 4 2
2 3 1
4 3 2

for key in dict:
print key[0], key[1], dict[key]

This might read more cleanly with tuple unpacking:

for (edge1, edge2), cost in d.iteritems(): # or .items()
print edge1, edge2, cost

(I'm making the assumption that this is a edge/cost graph...use
appropriate names according to what they actually mean)

dict.items() is a list - linear access time whereas with 'for
key in dict:' access time is constant:
http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#use-in-where-possible-1

That link doesn't actually discuss dict.{iter}items()

Both are O(N) because you have to touch each item in the dict--you
can't iterate over N entries in less than O(N) time. For small
data-sets, building the list and then iterating over it may be
faster faster; for larger data-sets, the cost of building the list
overshadows the (minor) overhead of a generator. Either way, the
iterate-and-fetch-the-associated-value of .items() & .iteritems()
can (should?) be optimized in Python's internals to the point I
wouldn't think twice about using the more readable version.

-tkc
 
R

Roman Vashkevich

Actually, they are different.
Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
Dict uses hashing to get a value from the dict and this is why it's O(1).

10.08.2012, × 1:21, Tim Chase ÎÁÐÉÓÁÌ(Á):
10.08.2012, × 0:35, Tim Chase ÎÁÐÉÓÁÌ(Á):
On 08/09/12 15:22, Roman Vashkevich wrote:
{(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
4 5 1
5 4 1
4 4 2
2 3 1
4 3 2

for key in dict:
print key[0], key[1], dict[key]

This might read more cleanly with tuple unpacking:

for (edge1, edge2), cost in d.iteritems(): # or .items()
print edge1, edge2, cost

(I'm making the assumption that this is a edge/cost graph...use
appropriate names according to what they actually mean)

dict.items() is a list - linear access time whereas with 'for
key in dict:' access time is constant:
http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#use-in-where-possible-1

That link doesn't actually discuss dict.{iter}items()

Both are O(N) because you have to touch each item in the dict--you
can't iterate over N entries in less than O(N) time. For small
data-sets, building the list and then iterating over it may be
faster faster; for larger data-sets, the cost of building the list
overshadows the (minor) overhead of a generator. Either way, the
iterate-and-fetch-the-associated-value of .items() & .iteritems()
can (should?) be optimized in Python's internals to the point I
wouldn't think twice about using the more readable version.

-tkc
 
T

Terry Reedy

10.08.2012, в 0:35, Tim Chase напиÑал(а):
On 08/09/12 15:22, Roman Vashkevich wrote:
{(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
4 5 1
5 4 1
4 4 2
2 3 1
4 3 2

for key in dict:
print key[0], key[1], dict[key]

This might read more cleanly with tuple unpacking:

for (edge1, edge2), cost in d.iteritems(): # or .items()
print edge1, edge2, cost

(I'm making the assumption that this is a edge/cost graph...use
appropriate names according to what they actually mean)

dict.items() is a list - linear access time whereas with 'for
key in dict:' access time is constant:
http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#use-in-where-possible-1

That link doesn't actually discuss dict.{iter}items()

Both are O(N) because you have to touch each item in the dict--you
can't iterate over N entries in less than O(N) time. For small
data-sets, building the list and then iterating over it may be
faster faster; for larger data-sets, the cost of building the list
overshadows the (minor) overhead of a generator. Either way, the
iterate-and-fetch-the-associated-value of .items() & .iteritems()
can (should?) be optimized in Python's internals to the point I
wouldn't think twice about using the more readable version.

In 3.x, .keys, .values, and .items are set-like read-only views
specifically designed for iteration. So in 3.x they are THE way to do so
for whichever alternative is appropriate. Iterating by keys and then
looking up values instead of yielding the values at the same time is
extra work.
 
D

Dave Angel

Actually, they are different.
Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
Dict uses hashing to get a value from the dict and this is why it's O(1).

Sure, that's why

for key in dict:
print key[0], key[1], dict[key]

is probably slower than

for (edge1, edge2), cost in d.iteritems(): # or .items()
print edge1, edge2, cost


So, the latter is both faster and easier to read. Why are you arguing against it?

Also, please stop top-posting. It's impolite here, and makes it much harder to figure out who is saying what, in what order.
 
C

Chris Kaynor

Actually, they are different.
Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
Dict uses hashing to get a value from the dict and this is why it's O(1).

Using "in" as an operator such as: "if key in dict" or "result = key
in dict" is O(1) as you say. Iterating on the dictionary requires
touching every item, and so is O(n), even though it also using "in" in
the command.

Here are a few quick timing tests I just ran with Python 2.6:
1.1708168159579486
2.2544262854249268
timeit.timeit('for i in d: v=d', 'd=dict.fromkeys(range(1))') 0.10009793211446549
timeit.timeit('for i in d: v=d', 'd=dict.fromkeys(range(10))') 0.38825072496723578
timeit.timeit('for i in d: v=d', 'd=dict.fromkeys(range(100))')

3.3020098061049339


As can be seen here, a 1-item dictionary iterated in 0.07 seconds, 10
items in 0.17 seconds, and 100 items in 1.17 seconds. That is fairly
close to linear, especially when considering the overhead of a
complete no-op

Using iteritems, it appears to actually scale slightly better than
linear, though it is slower than just the plain iteration.

Doing a plain iteration, then looking up the keys to get the values
also appears to be linear, and is even slower than iteritems.
 
C

Chris Kaynor

I realized, I should have done 10, 100, 1000 rather than 1, 10, 100
for better results, so here are the results for 1000 items. It still
maintains the same pattern:
timeit.timeit('for i in d: pass', 'd=dict.fromkeys(range(1000))') 10.166595947685153
timeit.timeit('for i in d.iteritems(): pass', 'd=dict.fromkeys(range(1000))') 19.922474218828711
timeit.timeit('for i in d: v=d', 'd=dict.fromkeys(range(1000))')

31.007666660415282

Chris

Actually, they are different.
Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
Dict uses hashing to get a value from the dict and this is why it's O(1).

Using "in" as an operator such as: "if key in dict" or "result = key
in dict" is O(1) as you say. Iterating on the dictionary requires
touching every item, and so is O(n), even though it also using "in" in
the command.

Here are a few quick timing tests I just ran with Python 2.6:
timeit.timeit('for i in d: pass', 'd=dict.fromkeys(range(1))') 0.078683853332734088
timeit.timeit('for i in d: pass', 'd=dict.fromkeys(range(10))') 0.17451784110969015
timeit.timeit('for i in d: pass', 'd=dict.fromkeys(range(100))')
1.1708168159579486
timeit.timeit('for i in d.iteritems(): pass', 'd=dict.fromkeys(range(1))') 0.14186911440299355
timeit.timeit('for i in d.iteritems(): pass', 'd=dict.fromkeys(range(10))') 0.33836512561802579
timeit.timeit('for i in d.iteritems(): pass', 'd=dict.fromkeys(range(100))')
2.2544262854249268
timeit.timeit('for i in d: v=d', 'd=dict.fromkeys(range(1))') 0.10009793211446549
timeit.timeit('for i in d: v=d', 'd=dict.fromkeys(range(10))') 0.38825072496723578
timeit.timeit('for i in d: v=d', 'd=dict.fromkeys(range(100))')

3.3020098061049339


As can be seen here, a 1-item dictionary iterated in 0.07 seconds, 10
items in 0.17 seconds, and 100 items in 1.17 seconds. That is fairly
close to linear, especially when considering the overhead of a
complete no-op

Using iteritems, it appears to actually scale slightly better than
linear, though it is slower than just the plain iteration.

Doing a plain iteration, then looking up the keys to get the values
also appears to be linear, and is even slower than iteritems.
 
G

Giuseppe Amatulli

Thanks a lot for the clarification.
Actually my problem is giving to raster dataset in geo-tif format find out
unique pair combination, count the number of observation
unique combination in rast1, count the number of observation
unique combination in rast2, count the number of observation

I try different solution and this seems to me the faster


Rast00=dsRast00.GetRasterBand(1).ReadAsArray()
Rast10=dsRast10.GetRasterBand(1).ReadAsArray()

mask=( Rast00 != 0 ) & ( Rast10 != 0 ) # may be this masking
operation can be included in the for loop

Rast00_mask= Rast00[mask] # may be this masking
operation can be included in the for loop
Rast10_mask= Rast10[mask] # may be this masking
operation can be included in the for loop

array2D = np.array(zip( Rast00_mask,Rast10_mask))

unique_u=dict()
unique_k1=dict()
unique_k2=dict()

for key1,key2 in array2D :
row = tuple((key1,key2))
if row in unique_u:
unique_u[row] += 1
else:
unique_u[row] = 1
if key1 in unique_k1:
unique_k1[key1] += 1
else:
unique_k1[key1] = 1
if key2 in unique_k2:
unique_k2[key2] += 1
else:
unique_k2[key2] = 1

output = open(dst_file_rast0010, "w")
for (a, b), c in unique_u.items():
print(a, b, c, file=output)
output.close()

output = open(dst_file_rast00, "w")
for (a), b in unique_k1.items():
print(a, b, file=output)
output.close()

output = open(dst_file_rast10, "w")
for (a), b in unique_k2.items():
print(a, b, file=output)
output.close()

What do you think? is there a way to speed up the process?
Thanks
Giuseppe





Actually, they are different.
Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
Dict uses hashing to get a value from the dict and this is why it's O(1).

10.08.2012, в 1:21, Tim Chase напиÑал(а):
10.08.2012, в 0:35, Tim Chase напиÑал(а):
On 08/09/12 15:22, Roman Vashkevich wrote:
{(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
4 5 1
5 4 1
4 4 2
2 3 1
4 3 2

for key in dict:
print key[0], key[1], dict[key]

This might read more cleanly with tuple unpacking:

for (edge1, edge2), cost in d.iteritems(): # or .items()
print edge1, edge2, cost

(I'm making the assumption that this is a edge/cost graph...use
appropriate names according to what they actually mean)

dict.items() is a list - linear access time whereas with 'for
key in dict:' access time is constant:
http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#use-in-where-possible-1

That link doesn't actually discuss dict.{iter}items()

Both are O(N) because you have to touch each item in the dict--you
can't iterate over N entries in less than O(N) time. For small
data-sets, building the list and then iterating over it may be
faster faster; for larger data-sets, the cost of building the list
overshadows the (minor) overhead of a generator. Either way, the
iterate-and-fetch-the-associated-value of .items() & .iteritems()
can (should?) be optimized in Python's internals to the point I
wouldn't think twice about using the more readable version.

-tkc
 
R

Roman Vashkevich

10.08.2012, × 1:47, Dave Angel ÎÁÐÉÓÁÌ(Á):
Actually, they are different.
Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
Dict uses hashing to get a value from the dict and this is why it's O(1).

Sure, that's why

for key in dict:
print key[0], key[1], dict[key]

is probably slower than

for (edge1, edge2), cost in d.iteritems(): # or .items()
print edge1, edge2, cost


So, the latter is both faster and easier to read. Why are you arguing against it?

Also, please stop top-posting. It's impolite here, and makes it much harder to figure out who is saying what, in what order.

I'm not arguing at all. Sorry if it sounded like I was arguing.
Thanks for notifying me of the way messages should be sent.

Roman
 
A

Andrew Cooper

Actually, they are different.
Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
Dict uses hashing to get a value from the dict and this is why it's O(1).

Sligtly off topic, but looking up a value in a dictionary is actually
O(n) for all other entries in the dict which suffer a hash collision
with the searched entry.

True, a sensible choice of hash function will reduce n to 1 in common
cases, but it becomes an important consideration for larger datasets.

~Andrew
 
D

Dave Angel

Sligtly off topic, but looking up a value in a dictionary is actually
O(n) for all other entries in the dict which suffer a hash collision
with the searched entry.

True, a sensible choice of hash function will reduce n to 1 in common
cases, but it becomes an important consideration for larger datasets.

~Andrew

I'm glad you're wrong for CPython's dictionaries. The only time the
lookup would degenerate to O[n] would be if the hash table had only one
slot. CPython sensibly increases the hash table size when it becomes
too small for efficiency.


Where have you seen dictionaries so poorly implemented?
 
C

Chris Kaynor

Sligtly off topic, but looking up a value in a dictionary is actually
O(n) for all other entries in the dict which suffer a hash collision
with the searched entry.

True, a sensible choice of hash function will reduce n to 1 in common
cases, but it becomes an important consideration for larger datasets.

~Andrew

I'm glad you're wrong for CPython's dictionaries. The only time the
lookup would degenerate to O[n] would be if the hash table had only one
slot. CPython sensibly increases the hash table size when it becomes
too small for efficiency.


Where have you seen dictionaries so poorly implemented?

There are plenty of ways to make a pathological hash function that
will have that issue in CPython.

The very simple (and stupid):

class O(object):
def __hash__(self):
return 0
def __eq__(self, other): # I am aware this is the default equals method.
return self is other

Start adding those to a dictionary to get O(n) lookups.

Any case the hash return values modulus the dictionary hash table size
is constant will have similar results; powers of 2 are likely to
result in such behavior as well.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,188
Latest member
Crypto TaxSoftware

Latest Threads

Top