Delete common entries between two dictionaries

A

Amy G

I have received such good help on this message board. I wonder if I might
not get a little more help from you on this.

I am at the point where I have two dictionaries, with information of a
domain and a frequency of that domain.

Now that I have the two, I want to delete each entry from one that the two
have in common, leaving only those that are unique to the dictionary?

Say I have a dictionary called domains_black
and another domains_white...

Thanks for the help.
 
D

Derrick 'dman' Hudson

I have received such good help on this message board. I wonder if I
might not get a little more help from you on this.

I am at the point where I have two dictionaries, with information of
a domain and a frequency of that domain.

Now that I have the two, I want to delete each entry from one that
the two have in common, leaving only those that are unique to the
dictionary?

This would be great for sets, if a set adequately models your data.
(with two sets, this would simply be (s1-(s1&s2)))
Say I have a dictionary called domains_black and another
domains_white...

Did you want to define equality by key or by (key, value) pair?

for key in domains_white.keys() :
if key in domains_black: del domains_black[key]

for key in domains_white.keys() :
if key in domains_black and domains_white[key] == domains_black[key] :
del domains_black[key]

-D

--
He who scorns instruction will pay for it,
but he who respects a command is rewarded.
Proverbs 13:13

www: http://dman13.dyndns.org/~dman/ jabber: (e-mail address removed)
 
P

Paul Rubin

Amy G said:
Now that I have the two, I want to delete each entry from one that the two
have in common, leaving only those that are unique to the dictionary?

Say I have a dictionary called domains_black
and another domains_white...

for k in domains_white():
if k in domains_black:
del domains_black[k]
 
D

David Eppstein

Derrick 'dman' Hudson said:
This would be great for sets, if a set adequately models your data.
(with two sets, this would simply be (s1-(s1&s2)))

You mean s1 - s2, no need for that extra &.
 
A

Amy G

How do I do this same thing but with lists???

I apparently have two lists... not dictionaries.

This is what it prints if I add
print domains_black

[('YAHOO.COM', 118), ('buildingonline.com', 130), ('canada.com', 95),
('china.com', 104), ('earthlink.com', 118), ('earthlink.net', 122),
('email.com', 286), ('excite.com', 200), ('hongkong.com', 110), ('juno.com',
233), ('lycos.com', 95), ('mail.com', 399), ('minedu.fi', 134), ('msn.com',
764), ('shaw.ca', 259), ('stderr.windsongnews.com', 88), ('yahoo.ca', 435),
('yahoo.co.uk', 303), ('yahoo.com.hk', 156), ('yahoo.fr', 266)]

This is domains_white

[('aol.com', 17), ('awci.org', 6), ('cox.net', 12), ('hotmail.com', 6),
('yahoo.com', 11)]

I want to be left with domains_black =

[('YAHOO.COM', 118), ('buildingonline.com', 130), ('canada.com', 95),
('china.com', 104), ('earthlink.com', 118), ('earthlink.net', 122),
('email.com', 286), ('excite.com', 200), ('hongkong.com', 110), ('juno.com',
233), ('lycos.com', 95), ('mail.com', 399), ('minedu.fi', 134), ('msn.com',
764), ('shaw.ca', 259), ('stderr.windsongnews.com', 88), ('yahoo.ca', 435),
('yahoo.co.uk', 303), ('yahoo.com.hk', 156), ('yahoo.fr', 266)]

ie. minus the entries in domains_white.

Thanks again guys.

AMY
Derrick 'dman' Hudson said:
I have received such good help on this message board. I wonder if I
might not get a little more help from you on this.

I am at the point where I have two dictionaries, with information of
a domain and a frequency of that domain.

Now that I have the two, I want to delete each entry from one that
the two have in common, leaving only those that are unique to the
dictionary?

This would be great for sets, if a set adequately models your data.
(with two sets, this would simply be (s1-(s1&s2)))
Say I have a dictionary called domains_black and another
domains_white...

Did you want to define equality by key or by (key, value) pair?

for key in domains_white.keys() :
if key in domains_black: del domains_black[key]

for key in domains_white.keys() :
if key in domains_black and domains_white[key] == domains_black[key] :
del domains_black[key]

-D

--
He who scorns instruction will pay for it,
but he who respects a command is rewarded.
Proverbs 13:13

www: http://dman13.dyndns.org/~dman/ jabber:
(e-mail address removed)
 
D

David Eppstein

"Amy G said:
How do I do this same thing but with lists???

I apparently have two lists... not dictionaries.

This is what it prints if I add
print domains_black

domains_black = [x for x in domains_black if x not in domains_white]

If domains_white is a long list, this will be inefficient due to the
linear search to test whether each x belongs to it. In that case, you
might be better off using a set:

mask = Set(domains_white)
domains_black = [x for x in domains_black if x not in mask]

Also, this creates a new list. If you instead want to change the same
list in-place, you could replace "domains_black =" with
"domains_black[:] =".
 
A

Amy G

Don't know what I could have done wrong, but it just returned the origianl
list, unchanged.


Amy G said:
How do I do this same thing but with lists???

I apparently have two lists... not dictionaries.

This is what it prints if I add
print domains_black

[('YAHOO.COM', 118), ('buildingonline.com', 130), ('canada.com', 95),
('china.com', 104), ('earthlink.com', 118), ('earthlink.net', 122),
('email.com', 286), ('excite.com', 200), ('hongkong.com', 110), ('juno.com',
233), ('lycos.com', 95), ('mail.com', 399), ('minedu.fi', 134), ('msn.com',
764), ('shaw.ca', 259), ('stderr.windsongnews.com', 88), ('yahoo.ca', 435),
('yahoo.co.uk', 303), ('yahoo.com.hk', 156), ('yahoo.fr', 266)]

This is domains_white

[('aol.com', 17), ('awci.org', 6), ('cox.net', 12), ('hotmail.com', 6),
('yahoo.com', 11)]

I want to be left with domains_black =

[('YAHOO.COM', 118), ('buildingonline.com', 130), ('canada.com', 95),
('china.com', 104), ('earthlink.com', 118), ('earthlink.net', 122),
('email.com', 286), ('excite.com', 200), ('hongkong.com', 110), ('juno.com',
233), ('lycos.com', 95), ('mail.com', 399), ('minedu.fi', 134), ('msn.com',
764), ('shaw.ca', 259), ('stderr.windsongnews.com', 88), ('yahoo.ca', 435),
('yahoo.co.uk', 303), ('yahoo.com.hk', 156), ('yahoo.fr', 266)]

ie. minus the entries in domains_white.

Thanks again guys.

AMY
Derrick 'dman' Hudson said:
I have received such good help on this message board. I wonder if I
might not get a little more help from you on this.

I am at the point where I have two dictionaries, with information of
a domain and a frequency of that domain.

Now that I have the two, I want to delete each entry from one that
the two have in common, leaving only those that are unique to the
dictionary?

This would be great for sets, if a set adequately models your data.
(with two sets, this would simply be (s1-(s1&s2)))
Say I have a dictionary called domains_black and another
domains_white...

Did you want to define equality by key or by (key, value) pair?

for key in domains_white.keys() :
if key in domains_black: del domains_black[key]

for key in domains_white.keys() :
if key in domains_black and domains_white[key] == domains_black[key] :
del domains_black[key]

-D

--
He who scorns instruction will pay for it,
but he who respects a command is rewarded.
Proverbs 13:13

www: http://dman13.dyndns.org/~dman/ jabber:
(e-mail address removed)
 
J

John Hazen

* Amy G said:
How do I do this same thing but with lists???

I apparently have two lists... not dictionaries.

This is what it prints if I add
print domains_black

[('YAHOO.COM', 118), ('buildingonline.com', 130), ('canada.com', 95),
('china.com', 104), ('earthlink.com', 118), ('earthlink.net', 122),
('email.com', 286), ('excite.com', 200), ('hongkong.com', 110), ('juno.com',
233), ('lycos.com', 95), ('mail.com', 399), ('minedu.fi', 134), ('msn.com',
764), ('shaw.ca', 259), ('stderr.windsongnews.com', 88), ('yahoo.ca', 435),
('yahoo.co.uk', 303), ('yahoo.com.hk', 156), ('yahoo.fr', 266)]

This is domains_white

[('aol.com', 17), ('awci.org', 6), ('cox.net', 12), ('hotmail.com', 6),
('yahoo.com', 11)]

I want to be left with domains_black =

[('YAHOO.COM', 118), ('buildingonline.com', 130), ('canada.com', 95),
('china.com', 104), ('earthlink.com', 118), ('earthlink.net', 122),
('email.com', 286), ('excite.com', 200), ('hongkong.com', 110), ('juno.com',
233), ('lycos.com', 95), ('mail.com', 399), ('minedu.fi', 134), ('msn.com',
764), ('shaw.ca', 259), ('stderr.windsongnews.com', 88), ('yahoo.ca', 435),
('yahoo.co.uk', 303), ('yahoo.com.hk', 156), ('yahoo.fr', 266)]

ie. minus the entries in domains_white.

Well, it's hard to tell exactly what you want, given that none of the
domains in the whitelist are in the original blacklist (and that you
didn't answer Derrick's question about whether you want to consider
entries equal if just the domain is the same, or do you require the
domain and the count to be the same).

(Also, I would recommend you normalize all of the domains to lowercase,
since case information is not significant to DNS.)

Anyway, I would solve the question you're asking with list
comprehensions:
black = [('yahoo.com',118), .... ('buildingonline.com',130),('foo.bar',100)]
white = [('yahoo.com',11),('foo.bar',100)]
[x for x in black if x not in white] [('yahoo.com', 118), ('buildingonline.com', 130)]
# note the version above only removes entries that have both
.... # domain and count equal.
....
[x for x in black if x[0] not in [y[0] for y in white]] [('buildingonline.com', 130)]
# I think the above is what you want. I think it'll be
.... # more readable with an intermediate assignment:
....
w = [y[0] for y in white]
[x for x in black if x[0] not in w] [('buildingonline.com', 130)]

HTH-

John
 
A

Aahz

Amy G said:
Now that I have the two, I want to delete each entry from one that the two
have in common, leaving only those that are unique to the dictionary?

Say I have a dictionary called domains_black
and another domains_white...

for k in domains_white():
if k in domains_black:
del domains_black[k]

Didja try that before posting?.... (I see at least two errors.)
 
P

Paul Rubin

for k in domains_white():
if k in domains_black:
del domains_black[k]

Didja try that before posting?.... (I see at least two errors.)

Oops, editing error (I removed 'keys' which is no longer needed, but
forgot to remove the parentheses). No I didn't try it first.
What's the second error?
 
D

Derrick 'dman' Hudson

You mean s1 - s2, no need for that extra &.

Ah, of course -- if an item is in s2 but not in s1, the subtraction is
a no-op.

--
Love is not affectionate feeling, but a steady wish for the loved
person's ultimate good as far as it can be obtained.
--C.S. Lewis

www: http://dman13.dyndns.org/~dman/ jabber: (e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,775
Messages
2,569,601
Members
45,182
Latest member
BettinaPol

Latest Threads

Top