Algorithm for Labels like in Gmail

R

rhcarvalho

Hello there!

I'm trying to make a simple Contact Manager using python (console
only), however i'm having trouble implementing a division by "Groups"
or "Labels" just like in Gmail. I don't have any real code to post
because all i got now is a raw TXT file holding the names and phones of
my contacts.

The only idea I could figure out until now seems too weak, so that's
why i'm asking for help. I thought of making a separate list (in a text
file) holding all the possible groups, where each group hold the names
of the contacts. Then if i access a group, i'll be able to see all
contacts related to that group. On the other hand, i'll also need to
attach to the contact instance a list of every group it is present.
I think it's a bad idea because it seems to be very easy to get things
messed up. Like I can get things corrupted or bad linked, and i'll
always need to run functions to check all the realations between
contact names and groups...

I like the way i can "label" emails on Gmail, does anyone know how I
can implement such kind of feature? What's the best/broadly used
algorithm?

Sorry for the long message, and thanks in advance

Rodolfo Carvalho
 
J

Jerry

I would probably go with an SQLite database to store your information.
You can have the contacts listed in a table with unique ids, then a
table of labels. Finally, create a table that links one or more labels
with each contact. Then you can just keep adding more labels.
 
G

George Sakkis

Hello there!

I'm trying to make a simple Contact Manager using python (console
only), however i'm having trouble implementing a division by "Groups"
or "Labels" just like in Gmail. I don't have any real code to post
because all i got now is a raw TXT file holding the names and phones of
my contacts.

The only idea I could figure out until now seems too weak, so that's
why i'm asking for help. I thought of making a separate list (in a text
file) holding all the possible groups, where each group hold the names
of the contacts. Then if i access a group, i'll be able to see all
contacts related to that group. On the other hand, i'll also need to
attach to the contact instance a list of every group it is present.
I think it's a bad idea because it seems to be very easy to get things
messed up. Like I can get things corrupted or bad linked, and i'll
always need to run functions to check all the realations between
contact names and groups...

I like the way i can "label" emails on Gmail, does anyone know how I
can implement such kind of feature? What's the best/broadly used
algorithm?

Sorry for the long message, and thanks in advance

Rodolfo Carvalho

Google for "many-to-many relationships". In short, you have two entity
classes (say emails and labels) where each instance of one entity may
be associated to zero or more instances of the other entity. In
databases you implement this by having three tables, one for each
entity and one for their association:

Email RelEmailLabel Label
---------- -------------- ---------
ID <--- EmailID ID
subject LabelID ---> name
.... ...


Then you can associate mails to labels by joining all three tables
together on the IDs. Of course you can implement this in memory as well
but you should probably want to store them in some persistent area
anyway, so an rdbms the way to go. Sqlite (with pysqlite) would meet
your needs just fine.

HTH,
George
 
R

Rodolfo

Hi George,

George said:
Google for "many-to-many relationships". In short, you have two entity
classes (say emails and labels) where each instance of one entity may
be associated to zero or more instances of the other entity. In
databases you implement this by having three tables, one for each
entity and one for their association:

Email RelEmailLabel Label
---------- -------------- ---------
ID <--- EmailID ID
subject LabelID ---> name
... ...

Ok, but how can I keep my Relationship Table out of bugs, bad data?!
I wonder how i'll control the following:
1st) Given an generic email, in which group(s) is it contained?
2nd) Given a group, which Emails/contacts does it contain?

I don't have much expererience with databases (yet).
Will the pysqlite work with ease? Because I don't mean to make a big
big program, just something very simple for my personal use.
Then you can associate mails to labels by joining all three tables
together on the IDs. Of course you can implement this in memory as well
but you should probably want to store them in some persistent area
anyway, so an rdbms the way to go. Sqlite (with pysqlite) would meet
your needs just fine.

I'll google for this module tomorrow and try to learn something about
it.
I plan to post the code I manage to write.

BTW which is the best way to store all those data files? Plain text
files? Some kind of binary file? or what?

Thank you once again,

Rodolfo
 
G

Guest

Rodolfo wrote:
[...]
Ok, but how can I keep my Relationship Table out of bugs, bad data?!
I wonder how i'll control the following:
1st) Given an generic email, in which group(s) is it contained?
2nd) Given a group, which Emails/contacts does it contain?

To none perhaps?
I don't have much expererience with databases (yet).
Will the pysqlite work with ease? Because I don't mean to make a big
big program, just something very simple for my personal use.

If database sounds like too big beast, there are easy ways to save your
objects in the standard library. Take a look at shelve, pickle and cPickle.

[...]
 
D

Dennis Lee Bieber

Hi George,



Ok, but how can I keep my Relationship Table out of bugs, bad data?!
I wonder how i'll control the following:
1st) Given an generic email, in which group(s) is it contained?
2nd) Given a group, which Emails/contacts does it contain?
I don't have much expererience with databases (yet).
Will the pysqlite work with ease? Because I don't mean to make a big
big program, just something very simple for my personal use.
Start with it -- it will be much simpler (personally, since I have
MySQL running as a service on my desktop I'd use MySQLdb with it,
but...). SQLite is a "file server" database; each "database" is one
file, and all the tables are stored within that one file.

I didn't catch all of what you are trying to store, but...

Table: Email
ID integer primary key autoincrement
Date datetime #taken from message header
Subject varchar(255) #from message header
FromID integer foreign key (AddressBook:ID)
Body blob #or other unlimited text type

Table: Group
ID integer primary key autoincrement
Name varchar(80)
Description varchar(255)

Table: AddressBook
ID integer primary key autoincrement
RealName varchar(80)
From varchar(80) #taken/matched to message header

Table: EmailGroup
ID integer primary key autoincrement #optional if ...
EmailID integer foreign key (Email:ID)
GroupID integer foreign key (Group:ID)
(Key unique index: EmailID, GroupID) #... this is defined

-=-=-=-

Find all emails with Group:Name XYZ

select Email.ID, Date, RealName, Subject from Email
inner join AddressBook on FromID = AddressBook.ID
inner join EmailGroup on Email.ID = EmailID
inner join Group on Group.ID = Group.ID
where Group.Name = "XYZ"
order by Date

When adding a new email, you have to add EmailGroup records for each
group you assign the message to -- EmailGroup only contains the record
(ID) number of the message, and each group.

* 3 2
* 3 5
* 3 6
* 4 5

* autoincrement, assigned by database engine on insert

This says message #3 belongs to group #2, group #5, and group #6;
message #4 belongs to group #5
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
G

Gerard Flanagan

Hello there!

I'm trying to make a simple Contact Manager using python (console
only), however i'm having trouble implementing a division by "Groups"
or "Labels" just like in Gmail. I don't have any real code to post
because all i got now is a raw TXT file holding the names and phones of
my contacts.

The only idea I could figure out until now seems too weak, so that's
why i'm asking for help. I thought of making a separate list (in a text
file) holding all the possible groups, where each group hold the names
of the contacts. Then if i access a group, i'll be able to see all
contacts related to that group. On the other hand, i'll also need to
attach to the contact instance a list of every group it is present.
I think it's a bad idea because it seems to be very easy to get things
messed up. Like I can get things corrupted or bad linked, and i'll
always need to run functions to check all the realations between
contact names and groups...

I like the way i can "label" emails on Gmail, does anyone know how I
can implement such kind of feature? What's the best/broadly used
algorithm?

Sorry for the long message, and thanks in advance

Rodolfo Carvalho

There's a program called 'buzhug' (http://buzhug.sourceforge.net/)
which is described as "a pure-Python database engine, using a Pythonic,
no-SQL syntax". I think it's in its early stages of development but it
might be suitable for your project. I spent the morning playing about
with it using your example of a (Very Simple) Contact Manager, and
there's some runnable but unfinished code below. It only implements a
'Many-to-one' relationship between Contacts and Groups - in other
words, a Contact can only belong to one Group. There's probably a lot
I haven't considered, it will break easily, and the docstrings are in
the post, but there you go.

Have fun!

Gerard

(also - http://groups.google.com/group/buzhug)
------------------------------------------------------------

print

from buzhug import Base
import os

def get_bases():
if os.path.exists('data'):
return Base('data/dt_groups'), Base('data/dt_contacts')
else:
#Bases don't exist, so create them
os.mkdir('data')
groups = Base('data/dt_groups')
groups.create( ('name', str) )
groups.insert( 'Family' )
groups.insert( 'Friends' )
groups.commit()

contacts = Base('data/dt_contacts')
contacts.create( ('group', groups), ('first_name', str),
('last_name', str), ('email', str) )
contacts.insert( groups[0], 'Jack', 'Jones', '(e-mail address removed)' )
contacts.insert( groups[0], 'John', 'Jones', '(e-mail address removed)' )

contacts.insert( groups[1], 'James', 'diGriz',
'(e-mail address removed)' )
contacts.insert( groups[1], 'Dirk', 'Gently',
'(e-mail address removed)' )
contacts.commit()
return groups, contacts

def usage():
print '''
ADD - Add a Contact
eg. ADD Family, Susan, Smith, (e-mail address removed)
eg. ADD Work, Jason, Jones, (e-mail address removed)
DEL - Delete a Contact or Group
eg. DEL firstname=Susan, lastname=Smith
eg. DEL group=Family
FIND - Search for contacts
eg. FIND lastname=Smith
eg. FIND group=Family
EXIT - End the program
'''

def intro():
print '\n' * 5
print '#' * 52
print '#' * 20, ' CONTACTS ', '#' * 20
print '#' * 52
usage()
print '\n' * 2


def get_input(prompt):
s = raw_input(prompt).strip()
if s.upper() == 'EXIT':
raise EOFError
return s

def ADD(groupname, firstname, lastname, email):
groups, contacts = get_bases()
try:
groups.open()
#see if a group with this name exists
records = [ g for g in groups if g.name == groupname ]
if len(records) == 0:
#no group with this name, so create it
gid = groups.insert( name=groupname.strip() )
group = groups[gid]
else:
group = records[0]
finally:
groups.commit()
try:
contacts.open()
contacts.insert(group, firstname.strip(), lastname.strip(),
email.strip())
finally:
contacts.commit()

def test():
groups, contacts = get_bases()
try:
contacts.open()
for contact in contacts:
print contact.group.name, contact.first_name,
contact.last_name
finally:
contacts.commit()

if __name__ == '__main__':

intro()
while True:
try:
s = get_input('-> ')
if s[0] == '?':
usage()
elif s == 'test':
test()
elif s[:4] == 'ADD ':
grp, fname, lname, email = s[4:].split(',')
try:
ADD(grp, fname, lname, email)
except Exception, e:
print e
except EOFError:
break
print '\nbye'

--------------------------------------------------------------------
 
R

Rodolfo

Thanks to you all.

I'll start trying the buzhug solution, which seems more pythonic. But I
feel like I gotta learn how to use databases... once I studied SQL a
litle bit, but don't remember much.
I'll play around with Python and then I show what I got.

Bye

Rodolfo

P.S: Is there a way to do a "clear screen" on a Python program? I mean
something like the "cls" command in MS-DOS....
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top