nntplib: abstraction of threads

R

Rakesh

For a particular application of mine, I need to get the messages from
usenet , (and group them by each thread) . My startup python code looks
as follows.

<--- Startup code to read messages from a newsgroup -->

import nntplib, cStringIO, rfc822, sys

SRVR = '<my_news_server>' # Your news server
newsgroup = 'comp.lang.c' # Group of your choice

def inpdflt(s, d):
resp = raw_input("%s [%s]: " % (s, d))
return resp or d

news = nntplib.NNTP(SRVR)
resp, estimate, first, last, name = news.group(newsgroup)

if estimate == '0':
sys.exit("No messages in " + newsgroup)

#
# Get (article number, subject, poster, date, id, references, size,
lines)
# for each of the articles between first and last
#
xover = news.xover(first, last)

# loop through articles, extracting headers
for x in xover[1]:
# x == (article number, subject, poster, date, id, references,
size, lines)
try:
hdrs = news.head(x[0])[3]
mesg = rfc822.Message(cStringIO.StringIO("\r\n".join(hdrs)))
print '%s\n+++%s' % (mesg.getheader("from"),
mesg.getheader("subject"))
except nntplib.NNTPError:
pass
news.quit()

<-- End newsgroup -->

I am getting all the messages of the newsgroup stored in the newsgroup
server.
What I want is to *group the messages belonging to each thread* .

How would I do that ?

Eg:

Topic 1
|
-- Re: Topic:1
-- Re: Topic: 1
|
-- Re: Re: Topic 1

Topic 2
|
-- Re: Topic:2


Total number of messages 6, but number of threads = 2,
I want to get an abstraction something similar to this.
 
W

Werner Amann

Rakesh said:
What I want is to *group the messages belonging to each thread* .

Hello

Why not sort with Message-ID and References?
Attention - it is a Newbie-Solution.

import nntplib

hamster = nntplib.NNTP('127.0.0.1', 119, 'user', 'pass')
resp, count, first, last, name = hamster.group('comp.lang.python')
resp, items = hamster.xover(first,last)

start_dic = {}
re_dic = {}
numb = 1

for id,subject,author,date,message_id,references,size,lines in items:
if 'Re:' not in subject:
start_dic[subject] = (author, message_id)
else:
re_dic[numb] = (subject, author, references)
numb += 1

resp = hamster.quit()

for a in start_dic:
print a
print start_dic[a][0]
for b in re_dic:
if start_dic[a][1] in re_dic[2]:
print '|'
print ' ->', re_dic[0]
print ' ', re_dic[1]
print
 
S

Steve Holden

Werner said:
Rakesh schrieb:

What I want is to *group the messages belonging to each thread* .


Hello

Why not sort with Message-ID and References?
Attention - it is a Newbie-Solution.

import nntplib

hamster = nntplib.NNTP('127.0.0.1', 119, 'user', 'pass')
resp, count, first, last, name = hamster.group('comp.lang.python')
resp, items = hamster.xover(first,last)

start_dic = {}
re_dic = {}
numb = 1

for id,subject,author,date,message_id,references,size,lines in items:
if 'Re:' not in subject:
start_dic[subject] = (author, message_id)
else:
re_dic[numb] = (subject, author, references)
numb += 1

resp = hamster.quit()

for a in start_dic:
print a
print start_dic[a][0]
for b in re_dic:
if start_dic[a][1] in re_dic[2]:
print '|'
print ' ->', re_dic[0]
print ' ', re_dic[1]
print

Better still, do a Google search on "mail threading algorithm",
implement the algorithm described in

http://www.jwz.org/doc/threading.html

and post your implementation back to the newsgroup :)

regards
Steve
 
R

Rakesh

Steve said:
Werner said:
Rakesh schrieb:

What I want is to *group the messages belonging to each thread* .


Hello

Why not sort with Message-ID and References?
Attention - it is a Newbie-Solution.

import nntplib

hamster = nntplib.NNTP('127.0.0.1', 119, 'user', 'pass')
resp, count, first, last, name = hamster.group('comp.lang.python')
resp, items = hamster.xover(first,last)

start_dic = {}
re_dic = {}
numb = 1

for id,subject,author,date,message_id,references,size,lines in items:
if 'Re:' not in subject:
start_dic[subject] = (author, message_id)
else:
re_dic[numb] = (subject, author, references)
numb += 1

resp = hamster.quit()

for a in start_dic:
print a
print start_dic[a][0]
for b in re_dic:
if start_dic[a][1] in re_dic[2]:
print '|'
print ' ->', re_dic[0]
print ' ', re_dic[1]
print

Better still, do a Google search on "mail threading algorithm",
implement the algorithm described in

http://www.jwz.org/doc/threading.html


Thanks a lot for the link.
and post your implementation back to the newsgroup :)

Sure I would. I would definitely do the same.
I am a python newbie and am reading nntp spec (rfc) right now.
Once I get a working version I would definitely post the same.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top